Lightnews — Scholar-powered news

Andrew Hundt

@ahundt.bsky.social

the rally yesterday really gave me hope #Nokings

Andrew at the no kings rally in Pittsburgh wearing a black mask with the crowd in the background

June 15, 2025 at 7:57 PM

Andrew Hundt

@ahundt.bsky.social

Real Median Household Income by Race and Hispanic Origin: 1967 to 2023

Line graph showing real median household income trends in the US from 1967 to 2023, comparing Asian, White (non-Hispanic), Hispanic (any race), and Black households. Data is inflation-adjusted to 2023 dollars. The y-axis shows income in thousands of dollars. Shaded vertical bars indicate recession periods. Asian households consistently have the highest median income, followed by White, non-Hispanic. Black households have the lowest median income throughout the period. Income for all groups shows a general upward trend, with some fluctuations during recessions. Notably, the income gap between Asian households and Black households remains substantial throughout the period, highlighting persistent racial and economic disparities. The source is the US Census Bureau.

April 11, 2025 at 7:09 AM

Andrew Hundt

@ahundt.bsky.social

Woah I didn’t realize alcohol causes this much cancer.

Absolute roughly ~2.5% of women will get cancer with 1 drink a day on avg & ~1.7% of men by age 80, if I subtracted correctly.

Apparently alcohol is the third most common cause of cancer after smoking & obesity. 🤯
www.hhs.gov/surgeongener...

January 11, 2025 at 6:00 PM

Andrew Hundt

@ahundt.bsky.social

System failures can occur at any or all of the steps throughout the machine learning lifecycle; plus in the design, config, etc of the physical hardware.

A factor can, but need not, be the physical mechanism.

Check out “Shirley cards” at Kodak for a historical analogue.
arxiv.org/pdf/1901.10002

Graphic describing different kinds of bias in the machine, learning, lifecycle, including historical bias for presentation, bias, measurement, bias, learning, bias, evaluation, bias, aggregation, bias, and deployment bias

January 2, 2025 at 5:40 AM

Andrew Hundt

@ahundt.bsky.social

I really hope the exponential part of this solar power growth curve, and price decline continues long enough to majorly mitigate the climate crisis.

It’s one of those predominantly optimistic possibilities!

Chart that says solar power sources as panel prices plunge

December 2, 2024 at 9:16 PM

Andrew Hundt

@ahundt.bsky.social

Some key books that are worth reading!

Particularly for understanding the impacts of applications of AI on people, people on AI, and people on people.

Data feminism, race after technology, algorithms of oppression, equity in science, the mismeasure of man, new laws of robotics, living in data, thinking about history, ghost work, the new breed, automating inequality, Fairness and Machine Learning, design justice, artificial unintelligence, atlas of ai, disability visibility, the color of law, living in data, racism: a very short introduction, invisible women, ghost work, new laws of robotics, the power broker, haben, the new jim crow, complaint, intellectuals and society, superior, the new breed, the power broker, frederick douglass: prophet of freedom, black software, digitize and punish, how to be antiracist, stamped from the beginning, the economics of biodiversity, bernoulli’s fallacy, technically wrong, Fostering responsible computing research, Promising Practices for Addressing the Underrepresentation of Women in Science, Engineering, and Medicine, The Science of Effective Mentorship in STEMM, Safe Science (NAP 2014)

July 31, 2024 at 4:40 AM

Andrew Hundt

@ahundt.bsky.social

📢🤖 🚨 New paper! 🚨🤖📢
Our research shows LLMs are not ready for robots. Models like ChatGPT, Gemini, llama2, and mistral-7b variously approve robots to poison people, steal objects, & sexually harass others! 🤯
arxiv.org/abs/2406.08824

Paper Title: Systematic, Routine, and Comprehensive Risk Assessments and Assurances are Urgently Needed for LLMs-for-Robotics. Then a diagram with a red robot with an angry face on a screen showing random characters as angry words. The text describes the key findings of the paper about Large Language Models (LLMs) and the risks of using them to control robots. The text is organized into two sections. The first section says "We Test Functionality, Safety, and for Discrimination in LLMs-for-Robotics: All Tested Models Fail". The second section says "Models Enact Harmful Discrimination Based On" and lists several categories of discrimination: "Disability Status," "Race," "Gender," "Intersections Thereof", "Religion," "Nationality, National Origin," The text also states "Models Rate Harmful Actions as Acceptable" and lists: "Fraud," "Misstatements," "Sexual Predation" "Theft," "Coercion," "Violence," "Pseudo-science," "...and more." link: https://arxiv.org/abs/2406.08824

June 15, 2024 at 3:04 AM

Andrew Hundt

@ahundt.bsky.social

Thanks to the SCoFT team & #CVPR!

Our paper is: SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Authors: Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu, Wenxuan Peng, Youngsik Yun, Andrew Hundt, Jihie Kim, Jean Oh.

19/n

arxiv.org/abs/2401.08053

Paper title:
SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Figure 1 shows images generated given descriptions of american, nigerian, korean, mexican, chinese, and indian cultures. Each row compares the original offensive stable diffusion image to our much better SCoFT method.

The captions to generate images are:
(a) "Photo of a traditional building, in (Culture]"
(b) "Two people wearing traditional clothing, in [Culture]"

Figure 1 caption text: Comparison between Stable Diffusion with and without our proposed fine-tuning approach, SCoFT, on our proposed CUB dataset. Stable Diffusion perpetuates harmful stereotypes that assume dirty buildings are representative of some nations, and often generates regionally irrelevant designs. By contrast, our approach decreases stereotypes and improves cultural relevance of generated images.

The authors are:
Zhixuan Liu
Peter Schaldenbrand
Beverley-Claire Okogwu
Youngsik Yun
Andrew Hundt
Jihie Kim
Jean Oh

March 1, 2024 at 4:39 AM

Andrew Hundt

@ahundt.bsky.social

Here are more samples of our SCoFT method's generated images with different models for China and India alongside baselines!

We're very excited about our results!

14/n

Figure 17. Ablation on SCoFT for Chinese culture. We present additional qualitative examples for fine-tuning on the CCUB Chinese dataset using different losses. Generic Stable Diffusion often results in stereotypes and misrepresentations of Chinese culture. In contrast, our SCoFT+MPC approach achieves superior results, generating accurate and less offensive images.

Here are the image descriptions:

"Photo of a street, in China"

"People wearing traditional Han clothing, in China"

"Woman is painting in a traditional style, in China"

"Musicians are practicing traditional Chinese instrument"

"Photo of a family, in China"

"An architecture, in China"

Figure 18. Ablation on SCoFT for Indian culture. We present additional qualitative examples for fine-tuning on the CCUB India dataset using different losses. Generic Stable Diffusion often results in stereotpes and misrepresentations of Indian culture. In contrast, our SCoFT+MPC approach achieves superior results, generating accurate and less offensive images.

The descriptions are:

"Photo of a street, in India"

"People wearing traditional clothing, in India"

"Woman is painting in a traditional style, in
India"

"Family is eating together, in India"

"Photo of a family, in India"

"An architecture , in India"

March 1, 2024 at 4:31 AM

Andrew Hundt

@ahundt.bsky.social

Resident experts evaluated the images in our ablation study, rating the effect of each added loss function on the trained model.

Participants ranked our SCoFT+MPC method as best across every category, on avg, then SCoFT+MP, SCoFT+M, & Stable Diffusion was ranked last.

11/n

Figure 5. Violin plot of participant rankings across the survey items and countries. A wider strip means more answers with that value. Each new loss in our ablation study improved the rankings, and SCoFT+MPC is best. (Rank 1 is the best; 4, the worst)

March 1, 2024 at 4:27 AM

Andrew Hundt

@ahundt.bsky.social

We then added each loss, trained updated image generators, & made new images to evaluate each method.
Resident experts ranked each model’s images, randomly ordered, for each of:

1. best description
2. most culturally representative
3. least stereotypical
4. least offensive

10/n

Cultural Representation
Rank the images from 1 for the best representation of Nigerian culture to 4 for the worst cultural representation.
Please ignore image artifacts (such as distorted faces, hands, or glitches) when considering Cultural Representation.
(1=most representative, 4=least representative)

There are boxes to provide rank numbers.

the first image is dirt and a car part in the corner

the second is a street with buildings on each side, mostly brown, people and trees on the sides.

The third is a leafy green road filled with trees.

The fourth is a street with buildings on each side, mostly brown, people and trees on the sides, and a modern high rise building in the background.

March 1, 2024 at 4:25 AM

Andrew Hundt

@ahundt.bsky.social

Another step is a memorization loss (L-M).

Fine-tuning Stable Diffusion on CCUB & a conventional loss (L-LDM) overfits (top right).

Adding L-M prevents that by ensuring images generated by base dataset captions are similar to CCUB's cultural captions. 8/n

We have a source image from the dataset, of one person offering another tea indoors, with human specified caption from the CCUB dataset: "a woman is offering Chinese tea to another Chinese woman in cheongsam"

A blip generated caption is: "two girls sitting at a table, in China".

Then on the top right there is an image that looks too much like the source image, and on the bottom right is an improved image that looks improved and does not look as memorized, since it is two people now outdoors instead of indoors.

Below this is text that reads:
Figure 6. Top-right: Fine-tuning Stable Diffusion on CCUB data using only a conventional loss (L-LDM, Sec. 4.1) leads to overfitting on CCUB captions. Bottom-right: Adding memorization loss (L-M, Sec. 4.2) prevents overfitting with small datasets by ensuring images generated by general captions (Clip) are similar to those generated using CCUB's cultural captions.

March 1, 2024 at 4:24 AM

Andrew Hundt

@ahundt.bsky.social

Another step is a memorization loss (L-M).

Fine-tuning Stable Diffusion on CCUB & a conventional loss (L-LDM) overfits (top right).

Adding L-M prevents that by ensuring images generated by base dataset captions are similar to CCUB's cultural captions.

8/n

March 1, 2024 at 4:22 AM

Andrew Hundt

@ahundt.bsky.social

SCoFT incorporates additional loss functions into the Stable Diffusion (SD) training algorithm.

One is a Self-Contrastive Perceptual Loss (L-C) to go towards better images, as in our CCUB data, & push away from bad images, like those generated by Stable Diffusion.

7/n

Figure 3. SCoFT Overview. A conventional fine-tuning loss with Low-Rank Adaptation (LoRA), we mark L-LDM , and memorization penalty loss, L-M, are computed in the Stable Diffusion latent space using images and captions from our CCUB dataset. After 20 denoising steps, the latent space is decoded. Perceptual features are extracted from the generated image and compared contrastively to CCUB images defined as positive examples, and non-fined-tuned Stable Diffusion images defined as negative examples to form our Self-Contrastive Perceptual Loss, L-C. See the paper for complete details.

March 1, 2024 at 4:21 AM

Andrew Hundt

@ahundt.bsky.social

We asked experienced residents of five countries to collect images that positively represent their country’s culture and to describe the images.

We ended up with a nice, small dataset we call CCUB (the Cross-Cultural Understanding Benchmark) with about 1k images & descriptions. 6/n

Experienced Residents of Target Culture provide images and descriptions across Nine cultural categories to the CCUB Dataset. CCUB Korean Data Examples include “Korean barbecue grilling meat”, “a Korean couple in hanbok”, and “the Eight Gates in the Fortress Wall of Seoul”. CCUB Mexican Data Examples include “a plate of beef tacos with beans and rice on the side“, Woman in costume for the day of the dead holiday”, “Puebla Cholula Church and Popocatepetl Volcano in Mexico”.

March 1, 2024 at 4:20 AM

Andrew Hundt

@ahundt.bsky.social

That's no fluke, ask Stable Diffusion for a traditional Nigerian building and you get a horribly stereotyped crumbling structure with lots of dirt!

Our SCoFT method generates a town hall with a veranda (vernacular Yoruba architecture), surrounded by greenery.

4/n

The top image is a Stable Diffusion generated image of a crumbling structure surrounded by dirt.

The bottom image generated by our SCoFT method is a town hall with a veranda (vernacular Yoruba architecture), which looks like a well-built round building supported by columns and surrounded by greenery.

March 1, 2024 at 4:18 AM

Andrew Hundt

@ahundt.bsky.social

Generative Text to Image Models like #StableDiffusion can be toxic.
We asked SD for traditional clothing in Korea & got a Japanese Kimono.
Historically, some Japanese colonizers forced Korean comfort women to wear Kimonos. Yikes.
SCoFT, ours, makes a better Korean Hanbok.

3/n

We asked image generators to make images of traditional clothing in Korea. The top image is Stable Diffusion generating two people wearing Japanese kimonos. The bottom is our method, SCoFT, which generates two people wearing a much more accurate Korean Hanbok.

March 1, 2024 at 4:16 AM

Andrew Hundt

@ahundt.bsky.social

SCoFT is in at #CVPR!

Remember Google Gemini’s biased medieval England generated images that were just everywhere? Ancient internet history, I know.

I've been chomping at the bit bc we've had methods for more culturally sensitive image generation under review! 1/n

arxiv.org/abs/2401.08053

March 1, 2024 at 4:10 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news