Lightnews — Scholar-powered news

@evanshieh.bsky.social

17 followers 29 following 4 posts

Posts Replies Media Videos

evanshieh.bsky.social

@evanshieh.bsky.social

(4/n) Nearly three years ago, OpenAI described ChatGPT as a "low-key research preview". Until proven otherwise, continue to treat generative AI models accordingly.

(Thank you to organizers and reviewers at the AAAI Summer Symposium Series 2025 for supporting our work!)

August 5, 2025 at 12:13 AM

evanshieh.bsky.social

@evanshieh.bsky.social

(3/n) This narrow focus overlooks the source of many AI harms reported to date: the models themselves. As generative AI expands in intimate consumer domains like therapy and education, we must demand access for independent researchers and public regulators to perform better real-world evaluations

August 5, 2025 at 12:12 AM

evanshieh.bsky.social

@evanshieh.bsky.social

(2/n) So what do companies actually measure in technical reports? As the title of our paper suggests, self-audits focus more on nefarious actors than everyday consumers ("seeing red"), and they prefer consulting language models in situations where contextual human input is vital ("teaching parrots")

August 5, 2025 at 12:10 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news