Lightnews — Scholar-powered news

Yijia Shao

@echoshao8899.bsky.social

🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody’s asking them what they want.

While AI R&D races to automate everything, we took a different approach: auditing what workers want vs. what AI can deliver across the US workforce.🧵

June 12, 2025 at 4:34 PM

Reposted by Yijia Shao

Hao Zhu 朱昊

@zhuhao.me

We are getting closer to have agents operating in the real physical world. However, can we trust frontier models to make embodied decisions 🎮 aligned with human norms 👩‍⚖️ ?

With EgoNormia, a 1.8k ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging!

March 4, 2025 at 4:32 AM

Yijia Shao

@echoshao8899.bsky.social

🎉 For the first time ever: Collaborate with AI agents in real-time! Collaborative Gym UI is now IRB-approved and alive at cogym.saltlab.stanford.edu!

A group of agents is eager to work with you. By providing feedback, you will see the agent's identity and its feedback to you!

February 12, 2025 at 7:24 PM

Reposted by Yijia Shao

Yijia Shao

@echoshao8899.bsky.social

LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? 🤖➕👤

Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmations or my deep thinking.(🧵 with video)

January 17, 2025 at 5:44 PM

Yijia Shao

@echoshao8899.bsky.social

LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? 🤖➕👤

Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmations or my deep thinking.(🧵 with video)

January 17, 2025 at 5:44 PM

Yijia Shao

@echoshao8899.bsky.social

Super fun and easy to play with! Check it out ⬇️

Will Held @williamheld.com · Dec 10

With an increasing number of Large *Audio* Models 🔊, which one do users like the most?

Introducing talkarena.org — an open platform where users speak to LAMs and receive text responses. Through open interaction, we focus on rankings based on user preferences rather than static benchmarks.
🧵 (1/5)

Talk Arena: Interactive Evaluation of Large Audio Models

December 10, 2024 at 5:40 AM

Reposted by Yijia Shao

Stanford NLP Group

@stanfordnlp.bsky.social

PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
Th, Dec 12, 11:00 PST - Poster Session 3 West

December 9, 2024 at 7:57 PM

Reposted by Yijia Shao

Stanford NLP Group

@stanfordnlp.bsky.social

Papers (partly) from @stanfordnlp at #NeurIPS 2024:

Oral: Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Manling Li · Shiyu Zhao · Qineng Wang · Kangrui Wang · … · Weiyu Liu · Percy Liang · Li Fei-Fei · Jiayuan Mao · Jiajun Wu
Wed 11 Dec 11:50 PM UTC [East Ballroom A, B]

December 9, 2024 at 7:57 PM

Yijia Shao

@echoshao8899.bsky.social

Excited to present our PrivacyLens paper at #NuerIPS next week! We explore LM agent privacy risks when deployed as personal assistants. (Details in thread)

I am working on developing LM agents as collaborative research partners, learning aids, personal assistants, and more. Let's connect and chat!!

December 6, 2024 at 6:20 PM

Reposted by Yijia Shao

Stanford NLP Group

@stanfordnlp.bsky.social

Missed some – or all – of our papers at #EMNLP2024?

It's not too late to catch up using this handy list from the Stanford AI Lab blog:

ai.stanford.edu/blog/emnlp-2...

November 18, 2024 at 4:29 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news