Yijia Shao
echoshao8899.bsky.social
Yijia Shao
@echoshao8899.bsky.social
CS PhD student @StanfordNLP
https://cs.stanford.edu/~shaoyj/
🚨 70 million US workers are about to face their biggest workplace transmission due to AI agents. But nobody’s asking them what they want.

While AI R&D races to automate everything, we took a different approach: auditing what workers want vs. what AI can deliver across the US workforce.🧵
June 12, 2025 at 4:34 PM
Reposted by Yijia Shao
We are getting closer to have agents operating in the real physical world. However, can we trust frontier models to make embodied decisions 🎮 aligned with human norms 👩‍⚖️ ?

With EgoNormia, a 1.8k ego-centric video 🥽 QA benchmark, we show that this is surprisingly challenging!
March 4, 2025 at 4:32 AM
🎉 For the first time ever: Collaborate with AI agents in real-time! Collaborative Gym UI is now IRB-approved and alive at cogym.saltlab.stanford.edu!

A group of agents is eager to work with you. By providing feedback, you will see the agent's identity and its feedback to you!
February 12, 2025 at 7:24 PM
Reposted by Yijia Shao
LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? 🤖➕👤

Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmations or my deep thinking.(🧵 with video)
January 17, 2025 at 5:44 PM
LM agents today primarily aim to automate tasks. Can we turn them into collaborative teammates? 🤖➕👤

Introducing Collaborative Gym (Co-Gym), a framework for enabling & evaluating human-agent collaboration! I now get used to agents proactively seeking confirmations or my deep thinking.(🧵 with video)
January 17, 2025 at 5:44 PM
Super fun and easy to play with! Check it out ⬇️
With an increasing number of Large *Audio* Models 🔊, which one do users like the most?

Introducing talkarena.org — an open platform where users speak to LAMs and receive text responses. Through open interaction, we focus on rankings based on user preferences rather than static benchmarks.
🧵 (1/5)
December 10, 2024 at 5:40 AM
Reposted by Yijia Shao
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
Th, Dec 12, 11:00 PST - Poster Session 3 West
December 9, 2024 at 7:57 PM
Reposted by Yijia Shao
Papers (partly) from @stanfordnlp at #NeurIPS 2024:

Oral: Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making
Manling Li · Shiyu Zhao · Qineng Wang · Kangrui Wang · … · Weiyu Liu · Percy Liang · Li Fei-Fei · Jiayuan Mao · Jiajun Wu
Wed 11 Dec 11:50 PM UTC [East Ballroom A, B]
December 9, 2024 at 7:57 PM
Excited to present our PrivacyLens paper at #NuerIPS next week! We explore LM agent privacy risks when deployed as personal assistants. (Details in thread)

I am working on developing LM agents as collaborative research partners, learning aids, personal assistants, and more. Let's connect and chat!!
December 6, 2024 at 6:20 PM
Reposted by Yijia Shao
Missed some – or all – of our papers at #EMNLP2024?

It's not too late to catch up using this handy list from the Stanford AI Lab blog:

ai.stanford.edu/blog/emnlp-2...
November 18, 2024 at 4:29 PM