Lightnews — Scholar-powered news

Andrew Wang

@andrewwnlp.bsky.social

370 followers 40 following 4 posts

PhD student @jhuclsp.bsky.social

Posts Replies Media Videos

Andrew Wang

@andrewwnlp.bsky.social

Thanks to my collaborators Sophia Hager, Adi Asija, Nick Andrews, and @danielkhashabi.bsky.social at @jhuclsp.bsky.social !

Arxiv: arxiv.org/abs/2508.11027
Code: github.com/JHU-CLSP/hell-or-high-water
(Data coming soon!)

September 19, 2025 at 2:06 PM

Andrew Wang

@andrewwnlp.bsky.social

More tools = worse at handling tool failures

When tool schemas are provided in-context, we find that performance gaps between adversarial and non-adversarial settings increases with the number of schemas.

September 19, 2025 at 2:05 PM

Andrew Wang

@andrewwnlp.bsky.social

LLM agents do not handle tool failures well

With RAG on tool schemas, we observe a substantial performance gap between adversarial and non-adversarial settings.

September 19, 2025 at 2:04 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news