Arxiv: arxiv.org/abs/2508.11027
Code: github.com/JHU-CLSP/hell-or-high-water
(Data coming soon!)
Arxiv: arxiv.org/abs/2508.11027
Code: github.com/JHU-CLSP/hell-or-high-water
(Data coming soon!)
When tool schemas are provided in-context, we find that performance gaps between adversarial and non-adversarial settings increases with the number of schemas.
When tool schemas are provided in-context, we find that performance gaps between adversarial and non-adversarial settings increases with the number of schemas.
With RAG on tool schemas, we observe a substantial performance gap between adversarial and non-adversarial settings.
With RAG on tool schemas, we observe a substantial performance gap between adversarial and non-adversarial settings.