Website astrovisbench.github.io
Paper arxiv.org/abs/2505.20538
@utaustin.bsky.social @simonsfoundation.org @uvapress.bsky.social @nyupress.bsky.social @taccutexas.bsky.social @noirlabastro.bsky.social
-interact with a variety of data formats to create diverse visualizations that comply with expert standards
-interact with a variety of data formats to create diverse visualizations that comply with expert standards
-aid scientists amidst their own workflows when they do not know step-by-step workflows and may not know, in advance, the kinds of scientific utility a visualization would bring.
-aid scientists amidst their own workflows when they do not know step-by-step workflows and may not know, in advance, the kinds of scientific utility a visualization would bring.
🔎 Findings: Even the best LLMs struggle to execute scientific workflows.
🔎 Findings: Even the best LLMs struggle to execute scientific workflows.