Sebastian Joseph
sebajoe.bsky.social
Sebastian Joseph
@sebajoe.bsky.social
CS Ph.D. Student at UT Austin
How good are LLMs at 🔭 scientific computing and visualization 🔭?

AstroVisBench tests how well LLMs implement scientific workflows in astronomy and visualize results.

SOTA models like Gemini 2.5 Pro & Claude 4 Opus only match ground truth scientific utility 16% of the time. 🧵
June 2, 2025 at 3:42 PM