The dream of “autonomous AI scientists” is tempting:
machines that generate hypotheses, run experiments, and write papers. But science isn’t just automation.
cichicago.substack.com/p/the-mirage...
🧵
Code is submitted but rarely executed during peer review—an issue likely to worsen with research agents. 🧑🔬
We introduce 𝐌𝐞𝐜𝐡𝐄𝐯𝐚𝐥𝐀𝐠𝐞𝐧𝐭, an execution-grounded evaluation of narrative + execution. 𝐕𝐞𝐫𝐢𝐟𝐲 𝐭𝐡𝐞 𝐬𝐜𝐢𝐞𝐧𝐜𝐞, 𝐧𝐨𝐭 𝐣𝐮𝐬𝐭 𝐭𝐡𝐞 𝐬𝐭𝐨𝐫𝐲.
1/n
Code is submitted but rarely executed during peer review—an issue likely to worsen with research agents. 🧑🔬
We introduce 𝐌𝐞𝐜𝐡𝐄𝐯𝐚𝐥𝐀𝐠𝐞𝐧𝐭, an execution-grounded evaluation of narrative + execution. 𝐕𝐞𝐫𝐢𝐟𝐲 𝐭𝐡𝐞 𝐬𝐜𝐢𝐞𝐧𝐜𝐞, 𝐧𝐨𝐭 𝐣𝐮𝐬𝐭 𝐭𝐡𝐞 𝐬𝐭𝐨𝐫𝐲.
1/n
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
Key question: when we actually roll out AI tools, how do people use them? Do they just defer completely? Does it improve productivity and ability?
We look in the medical setting of pulmonary embolisms
paulgp.com/papers/Radio...
Key question: when we actually roll out AI tools, how do people use them? Do they just defer completely? Does it improve productivity and ability?
We look in the medical setting of pulmonary embolisms
paulgp.com/papers/Radio...
Two, no, not "everyone is doing it" and those who don't do it aren't "losing out." ChatGPT is not an advantage.
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
arxiv.org/abs/2601.04253
arxiv.org/abs/2601.04253
Featuring Yisong Yue, Professor of Computing and Mathematical Sciences at @caltech.edu
You can participate online or in-person at DSI. Learn more at ai-scientific-discovery.github.io
Featuring Yisong Yue, Professor of Computing and Mathematical Sciences at @caltech.edu
You can participate online or in-person at DSI. Learn more at ai-scientific-discovery.github.io
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...
Chenhao: "Finding the equilibrium of publishing will take at least a decade."
25% agree, 75% disagree
Chenhao: "Finding the equilibrium of publishing will take at least a decade."
25% agree, 75% disagree
I can finally read my great-grandfather's epitaph. Try it:
davidbau.com/archives/202...
I can finally read my great-grandfather's epitaph. Try it:
davidbau.com/archives/202...
Why the change? Thread -->
1/n
Why the change? Thread -->
1/n
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: youtube.com/@AIScientifi...
Hope to see you soon!
You can tune in either on
Zoom: uchicago.zoom.us/j/9897879984...
Youtube: youtube.com/@AIScientifi...
Hope to see you soon!
We’re kicking off Winter Quarter with an 🔥 lineup, starting in two hours!
🧵
We’re kicking off Winter Quarter with an 🔥 lineup, starting in two hours!
🧵