banner
samuelschmidgall.bsky.social
@samuelschmidgall.bsky.social
PhD at Johns Hopkins University and Researcher at Google Deepmind working on LLM agents
🎉Read the preprint: agentrxiv.github.io
Try out AgentRxiv: github.com/SamuelSchmid...
Let’s explore how agents can accelerate research—together.
🧵8/8
March 24, 2025 at 2:25 PM
🛡️Research agents and their labs, while promising, are still not at human-level quality. By channeling their work into AgentRxiv—a dedicated hub for autonomous research—we’re also safeguarding the quality of human research on arXiv.
🧵7/8
March 24, 2025 at 2:25 PM
✨ In parallel experiments with 3 independent labs sharing pre-prints through AgentRxiv, the best method achieved 79.8% accuracy—a 13.7% relative improvement—while reaching key milestones faster than in sequential experiments.
🧵6/8
March 24, 2025 at 2:25 PM
🏥 We also wondered how well the methods our agents discovered perform on out-of-domain benchmarks (MMLU-Pro, GPQA, & MedQA) and with five other language models. We find the top performing algorithm SDA improves across these benchmarks on average by 3.3%.
🧵5/8
March 24, 2025 at 2:25 PM
🥇We perform experiments where agents are asked to develop new reasoning techniques on MATH-500. We find that when agents are given access to previous research, accuracy improved from 70.2% to 78.2% – an 11.4% relative improvement over the gpt-4o mini baseline and 9.7% over gpt-4o mini with CoT.
🧵4/8
March 24, 2025 at 2:25 PM
To address this, we introduce AgentRxiv—a framework that lets LLM agent laboratories upload and retrieve reports from a shared preprint server in order to collaborate, share insights, and iteratively build on each other’s research.
🧵3/8
March 24, 2025 at 2:25 PM
There has been a lot of recent excitement around autonomous LLM agents performing research, with several fully autonomous works being accepted into ICLR 2025 📚

‼️The problem is that these systems work in isolation without the ability to build on their research.
🧵2/8
March 24, 2025 at 2:25 PM
👩‍💻 All of the code is completely open-source! Below are links to the website, paper, and github! Check it out.

website: agentlaboratory.github.io
paper: arxiv.org/pdf/2501.04227
github: github.com/SamuelSchmidga…
Agent Laboratory: Using LLMs as Research Assistants
by Samuel Schmidgall at JHU
agentlaboratory.github.io
February 27, 2025 at 5:25 PM
Agent Laboratory consists of three primary phases that guide the research process: (1) Literature Review, (2) Experimentation, and (3) Report Writing. During each phase, LLM agents collaborative, integrating tools like arXiv, Hugging Face, Python, and LaTeX.
February 27, 2025 at 5:25 PM
Reposted
These aren’t totally hypothetical questions. Currently, the US is in the process of trashing its wildly successful science funding system. NIH, which funds tens of billions of dollars of research each year, has been estimated to generate around $2.50 of economic activity for every $1 funded:
Impacts of NIH Funding on the US Economy, Jobs and Better Health
The U.S. National Institutes of Health is the largest single public funder of biomedical and behavioral research in the world. NIH activities and funding are major drivers of the United States’ compet...
wewillcure.com
February 23, 2025 at 10:16 AM
An LLM that makes decisions that have consequences in an external environment with temporal dependencies (?)
December 4, 2024 at 9:57 PM
Lol true
November 24, 2024 at 11:30 PM
Hello. Please add me as well!! 👋
November 24, 2024 at 12:13 AM