Shikhar Murty
shikharmurty.bsky.social
Shikhar Murty
@shikharmurty.bsky.social
Final year PhD Student in Computer Science @Stanford

Work on:
- Compositionality, syntax (language structure)
- Web Agents: Synthetic data, tree search, exploration (language interpretation)
Reposted by Shikhar Murty
Ever dreamed of AI agents learning through interacting with the open world unsupervisedly? Our latest preprint introduces NNetNav-Live which collects training data through exploration on real websites and hindsight labeling, which produces a SOTA OSS agent.
February 6, 2025 at 7:22 PM
Want to make a browser agent for *any* domain like banking or healthcare?
We propose methods for training LLMs with open-ended, unsupervised interaction on live websites:
✅ OSS SoTA on WebVoyager
✅ world's smallest high-performing web-agent
Try it here: nnetnav.dev
February 6, 2025 at 5:43 PM
going to stay off twitter for my own mental health. something has gone horribly wrong with that platform.
December 28, 2024 at 10:07 PM
Couldn't make it to NeurIPS due to work, but do check out our workshop happening in West Ballroom B. Lots of cool things to come, including a very fun panel!
Super excited today for the System 2 Reasoning at Scale workshop, come join us to discover how to equip AI systems with reasoning that's optimized for renewable energy and not fossil fuel 🔥🚀

⏰When? today, 9am-5:30pm
📍West Ballroom B

s2r-at-scale-workshop.github.io
#NeurIPS2024
December 15, 2024 at 8:29 PM
Reposted by Shikhar Murty
Come visit our poster "MoEUT: Mixture-of-Experts Universal Transformers" on Friday at 4:30 in East Exhibit Hall A-C #1907 on #NeurIPS2024. With Kazuki Irie, Jürgen Schmidhuber, Christopher Potts and @chrmanning.bsky.social.
December 12, 2024 at 10:46 PM
Reposted by Shikhar Murty
The extraordinary recent takeover of ML/AI by #NLP is well-known but insufficiently reflected on.

Look at the @neuripsconf.bsky.social tutorials in 2024!

neurips.cc/virtual/2024...

14 tutorials; 6 have "LLM" in the title; 4 more cover foundation models, with large NLP coverage. That's > 70% 😲
NeurIPS 2024 TutorialsNeurIPS 2024
neurips.cc
December 9, 2024 at 7:29 PM
Reposted by Shikhar Murty
🚨 Thrilled to share that Compositional Generalization Across Distributional Shifts with Sparse Tree Operations received a spotlight award at #NeurIPS2024! 🌟 I'll present a poster on Tuesday and give an invited lightning talk at the System 2 Reasoning Workshop on Sunday. 🧵👇
December 9, 2024 at 3:06 PM
Reposted by Shikhar Murty
🧵-1
We are thrilled to release #AgentLab, a new open-source package for developing and evaluating web agents. This builds on the new #BrowserGym package which supports 10 different benchmarks, including #WebArena.
December 3, 2024 at 9:02 PM
Folks, I'm not going to be at Neurips this year, but we have an *awesome* workshop that i'm super proud of.

Go attend, and use the link below to ask all of your burning questions about LLM reasoning, agents and compositionality!
🎊Excited for #neurips2024 and our "System 2 Reasoning at Scale" workshop. We have an excited lineup of speakers who will answer your most burning questions about AI and reasoning 🚀

🔥Got spicy questions? Submit & vote here:
app.sli.do/event/dJNU63...
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
app.sli.do
December 3, 2024 at 7:45 PM
Reposted by Shikhar Murty
🎊Excited for #neurips2024 and our "System 2 Reasoning at Scale" workshop. We have an excited lineup of speakers who will answer your most burning questions about AI and reasoning 🚀

🔥Got spicy questions? Submit & vote here:
app.sli.do/event/dJNU63...
Join Slido: Enter #code to vote and ask questions
Participate in a live poll, quiz or Q&A. No login required.
app.sli.do
December 3, 2024 at 5:43 PM
I also wear the AI agents researcher hat. Can't say i'm similarly impressed by reviewers in that community...
ACL syntax track reviewers >> almost any other conference.

These folks care about their sub-field and i learn something new every time!
November 27, 2024 at 11:32 PM
ACL syntax track reviewers >> almost any other conference.

These folks care about their sub-field and i learn something new every time!
November 27, 2024 at 7:44 PM
What is a probing task that is purely about semantics?
Context: I have a probe trained to predict dependency relations, and would like to train another one on a semantics only task (for research purposes)
November 24, 2024 at 5:00 AM
Asked GPT-4o to draw parse trees in two languages:
November 21, 2024 at 5:49 AM
Hot take (since it's still just friends on this platform):

It's crazy how the classic "sample and rerank" baseline from machine translation and IR, got re-branded as "scaling up inference-time compute".
November 21, 2024 at 5:06 AM