Arjun Guha
guha-anderson.com
Arjun Guha
@guha-anderson.com
At Gimme! Coffee -- my favorite New York State coffee chain!
August 30, 2025 at 2:40 PM
1850’s baseball.
June 22, 2025 at 6:37 PM
Looking forward to this.
April 15, 2025 at 7:13 PM
Photo taken today at @browncsdept.bsky.social. I'm glad to see that the PhD students (@genevievemp.bsky.social), furniture, and faculty seem to have not changed in 10+ years.
March 5, 2025 at 8:06 PM
However, many problems are so hard that reasoning models “give up” – they output solutions that they know are wrong or argue that the problem is impossible to solve. In some cases, R1 gets stuck “thinking forever”. (See this example of R1 getting “frustrated.”)
February 4, 2025 at 2:37 AM
Our benchmark reveals capability gaps and failure modes that are not evident in existing benchmarks. E.g., we find that o1 is significantly better at these tasks than other reasoning models.
February 4, 2025 at 2:37 AM
In short, we turn the weekly puzzles from the NPR Sunday Puzzle Challenge into a machine-checkable benchmark. These are hard problems, typically solved by a few hundred people a week. But, the answers are obvious when revealed (to U.S. adults).
February 4, 2025 at 2:37 AM
Last one: there are a LOT of people to blame for this one. I think @jasvir.bsky.social is to blame for this problem in "Humanity's Last Exam".
January 28, 2025 at 6:29 PM
Ugh, who did this? @joepolitz.bsky.social ? Wait, was it @dbp.bsky.social ? Someone else from @shriram.bsky.social's group?

Also from "Humanity's Last Exam".
January 28, 2025 at 3:51 PM
OK, who is responsible for this? Is it @natefoster.bsky.social?

Source: "Humanity's Last Exam" www.nytimes.com/2025/01/23/t...
January 28, 2025 at 3:47 PM
Although React is now very different from our old work in this space, there is a direct, acknowledged connection to some our low-level techniques from that time:

x.com/hupp/status/...
January 24, 2025 at 10:30 AM
I did manage to hand-roll a bad parser, which was a small leap from regular expressions, which were in C# by that time. However, I failed to figure out how to represent sum types (i.e., for values). My final note from 25 years ago says:
January 16, 2025 at 12:35 PM
Counterintuitively, this shows the power of GC. IIRC I've heard of similar tricks used for high-performance OCaml at Jane Street.
January 11, 2025 at 2:26 AM