Arvind Nagaraj
banner
narvind.bsky.social
Arvind Nagaraj
@narvind.bsky.social
Deep Learning | ML research |
Ex.Robotics at Invento | 🔗 https://narvind2003.github.io

Here to strictly talk about ML, NNs and related ideas. Casual stuff on x.com/nagaraj_arvind
The Hierarchical Reasoning Model (HRM) isn't just another model. It's a deep synthesis. It marries the iterative soul of an RNN (minus the BPTT nightmare) with the raw power of modern Attention.
August 7, 2025 at 8:50 AM
Then, last month, a paper dropped that changes everything.
This is the architecture I've been waiting for since 2018. A thread on HRM. 🧵
August 7, 2025 at 8:50 AM
For years, I died a little inside every time I taught the Transformer model, grudgingly accepting that the elegant loop of the RNN was dead.
August 7, 2025 at 8:50 AM
I like how the new gemini 2.0 thinking model insists like a child...lol
December 19, 2024 at 6:38 PM
Why does ChatGPT refuse to say "David Mayer" ?? 🤔
I have tried a bunch of ways and it refuses to!! 😭
December 1, 2024 at 6:38 AM
Casio?
November 26, 2024 at 5:28 PM
😮 Lost it for a moment....luckily, the good people on the interwebs make copies of important files!
November 26, 2024 at 2:01 PM
This thing's "mind" looks like a ferret on crack 🤣
I had to explain it to the poor thing!
November 26, 2024 at 6:47 AM
o1 style reasoning can solve most of these. But 8 is tricky.
Deepseek's model (inner monologue thinking tokens) are super interesting to watch. But the CoT trajectories take it to 2 incorrect solutions before it runs out thinking time: It either adds an extra 8 or uses cube roots.
Can't nest like👇
November 26, 2024 at 6:39 AM
Looks like it takes a special system prompt to trigger that behavior! huggingface.co/AIDC-AI/Marc...
November 23, 2024 at 3:26 AM
The model spits out straight answers. No Reasoning steps of any sort. The git repo code is just straight HF model+tokenizer load as usual and autoregressive decoding.
November 22, 2024 at 3:42 PM
The phrase "Wait! Maybe I made some mistakes! I need to rethink from scratch." allows for expansion of the "thoughts tree". It looks like this self-critic step(inspired from the popular ASU papers) adds additional search paths for MCTS to explore.
November 22, 2024 at 3:42 PM
I'm trying to understand what's going on with Alibaba's new Marco-o1 model.
At first glance it looked like they managed to introduce valuable "reasoning" tokens, compute a reward score (softmax of argmax token logprob over top 5 potential candidate tokens)
And finally added a reflection phrase
November 22, 2024 at 3:42 PM