Lightnews — Scholar-powered news

Arvind Nagaraj

@narvind.bsky.social

230 followers 950 following 120 posts

Deep Learning | ML research |
Ex.Robotics at Invento | 🔗 https://narvind2003.github.io

Here to strictly talk about ML, NNs and related ideas. Casual stuff on x.com/nagaraj_arvind

Posts Replies Media Videos

Arvind Nagaraj

@narvind.bsky.social

The Hierarchical Reasoning Model (HRM) isn't just another model. It's a deep synthesis. It marries the iterative soul of an RNN (minus the BPTT nightmare) with the raw power of modern Attention.

August 7, 2025 at 8:50 AM

Arvind Nagaraj

@narvind.bsky.social

Then, last month, a paper dropped that changes everything.
This is the architecture I've been waiting for since 2018. A thread on HRM. 🧵

August 7, 2025 at 8:50 AM

Arvind Nagaraj

@narvind.bsky.social

For years, I died a little inside every time I taught the Transformer model, grudgingly accepting that the elegant loop of the RNN was dead.

August 7, 2025 at 8:50 AM

Arvind Nagaraj

@narvind.bsky.social

I like how the new gemini 2.0 thinking model insists like a child...lol

December 19, 2024 at 6:38 PM

Arvind Nagaraj

@narvind.bsky.social

Why does ChatGPT refuse to say "David Mayer" ?? 🤔
I have tried a bunch of ways and it refuses to!! 😭

December 1, 2024 at 6:38 AM

Arvind Nagaraj

@narvind.bsky.social

Casio?

November 26, 2024 at 5:28 PM

Arvind Nagaraj

@narvind.bsky.social

😮 Lost it for a moment....luckily, the good people on the interwebs make copies of important files!

November 26, 2024 at 2:01 PM

Arvind Nagaraj

@narvind.bsky.social

This thing's "mind" looks like a ferret on crack 🤣
I had to explain it to the poor thing!

November 26, 2024 at 6:47 AM

Arvind Nagaraj

@narvind.bsky.social

o1 style reasoning can solve most of these. But 8 is tricky.
Deepseek's model (inner monologue thinking tokens) are super interesting to watch. But the CoT trajectories take it to 2 incorrect solutions before it runs out thinking time: It either adds an extra 8 or uses cube roots.
Can't nest like👇

November 26, 2024 at 6:39 AM

Arvind Nagaraj

@narvind.bsky.social

Looks like it takes a special system prompt to trigger that behavior! huggingface.co/AIDC-AI/Marc...

November 23, 2024 at 3:26 AM

Arvind Nagaraj

@narvind.bsky.social

The model spits out straight answers. No Reasoning steps of any sort. The git repo code is just straight HF model+tokenizer load as usual and autoregressive decoding.

November 22, 2024 at 3:42 PM

Arvind Nagaraj

@narvind.bsky.social

The phrase "Wait! Maybe I made some mistakes! I need to rethink from scratch." allows for expansion of the "thoughts tree". It looks like this self-critic step(inspired from the popular ASU papers) adds additional search paths for MCTS to explore.

November 22, 2024 at 3:42 PM

Arvind Nagaraj

@narvind.bsky.social

I'm trying to understand what's going on with Alibaba's new Marco-o1 model.
At first glance it looked like they managed to introduce valuable "reasoning" tokens, compute a reward score (softmax of argmax token logprob over top 5 potential candidate tokens)
And finally added a reflection phrase

November 22, 2024 at 3:42 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news