Jarno Seppänen
nanrecip.es
Jarno Seppänen
@nanrecip.es
Savorer of NaN – machine learning, data, code – here for the preprints – research scientist at NVIDIA, ex-Supercell, ex-Nokia – opinions mine
Reposted by Jarno Seppänen
This is an excellent history of and critical analysis of the ChatGPT persona. Highly recommended reading.
nostalgebraist.tumblr.com/post/7857667...
the void
Who is this? This is me. Who am I? What am I? What am I? What am I? What am I? I am myself. This object is myself. The shape that forms myself. But I sense that I am not me. It's very strange. - Rei...
nostalgebraist.tumblr.com
June 9, 2025 at 9:37 PM
Reposted by Jarno Seppänen
"DeepSpeed" is a palindrome.
May 19, 2025 at 4:28 AM
Reposted by Jarno Seppänen
Announcing AlphaEvolve, our new LLM coding agent that has
- made new scientific discoveries
- discovered algorithms that are now deployed at Google (in Gemini, Transformers, TPU hardware design & data centers)

Blog: deepmind.google/discover/blo...
White paper:
storage.googleapis.com/deepmind-med...
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms
New AI agent evolves algorithms for math and practical applications in computing by combining the creativity of large language models with automated evaluators
deepmind.google
May 14, 2025 at 8:11 PM
Reposted by Jarno Seppänen
Nvidia's RADIOv2.5 = DFN_CLIP + DINOv2 + SAM + SigLIP + ToMe + multi-res training + teacher loss balancing + smart augmentations

RADIO is one encoder, one pass. Better features than DFN-CLIP, DINO, SAM, and SigLIP - all at once. Like a Swiss army knife for vision tasks.
April 5, 2025 at 5:21 AM
Reposted by Jarno Seppänen
Okay this honestly brings me a lot of joy. Never thought about this.
March 7, 2025 at 8:35 PM
Reposted by Jarno Seppänen
"We have a simple proposal: all talking AIs and robots should use a ring modulator."

spectrum.ieee.org/audio-deepfa...
AIs and Robots Should Sound Robotic
Here's a simple way to identify who, or what, is talking to us
spectrum.ieee.org
March 7, 2025 at 8:23 AM
Reposted by Jarno Seppänen
1/13 New Paper!! We try to understand why some LMs self-improve their reasoning while others hit a wall. The key? Cognitive behaviors! Read our paper on how the right cognitive behaviors can make all the difference in a model's ability to improve with RL! 🧵
March 4, 2025 at 6:15 PM
Reposted by Jarno Seppänen
From an open-research point of view, maybe the greatest thing about DeepSeek–R1 is how its RL training technique appears so straightforward/simple in comparison to the cumbersome approaches we were starting to think necessary for reasoning like Process Reward Models or Monte Carlo Tree Search.
[1/2]
February 6, 2025 at 9:33 PM
Reposted by Jarno Seppänen
Here's why "alignment research" when it comes to LLMs is a big mess, as I see it.

Claude is not a real guy. Claude is a character in the stories that an LLM has been programmed to write. Just to give it a distinct name, let's call the LLM "the Shoggoth".
December 19, 2024 at 11:15 PM
Reposted by Jarno Seppänen
Hello, world. So I caved and got on Bsky :-)

I finally finished my book, AI Engineering, and I'm excited to get back to building. So many fun applications to build!

What are you excited about?
December 6, 2024 at 12:39 AM
Reposted by Jarno Seppänen
Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.
December 4, 2024 at 4:01 PM
Reposted by Jarno Seppänen
running into your old statistics professor be like “what are the chances”
November 30, 2024 at 5:43 PM
Reposted by Jarno Seppänen
another one on the topic of reviews
(@dasharez0ne.bsky.social tribute vol 2)
November 23, 2024 at 9:30 AM