Christian S. Perone
@cperone.bsky.social
http://blog.christianperone.com, Machine Learning, Computer Science and Math. Staff ML Research Engineer working with imitation learning and planning for Autonomous Vehicles. London/UK.
Gemma3n was released a few months ago, I wasn't able to find more info and I found it a *very interesting* architecture with a lot of innovations (Matryoshka Transformer, MobileNetV5, etc), so I decided to dig further, here you are the slides of this talk: drive.google.com/file/d/15hbh...
November 7, 2025 at 2:05 PM
Gemma3n was released a few months ago, I wasn't able to find more info and I found it a *very interesting* architecture with a lot of innovations (Matryoshka Transformer, MobileNetV5, etc), so I decided to dig further, here you are the slides of this talk: drive.google.com/file/d/15hbh...
Reposted by Christian S. Perone
Announcing the @iccv.bsky.social NAVSIM Challenge! What's new? We're testing not only on real recordings, but also perturbed futures generated from the real ones via pseudo-simulation! $8K in prizes + several $1.5k travel grants. Submit by September 20! opendrivelab.com/challenge2025/ 🧵👇
September 1, 2025 at 9:14 AM
Announcing the @iccv.bsky.social NAVSIM Challenge! What's new? We're testing not only on real recordings, but also perturbed futures generated from the real ones via pseudo-simulation! $8K in prizes + several $1.5k travel grants. Submit by September 20! opendrivelab.com/challenge2025/ 🧵👇
I feel I can build an entire benchmark dataset with ONNX errors that would be harder than the humanity's last exam dataset for us to evaluate AGI
July 31, 2025 at 1:07 PM
I feel I can build an entire benchmark dataset with ONNX errors that would be harder than the humanity's last exam dataset for us to evaluate AGI
Reposted by Christian S. Perone
Guillermo del Toro - Studio Ghibli Masterclass (2013, TIFF festival) ❤️
Now online (80 minutes) >> www.youtube.com/watch?v=q8Uo...
Now online (80 minutes) >> www.youtube.com/watch?v=q8Uo...
July 25, 2025 at 1:35 PM
Guillermo del Toro - Studio Ghibli Masterclass (2013, TIFF festival) ❤️
Now online (80 minutes) >> www.youtube.com/watch?v=q8Uo...
Now online (80 minutes) >> www.youtube.com/watch?v=q8Uo...
"Optimizers Qualitatively Alter Solutions And We Should Leverage This" (arxiv.org/abs/2507.12224), very nice to see this direction of understanding what different optimizers bring in terms of solution properties.
July 22, 2025 at 8:48 AM
"Optimizers Qualitatively Alter Solutions And We Should Leverage This" (arxiv.org/abs/2507.12224), very nice to see this direction of understanding what different optimizers bring in terms of solution properties.
I find incredible how much we can relate about the evolution of Machine Learning in the past decade to what Simondon described in 1958. The shift towards more generalist systems is exactly what Simondon's concept of "concretization" is about. 1/2
July 21, 2025 at 8:54 AM
I find incredible how much we can relate about the evolution of Machine Learning in the past decade to what Simondon described in 1958. The shift towards more generalist systems is exactly what Simondon's concept of "concretization" is about. 1/2
I made a diagram on how you can use a World Model with Diffusion Elites:
July 19, 2025 at 10:01 AM
I made a diagram on how you can use a World Model with Diffusion Elites:
Reposted by Christian S. Perone
This is absolutely *charming* in its straightforwardness.
All kinds of bells and whistles suggest themselves at once, but this gives "really strong baseline for the general case" vibes.
All kinds of bells and whistles suggest themselves at once, but this gives "really strong baseline for the general case" vibes.
New blog post: "Diffusion Elites: surprisingly good, simple and embarrassingly parallel", blog.christianperone.com/2025/07/diff...
July 18, 2025 at 12:43 AM
This is absolutely *charming* in its straightforwardness.
All kinds of bells and whistles suggest themselves at once, but this gives "really strong baseline for the general case" vibes.
All kinds of bells and whistles suggest themselves at once, but this gives "really strong baseline for the general case" vibes.
Reposted by Christian S. Perone
I'll be at ICML this week, presenting our paper on Wasserstein Policy Optimization on Tuesday! If you're in Vancouver, come say hi!
July 14, 2025 at 8:16 AM
I'll be at ICML this week, presenting our paper on Wasserstein Policy Optimization on Tuesday! If you're in Vancouver, come say hi!
New blog post: "Diffusion Elites: surprisingly good, simple and embarrassingly parallel", blog.christianperone.com/2025/07/diff...
July 9, 2025 at 9:14 PM
New blog post: "Diffusion Elites: surprisingly good, simple and embarrassingly parallel", blog.christianperone.com/2025/07/diff...
"Chip Placement with Diffusion Models" (openreview.net/pdf?id=crCPL...) very cool paper.
openreview.net
June 27, 2025 at 12:52 PM
"Chip Placement with Diffusion Models" (openreview.net/pdf?id=crCPL...) very cool paper.
I'm getting addicted to animations of Langevin sampling with fixed rng and varying params.
June 16, 2025 at 7:28 PM
I'm getting addicted to animations of Langevin sampling with fixed rng and varying params.
Given the amount of different definitions of world models, at this point, I think I can call any model a world model.
June 3, 2025 at 4:52 PM
Given the amount of different definitions of world models, at this point, I think I can call any model a world model.
If you change the tensor in PyTorch, it will change the tensor in Jax, Numpy, PyTorch and Tensorflow 😅
May 30, 2025 at 7:27 PM
If you change the tensor in PyTorch, it will change the tensor in Jax, Numpy, PyTorch and Tensorflow 😅
After a lot of issues with power distribution, the first panel of TorchStation proto is finally here 😉 node sel. for distributed training is coming. You will soon have an open-source and open-hardware @pytorch.org distributed training monitor on your desk: www.youtube.com/watch?v=D7po...
TorchStation - Development proto v1
YouTube video by Christian S. Perone
www.youtube.com
May 27, 2025 at 2:17 PM
After a lot of issues with power distribution, the first panel of TorchStation proto is finally here 😉 node sel. for distributed training is coming. You will soon have an open-source and open-hardware @pytorch.org distributed training monitor on your desk: www.youtube.com/watch?v=D7po...
We are hiring 2 Machine Learning Engineers in London/UK 🇬🇧 to work with end-to-end automated driving.
➡️ Senior Machine Learning Engineer
woven.toyota/en/careers/d...
➡️ Machine Learning Engineer
woven.toyota/en/careers/d...
We sponsor visas as well !
➡️ Senior Machine Learning Engineer
woven.toyota/en/careers/d...
➡️ Machine Learning Engineer
woven.toyota/en/careers/d...
We sponsor visas as well !
May 7, 2025 at 12:45 PM
We are hiring 2 Machine Learning Engineers in London/UK 🇬🇧 to work with end-to-end automated driving.
➡️ Senior Machine Learning Engineer
woven.toyota/en/careers/d...
➡️ Machine Learning Engineer
woven.toyota/en/careers/d...
We sponsor visas as well !
➡️ Senior Machine Learning Engineer
woven.toyota/en/careers/d...
➡️ Machine Learning Engineer
woven.toyota/en/careers/d...
We sponsor visas as well !
VectorVFS is on Hacker News front page 🤟
Show HN: VectorVFS, your filesystem as a vector database | Discussion
VectorVFS: Your Filesystem as a Vector Database
vectorvfs.readthedocs.io
May 5, 2025 at 4:04 PM
VectorVFS is on Hacker News front page 🤟
Introducing VectorVFS, your filesystem as a vector database: github.com/perone/vecto....
VectorVFS stores embeddings directly into filesystem inodes. No external index, daemon, database or metadata files. The first model supported is the SOTA Perception Encoder from Meta.
VectorVFS stores embeddings directly into filesystem inodes. No external index, daemon, database or metadata files. The first model supported is the SOTA Perception Encoder from Meta.
GitHub - perone/vectorvfs: Your filesystem is a vector database
Your filesystem is a vector database. Contribute to perone/vectorvfs development by creating an account on GitHub.
github.com
April 28, 2025 at 9:31 PM
Introducing VectorVFS, your filesystem as a vector database: github.com/perone/vecto....
VectorVFS stores embeddings directly into filesystem inodes. No external index, daemon, database or metadata files. The first model supported is the SOTA Perception Encoder from Meta.
VectorVFS stores embeddings directly into filesystem inodes. No external index, daemon, database or metadata files. The first model supported is the SOTA Perception Encoder from Meta.
New open-source project coming out soon 🤞
April 28, 2025 at 10:14 AM
New open-source project coming out soon 🤞
I think the Google Search Appliance (GSA) was a nice concept that suffered a unfortunate timing. Imagine it today with on-premise LLMs, multi-modal document indexing and modern retrieval. All local wo/ any data sent to cloud. I really want to develop a prototype w/ a Jetson.
April 7, 2025 at 1:39 PM
I think the Google Search Appliance (GSA) was a nice concept that suffered a unfortunate timing. Imagine it today with on-premise LLMs, multi-modal document indexing and modern retrieval. All local wo/ any data sent to cloud. I really want to develop a prototype w/ a Jetson.
IRoPE on LLama 4 seems very interesting, some clever tricks there.
April 5, 2025 at 11:03 PM
IRoPE on LLama 4 seems very interesting, some clever tricks there.
New Forest and its magical beings.
March 30, 2025 at 9:21 PM
New Forest and its magical beings.
What a crazy evolution. Slide from NVIDIA Blackwell Numerics for AI presentation.
March 18, 2025 at 10:33 AM
What a crazy evolution. Slide from NVIDIA Blackwell Numerics for AI presentation.
Machine learning engineers can now feel the pain epidemiologists endured during covid: everyone's suddenly an expert.
January 31, 2025 at 4:30 PM
Machine learning engineers can now feel the pain epidemiologists endured during covid: everyone's suddenly an expert.
That can partially explain the similarities of DeepSeek-R1 CoTs and o1-preview CoTs, even though the CoTs from o1 are hidden. There are now probably 3 hypothesis:
1) convergent behavior
2) distillation (but not of CoTs)
3) sinister data leak (directly or through a 3rd party)
1) convergent behavior
2) distillation (but not of CoTs)
3) sinister data leak (directly or through a 3rd party)
OpenAI says it has evidence China’s DeepSeek used its model to train competitor
https://www.ft.com/content/a0dfedd1-5255-4fa9-8ccc-1fe01de87ea6
https://www.ft.com/content/a0dfedd1-5255-4fa9-8ccc-1fe01de87ea6
OpenAI says it has evidence China’s DeepSeek used its model to train competitor
White House AI tsar David Sacks raises possibility of alleged intellectual property theft
www.ft.com
January 29, 2025 at 10:10 AM
That can partially explain the similarities of DeepSeek-R1 CoTs and o1-preview CoTs, even though the CoTs from o1 are hidden. There are now probably 3 hypothesis:
1) convergent behavior
2) distillation (but not of CoTs)
3) sinister data leak (directly or through a 3rd party)
1) convergent behavior
2) distillation (but not of CoTs)
3) sinister data leak (directly or through a 3rd party)