Nicolas Beltran-Velez
banner
velezbeltran.bsky.social
Nicolas Beltran-Velez
@velezbeltran.bsky.social
Machine Learning PhD Student
@ Blei Lab & Columbia University.

Working on probabilistic ML | uncertainty quantification | LLM interpretability.

Excited about everything ML, AI and engineering!
Pinned
I am very excited to share our new Neurips 2024 paper + package, Treeffuser! 🌳 We combine gradient-boosted trees with diffusion models for fast, flexible probabilistic predictions and well-calibrated uncertainty.

paper: arxiv.org/abs/2406.07658
repo: github.com/blei-lab/tre...

🧵(1/8)
Reposted by Nicolas Beltran-Velez
🎓 Hats off to the 2025 IICD graduates: Yining Ma Junze Huang Yichi Yang Ruilin Dai Boan Zhu Cameron Park @jlfan.bsky.social & Achille Nazaret!
Wishing you all the best in your next chapter — we’re proud of you! 💙 #Columbia2025
@bleilab.bsky.social @khanhndinh.bsky.social @elhamazizi.bsky.social
May 21, 2025 at 1:19 PM
Reposted by Nicolas Beltran-Velez
this is probably not the complete picture of KD, but i can definitely sleep better after writing down and confirming this minimal working explanation.

arXiv: arxiv.org/abs/2505.13111

(3/4)
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
Knowledge distillation (KD) is a core component in the training and deployment of modern generative models, particularly large language models (LLMs). While its empirical benefits are well documented-...
arxiv.org
May 20, 2025 at 12:18 PM
Reposted by Nicolas Beltran-Velez
I received a review like this five years ago. It’s probably the right time now to share it with everyone who wrote or got random discouraging reviews from ICML/ACL.
March 28, 2025 at 7:55 PM
Reposted by Nicolas Beltran-Velez
First 11 chapters of RLHF Book have v0 draft done. Should be useful now.

Next:
* Crafting more blog content into future topics,
* DPO+ chapter,
* Meeting with publishers to get wheels turning on physical copies,
* Cleaning & cohesiveness
rlhfbook.com
February 26, 2025 at 4:35 PM
Reposted by Nicolas Beltran-Velez
🔥 Benchmark Alert! MotifBench sets a new standard for evaluating protein design methods in motif scaffolding.
Why does this matter? Reproducibility & fair comparison have been lacking—until now.
Paper: arxiv.org/abs/2502.12479 | Repo: github.com/blt2114/Moti...
A thread ⬇️
February 19, 2025 at 8:50 PM
Reposted by Nicolas Beltran-Velez
The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot...

It’s not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.
February 19, 2025 at 7:13 PM
I just wanted to see what it looked like 😭
February 19, 2025 at 2:26 AM
Good God, please. I just want some gradients that don't vanish 😭
February 17, 2025 at 3:01 AM
Reposted by Nicolas Beltran-Velez
I was hoping that recent events would lead to a mass exodus from X. Many have left, but most of the ML and LLM people have not.

I have lost a lot of respect for the ML community.
February 5, 2025 at 5:58 AM
Reposted by Nicolas Beltran-Velez
Now that bluesky has gifs (it didn't work?), I can share (again) my educational notebook on discrete flow matching (by Itai Gat et al.). Also please check the original article and official implementation by Meta!

🐍 github.com/gle-bellier/...
🐍 github.com/facebookrese...
📄 arxiv.org/abs/2407.15595
February 5, 2025 at 4:54 PM
Reposted by Nicolas Beltran-Velez
Really excited about this! We note a connection between diffusion/flow models and neural/latent SDEs. We show how to use this for simulation-free learning of fully flexible SDEs. We refer to this as SDE Matching and show speed improvements of several orders of magnitude.

arxiv.org/abs/2502.02472
SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations
The Latent Stochastic Differential Equation (SDE) is a powerful tool for time series and sequence modeling. However, training Latent SDEs typically relies on adjoint sensitivity methods, which depend ...
arxiv.org
February 5, 2025 at 2:38 PM
Reposted by Nicolas Beltran-Velez
I have a sinking feeling that by 2029 I'm going to be faking a British accent so no one will think I was one of the *Americans* working on AI during the regime.
February 3, 2025 at 1:24 AM
NGL, it's kind of surprising that more people haven't migrated here, especially given what Musk has been doing these days. I don't get it.
February 3, 2025 at 2:58 AM
Reposted by Nicolas Beltran-Velez
Since everyone wants to learn RL for language models now post DeepSeek, reminder that I've been working on this book quietly in the background for months.

Policy gradient chapter is coming together. Plugging away at the book every day now.

rlhfbook.com/c/11-policy-...
February 1, 2025 at 10:05 PM
Reposted by Nicolas Beltran-Velez
Please stop anthropomorphizing language models, it makes them feel really bad
January 29, 2025 at 11:20 PM
Reposted by Nicolas Beltran-Velez
This comments section is the first time I've felt even a shred of hope in eight days.
From the fednews community on Reddit
Explore this post and more from the fednews community
www.reddit.com
January 29, 2025 at 5:41 AM
Reposted by Nicolas Beltran-Velez
Nazi salutes and speaking at neo-Nazi rallies seems bad. There's history that we should learn from.
January 26, 2025 at 12:41 AM
Something I really like about NLP research is that it makes everything super intuitive. This week I have been thinking about variational inference in NLP and a lot of the things that seemed to require mathematical intuition just become trivial when thinking about language. So cool:)
January 25, 2025 at 9:52 PM
Reposted by Nicolas Beltran-Velez
New randomized, controlled trial by the World Bank of students using GPT-4 as a tutor in Nigeria. Six weeks of after-school AI tutoring = 2 years of typical learning gains, outperforming 80% of other educational interventions.

And it helped all students, especially girls who were initially behind.
January 15, 2025 at 8:58 PM
Does anyone have any good resources to learn about quantization? Any essential papers to read and resources about how to use/quantize models in practice are greatly appreciated!
December 28, 2024 at 4:51 PM
Reposted by Nicolas Beltran-Velez
1-> 2 -> 3 -> 3.5 -> 4 -> 4o -> o1 -> o3

I guess we need AGI just to figure out how to name things
December 20, 2024 at 7:17 PM
Reposted by Nicolas Beltran-Velez
If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.
December 19, 2024 at 12:55 AM
Reposted by Nicolas Beltran-Velez
🧵 Excited to share #Echidna, a Bayesian framework for quantifying the impact of gene dosage on phenotypic plasticity: tinyurl.com/296kf7hf!
With @elhamazizi.bsky.social and @mingxz.bsky.social, we integrate scRNA-seq & WGS to uncover how CNAs drive tumor evolution and transcriptional variability.
www.biorxiv.org
December 18, 2024 at 1:31 PM
Reposted by Nicolas Beltran-Velez
Proud of this work spearheaded by the phenomenal @jlfan.bsky.social and @mingxz.bsky.social in collaboration w/ Ben Izar! The past 3 years we've worked hard to unravel how #CNVs shape #tumor phenotypic plasticity seen in #singlecell #RNAseq data ➡️ #Echidna 🦔
December 18, 2024 at 2:08 PM