Lightnews — Scholar-powered news

Nicolas Beltran-Velez

@velezbeltran.bsky.social

1.8K followers 1K following 84 posts

Machine Learning PhD Student
@ Blei Lab & Columbia University.

Working on probabilistic ML | uncertainty quantification | LLM interpretability.

Excited about everything ML, AI and engineering!

Posts Replies Media Videos

Reposted by Nicolas Beltran-Velez

kyunghyuncho.bsky.social

@kyunghyuncho.bsky.social

this is probably not the complete picture of KD, but i can definitely sleep better after writing down and confirming this minimal working explanation.

arXiv: arxiv.org/abs/2505.13111

(3/4)

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Knowledge distillation (KD) is a core component in the training and deployment of modern generative models, particularly large language models (LLMs). While its empirical benefits are well documented-...

arxiv.org

May 20, 2025 at 12:18 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

democracy2025.org/response-center keeps track of it.

Democracy 2025 | The united legal frontline in the fight for our democracy

Democracy 2025 is the strategic hub to protect people and their rights should the Trump-Vance administration seek to unlawfully strip away freedoms and prosperity.

Democracy2025.org

January 27, 2025 at 5:10 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

Tests!! :)

January 25, 2025 at 7:50 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

But the memory needed for the value function kills the ones that don't have good GPUs 😭

January 25, 2025 at 3:36 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

I mostly use copilot for writing code (as auto complete), gpt4-o for boiler plate, and o1 for serious debugging or boilerplate with some complexity or a lot of requirements. I also use o1 for quick but slightly involved experiments but not as often.

January 8, 2025 at 7:36 PM

Nicolas Beltran-Velez

@velezbeltran.bsky.social

I use chatgpt over Google for a lot of things because it is really good at fuzzy queries + data aggregation from many sources. I feel that as long as you double check results it is much faster and convenient.

January 7, 2025 at 12:40 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news