LightNews — Scholar-powered news

Andi

@andimara.bsky.social

New Blog📖✨:
nanoVLM: The simplest way to train your own Vision-Language Model in pure PyTorch explained step-by-step!
Easy to read, even easier to use. Train your first VLM today!

May 21, 2025 at 1:10 PM

Andi

@andimara.bsky.social

Real-time SmolVLM in a web browser with transformers.js.

All running locally with no installs. Just open the website.

May 14, 2025 at 3:39 PM

Andi

@andimara.bsky.social

Today, we share the tech report for SmolVLM: Redefining small and efficient multimodal models.
🔥 Explaining how to create a tiny 256M VLM that uses less than 1GB of RAM and outperforms our 80B models from 18 months ago!
huggingface.co/papers/2504....

Paper page - SmolVLM: Redefining small and efficient multimodal models

Join the discussion on this paper page

huggingface.co

April 8, 2025 at 3:12 PM

Andi

@andimara.bsky.social

🚀 We just dropped SmolDocling: a 256M open-source vision LM for complete document OCR! 📄✨
Lightning fast, process a page in 0.35 sec on consumer GPU using < 500MB VRAM ⚡
SOTA in document conversion, beating every competing model we tested (including models 27x more params) 🤯
But how? 🧶⬇️

March 17, 2025 at 3:53 PM

Andi

@andimara.bsky.social

Extremely bullish on @CohereForAI's Aya Vision (8B & 32B) - new SOTA open-weight VLMs

- 8B wins up to 81% of the time in its class, better than Gemini Flash
- 32B beats Llama 3.2 90B!
- Integrated on @hf.co from Day 0!

Check out their blog! huggingface.co/blog/aya-vis...

A Deepdive into Aya Vision: Advancing the Frontier of Multilingual Multimodality

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

March 5, 2025 at 2:38 PM

Andi

@andimara.bsky.social

Fuck it, today we're open-sourcing the codebase used to train SmolVLM from scratch on 256 H100s 🔥
Inspired by our team's effort to open-source DeepSeek's R1, we are releasing the training and evaluation code on top of the weights 🫡
Now you can train any SmolVLM—or create your own custom VLMs!

January 31, 2025 at 3:06 PM

Andi

@andimara.bsky.social

Introducing the smollest VLMs yet! 🤏
SmolVLM (256M & 500M) runs on <1GB GPU memory.
Fine-tune it on your laptop and run it on your toaster. 🚀
Even the 256M model outperforms our Idefics 80B (Aug '23).
How small can we go? 👀

January 23, 2025 at 1:33 PM

Reposted by Andi

Lewis Tunstall

@lewtun.bsky.social

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute 🔥

How? By combining step-wise reward models with tree search algorithms :)

We're open sourcing the full recipe and sharing a detailed blog post 👇

December 16, 2024 at 5:08 PM

Andi

@andimara.bsky.social

Do you want to try Llama 3.3?
huggingface.co/chat/

HuggingChat

Making the community's best AI chat models available to everyone.

huggingface.co

December 6, 2024 at 6:41 PM

Reposted by Andi

Tom Elliot

@tomelliot.net

Just been messing with SmolVLM - visual language model that is _Smol_.

2B params, does some amazing stuff, with little memory, quickly. I'll post a couple of examples below.

Super cool stuff from @merve.bsky.social & @andimara.bsky.social!

December 3, 2024 at 10:52 AM

Reposted by Andi

Loubna Ben Allal

@loubnabnl.hf.co

📬 Summarize and rewrite your text/emails faster, and offline!

Check @andimara.bsky.social's Smol Tools for summarization and rewriting. It uses SmolLM2 to summarize text and make it more friendly or professional, all running locally thanks to llama.cpp github.com/huggingface/...

smollm/smol_tools at main · huggingface/smollm

Everything about the SmolLM & SmolLM2 family of models - huggingface/smollm

github.com

November 30, 2024 at 3:59 PM

Reposted by Andi

merve

@merve.bsky.social

it only takes a single CLI command to kick-off a Direct Preference Optimization fine-tuning run on SmolVLM huggingface.co/blog/smolvlm...

you're welcome

November 28, 2024 at 11:06 AM

Andi

@andimara.bsky.social

Do you want to help us build SmolVLM2? We are hiring interns! Remote roles from EMEA or US.
apply.workable.com/huggingface/...

Andi @andimara.bsky.social · Nov 26

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

November 29, 2024 at 10:25 AM

Reposted by Andi

Andi

@andimara.bsky.social

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

November 26, 2024 at 3:57 PM

Reposted by Andi

merve

@merve.bsky.social

The authors of ColPali trained a retrieval model based on SmolVLM 🤠 TLDR;
- ColSmolVLM performs better than ColPali and DSE-Qwen2 on all English tasks
- ColSmolVLM is more memory efficient than ColQwen2 💗

Find the model here huggingface.co/vidore/colsm...

November 27, 2024 at 2:10 PM

Reposted by Andi

Lucie-Aimée Kaffee

@frimelle.bsky.social

🚀 Excited to share that our provocation paper "Evaluations Using Wikipedia without Data Contamination: From Trusting Articles to Trusting Edit Processes" has been accepted at the NeurIPS workshop Evaluating Evaluations! 🌐📚

#NeurIPS2024 #evaleval #AIEvaluation

November 28, 2024 at 9:41 AM

Reposted by Andi

merve

@merve.bsky.social

It's pretty sad to see the negative sentiment towards Hugging Face on this platform due to a dataset put by one of the employees. I want to write a small piece. 🧵

Hugging Face empowers everyone to use AI to create value and is against monopolization of AI it's a hosting platform above all.

November 27, 2024 at 3:23 PM

Reposted by Andi

Leandro von Werra

@lvwerra.bsky.social

Some people are pushing models to the top right of the plot following the scaling laws, others push them to the top left and make them faster and cheaper!

We need both!

Andi @andimara.bsky.social · Nov 26

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

November 26, 2024 at 4:33 PM

Andi

@andimara.bsky.social

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

November 26, 2024 at 3:57 PM

Andi

@andimara.bsky.social

The best thing about working at @huggingface is how often I’m floored by the brilliance of my colleagues. Truly inspiring.

November 25, 2024 at 3:50 PM

Andi

@andimara.bsky.social

Working hard on an awesome multimodal release. Coming soon :)

November 25, 2024 at 12:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news