Lightnews — Scholar-powered news

Alexandre Défossez

@honualx.bsky.social

We just released unmute.sh 🔇🔊
It is a text LLM wrapper, based on in-house streaming ASR, TTS, semantic VAD to reduce latency. ⏱️
Unlike Moshi 🟢, Unmute 🔊 is turn base, but allows customization in two clicks🖱️: voice and prompt!
Paper and open source coming soon.

May 23, 2025 at 9:51 AM

Alexandre Défossez

@honualx.bsky.social

We just open sourced a fine tuning codebase for Moshi!

Kyutai @kyutai-labs.bsky.social · Apr 1

Have you enjoyed talking to 🟢Moshi and dreamt of making your own speech to speech chat experience🧑‍🔬🤖? It's now possible with the moshi-finetune codebase! Plug your own dataset and change the voice/tone/personality of Moshi 💚🔌💿. An example after finetuning w/ only 20 hours of the DailyTalk dataset. 🧵

April 1, 2025 at 4:47 PM

Alexandre Défossez

@honualx.bsky.social

Just back from holidays, so a bit late, to announce MoshiVis, extending Moshi's multimodal capabilities to take in images 📷.
Only 200M weights were added to plug a ViT through cross attention with gating 🖼️🔀🎤
Training relies on a mix of text only and text+audio synthetic data (~20k hours) 💽

Kyutai @kyutai-labs.bsky.social · Mar 21

Meet MoshiVis🎙️🖼️, the first open-source real-time speech model that can talk about images!

It sees, understands, and talks about images — naturally, and out loud.

This opens up new applications, from audio description for the visual impaired to visual access to information.

March 31, 2025 at 10:06 AM

Alexandre Défossez

@honualx.bsky.social

I'll start my presentation in 10 minutes, you can join in Zoom: concordia-ca.zoom.us/j/81541793947
See you there!

Alexandre Défossez @honualx.bsky.social · Mar 10

I'll present a dive into Moshi 🟢 and our translation model Hibiki 🇫🇷♻️🇬🇧 as part of the next @convai-rg.bsky.social reading group 👨‍🏫📗.

📅 13th of March 🕰️ 11am ET, 4pm in Paris.

I'll discuss Mimi 🗜️ and multi-stream audio modeling 🔊.
Join on Zoom, replay on YT.

⬛ ⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛
⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛ ⬛

convai-rg.bsky.social @convai-rg.bsky.social · Mar 10

📢 Join our Conversational AI Reading Group!
📅 Thursday, March 13 | 11 AM - 12 PM EST
🎙Speaker: Alexandre Defossez
📖 Topic: "Moshi: a speech-text foundation model for real-time dialogue"
🔗 Details: (poonehmousavi.github.io/rg)
▶️ Missed a session? Watch on YouTube: (www.youtube.com/@CONVAI_RG) 🚀

March 13, 2025 at 2:50 PM

Alexandre Défossez

@honualx.bsky.social

I'll present a dive into Moshi 🟢 and our translation model Hibiki 🇫🇷♻️🇬🇧 as part of the next @convai-rg.bsky.social reading group 👨‍🏫📗.

📅 13th of March 🕰️ 11am ET, 4pm in Paris.

I'll discuss Mimi 🗜️ and multi-stream audio modeling 🔊.
Join on Zoom, replay on YT.

⬛ ⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛
⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛ ⬛

convai-rg.bsky.social @convai-rg.bsky.social · Mar 10

📢 Join our Conversational AI Reading Group!
📅 Thursday, March 13 | 11 AM - 12 PM EST
🎙Speaker: Alexandre Defossez
📖 Topic: "Moshi: a speech-text foundation model for real-time dialogue"
🔗 Details: (poonehmousavi.github.io/rg)
▶️ Missed a session? Watch on YouTube: (www.youtube.com/@CONVAI_RG) 🚀

Pooneh Mousavi

Homepage of Pooneh Mousavi

poonehmousavi.github.io

March 10, 2025 at 5:34 PM

Reposted by Alexandre Défossez

Kyutai

@kyutai-labs.bsky.social

Even Kavinsky 🎧🪩 can't break Hibiki! Just like Moshi, Hibiki is robust to extreme background conditions 💥🔊.

February 11, 2025 at 4:11 PM

Reposted by Alexandre Défossez

Jean-Rémi King

@jeanremiking.bsky.social

Very happy to have participated in this *beautiful* documentary from Florent Muller, on the frontiers between humans and machines,
following next @yann-lecun.bsky.social and so many humbling figures of AI:
www.france.tv/documentaire...

France TV - Replay et Direct tv des chaînes France Télévisions (ex Pluzz)

Retrouvez toutes les vidéos, articles et podcasts des programmes des chaînes de France Télévisions.

www.france.tv

February 11, 2025 at 9:32 AM

Reposted by Alexandre Défossez

Jean-Rémi King

@jeanremiking.bsky.social

Our latest studies on the decoding text from brain activity, reviewed by MIT Tech Review @technologyreview.com

www.technologyreview.com/2025/02/07/1...

February 10, 2025 at 12:13 PM

Reposted by Alexandre Défossez

Quentin Berthet

@qberthet.bsky.social

Check out our paper, with Lawrence Stewart and @bachfrancis.bsky.social

Link: arxiv.org/abs/2502.02996

1/8

Building Bridges between Regression, Clustering, and Classification

Regression, the task of predicting a continuous scalar target y based on some features x is one of the most fundamental tasks in machine learning and statistics. It has been observed and...

arxiv.org

February 10, 2025 at 12:00 PM

Alexandre Défossez

@honualx.bsky.social

Excited to meet and exchange with a number of actors from all around the world at the AI Summit 🌍

February 10, 2025 at 1:24 PM

Alexandre Défossez

@honualx.bsky.social

We just released Hibiki, a 🎙️-to-🔊 simultaneous translation model 🇫🇷🇬🇧
We leverage a large synthetic corpus synthesized from the text translation model MADLAD, and our own TTS + simple lag rule.
Model is decoder only, runs at scale, even on device 📲
github.com/kyutai-labs/hibiki

Kyutai @kyutai-labs.bsky.social · Feb 7

Meet Hibiki, our simultaneous speech-to-speech translation model, currently supporting 🇫🇷➡️🇬🇧.
Hibiki produces spoken and text translations of the input speech in real-time, while preserving the speaker’s voice and optimally adapting its pace based on the semantic content of the source speech. 🧵

February 7, 2025 at 9:47 PM

Reposted by Alexandre Défossez

Jean-Rémi King

@jeanremiking.bsky.social

🚨Job alert (Please RT)

What: masters internship and/or PhD positions
Where: Rothschild Foundation Hospital (Paris, France)
Topic: AI and Neuroscience
Supervised by: Pierre Bourdillon and myself
Apply here: forms.gle/KKnea2QAjhAe...
Deadline: Feb 5th

forms.gle

January 15, 2025 at 8:56 AM

Alexandre Défossez

@honualx.bsky.social

We just released the Helium-1 model , a 2B multi-lingual LLM which @exgrv.bsky.social and @lmazare.bsky.social have been crafting for us! Best model so far under 2.17B params on multi-lingual benchmarks 🇬🇧🇮🇹🇪🇸🇵🇹🇫🇷🇩🇪
On HF, under CC-BY licence: huggingface.co/kyutai/heliu...

January 13, 2025 at 6:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news