Lightnews — Scholar-powered news

Laurent Mazare

@lmazare.bsky.social

🚀 Say hello to unmute.sh — a modular voice AI system built on our in-house low latency text-to-speech and speech-to-text engines. It works in English 🇬🇧 and French 🇫🇷 and you can customize the voice and personality.
🎙️Try it live and tell us what you think!

Kyutai @kyutai-labs.bsky.social · May 23

Talk to unmute.sh 🔊, the most modular voice AI around. Empower any text LLM with voice, instantly, by wrapping it with our new speech-to-text and text-to-speech. Any personality, any voice. Interruptible, smart turn-taking. We’ll open-source everything within the next few weeks.

May 23, 2025 at 1:59 PM

Reposted by Laurent Mazare

Alexandre Défossez

@honualx.bsky.social

I'll present a dive into Moshi 🟢 and our translation model Hibiki 🇫🇷♻️🇬🇧 as part of the next @convai-rg.bsky.social reading group 👨‍🏫📗.

📅 13th of March 🕰️ 11am ET, 4pm in Paris.

I'll discuss Mimi 🗜️ and multi-stream audio modeling 🔊.
Join on Zoom, replay on YT.

⬛ ⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛
⬛ 🟧 🟧 🟨 🟨 🟩 🟩 🟩 ⬛ ⬛

convai-rg.bsky.social @convai-rg.bsky.social · Mar 10

📢 Join our Conversational AI Reading Group!
📅 Thursday, March 13 | 11 AM - 12 PM EST
🎙Speaker: Alexandre Defossez
📖 Topic: "Moshi: a speech-text foundation model for real-time dialogue"
🔗 Details: (poonehmousavi.github.io/rg)
▶️ Missed a session? Watch on YouTube: (www.youtube.com/@CONVAI_RG) 🚀

Pooneh Mousavi

Homepage of Pooneh Mousavi

poonehmousavi.github.io

March 10, 2025 at 5:34 PM

Laurent Mazare

@lmazare.bsky.social

Afraid of missing out on French pop culture references because you don't speak the language? Fear no more and try our Hibiki speech-to-speech translation model— no more FOMO! 🇫🇷✨ #Translation #AI

February 12, 2025 at 1:39 PM

Reposted by Laurent Mazare

Kyutai

@kyutai-labs.bsky.social

Even Kavinsky 🎧🪩 can't break Hibiki! Just like Moshi, Hibiki is robust to extreme background conditions 💥🔊.

February 11, 2025 at 4:11 PM

Laurent Mazare

@lmazare.bsky.social

We just released Hibiki 🟢, a real time speech-to-speech translation 🇫🇷 -> 🇬🇧. It preserves the voice of the user, and the smaller variant can run on iPhone as showed by Neil in this video.
Find the code on github github.com/kyutai-labs/... and the weights on HF and give it a spin!

February 7, 2025 at 8:26 AM

Laurent Mazare

@lmazare.bsky.social

Very impressive to hear this Japanese 🇯🇵 version of moshi 🟢. I don't speak the language so I cannot understand what it's trying to tell me but at least it sounds great 😅
github.com/nu-dialogue/...

GitHub - nu-dialogue/j-moshi: J-Moshi: A Japanese Full-duplex Spoken Dialogue System

J-Moshi: A Japanese Full-duplex Spoken Dialogue System - nu-dialogue/j-moshi

github.com

January 24, 2025 at 12:38 PM

Laurent Mazare

@lmazare.bsky.social

Getting our latest LLM to run on the edge was pretty fun, it had been a while since I last used swift and it's still a pretty enjoyable language!

Kyutai @kyutai-labs.bsky.social · Jan 14

Helium 2B running locally on an iPhone 16 Pro at ~28 tok/s, faster than you can read your loga lessons in French 🚀 All that thanks to mlx-swift with q4 quantization!

January 14, 2025 at 4:43 PM

Laurent Mazare

@lmazare.bsky.social

Super proud of our first publicly released text model, helium-1 preview, a 2B model trained on 6 languages. It should be a great fit for on-device applications. Already available in candle/transformers, can't wait to see what the community builds with it! #OpenSource #AI #FTW!

Kyutai @kyutai-labs.bsky.social · Jan 13

Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today!
huggingface.co/kyutai/heliu...

kyutai/helium-1-preview-2b · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

January 13, 2025 at 5:53 PM

Laurent Mazare

@lmazare.bsky.social

Last week we've received a new M4pro mac mini so I've benched it with various matmul variants and the results are pretty impressive for a tiny form factor. Even with a naive approach it reaches ~5.2TFlops in f32 (so probably more than 10TFlops in bf16), and that's just using the GPU, no NPU for now.

November 24, 2024 at 12:20 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news