Lightnews — Scholar-powered news

@researchtrend.ai

[2025-11-06] 📚 Updates in #VGen

(1) ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing
(2) Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation

🔍 More at researchtrend.ai/communities/VGen

November 6, 2025 at 3:09 AM

FinTwitter

@fintwitter.bsky.social

🇨🇳ALIBABA INTRODUCES THINKSOUND, AN AI MODEL FOR INTERACTIVE, STEP-BY-STEP AUDIO GENERATION AND EDITING FOR VIDEO CONTENT ALIBABA UP OVER 3% IN HK #CHINA #AI #ALIBABA $BABA

July 16, 2025 at 2:15 AM

FinTwitter

@fintwitter.bsky.social

Alibaba: Launches ThinkSound, An AI Tool For Creating And Editing Audio Step-By-Step For Videos 🎧🎥

July 16, 2025 at 2:03 AM

FinTwitter

@fintwitter.bsky.social

Alibaba Group has unveiled ThinkSound, a new AI model designed for interactive, step-by-step audio generation and editing tailored for video content creation.

July 16, 2025 at 1:50 AM

Dreaming Tulpa

@dreamingtulpa.bsky.social

project page: ThinkSound-Demo.github.io
code: github.com/FunAudioLLM...
demo: huggingface.co/spaces/FunA...

ThinkSound - a Hugging Face Space by FunAudioLLM

huggingface.co

July 11, 2025 at 4:14 PM

Dreaming Tulpa

@dreamingtulpa.bsky.social

video-to-sound has been solved

there is a new spiritual successor to mmaudio called ThinkSound that supports chain-of-thought prompts for extremely accurate video-to-audio generation

kinda blown away:

July 11, 2025 at 4:14 PM

Bertrand Formet

@bertrandformet.bsky.social

ThinkSound (www.uneiaparjour.fr/thinksound/) génère la bande son de vidéos en la synchronisant et en ajoutant directement au fichier original, avec ou sans instructions.
Export mp4 et partage url.

Gratuit, open source et illimité.

#uneIAparjour #IA #audio #vidéo

ThinkSound - Une IA par jour

ThinkSound (démo : https://huggingface.co/spaces/FunAudioLLM/ThinkSound et documentation : https://github.com/FunAudioLLM/ThinkSound) génère la bande son de vidéos en la synchronisant et en ajoutant d...

www.uneiaparjour.fr

July 10, 2025 at 9:56 PM

luokai

@luok.ai

Size matters! The large ThinkSound model (1.3B parameters) delivers the best results, but even smaller versions hold their own. The takeaway: more capacity means deeper reasoning and better audio, but the tech scales for different needs.

Paper: arxiv.org/abs/2506.21448

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

While end-to-end video-to-audio generation has greatly improved, producing high-fidelity audio that authentically captures the nuances of visual content remains challenging. Like professionals in the ...

arxiv.org

July 5, 2025 at 1:22 AM

luokai

@luok.ai

ThinkSound uses a “gated fusion” mechanism to blend video and audio features. This dynamic approach lets the model adapt to each scene, improving synchronization and making the generated audio feel natural—no more awkward lags or mismatches.

July 5, 2025 at 1:22 AM

luokai

@luok.ai

It’s not just about video or text—ThinkSound fuses both. By combining CLIP’s contrastive features with T5’s contextual reasoning, it nails both the details and the big picture. The result? Audio that’s not only realistic, but perfectly fits the scene.

Github: github.com/FunAudioLLM/...

GitHub - FunAudioLLM/ThinkSound: PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.

PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning. - FunAudioLLM/ThinkSound

github.com

July 5, 2025 at 1:22 AM

luokai

@luok.ai

Imagine your video scenes coming alive with sound that actually makes sense—no more generic noise!

Project: thinksound-project.github.io

Movie Gen + ThinkSound

July 5, 2025 at 1:22 AM

luokai

@luok.ai

ThinkSound is the open-source challenger shaking up the video-to-audio generation game.

Unlike commercial giants, ThinkSound brings chain-of-thought reasoning to multimodal AI, letting it generate audio that’s not just synced, but semantically spot-on.

VEO3 + ThinkSound 🧵1/5

July 5, 2025 at 1:22 AM

Meng Li

@mengli512.bsky.social

🎵 #Alibaba just dropped #ThinkSound - the first AI audio model that actually "thinks" before generating sound! Using Chain of Thought reasoning, it creates perfectly synced audio for every video frame, from baby cries to train sounds
aidisruption.ai/p/alibaba-op...

Alibaba Open-Sources First CoT Audio Model, Mastering Audio-Visual Sync

Alibaba's ThinkSound: First CoT audio model for video dubbing. Generates frame-perfect sound effects using Chain of Thought reasoning. Open-source on GitHub.

aidisruption.ai

July 1, 2025 at 1:40 PM

arXiv Sound

@arxiv-sound.bsky.social

ThinkSound uses Chain-of-Thought reasoning to enable stepwise, interactive audio generation and editing for videos, guided by a multimodal large language model and a unified audio foundation model; state-of-the-art performance on video-to-audio generation.

ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing

Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue

arxiv.org

June 27, 2025 at 11:15 AM

arXiv eess.AS Audio and Speech Processing

@eessas-bot.bsky.social

Huadai Liu, Jialei Wang, Kaicheng Luo, Wen Wang, Qian Chen, Zhou Zhao, Wei Xue: ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing https://arxiv.org/abs/2506.21448 https://arxiv.org/pdf/2506.21448 https://arxiv.org/html/2506.21448

June 27, 2025 at 6:35 AM

AlexLita

@alexiogrenier.bsky.social

I love this audio setup. ♥️ #AudioSetup #Hiby #ThinkSound #Nice #Audio #Sound #YES

June 1, 2025 at 12:39 PM

AlexLita

@alexiogrenier.bsky.social

I had to listen to this album with my new headphones. 😯🥰 So much details. 😯😀♥️ #AlitaArmy #Headphones #AudioExperience #Wow #ThinkSound #AlitaSoundtrack

May 31, 2025 at 6:55 PM

Taiterstan

@taiterstan.bsky.social

Just shopped Canadian with the purchase of gorgeous headphones from Ontario-based thinksound.

April 29, 2025 at 1:07 AM

Sean Olive

@seanolive.bsky.social

Never heard of Thinksound but they are in Burlington Ontario

thinksound.com/pages/contact

Contact Us

For general inquiries, questions about our products or to geek out with us on all things audio, please get in touch at support@thinksound.com thinksound Inc.PO Box 80057 RPO ApplebyBurlington, ON, L7L...

thinksound.com

March 5, 2025 at 10:24 PM

Chris Thomas

@hotgarbage.ca

Thinksound moved from my former neck of the woods in NH to ON a ways back. Given the CEO's focus on supply chain efficiency I have to imagine it's working well there. Kanto also makes speakers in BC.

TON of capable grads in BC, ON. Tax incentives, too! www.investcanada.ca/programs-inc...

Programs and incentives | Invest in Canada

Canada’s investment programs and incentives make it easy for your company to expand and succeed. Learn more.

www.investcanada.ca

March 5, 2025 at 9:49 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news