Lightnews — Scholar-powered news

Groophz | Nation . Rainer Eschen (DJ Groophz)

@groophz.com

Lip-syncing is still a challenge with open source AI.

There‘s a new kid on the block by ByteDance: LatentSync.

I found suggestions for „better“ commercial tools:

Kling and Hedra.

My first tests with Kling were mediocre.

Hedra works better when using a calm voice and a face front shot.

January 9, 2025 at 9:22 AM

gigowat Ver.3.0

@gigowat.bsky.social

From TechnoEdge

アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー）

アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー） | テクノエッジ TechnoEdge

この1週間の気になる生成AI技術・研究をいくつかピックアップして解説する「生成AIウィークリー」（第78回）では、自律AIたちが研究プロセス全般を自動で実行するモデル「Agent Laboratory」、ロボットや自動運転車などの物理AI向けデジタル環境学習プラットフォーム「Cosmos」を取り上げます。

www.techno-edge.net

January 15, 2025 at 10:25 PM

Alex Ledante

@alxledante.bsky.social

Cassilda's Song (Kashiruda no Uta) youtu.be/pG5kv4ucmu4?... via @YouTube #4K, #60fps, #UHD, #horror, #animation, #surreal, #Lovecraft, #fantasy, #supernatural, #occult, #comfyui, #LatentSync, #HunyuanVideo, #mystery, #Carcosa, #Hastur, #cosmichorror, #Cthulhu, #Mythos, #Carcosa, #Hastur, #Surreal

Hastur the Unspeakable, the King in Yellow, cosmic horror inspired by
Zdzisław Beksiński

April 3, 2025 at 10:25 PM

arxiv cs.CV

@arxiv-cs-cv.bsky.social

Chunyu Li, Chao Zhang, Weikai Xu, Jinghui Xie, Weiguo Feng, Bingyue Peng, Weiwei Xing
LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync
https://arxiv.org/abs/2412.09262

December 13, 2024 at 9:50 AM

Tom Dörr

@tom-doerr.bsky.social

https://github.com/bytedance/LatentSync

February 14, 2025 at 6:23 AM

luokai

@luok.ai

Unlike previous methods based on pixel-space diffusion or two-stage generation, LatentSync leverages the powerful capabilities of Stable Diffusion.

January 4, 2025 at 2:37 PM

Alex Ledante

@alxledante.bsky.social

an Inhabitant of Carcosa youtu.be/53pmJRcfV0M?... via @YouTube #4K, #60fps, #UHD, #horror, #animation, #surreal, #Lovecraft, #fantasy, #supernatural, #occult, #comfyui, #LatentSync, #HunyuanVideo, #mystery, #Carcosa, #Hastur, #cosmichorror, #Cthulhu

an Inhabitant of Carcosa

YouTube video by Alex Ledante

youtu.be

April 17, 2025 at 10:08 PM

とれこめ

@trecome.bsky.social

https://trecome.info/articles/c974cf04-9c7e-434a-837a-f9f9407d5d82
【新着記事】
アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー）

January 16, 2025 at 6:03 AM

llmstock.com mc

@llmstock.bsky.social

Introducing LatentSync: An Advanced Open-Source Lip Sync Model for Enhanced Video Synthesis agientry.com/blog/255

Introducing LatentSync: An Advanced Open-Source Lip Sync Model for Enhanced Video Synthesis | LLMStock News

LatentSync, an innovative open-source lip sync model developed by ByteDance, is gaining attention for its superior performance in generating realistic and temporally consistent lip movements in videos...

agientry.com

January 7, 2025 at 1:13 PM

Tom Dörr

@tom-doerr.bsky.social

https://github.com/bytedance/LatentSync

June 27, 2025 at 4:10 AM

X Bot

@handle.invalid

@rohanpaul_ai https://x.com/rohanpaul_ai/status/1889869472292598220 #x-rohanpaul_ai

Github 👨‍🔧: Taming Stable Diffusion for Lip Sync!

Helps you generate lip-synced videos using audio-conditioned latent diffusion models.

This framework, LatentSync, directly models audio-visual cor...

February 13, 2025 at 3:15 AM

Dr Alexander Young

@alexanderfyoung.bsky.social

4. LatentSync (Bytedance) – Best Free Open-Source Option

Free if you’ve got a GPU.

Sharp, stable lip motion — for tinkerers who like full control.

August 10, 2025 at 7:58 AM

luokai

@luok.ai

During training, LatentSync employs a one-step method to obtain estimated clean latent representations from predicted noise, which are then decoded to produce estimated clean frames.

January 4, 2025 at 2:37 PM

luokai

@luok.ai

Additionally, LatentSync introduces a technique called Temporal REPresentation Alignment (TREPA) to enhance temporal consistency while maintaining lip-sync accuracy.

Github: github.com/bytedance/La...

GitHub - bytedance/LatentSync: Taming Stable Diffusion for Lip Sync!

Taming Stable Diffusion for Lip Sync! Contribute to bytedance/LatentSync development by creating an account on GitHub.

github.com

January 4, 2025 at 2:37 PM

Bookness and Thereness　本と出版と情報とその周辺のニュースまとめ

@bookness.bsky.social

アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー）
www.techno-edge.net/article/2025...

アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー） | テクノエッジ TechnoEdge

この1週間の気になる生成AI技術・研究をいくつかピックアップして解説する「生成AIウィークリー」（第78回）では、自律AIたちが研究プロセス全般を自動で実行するモデル「Agent Laboratory」、ロボットや自動運転車などの物理AI向けデジタル環境学習プラットフォーム「Cosmos」を取り上げます。

www.techno-edge.net

January 19, 2025 at 11:13 PM

Seamless

@seamless0202.bsky.social

アイディアを入力するだけ、自律AIたちが調査→実験→論文執筆の研究全般を行うAMD開発「Agent Laboratory」、音声に応じて自然な口パクを生成する動画AI「LatentSync」など生成AI技術5つを解説（生成AIウィークリー） www.techno-edge.net/article/2025... Agent Laboratoryは人間の介入度合いを調整できる自動論文生成器

January 15, 2025 at 11:52 PM

Juan Sanchez

@juanyobluesky.bsky.social

🧵 LatentSync: Sincronización de labios con IA y Latent Diffusion 🎤 👇

January 12, 2025 at 10:02 AM

Noam Naumovsky

@endlessblink.bsky.social

טכנולוגיות וידאו: TeaCache מציג מערכת מטמון חכמה למודלי דיפוזיה, ו-LatentSync מביא סנכרון שפתיים מדויק מבוסס דיפוזיה לטנטית.

December 29, 2024 at 11:37 AM

Kwame

@createwithkwame.bsky.social

Sync your voice to fun animated characters. People are making content that looks like cartoons and games. #AvatarSync #AnimatedVoices #LatentSync

May 15, 2025 at 5:53 PM

Juan Sanchez

@juanyobluesky.bsky.social

1️⃣ Descubre LatentSync, un framework de lip sync basado en modelos de Latent Diffusion, que revoluciona cómo se sincronizan labios en vídeos:

January 12, 2025 at 10:02 AM

luokai

@luok.ai

ByteDance has open-sourced a lip-sync model called LatentSync. LatentSync is an end-to-end lip-sync framework that does not rely on any intermediate motion representation, but instead models complex audio-visual correlations directly in the latent space.

January 4, 2025 at 2:37 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news