Lightnews — Scholar-powered news

Moayed Haji ALi

@moayedha.bsky.social

5 followers 9 following 8 posts

Phd @RiceUniversity | Research Intern @Snap

Posts Replies Media Videos

Moayed Haji ALi

@moayedha.bsky.social

A great collaboration with
W. Menapace, A. Siarohin, I. Skorokhodov, A. Canberk, K.S Lee, V. Ordonez, and S. Tulyakov.

Please repost to support our work and check out our
Arxiv preprint: arxiv.org/abs/2412.15191
Webpage: snap-research.github.io/AVLink/

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

We propose AV-Link, a unified framework for Video-to-Audio and Audio-to-Video generation that leverages the activations of frozen video and audio diffusion models for temporally-aligned cross-modal co...

arxiv.org

January 14, 2025 at 6:13 PM

Moayed Haji ALi

@moayedha.bsky.social

While current approaches uses external pretrained features (e.g. Meta CLIP, BEATs), we found that diffusion activations hold rich, semantically and temporally aware features, making them perfect for cross-modal generation in a self-contained framework.

🔊➡️📽️ Example:

January 14, 2025 at 6:13 PM

Moayed Haji ALi

@moayedha.bsky.social

Besides Video to Audio (📽️ ➡️🔊), we also support Audio to Video (🔊➡️📽️) generation under the same unified framework.

January 14, 2025 at 6:13 PM

Moayed Haji ALi

@moayedha.bsky.social

Compared to Meta Movie Gen Video to Audio, we achieve significantly better temporal synchronization with a 90% smaller scale model.

January 14, 2025 at 6:13 PM

Moayed Haji ALi

@moayedha.bsky.social

recise temporal synchronization remains a significant challenge for current video-to-audio models. AV-Link addresses this by leveraging diffusion features to accurately capture both local and global temporal events, such as hand slides on a guitar and fretboard pitch changes.

January 14, 2025 at 6:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news