Jaesung Huh
jaesunghuh.bsky.social
Jaesung Huh
@jaesunghuh.bsky.social
PhD student @VGG_Oxford; ex-intern @Meta Reality Labs; Audio-visual learning
I'm releasing the audio-visual diarization pipeline that was used to create the VoxConverse dataset. Along with the original code, an enhanced version featuring new VAD and speaker verification models is now available.
February 20, 2025 at 4:25 AM
Since my PhD journey going towards the end (I’m currently looking for a full-time job in Research Engineer / Scientist positions!), I’m trying to open-source all the codes I’ve participated in! This is the first edition.

#VoxConverse #Speakerdiarization #Audiovisual
February 20, 2025 at 4:17 AM