Lightnews — Scholar-powered news

Shinji Watanabe

@shinjiw.bsky.social

I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.

Posts Replies Media Videos

Reposted by Shinji Watanabe

Kwanghee Choi

@juice500ml.bsky.social

Can self-supervised models 🤖 understand allophony 🗣? Excited to share my new #NAACL2025 paper: Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment arxiv.org/abs/2502.07029 (1/n)

April 29, 2025 at 5:00 PM

Shinji Watanabe

@shinjiw.bsky.social

📢 Introducing VERSA: our new open-source toolkit for speech & audio evaluation!

- 80+ metrics in one unified interface
- Flexible input support
- Distributed evaluation with Slurm
- ESPnet compatible

Check out the details
wavlab.org/activities/2...
github.com/wavlab-speec...

April 28, 2025 at 7:50 PM

Reposted by Shinji Watanabe

siddhant-arora.bsky.social

@siddhant-arora.bsky.social

New #NAACL2025 demo, Excited to introduce ESPnet-SDS, a new open-source toolkit for building unified web interfaces for both cascaded & end-to-end spoken dialogue system, providing real-time evaluation, and more!
📜: arxiv.org/abs/2503.08533
Live Demo: huggingface.co/spaces/Siddh...

March 17, 2025 at 2:29 PM

Reposted by Shinji Watanabe

siddhant-arora.bsky.social

@siddhant-arora.bsky.social

🚀 New #ICLR2025 Paper Alert! 🚀

Can Audio Foundation Models like Moshi and GPT-4o truly engage in natural conversations? 🗣️🔊

We benchmark their turn-taking abilities and uncover major gaps in conversational AI. 🧵👇

📜: arxiv.org/abs/2503.01174

March 5, 2025 at 4:03 PM

Reposted by Shinji Watanabe

Badr M. Abdullah, PhD

@badralabsi.bsky.social

📣 #SpeechTech & #SpeechScience people

We are organizing a special session at #Interspeech2025 on: Interpretability in Audio & Speech Technology

Check out the special session website: sites.google.com/view/intersp...

Paper submission deadline 📆 12 February 2025

December 6, 2024 at 9:30 PM

Reposted by Shinji Watanabe

Martijn Bartelds

@mbartelds.bsky.social

Excited to announce the launch of our ML-SUPERB 2.0 challenge @interspeech.bsky.social 2025! Join us in pushing the boundaries of multilingual ASR and LID! 🚀

💻 multilingual.superbbenchmark.org

December 4, 2024 at 6:09 PM

Shinji Watanabe

@shinjiw.bsky.social

We are excited to announce the launch of ML SUPERB 2.0 (multilingual.superbbenchmark.org) as part of the Interspeech 2024 official challenge! We hope this upgraded version of ML SUPERB advances universal access to speech processing worldwide. Please join it!

#Interspeech2025

December 4, 2024 at 2:45 PM

Shinji Watanabe

@shinjiw.bsky.social

This is my first official post at Bluesky with great news :)

We got the best paper award at IEEE SLT'24! This work elegantly and straightforwardly solves contextual biasing issues with dynamic vocabulary arxiv.org/abs/2405.13344. Congrats, Yui, Yosuke, Shakeel, and Yifan!
! I'm super happy!

December 4, 2024 at 2:16 PM

Reposted by Shinji Watanabe

Odette Scharenborg

@odettes.bsky.social

Hi speech people, super exciting news here!

We are running another "Multimodal information based speech (MISP)" Challenge at @interspeech.bsky.social

Participate!
Spread the word!

More info 👇
mispchallenge.github.io/mispchalleng...

Multimodal Information Based Speech Processing (MISP) 2025 Challenge

mispchallenge.github.io

November 25, 2024 at 11:25 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news