Shinji Watanabe
shinjiw.bsky.social
Shinji Watanabe
@shinjiw.bsky.social
I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
Reposted by Shinji Watanabe
Can self-supervised models 🤖 understand allophony 🗣? Excited to share my new #NAACL2025 paper: Leveraging Allophony in Self-Supervised Speech Models for Atypical Pronunciation Assessment arxiv.org/abs/2502.07029 (1/n)
April 29, 2025 at 5:00 PM
📢 Introducing VERSA: our new open-source toolkit for speech & audio evaluation!

- 80+ metrics in one unified interface
- Flexible input support
- Distributed evaluation with Slurm
- ESPnet compatible

Check out the details
wavlab.org/activities/2...
github.com/wavlab-speec...
April 28, 2025 at 7:50 PM
Reposted by Shinji Watanabe
New #NAACL2025 demo, Excited to introduce ESPnet-SDS, a new open-source toolkit for building unified web interfaces for both cascaded & end-to-end spoken dialogue system, providing real-time evaluation, and more!
📜: arxiv.org/abs/2503.08533
Live Demo: huggingface.co/spaces/Siddh...
March 17, 2025 at 2:29 PM
Reposted by Shinji Watanabe
🚀 New #ICLR2025 Paper Alert! 🚀

Can Audio Foundation Models like Moshi and GPT-4o truly engage in natural conversations? 🗣️🔊

We benchmark their turn-taking abilities and uncover major gaps in conversational AI. 🧵👇

📜: arxiv.org/abs/2503.01174
March 5, 2025 at 4:03 PM
Reposted by Shinji Watanabe
📣 #SpeechTech & #SpeechScience people

We are organizing a special session at #Interspeech2025 on: Interpretability in Audio & Speech Technology

Check out the special session website: sites.google.com/view/intersp...

Paper submission deadline 📆 12 February 2025
December 6, 2024 at 9:30 PM
Reposted by Shinji Watanabe
Excited to announce the launch of our ML-SUPERB 2.0 challenge @interspeech.bsky.social 2025! Join us in pushing the boundaries of multilingual ASR and LID! 🚀

💻 multilingual.superbbenchmark.org
December 4, 2024 at 6:09 PM
We are excited to announce the launch of ML SUPERB 2.0 (multilingual.superbbenchmark.org) as part of the Interspeech 2024 official challenge! We hope this upgraded version of ML SUPERB advances universal access to speech processing worldwide. Please join it!

#Interspeech2025
December 4, 2024 at 2:45 PM
This is my first official post at Bluesky with great news :)

We got the best paper award at IEEE SLT'24! This work elegantly and straightforwardly solves contextual biasing issues with dynamic vocabulary arxiv.org/abs/2405.13344. Congrats, Yui, Yosuke, Shakeel, and Yifan!
! I'm super happy!
December 4, 2024 at 2:16 PM
Reposted by Shinji Watanabe
Hi speech people, super exciting news here!

We are running another "Multimodal information based speech (MISP)" Challenge at @interspeech.bsky.social

Participate!
Spread the word!

More info 👇
mispchallenge.github.io/mispchalleng...
Multimodal Information Based Speech Processing (MISP) 2025 Challenge
mispchallenge.github.io
November 25, 2024 at 11:25 AM