naoyukikandaslp.bsky.social
@naoyukikandaslp.bsky.social
I was just notified that our E2 TTS paper received the Best Paper Award at IEEE #SLT2024! Many thanks to all the remarkable collaborators who made this happen!

Paper: arxiv.org/abs/2406.18009
Demo: aka.ms/e2tts
December 5, 2024 at 3:38 AM
TS3-Codec: yet another audio codec from my former team—simple, fast, and high-quality.

Simple—just a stack of Transformer and linear layers; no convolutions.

Faster and better—superior audio reconstruction quality with fewer MACs compared to strong convolution-based baselines.
Developed TS3-Codec, a transformer-based audio codec; achieved comparable or superior performance to state-of-the-art convolution-based codecs with fewer parameters and computations under streaming conditions.
TS3-Codec: Transformer-Based Simple Streaming Single Codec
Haibin Wu, Naoyuki Kanda, Sefik Emre Eskimez, Jinyu Li
arxiv.org
December 3, 2024 at 3:53 AM
Reposted
Our GenAI-Speech team at Meta is hiring RS interns for summer 2025 to work on speech, LLMs, dialog generation, and other exciting stuff! Check out the job posting here: www.metacareers.com/jobs/3841154...
Research Scientist Intern, AI Research - Speech & Audio (PhD)
Meta's mission is to build the future of human connection and the technology that makes it possible.
www.metacareers.com
November 22, 2024 at 3:41 AM