Man könnte per diarization ein durchsuchbares Archiv aller Politikerinneninterviews machen...
Man könnte per diarization ein durchsuchbares Archiv aller Politikerinneninterviews machen...
🌐 Smart AI Transcriptions
#transcription #gdpr #speakerrecognition #fast&fair #diarization #ncaa #multilingual
🌐 Smart AI Transcriptions
#transcription #gdpr #speakerrecognition #fast&fair #diarization #ncaa #multilingual
Amazon Bedrock Data Automation now supports enhanced audio transcription with speaker diarization and channel identification, enabling separate processing of multi-party conversations. Available in 7 AWS regions.
Amazon Bedrock Data Automation now supports enhanced audio transcription with speaker diarization and channel identification, enabling separate processing of multi-party conversations. Available in 7 AWS regions.
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
https://arxiv.org/abs/2502.12714
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
https://arxiv.org/abs/2502.12714
Abstract: This paper describes the speaker diarization system developed for the Multimodal Information-Based Speech Processing (MISP) 2025 Challenge. First, we utilize the Sequence-to-Sequence Neural Diarization [1/3 of https://arxiv.org/abs/2505.16387v1]
Abstract: This paper describes the speaker diarization system developed for the Multimodal Information-Based Speech Processing (MISP) 2025 Challenge. First, we utilize the Sequence-to-Sequence Neural Diarization [1/3 of https://arxiv.org/abs/2505.16387v1]
https://github.com/QuentinFuxa/WhisperLiveKit
https://github.com/QuentinFuxa/WhisperLiveKit
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
A multilingual speech representation you can fine-tune for ASR or repurpose for tasks like diarization, keyword spotting, or alignment.
Abstract: Self-supervised learning (SSL) models like WavLM can be effectively utilized when building speaker diarization systems but are often large and slow, limiting their use in resource constrained scenarios. [1/4 of https://arxiv.org/abs/2505.24111v1]
Abstract: Self-supervised learning (SSL) models like WavLM can be effectively utilized when building speaker diarization systems but are often large and slow, limiting their use in resource constrained scenarios. [1/4 of https://arxiv.org/abs/2505.24111v1]
💬 Hacker News community praises audio-to-text projects & shares tools—support for diarization debated. 📈
https://news.ycombinator.com/item?id=45337400
💬 Hacker News community praises audio-to-text projects & shares tools—support for diarization debated. 📈
https://news.ycombinator.com/item?id=45337400
http://arxiv.org/abs/2309.08489
http://arxiv.org/abs/2309.08489
Incredible latency & accuracy *and* only $0.15p/h. Insane.
www.assemblyai.com/blog/introdu...
Incredible latency & accuracy *and* only $0.15p/h. Insane.
www.assemblyai.com/blog/introdu...