winbuzzer.com/2025/11/11/a...
AI Speech Recognition and Transcription: New Meta AI System Supports over 1,600 Languages
#AI #MetaAI #OpenSource #ASR #SpeechRecognition #MetaFAIR #LanguageTechnology #Developer #DeepLearning #NLP #Linguistics #Multilingual
AI Speech Recognition and Transcription: New Meta AI System Supports over 1,600 Languages
#AI #MetaAI #OpenSource #ASR #SpeechRecognition #MetaFAIR #LanguageTechnology #Developer #DeepLearning #NLP #Linguistics #Multilingual
AI Speech Recognition and Transcription: New Meta AI System Supports over 1,600 Languages - WinBuzzer
Meta's FAIR division has released Omnilingual ASR, a free, open-source speech recognition model that supports over 1,600 languages, including 500 for the first time.
winbuzzer.com
November 11, 2025 at 8:07 AM
winbuzzer.com/2025/11/11/a...
AI Speech Recognition and Transcription: New Meta AI System Supports over 1,600 Languages
#AI #MetaAI #OpenSource #ASR #SpeechRecognition #MetaFAIR #LanguageTechnology #Developer #DeepLearning #NLP #Linguistics #Multilingual
AI Speech Recognition and Transcription: New Meta AI System Supports over 1,600 Languages
#AI #MetaAI #OpenSource #ASR #SpeechRecognition #MetaFAIR #LanguageTechnology #Developer #DeepLearning #NLP #Linguistics #Multilingual
Played a bit with the SpeechRecognition API 🤩
Here’s my playground: codepen.io/leaverou/pen...
Safari claims to support it, but I couldn't get it to recognize any non-English language. Can you?
Also, it seems *exceedingly* slow. Like 5-6 seconds from the moment you stop speaking.
Here’s my playground: codepen.io/leaverou/pen...
Safari claims to support it, but I couldn't get it to recognize any non-English language. Can you?
Also, it seems *exceedingly* slow. Like 5-6 seconds from the moment you stop speaking.
SpeechRecognition demo
...
codepen.io
October 24, 2025 at 3:08 PM
Played a bit with the SpeechRecognition API 🤩
Here’s my playground: codepen.io/leaverou/pen...
Safari claims to support it, but I couldn't get it to recognize any non-English language. Can you?
Also, it seems *exceedingly* slow. Like 5-6 seconds from the moment you stop speaking.
Here’s my playground: codepen.io/leaverou/pen...
Safari claims to support it, but I couldn't get it to recognize any non-English language. Can you?
Also, it seems *exceedingly* slow. Like 5-6 seconds from the moment you stop speaking.
Don't miss our January offers!
Buy a 1 year speech recognition subscription this January, and we'll give you an extra 2 months for free.
Find out more and claim this offer at www.lexacom.co.uk/january-offe...
#nhs #nhsdigital #speechrecognition #healthtech
Buy a 1 year speech recognition subscription this January, and we'll give you an extra 2 months for free.
Find out more and claim this offer at www.lexacom.co.uk/january-offe...
#nhs #nhsdigital #speechrecognition #healthtech
January 27, 2025 at 8:58 AM
Don't miss our January offers!
Buy a 1 year speech recognition subscription this January, and we'll give you an extra 2 months for free.
Find out more and claim this offer at www.lexacom.co.uk/january-offe...
#nhs #nhsdigital #speechrecognition #healthtech
Buy a 1 year speech recognition subscription this January, and we'll give you an extra 2 months for free.
Find out more and claim this offer at www.lexacom.co.uk/january-offe...
#nhs #nhsdigital #speechrecognition #healthtech
Shutting down my computer with my voice
This voice command (shut down) shuts the computer down.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
This voice command (shut down) shuts the computer down.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
December 29, 2024 at 10:27 PM
Shutting down my computer with my voice
This voice command (shut down) shuts the computer down.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
This voice command (shut down) shuts the computer down.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
Tech question: Is there a #speechrecognition software that:
1) works for generating #video captions (I would like to use them for #peertube videos)
2) is trainable
3) is #foss ?
1) works for generating #video captions (I would like to use them for #peertube videos)
2) is trainable
3) is #foss ?
September 17, 2025 at 10:07 AM
Tech question: Is there a #speechrecognition software that:
1) works for generating #video captions (I would like to use them for #peertube videos)
2) is trainable
3) is #foss ?
1) works for generating #video captions (I would like to use them for #peertube videos)
2) is trainable
3) is #foss ?
SA‑Whisper extends Whisper to transcribe overlapping speech with speaker tags, achieving lower word error rates on the LibriMix benchmark via joint decoding. Read more: https://getnews.me/speaker-attributed-whisper-model-improves-multi-talker-speech-recognition/ #speechrecognition #whispermodel
October 8, 2025 at 7:51 AM
SA‑Whisper extends Whisper to transcribe overlapping speech with speaker tags, achieving lower word error rates on the LibriMix benchmark via joint decoding. Read more: https://getnews.me/speaker-attributed-whisper-model-improves-multi-talker-speech-recognition/ #speechrecognition #whispermodel
I also made a SpeechRecognition polyfill (utilizing an open source STT server) a while ago that's useful for speech-driven experiences on platforms like Quest where SpeechRecognition isn't implemented natively in the browser https://github.com/msub2/sepia-speechrecognition-polyfill
July 3, 2023 at 6:26 PM
I also made a SpeechRecognition polyfill (utilizing an open source STT server) a while ago that's useful for speech-driven experiences on platforms like Quest where SpeechRecognition isn't implemented natively in the browser https://github.com/msub2/sepia-speechrecognition-polyfill
Restarting my computer with my voice
This voice command (restart) restarts the computer.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
This voice command (restart) restarts the computer.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
December 29, 2024 at 11:09 PM
Restarting my computer with my voice
This voice command (restart) restarts the computer.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
This voice command (restart) restarts the computer.
Demo of this command in the video below
#OpenSource
#Accessiblity
#opensource
#ergonomics
#coding
#python
#speechrecognition
#programming
#oss
#accessibility
🚀 Meet Solaria! Our revolutionary AI transforms call centers with real-time, multilingual transcription in 100 languages—imagine handling customer queries in any language instantly! How do you think AI can elevate customer service? 🌍💬 #AI #CustomerExperience #SpeechRecognition LINK
April 2, 2025 at 4:18 PM
🚀 Meet Solaria! Our revolutionary AI transforms call centers with real-time, multilingual transcription in 100 languages—imagine handling customer queries in any language instantly! How do you think AI can elevate customer service? 🌍💬 #AI #CustomerExperience #SpeechRecognition LINK
Breakthrough: Slam-1 speech model in public beta 🎙️
Customizable prompts enhance transcription accuracy for industry-specific needs, no complex development required.
Will this revolutionize speech-to-text technology?
#AI #SpeechRecognition
Customizable prompts enhance transcription accuracy for industry-specific needs, no complex development required.
Will this revolutionize speech-to-text technology?
#AI #SpeechRecognition
April 25, 2025 at 11:07 AM
Breakthrough: Slam-1 speech model in public beta 🎙️
Customizable prompts enhance transcription accuracy for industry-specific needs, no complex development required.
Will this revolutionize speech-to-text technology?
#AI #SpeechRecognition
Customizable prompts enhance transcription accuracy for industry-specific needs, no complex development required.
Will this revolutionize speech-to-text technology?
#AI #SpeechRecognition
Some nice aspects are:
- flexible choice of VectorStore
- own model routing
- multi-language support using multiple stores
- flexible authentication options
- SSL cert supported
- backend can be swapped, if needed.
- SpeechRecognition
Plus, I wrote a customer-branded WebUI
- flexible choice of VectorStore
- own model routing
- multi-language support using multiple stores
- flexible authentication options
- SSL cert supported
- backend can be swapped, if needed.
- SpeechRecognition
Plus, I wrote a customer-branded WebUI
November 3, 2025 at 7:11 PM
Some nice aspects are:
- flexible choice of VectorStore
- own model routing
- multi-language support using multiple stores
- flexible authentication options
- SSL cert supported
- backend can be swapped, if needed.
- SpeechRecognition
Plus, I wrote a customer-branded WebUI
- flexible choice of VectorStore
- own model routing
- multi-language support using multiple stores
- flexible authentication options
- SSL cert supported
- backend can be swapped, if needed.
- SpeechRecognition
Plus, I wrote a customer-branded WebUI
🚀 لما نقول "جيل جديد من التعرف على الكلام" يبقى لازم نذكر أحدث إطلاق من Alibaba:
Qwen3-ASR 🎙️ … نموذج واحد يغطي كل شيء!
#Qwen3ASR #AI #ASR #Transcription #حسام_الدين_حسن #ePreneurs #SpeechRecognition #AItools #الذكاء_الاصطناعي
Qwen3-ASR 🎙️ … نموذج واحد يغطي كل شيء!
#Qwen3ASR #AI #ASR #Transcription #حسام_الدين_حسن #ePreneurs #SpeechRecognition #AItools #الذكاء_الاصطناعي
September 10, 2025 at 12:15 PM
🚀 لما نقول "جيل جديد من التعرف على الكلام" يبقى لازم نذكر أحدث إطلاق من Alibaba:
Qwen3-ASR 🎙️ … نموذج واحد يغطي كل شيء!
#Qwen3ASR #AI #ASR #Transcription #حسام_الدين_حسن #ePreneurs #SpeechRecognition #AItools #الذكاء_الاصطناعي
Qwen3-ASR 🎙️ … نموذج واحد يغطي كل شيء!
#Qwen3ASR #AI #ASR #Transcription #حسام_الدين_حسن #ePreneurs #SpeechRecognition #AItools #الذكاء_الاصطناعي
#WhisperWeb: Run #OpenAI's #Whisper Large v3 Turbo locally in-browser 🎙️💻 #AI transcribes 25min audio in <20s. 100% free, no internet needed. Built with #TransformersJS. #SpeechRecognition #MLPowered
github.com/xenova/whisp...
github.com/xenova/whisp...
GitHub - xenova/whisper-web: ML-powered speech recognition directly in your browser
ML-powered speech recognition directly in your browser - xenova/whisper-web
github.com
October 10, 2024 at 9:08 PM
#WhisperWeb: Run #OpenAI's #Whisper Large v3 Turbo locally in-browser 🎙️💻 #AI transcribes 25min audio in <20s. 100% free, no internet needed. Built with #TransformersJS. #SpeechRecognition #MLPowered
github.com/xenova/whisp...
github.com/xenova/whisp...
JMIR Formative Res: Preprocessing Large-Scale Conversational Datasets: A Framework and Its Application to Behavioral Health Transcripts #AI #DataScience #MachineLearning #SpeechRecognition #DataPreprocessing
Preprocessing Large-Scale Conversational Datasets: A Framework and Its Application to Behavioral Health Transcripts
Background: The rise of AI and accessible audio equipment has led to a proliferation of recorded conversation transcripts datasets across various fields. However, automatic mass recording and transcription often produce noisy, unstructured data. First, these datasets naturally include unintended recordings, such as hallway conversations, background noise and media (e.g., TV programs, radio, phone calls). Second, automatic speech recognition (ASR) and speaker diarization errors can result in misidentified words, speaker misattributions, and other transcription inaccuracies. As a result, large conversational transcript datasets require careful preprocessing and filtering to ensure their research utility. This challenge is particularly relevant in behavioral health contexts (e.g., therapy, treatment, counselling): while these transcripts offer valuable insights into patient-provider interactions, therapeutic techniques, and client progress, they must accurately represent the conversations to support meaningful research. Objective: We present a framework for preprocessing and filtering large datasets of conversational transcripts and apply it to a dataset of behavioral health transcripts from community mental health clinics across the United States. Within this framework we explore tools to efficiently filter non-sessions – transcripts of recordings in these clinics that do not reflect a behavioral treatment session but instead capture unrelated conversations or background noise. Methods: Our framework integrates basic feature extraction, human annotation, and advanced applications of large language models (LLMs). We begin by mapping transcription errors and assessing the distribution of sessions and non-sessions. Next, we identify key features to analyze how outliers help in characterizing the type of transcript. Notably, we use LLM perplexity as a measure of comprehensibility to assess transcript noise levels. Finally, we use zero-shot LLM prompting to classify transcripts as sessions or non-sessions, validating LLM decisions against expert annotations. Throughout, we prioritize data security by selecting tools that preserve anonymity and minimize the risk of data breaches. Results: Our findings demonstrated that basic statistical outliers, such as speaking rate, are associated with transcription errors and are observed more frequently in non-sessions versus sessions. Specifically, LLM perplexity can flag fragmented and non-verbal segments and is generally lower in sessions (permutation test mean difference = -258, p
dlvr.it
October 24, 2025 at 7:54 PM
JMIR Formative Res: Preprocessing Large-Scale Conversational Datasets: A Framework and Its Application to Behavioral Health Transcripts #AI #DataScience #MachineLearning #SpeechRecognition #DataPreprocessing
📢️ Published today:
Dr Andrew Whiteley explains why voice recognition has the potential to be one of the most effective additions to the modern GP’s toolkit.
🗞 Read the full article on Healthcare Today.
#nhs #speechrecognition #digitaltransformation #healthcarenews @healthcaretoday.bsky.social
Dr Andrew Whiteley explains why voice recognition has the potential to be one of the most effective additions to the modern GP’s toolkit.
🗞 Read the full article on Healthcare Today.
#nhs #speechrecognition #digitaltransformation #healthcarenews @healthcaretoday.bsky.social
August 7, 2025 at 3:31 PM
📢️ Published today:
Dr Andrew Whiteley explains why voice recognition has the potential to be one of the most effective additions to the modern GP’s toolkit.
🗞 Read the full article on Healthcare Today.
#nhs #speechrecognition #digitaltransformation #healthcarenews @healthcaretoday.bsky.social
Dr Andrew Whiteley explains why voice recognition has the potential to be one of the most effective additions to the modern GP’s toolkit.
🗞 Read the full article on Healthcare Today.
#nhs #speechrecognition #digitaltransformation #healthcarenews @healthcaretoday.bsky.social
Почему НЛП является существенным в системах распознавания речи?
Услуги аудиоаннотации играют ключевую роль в обучении моделей машинного обучения для точного понимания и интерпретации звуковых данных. Эти услуги используют человеческих аннотаторов для маркировки, транск…
#ai #nlp #speechrecognition
Услуги аудиоаннотации играют ключевую роль в обучении моделей машинного обучения для точного понимания и интерпретации звуковых данных. Эти услуги используют человеческих аннотаторов для маркировки, транск…
#ai #nlp #speechrecognition
Why Is NLP Essential in Speech Recognition Systems?
dzone.com
June 27, 2025 at 3:04 PM
Почему НЛП является существенным в системах распознавания речи?
Услуги аудиоаннотации играют ключевую роль в обучении моделей машинного обучения для точного понимания и интерпретации звуковых данных. Эти услуги используют человеческих аннотаторов для маркировки, транск…
#ai #nlp #speechrecognition
Услуги аудиоаннотации играют ключевую роль в обучении моделей машинного обучения для точного понимания и интерпретации звуковых данных. Эти услуги используют человеческих аннотаторов для маркировки, транск…
#ai #nlp #speechrecognition
#Nvidia released a powerful new set of open-source tools, #Gramary, aimed at giving developers the power to build high-quality #speech #AI for 25 European #languages. It is curated to help teach AI the nuances of #SpeechRecognition & #translation. www.artificialintelligence-news.com/news/nvidia-...
NVIDIA aims to solve AI's issues with many languages
While AI might feel ubiquitous, it primarily operates in a tiny fraction of the world's 7,000 languages, a blind spot NVIDIA aims to fix.
www.artificialintelligence-news.com
August 23, 2025 at 12:46 PM
#Nvidia released a powerful new set of open-source tools, #Gramary, aimed at giving developers the power to build high-quality #speech #AI for 25 European #languages. It is curated to help teach AI the nuances of #SpeechRecognition & #translation. www.artificialintelligence-news.com/news/nvidia-...
NLP Applications: From Your Phone to Global Business #machinetranslation #speechrecognition #chatbottechnology #textanalytics #whatisnlp #namedentityrecognition #applicationsofnlpinhealthcare #topicmodeling #ailanguagemodels #nlpexamples
NLP Applications: From Your Phone to Global Business
Unlocking Our Digital World: The Real-World Natural Language Processing Applications You Use Every Day Ever wonder how your phone finishes your sentences? Or how Gmail knows that sketchy email is…
jivoice.com
October 24, 2025 at 11:34 AM
NLP Applications: From Your Phone to Global Business #machinetranslation #speechrecognition #chatbottechnology #textanalytics #whatisnlp #namedentityrecognition #applicationsofnlpinhealthcare #topicmodeling #ailanguagemodels #nlpexamples
"It saves a lot of time that I used to spend typing, editing and spell checking" - Dr Murad Khan.
Lexacom Echo scored 94% when we asked customers for their views on the accuracy of its speech recognition.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #speechrecognition #workflows #primarycare
Lexacom Echo scored 94% when we asked customers for their views on the accuracy of its speech recognition.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #speechrecognition #workflows #primarycare
January 13, 2025 at 9:46 AM
"It saves a lot of time that I used to spend typing, editing and spell checking" - Dr Murad Khan.
Lexacom Echo scored 94% when we asked customers for their views on the accuracy of its speech recognition.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #speechrecognition #workflows #primarycare
Lexacom Echo scored 94% when we asked customers for their views on the accuracy of its speech recognition.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #speechrecognition #workflows #primarycare
Spiralformer reduces token emission latency by 21.6% on Librispeech and 7.0% on CSJ while keeping accuracy, and the paper was accepted to the 2025 IEEE ASRU workshop. Read more: https://getnews.me/spiralformer-low-latency-encoder-for-streaming-speech-recognition/ #spiralformer #speechrecognition
October 3, 2025 at 5:37 AM
Spiralformer reduces token emission latency by 21.6% on Librispeech and 7.0% on CSJ while keeping accuracy, and the paper was accepted to the 2025 IEEE ASRU workshop. Read more: https://getnews.me/spiralformer-low-latency-encoder-for-streaming-speech-recognition/ #spiralformer #speechrecognition
Unlock the power of Speech Recognition with advanced Audio Codecs! Dive into how audio processing enables seamless interaction in IoT, automotive, and smart devices.
Explore the future of voice-driven tech. Read more: moschip.com/blog/spe...
#SpeechRecognition #AudioCodecs #IoT
Explore the future of voice-driven tech. Read more: moschip.com/blog/spe...
#SpeechRecognition #AudioCodecs #IoT
January 28, 2025 at 10:29 AM
Unlock the power of Speech Recognition with advanced Audio Codecs! Dive into how audio processing enables seamless interaction in IoT, automotive, and smart devices.
Explore the future of voice-driven tech. Read more: moschip.com/blog/spe...
#SpeechRecognition #AudioCodecs #IoT
Explore the future of voice-driven tech. Read more: moschip.com/blog/spe...
#SpeechRecognition #AudioCodecs #IoT
Trusted for over 25 years, our customers save more than 3 hours a week.
25,000 clinicians use Lexacom software every day, amounting to an incredible 3.5 million NHS hours saved per year.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #nhsdigital #workflows #speechrecognition #digitaldictation
25,000 clinicians use Lexacom software every day, amounting to an incredible 3.5 million NHS hours saved per year.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #nhsdigital #workflows #speechrecognition #digitaldictation
December 16, 2024 at 11:13 AM
Trusted for over 25 years, our customers save more than 3 hours a week.
25,000 clinicians use Lexacom software every day, amounting to an incredible 3.5 million NHS hours saved per year.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #nhsdigital #workflows #speechrecognition #digitaldictation
25,000 clinicians use Lexacom software every day, amounting to an incredible 3.5 million NHS hours saved per year.
Read more: www.lexacom.co.uk/proven-time-...
#nhs #nhsdigital #workflows #speechrecognition #digitaldictation
Relied on by most UK GPs in their daily work, choose Lexacom for unparalleled security on UK based servers, and industry leading speech recognition.
Learn more at www.lexacom.co.uk/helping-clin...
#nhs #healthtech #nhsAI #workflows #speechrecognition #nhsdigital
Learn more at www.lexacom.co.uk/helping-clin...
#nhs #healthtech #nhsAI #workflows #speechrecognition #nhsdigital
July 4, 2025 at 8:16 AM
Relied on by most UK GPs in their daily work, choose Lexacom for unparalleled security on UK based servers, and industry leading speech recognition.
Learn more at www.lexacom.co.uk/helping-clin...
#nhs #healthtech #nhsAI #workflows #speechrecognition #nhsdigital
Learn more at www.lexacom.co.uk/helping-clin...
#nhs #healthtech #nhsAI #workflows #speechrecognition #nhsdigital