Auditory-Visual Speech Association (AVISA)
banner
avsp.bsky.social
Auditory-Visual Speech Association (AVISA)
@avsp.bsky.social
The official(ish) account of the Auditory-VIsual Speech Association (AVISA) AV 👄 👓 speech references, but mostly what interests me avisa.loria.fr
Reposted by Auditory-Visual Speech Association (AVISA)
A deep neural network model of audiovisual speech recognition reports the McGurk effect https://pubmed.ncbi.nlm.nih.gov/41709058/
February 19, 2026 at 2:50 PM
Reposted by Auditory-Visual Speech Association (AVISA)
“Humans across multiple languages spontaneously associate the nonwords kiki & bouba with spiky & round shapes, respectively...We tested the bouba-kiki effect in baby chickens. Similar to humans, they spontaneously chose a spiky shape when hearing a kiki sound & a round shape when hearing a bouba.”😲🧪
Matching sounds to shapes: Evidence of the bouba-kiki effect in naïve baby chicks
Humans across multiple languages spontaneously associate the nonwords “kiki” and “bouba” with spiky and round shapes, respectively, a phenomenon named the bouba-kiki effect. To explore the origin of t...
www.science.org
February 19, 2026 at 7:20 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Multisensory simultaneity-judgment training and speech comprehension in patients with schizophrenia https://pubmed.ncbi.nlm.nih.gov/41701990/
February 18, 2026 at 2:52 PM
Acoustic Features of Emotional Vocalizations Account for Early Modulations of ERPs onlinelibrary.wiley.com/doi/10.1111/... "We conclude that acoustic features can account for early ERP modulations ...Because previous studies used a variety of stimuli, our result likely resolves previous disputes"
<em>Psychophysiology</em> | SPR Journal | Wiley Online Library
Electroencephalography (EEG) studies of the perception of emotional speech indicate that early components of event-related brain potentials (ERPs) are modulated by emotion. However, the direction and...
onlinelibrary.wiley.com
February 16, 2026 at 9:01 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Facial gestures are enacted through a cortical hierarchy of dynamic and stable codes | Science www.science.org/doi/10.1126/...

#neuroskyence
Facial gestures are enacted through a cortical hierarchy of dynamic and stable codes
Facial gestures are one fundamental set of communicative behaviors in primates, generated through the dynamic arrangement of many fine muscles. Anatomy shows that facial muscles are under direct contr...
www.science.org
February 16, 2026 at 8:17 AM
Some lips are red
Some eyes are blue
Seeing you speak
means I can identify more of the key words you said in noise
February 15, 2026 at 5:32 AM
Reposted by Auditory-Visual Speech Association (AVISA)
Watching Yourself Talk: Motor Experience Sharpens Sensitivity to Gesture-Speech Asynchrony

Tiziana Vercillo, Judith Holler, Uta Noppeney

www.biorxiv.org/content/10.6...
www.biorxiv.org
February 13, 2026 at 10:40 AM
Reposted by Auditory-Visual Speech Association (AVISA)
📚 Citation Classic

"Phonetic and phonological representation of stop consonant voicing"
Patricia Keating (1984)
Citations: 859+

Structured view of [voice] feature to phonetic implement...

🔗 https://www.jstor.org/stable/pdf/413642.pdf

#SpeechScience
February 12, 2026 at 12:19 PM
How does a deep neural network look at lexical stress in English words? pubs.aip.org/asa/jasa/art... CNNs trained to predict stress position from a spectrographic representation of disyllabic words ->92% accuracy on held-out tests, interpretability analysis >stressed vowel's 1st &2nd formants key
How does a deep neural network look at lexical stress in English words?
Despite their success in speech processing, neural networks often operate as black boxes, prompting the following questions: What informs their decisions, and h
pubs.aip.org
February 11, 2026 at 10:17 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Our latest paper, “Visual language models show widespread visual deficits on neuropsychological tests”, is now out in Nature Machine Intelligence: www.nature.com/articles/s42...

Non-paywalled version:
arxiv.org/abs/2504.10786

Tweet thread below from first author @genetang.bsky.social...
Visual language models show widespread visual deficits on neuropsychological tests - Nature Machine Intelligence
Tangtartharakul and Storrs use standardized neuropsychological tests to compare human visual abilities with those of visual language models (VLMs). They report that while VLMs excel in high-level obje...
www.nature.com
February 9, 2026 at 2:40 AM
X-modal processing of auditory & visual symbol representations in the temporo-parietal cortex
www.researchsquare.com/article/rs-8...
Slow-event-related 3T fMRI: A passive listening/viewing task auditory/visual letters & numbers overlapping activation in auditory cortex for auditory letters/numbers
Cross-modal processing of auditory and visual symbol representations in the temporo-parietal cortex
Numeracy and literacy are fundamental cognitive skills that rely on associating visual symbols with their spoken representations. Prior research has identified the posterior temporal-parietal cortex a...
www.researchsquare.com
February 9, 2026 at 1:49 AM
Attention decoding at the cocktail party: Preserved in hearing aid users, reduced in cochlear implant users www.sciencedirect.com/science/arti... 29 HA, 24 CI users & 29 age-matched TH people EEG attending 1 of 2 talkers (female/male) in free-field; EEG <-> envelope linear backward & forward models
Attention decoding at the cocktail party: Preserved in hearing aid users, reduced in cochlear implant users
Users of hearing aids (HAs) and cochlear implants (CIs) experience significant difficulty understanding a target speaker in multi-talker environments …
www.sciencedirect.com
February 8, 2026 at 9:27 PM
Toward Fuller Integration of Respiratory Rhythms Into Research on Infant Vocal&Motor Development nyaspubs.onlinelibrary.wiley.com/doi/10.1111/... Assays motor control,physiology,speech & language acquisition proposes respiration is core in early rhythmic coordination linking vocalization & movement
NYAS Publications
From birth, respiration constitutes an intrinsic rhythm. We suggest that vocalizations and bodily movements are interactively coordinated with this respiratory rhythm, providing a temporal framework ....
nyaspubs.onlinelibrary.wiley.com
February 7, 2026 at 3:23 AM
Human newborns form musical predictions based on rhythmic but not melodic structure journals.plos.org/plosbiology/... TRF analyses had high inter-individual variability for overall neural tracking of musical stimuli - note-by-note predictability tracked (not shuffled) a rhythmic not melodic effect
Human newborns form musical predictions based on rhythmic but not melodic structure
The ability to anticipate musical structure is a fundamental human trait, but whether it exists at birth is unclear. This study shows that newborns encode rhythmic expectations based on statistical re...
journals.plos.org
February 6, 2026 at 10:01 AM
Explaining the Musical Advantage in Speech Perception Through Beat Perception and Working Memory nyaspubs.onlinelibrary.wiley.com/doi/10.1111/... "Our findings clarify the cognitive and temporal foundations of the musician advantage and highlight the value of considering musical engagement"
NYAS Publications
Musical experience enhances speech-in-noise (SIN) perception, yet the mechanisms remain unclear. We tested 62 young adults using continuous measures of musical engagement, auditory and cognitive skil...
nyaspubs.onlinelibrary.wiley.com
February 5, 2026 at 10:47 PM
Individuals with congenital amusia show degraded performance in a nonword repetition task with lexical tones www.sciencedirect.com/science/arti... Nonword repetition task for syllable-tone combinations with length of the nonwords gradually increased from 1 to 7 syllables accuracy & error analysed
Individuals with congenital amusia show degraded performance in a nonword repetition task with lexical tones
Congenital amusia is a disorder characterized by abnormal pitch processing, including pitch encoding and pitch memory. Individuals with amusia were im…
www.sciencedirect.com
February 5, 2026 at 7:21 AM
Reposted by Auditory-Visual Speech Association (AVISA)
Early multimodal behavioral cues in autism: a micro-analytical exploration of actions, gestures and speech during naturalistic parent-child interactions https://pubmed.ncbi.nlm.nih.gov/41631016/
February 3, 2026 at 2:49 PM
Reposted by Auditory-Visual Speech Association (AVISA)
In 1961, physicist John Kelly programmed an IBM 704 to sing 'Daisy Bell' - the first song ever sung by a computer. This inspired HAL 9000's song in 2001: A Space Odyssey!

🎵 Historic: youtube.com/watch?v=41U78QP8nBk

#SpeechScience #Technology
February 1, 2026 at 10:58 AM
"We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done" scholar.google.com.au/scholar?oi=b... 😲
scholar.google.com.au
February 1, 2026 at 12:42 AM
Reposted by Auditory-Visual Speech Association (AVISA)
Cutaneous alternating current stimulation can cause a phasic modulation of speech perception https://pubmed.ncbi.nlm.nih.gov/41617605/
January 31, 2026 at 5:48 AM
Reposted by Auditory-Visual Speech Association (AVISA)

The cortical contribution to the speech-FFR is not modulated by visual information

https://www.biorxiv.org/content/10.64898/2026.01.26.701703v1
January 28, 2026 at 7:24 AM
Reposted by Auditory-Visual Speech Association (AVISA)
Audio-visual speech-in-noise tests for evaluating speech reception thresholds: A scoping review https://pubmed.ncbi.nlm.nih.gov/41592005/
January 28, 2026 at 2:34 AM
An embodied multi-articulatory multimodal language framework: A commentary on Karadöller et al
journals.sagepub.com/doi/10.1177/...
"we believe it shows that our understanding of the role of gesture in language is incomplete and lacks crucial insight when co-sign gesture is not accounted for"
An embodied multi-articulatory multimodal language framework: A commentary on Karadöller, Sümer and Özyürek - Rachel Miles, Shai Lynne Nielson, Deniz İlkbaşaran, Rachel I Mayberry, 2025
While many researchers working in spoken languages have used modality to distinguish language and gesture, this is not possible for sign language researchers. W...
journals.sagepub.com
January 26, 2026 at 9:22 PM
The involvement of endogenous brain rhythms in speech processing www.sciencedirect.com/science/arti... Reviews oscillation-based theories (dynamic attending, active sensing, asymmetric sampling in time, segmentation theories) & evidence > Naturalistic paradigms and resting-state data key to progress
The involvement of endogenous brain rhythms in speech processing
Endogenous brain rhythms are at the core of oscillation-based neurobiological theories of speech. These brain rhythms have been proposed to play a cru…
www.sciencedirect.com
January 23, 2026 at 9:03 PM
Children Sustain Their Attention on Spatial Scenes When Planning to Describe Spatial Relations Multimodally in Speech & Gesture onlinelibrary.wiley.com/doi/10.1111/... "How do children allocate visual attention to scenes as they prepare to describe them multimodally in speech and co-speech gesture?"
onlinelibrary.wiley.com
January 20, 2026 at 10:55 PM