Auditory-Visual Speech Association (AVISA)
banner
avsp.bsky.social
Auditory-Visual Speech Association (AVISA)
@avsp.bsky.social
The official(ish) account of the Auditory-VIsual Speech Association (AVISA) AV 👄 👓 speech references, but mostly what interests me avisa.loria.fr
Pinned
A teaser for the next instalment of AVSP Visionaries
youtube.com/watch?v=y2e9...
Neural Tracking of the Maternal Voice in the Infant Brain
www.jneurosci.org/content/earl...
Used TRF to look at how 7-month-old human infants track maternal vs unfamiliar speech & if this affects simultaneous face processing - maternal speech enhances neural tracking & alter how faces are processed
Neural Tracking of the Maternal Voice in the Infant Brain
Infants preferentially process familiar social signals, but the neural mechanisms underlying continuous processing of maternal speech remain unclear. Using EEG-based neural encoding models based on te...
www.jneurosci.org
November 11, 2025 at 8:59 PM
Visual Speech Reduces Cognitive Effort as Measured by EEG Theta Power and Pupil Dilation
www.eneuro.org/content/12/1...
Combined pupillometry & EEG to investigate how visual speech cues modulate cognitive effort during speech recognition ... code/software & data github.com/brianman515/... - nice!
Visual Speech Reduces Cognitive Effort as Measured by EEG Theta Power and Pupil Dilation
Listening effort reflects the cognitive and motivational resources allocated to speech comprehension, particularly under challenging conditions. Visual cues are known to enhance speech perception, pot...
www.eneuro.org
November 11, 2025 at 8:52 PM
Reposted by Auditory-Visual Speech Association (AVISA)
happy to share our new paper, out now in Neuron! led by the incredible Yizhen Zhang, we explore how the brain segments continuous speech into word-forms and uses adaptive dynamics to code for relative time - www.sciencedirect.com/science/arti...
Human cortical dynamics of auditory word form encoding
We perceive continuous speech as a series of discrete words, despite the lack of clear acoustic boundaries. The superior temporal gyrus (STG) encodes …
www.sciencedirect.com
November 7, 2025 at 6:16 PM
Correlation detection as a stimulus computable account for AV perception, causal inference & saliency maps in mammals elifesciences.org/articles/106... Image- & sound-computable population model for AV perception -> Used simulation to model psychophysical, eye-tracking & pharmacological experiments
Correlation detection as a stimulus computable account for audiovisual perception, causal inference, and saliency maps in mammals
Optimal cue integration, Bayesian Causal Inference, spatial orienting, speech illusions and other key phenomena in audiovisual perception naturally emerge from the collective behavior of a population ...
elifesciences.org
November 7, 2025 at 12:22 PM
A richly annotated dataset of co-speech hand gestures across diverse speaker contexts www.nature.com/articles/s41... Dataset comprising 2373 annotated gestures, 9 speakers across 3 distinct categories: University lecturers, Politicians, and Psychotherapists can be accessed at doi.org/10.17605/OSF...
A richly annotated dataset of co-speech hand gestures across diverse speaker contexts - Scientific Data
Scientific Data - A richly annotated dataset of co-speech hand gestures across diverse speaker contexts
www.nature.com
November 6, 2025 at 12:38 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Applications are now open for the MARCS International Visiting Scholar Program 2026! 🌏

We are pleased to offer scholarships for PhD students and postdocs for visits of 1–3 months before the end of 2026.

📅 Applications close 4 December 2025.

If you are interested, email marcs@westernsydney.edu.au
November 5, 2025 at 10:38 PM
Distinct Portions of Superior Temporal Sulcus Combine Auditory Representations with Different Visual Streams www.jneurosci.org/content/45/4... Analysed open-source auditory cortex fMRI data from people watching a movie with ANNs to investigate STS & 2 visual streams
Distinct Portions of Superior Temporal Sulcus Combine Auditory Representations with Different Visual Streams
In humans, the superior temporal sulcus (STS) combines auditory and visual information. However, the extent to which it relies on visual information from the ventral or dorsal stream remains uncertain...
www.jneurosci.org
November 5, 2025 at 9:34 PM
When brain talks back to the eye "The state of our brain shapes what we see, but how early in the visual system does this start? A new study in PLOS Biology shows that brain state-dependent release of histamine modulates the very first stage of vision in the retina" journals.plos.org/plosbiology/...
When the brain talks back to the eye
The state of our brain shapes what we see, but how early in the visual system does this start? This Primer explores a new PLOS Biology study which shows that brain state-dependent release of histamine...
journals.plos.org
November 5, 2025 at 9:30 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Here's an interesting new study exploring whether LLMs are able to understand the narrative sequencing of comics and... even the best AI models are *terrible* at it for pretty much all tasks that were analyzed aclanthology.org/2025.finding...
Beyond Single Frames: Can LMMs Comprehend Implicit Narratives in Comic Strip?
Xiaochen Wang, Heming Xia, Jialin Song, Longyu Guan, Qingxiu Dong, Rui Li, Yixin Yang, Yifan Pu, Weiyao Luo, Yiru Wang, Xiangdi Meng, Wenjie Li, Zhifang Sui. Findings of the Association for Computatio...
aclanthology.org
November 4, 2025 at 8:05 PM
Reposted by Auditory-Visual Speech Association (AVISA)

Can you feel what I am saying? Speech-based vibrotactile stimulation enhances the cortical tracking of attended speech in a multi-talker background

https://www.biorxiv.org/content/10.1101/2025.10.31.685484v1
November 2, 2025 at 3:28 AM
Audiovisual Synchrony in Left-hemisphere Brain-lesioned Individuals with Aphasia direct.mit.edu/jocn/article... Found a statistically significant effect of aphasia type on measures of AV synchrony; an effect not explained by lesion volume...damage to left posterior temporal bad for AV processing.
Audiovisual Synchrony in Left-hemisphere Brain-lesioned Individuals with Aphasia
Abstract. We investigated the ability of 40 left-hemisphere brain-lesioned individuals with various diagnoses of aphasia to temporally synchronize the audio of a spoken word to its congruent video usi...
direct.mit.edu
November 1, 2025 at 5:17 AM
Expectation-driven shifts in perception and production pubs.aip.org/asa/jasa/art... Failed to find evidence that individuals' expectation-driven shifts in perception correlate with those in production ...
Expectation-driven shifts in perception and production
While phonetic convergence has been taken as evidence for tight perception–production links, attempts to correlate perceptual adjustments with production shifts
pubs.aip.org
October 29, 2025 at 12:49 AM
Yes, the McGurk effect, that's right -> "The influence of age, listener sex, and speaker sex on the McGurk effect" journals.sagepub.com/doi/10.1177/... Are reports of higher sensitivity to the McGurk effect in females than males influenced by the match of Listener-Speaker sex?
Sage Journals: Discover world-class research
Subscription and open access journals from Sage, the world's leading independent academic publisher.
journals.sagepub.com
October 29, 2025 at 12:46 AM
Audiovisual speech perception in Mandarin cochlear implant users across age and listening conditions www.sciencedirect.com/science/arti... "AV cues play a critical role in speech perception for Mandarin-speaking CI users, especially under acoustically challenging conditions"
Audiovisual speech perception in Mandarin cochlear implant users across age and listening conditions
To investigate how visual cues influence speech recognition in Mandarin-speaking cochlear implant (CI) users and examine age-related differences in au…
www.sciencedirect.com
October 28, 2025 at 12:17 AM
Visual induction of spatial release from masking during speech perception in noise pubs.aip.org/asa/jel/arti... "There was no enhancement of auditory SRM through visual spatial separation" (shown before) - It did have a negative effect though i.e., to "disrupt existing auditory SRM" ...
Visual induction of spatial release from masking during speech perception in noise
Spatially separating target and masker talkers improves speech perception in noise, an effect known as spatial release from masking (SRM). Independently, the pe
pubs.aip.org
October 27, 2025 at 9:50 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Attentional engagement with target and distractor streams predicts speech comprehension in multitalker environments https://pubmed.ncbi.nlm.nih.gov/41136338/
October 25, 2025 at 2:06 PM
Visible Pre-acoustic Lip Motion Aids Listeners’ Judgments of Speech Onset Times www.researchgate.net/profile/Pete... Visible lip motion facilitates detecting acoustic speech onset ....
www.researchgate.net
October 25, 2025 at 8:07 AM
Ok, yes, sometimes I post just because of the title Oh, multimodality where art thou? Raised eyebrows in the constructional network www.jbe-platform.com/content/jour... Argues that knowledge about Tell Me About It & raised eyebrows is structured as a nested network of uni- & multimodal constructions
Oh, multimodality where art thou? | John Benjamins
Abstract The present paper explores the network of language-related knowledge about multimodal, stance-related uses of Tell me about it (TMAI) with a particular focus on the co-verbal use of raised ey...
www.jbe-platform.com
October 25, 2025 at 8:02 AM
The Influence of Facial Speech Phenomenon on the Learning Process of Children With Dyslexia: Aspects of Susceptibility & Dependency on Visual & Phonological Stimuli pubs.asha.org/doi/10.1044/... Investigated impact of different training programs (phonetic & visual) on learning (ps < .01 & .05, hmm)
The Influence of Facial Speech Phenomenon on the Learning Process of Children With Dyslexia: Aspects of Susceptibility and Dependency on Visual and Phonological Stimuli
Purpose: This study aimed to analyze audiovisual speech perception strategies in children with dyslexia, specifically addressing difficulties in ...
pubs.asha.org
October 23, 2025 at 8:39 PM
Reposted by Auditory-Visual Speech Association (AVISA)

Attentional disengagement during external and internal distractions reduces neural speech tracking in background noise

https://www.biorxiv.org/content/10.1101/2025.10.17.683146v1
October 20, 2025 at 2:16 PM
Exploring the McGurk Effect in Cochlear-Implant Users: A Systematic Review brill.com/view/journal... Systematic review carried out using the McGurk paradigm to understand the speech perception mechanism in CI-fitted individuals
brill.com
October 18, 2025 at 5:23 AM
sEEG Reveals Neural Signatures of Multisensory Integration in the Human Superior Temporal Sulcus during AV Speech Perception www.jneurosci.org/content/45/4... Used sEEG to directly record from the STG & STS in 42 epilepsy patients >STS takes the lead (i.e., faster that STF) in AV speech perception
Stereoelectroencephalography Reveals Neural Signatures of Multisensory Integration in the Human Superior Temporal Sulcus during Audiovisual Speech Perception
Human speech perception is multisensory, integrating auditory information from the talker's voice with visual information from the talker's face. BOLD fMRI studies have implicated the superior tempora...
www.jneurosci.org
October 15, 2025 at 8:59 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Evaluating the temporal order of motor and auditory systems in speech production using intracranial EEG https://pubmed.ncbi.nlm.nih.gov/41062786/
October 9, 2025 at 2:31 PM
"I would be so entertained if I found out an AI lab had wasted their time cheating on my dumb benchmark!" 😆@simonwillison.net
October 2, 2025 at 10:00 PM
Reposted by Auditory-Visual Speech Association (AVISA)
Mapping the task-general and task-specific neural correlates of speech production: Meta-analysis and fMRI direct comparisons of category fluency and picture naming https://pubmed.ncbi.nlm.nih.gov/41030322/
October 2, 2025 at 3:06 AM