Andrew Chang
banner
candrew123.bsky.social
Andrew Chang
@candrew123.bsky.social
Postdoctoral researcher at NYU, working on computational cognitive neuroscience, audition (music and speech), and real-world communication. 🇹🇼🇨🇦🇺🇸
Huge thanks to co-authors @yikeli.bsky.social, Iran R. Roman, @davidpoeppel.bsky.social, and to the Interspeech reviewers for the perfect 4/4 score! 🙌

Can’t wait to present and discuss how this bridges machine and human perception! See you in Rotterdam!
June 2, 2025 at 7:00 PM
💥 Key Impact 3:
This paves the way for advances in #CognitiveComputing and audio-related brain–computer (#BCI) applications (e.g., sound/speech reconstruction).
June 2, 2025 at 7:00 PM
💥 Key Impact 2:
STM features link directly to brain processing, offering a more interpretable, biologically grounded representation.
June 2, 2025 at 7:00 PM
💥 Key Impact 1:
Without any pretraining, our STM-based DNN matches popular spectrogram-based models on speech, music, and environmental sound classification.
June 2, 2025 at 7:00 PM
While spectrogram-based audio DNNs excel, they’re often bulky, compute-heavy, hard to interpret, and data-hungry.
We explored an alternative: training a DNN on spectrotemporal modulation (#STM) features—an approach inspired by how the human auditory cortex processes sound.
June 2, 2025 at 7:00 PM
Reposted by Andrew Chang
why DO babies dance? when do they start dancing? what counts as dancing, anyway (and how can we measure it)? out online today in CDPS, @lkcirelli.bsky.social and i attempt to integrate what is known about the development of dance
journals.sagepub.com/doi/epub/10.... (2/4)
March 14, 2025 at 4:39 PM
I have emailed @interspeech.bsky.social, but it would be great if you could also reach out to them at pco@interspeech2025.org if this concerns you as well, so they understand that this will affect many people. I’m sure none of us want to be stuck writing a rebuttal in a hotel at #ICASSP!
March 12, 2025 at 5:26 PM
What's next? We are currently working on (1) refining our ML model by combining active learning and semi-supervised learning approaches and (2) experimenting with new human-computer interaction designs to mitigate negative experiences during videoconferencing. 7/end
March 10, 2025 at 7:24 PM
Beyond improving technical aspects like signal quality and latency of a videoconferencing system, social dynamics can deeply affect user experience. Our research paves the way for future enhancements by predicting and preventing conversational derailments in real time.
6/n
March 10, 2025 at 7:24 PM
One surprising insight: awkward silences—those long gaps in turn-taking—were more detrimental to conversational fluidity and enjoyment than chaotic overlaps or interruptions.
5/n
March 10, 2025 at 7:24 PM
We used multimodal ML on 100+ person-hours of videoconferences, modeling voice, facial expressions, and body movements. Key result: ROC-AUC 0.87 in predicting unfluid and unenjoyable moments and classifying various disruptive events, such as gaps and interruptions.
4/n
March 10, 2025 at 7:24 PM
Videoconferencing has become essential in our professional and personal lives, especially post-pandemic. Yet, we've all experienced the “derailed” moments, such as awkward pauses and uncoordinated turn-taking, and that can make virtual meetings less effective and enjoyable.
3/n
March 10, 2025 at 7:24 PM
See my thread below, and also this press release: www.nyu.edu/about/news-p...
2/n
Can AI Tell Us if Those Zoom Calls Are Flowing Smoothly? New Study Gives a Thumbs Up
Researchers find machine learning can predict how we rate social interactions in videoconference conversations
www.nyu.edu
March 10, 2025 at 7:24 PM
There is an excellent cross-cultural study on this topic by @norijacoby.bsky.social. A lay summary of the paper can be found here: www.aesthetics.mpg.de/en/research/...
Perception of pitch is culturally influenced
Study on cross-cultural music perception published in Current Biology
www.aesthetics.mpg.de
February 21, 2025 at 8:05 PM
Thanks for your comment. Yes there are several recent studies suggesting that chroma is not really an innate or universal property of pitch perception. Our study cannot answer this question, but we indeed found that the effect of chroma is much weaker than height.
February 21, 2025 at 7:56 PM
In short: By combining machine learning and MEG, we show how the brain’s dynamic pitch representation echoes ideas proposed over 100 years ago. Feels like completing a full circle in music cognitive neuroscience! Huge thanks to my collaborators! End/n
February 19, 2025 at 8:19 PM
The helix model reflects the idea that pitches separated by an octave (e.g., the repeating piano keys) are perceived as inherently similar. This concept was first explored in the early 1900s by Géza Révész, laying the groundwork for modern music cognition! 🧠🎹 6/n
February 19, 2025 at 8:19 PM