Lightnews — Scholar-powered news

Reposted by Miao Zhang

@mzloteanu.bsky.social

#statstab #465 How to embrace variation and accept
uncertainty in linguistic and
psycholinguistic data analysis

Thoughts: An accessible paper on communicating your results with nuance.

#bayes #bayesian #uncertainty #error #bias #guide #tutorial

sites.stat.columbia.edu/gelman/resea...

sites.stat.columbia.edu

November 21, 2025 at 5:19 PM

Reposted by Miao Zhang

Stefano Coretta

@scoretta.bsky.social

🎉 Finally out in Journal of Phonetics, tutorial with @paulbuerkner.com

📖 "Bayesian beta regressions with brms in R: A tutorial for phoneticians"

Accepted manuscript here: doi.org/10.31219/osf...

Repo: github.com/stefanocoret...

Publisher link: www.sciencedirect.com/science/arti...

a close up of a rat looking at the camera with the word drunken written in the corner

ALT: a close up of a rat looking at the camera with the word drunken written in the corner

media.tenor.com

November 15, 2025 at 3:21 PM

Reposted by Miao Zhang

Eleanor Chodroff

@echodroff.bsky.social

Excited to share our new preprint with @mzhang89.bsky.social : “A crosslinguistic corpus phonetic analysis of intrinsic vowel duration” 🎉

🔗 osf.io/preprints/ps...

OSF

osf.io

October 2, 2025 at 4:46 PM

Reposted by Miao Zhang

Adam L

@adam-lg.bsky.social

Bring back the iPod classic and the 3.5mm headphone jack

erika @yeeeerika.bsky.social · Jan 13

i don't want to hear your most boomer complaint. what's your most millennial complaint?

September 20, 2025 at 10:51 AM

Reposted by Miao Zhang

Adam L

@adam-lg.bsky.social

Simon Wood, the GOAT of generalized additive models & creator of the mgcv #rstats package, has an Annual Review of Statistics essay on GAMs, available open access #statssky #mlsky

www.annualreviews.org/content/jour...

September 10, 2025 at 2:15 AM

Miao Zhang

@mzhang89.bsky.social

I feel things corrected by Grammarly feel less AI-generated than those corrected by general AI tools (ChatGPT-like). Is it just my illusion?

September 5, 2025 at 1:58 PM

Reposted by Miao Zhang

Eleanor Chodroff

@echodroff.bsky.social

🗣️Mozilla Common Voice users!🗣️

Important notice: the client ID does not always correspond to a single speaker ID! Every so often, a single client ID contains more than one speaker’s voice. Our #Interspeech2025 paper examines the extent of this problem and proposes a solution

Interspeech 2025 poster on Quantifying and reducing speaker heterogeneity within the Common Voice Corpus

August 29, 2025 at 10:25 AM

Reposted by Miao Zhang

Eleanor Chodroff

@echodroff.bsky.social

✅Similarity scores: huggingface.co/datasets/pac...

📄Paper: www.isca-archive.org/interspeech_...

💻Code: github.com/pacscilab/CV...

💫This was joint work with @mzhang89.bsky.social, Aref Farhadipour, Annie Baker, Jiachen Ma, and Bogdan Pricop

pacscilab/VoxCommunis at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

August 29, 2025 at 10:25 AM

Reposted by Miao Zhang

Darin Flynn

@phono-logical.bsky.social

LabPhon 20 will be held in Montréal June 25–28, 2026, on the theme “Looking Back and Looking Forward,” to reflect on the field’s foundational contributions while highlighting new directions in laboratory phonology. Abstract submission deadline: Dec 1, 2025 labphon.org/labphon20/home

Home | Labphon

labphon.org

August 26, 2025 at 1:32 PM

Miao Zhang

@mzhang89.bsky.social

The similarity score file can be found in our VoxCommunis huggingface repo: huggingface.co/datasets/pac.... You can also see the scripts we used to obtain the similarity scores here: github.com/areffarhadi/...

August 21, 2025 at 10:50 AM

Miao Zhang

@mzhang89.bsky.social

We presented our attempt to clean the Common Voice client ID for phonetic analysis at Interspeech 2025. Please check the poster here: www.researchgate.net/publication/.... The paper is also available at: www.isca-archive.org/interspeech_...

(PDF) Quantifying and Reducing Speaker Heterogeneity within the Common Voice Corpus for Phonetic Analysis

PDF | With its crosslinguistic and cross-speaker diversity, the Mozilla Common Voice Corpus (CV) has been a valuable resource for multilingual speech... | Find, read and cite all the research you need...

www.researchgate.net

August 21, 2025 at 10:33 AM

Reposted by Miao Zhang

Josef Fruehwald

@jofrhwld.bsky.social

Introducing the tidynorm package! It's got convenience functions for applying your favorite vowel normalization methods to point measures, formant tracks, and DCT coefficients in a tidyverse workflow, as well as a flexible framework for defining your own normalization methods!

Introducing tidynorm – Væl Space

Here’s a brief introduction to the new tidynorm package.

jofrhwld.github.io

June 16, 2025 at 3:35 PM

Reposted by Miao Zhang

Association for Laboratory Phonology

@labphon.bsky.social

New insights into German #prosody! How do speakers & listeners distinguish utterance-medial vs. utterance-final #intonation boundaries in #German? Subtle differences in intonation, particularly in the rhyme's f0, are key cues for listeners. #LabPhon #openaccess #kinematics doi.org/10.16995/lab...

How final is final: The production and perception of utterance-medial and utterance-final boundaries

We examine the production and perception of two types of phrase-final prosodic boundaries, specifically, utterance-medial and utterance-final intonation phrase (IP) boundaries in German. These two typ...

www.journal-labphon.org

June 14, 2025 at 3:17 AM

Miao Zhang

@mzhang89.bsky.social

When people talk about neutralization in phonology, it's very important to check some phonetic data. It's very probable that we either didn't perceive it or overinterpreted some variance as non-natives.

June 10, 2025 at 8:47 PM

Reposted by Miao Zhang

Posit

@posit.co

ggplot2 is turning 18! 🎂

For nearly two decades, it’s helped data scientists turn complex data into clear, beautiful insights.

We’re throwing a birthday party at Data+AI Summit, with treats and limited-edition swag. Come celebrate with us and @hadley.nz!

📍 Posit Lounge (402)
📅 June 10, 6–8pm

June 9, 2025 at 9:58 PM

Miao Zhang

@mzhang89.bsky.social

arxiv.org/abs/2506.00733 Our Interspeech 2025 preprint.

Quantifying and Reducing Speaker Heterogeneity within the Common Voice Corpus for Phonetic Analysis

With its crosslinguistic and cross-speaker diversity, the Mozilla Common Voice Corpus (CV) has been a valuable resource for multilingual speech technology and holds tremendous potential for research i...

arxiv.org

June 3, 2025 at 1:15 PM

Miao Zhang

@mzhang89.bsky.social

The debate doesn't exist in China, we just call it [ʈʂi˥ aɪ˥ ɛ˧˥fu]

nat alison @tesseralis.site · May 26

GIF is actually pronounced [xif], where [x] is the voiceless velar fricative

Jerry Chen @jcsalterego.bsky.social · May 26

don’t worry guys, bluesky can be perfect so long as we all get along. first things first, we just need to agree on how "GIF" is pronounced

May 27, 2025 at 12:10 PM

Miao Zhang

@mzhang89.bsky.social

In case people don't use it very often, or never knew its existence, glimpse() from dplyr is a much better function to use when you want to have a very rough look at your dataset than head() or summary().

May 27, 2025 at 11:43 AM

Miao Zhang

@mzhang89.bsky.social

youtu.be/cE3bK5XXbDc?... I recently gave a brief MFA tutorial.

MFA Workshop on 28 April 2025 | 张淼 Miao ZHANG | PAPPS | ZA JASRA

YouTube video by ZA JASRA (Linguistics)

youtu.be

May 26, 2025 at 9:54 AM

Reposted by Miao Zhang

Rasmus Puggaard-Rode

@rpuggaardrode.bsky.social

💣 praatpicture version 1.4.0 on CRAN! 💣 (1/3)

January 14, 2025 at 10:46 AM

Reposted by Miao Zhang

Chris Offner

@chrisoffner3d.bsky.social

I also use Zotero for most of my reading. Other than that, the Google Scholar PDF Reader extension is the best thing w.r.t how citations are handled:
chromewebstore.google.com/detail/googl...

Google Scholar PDF Reader - Chrome Web Store

Supercharge your paper reading: follow references, skim outline, jump to figures, cite and save.

chromewebstore.google.com

May 21, 2025 at 8:43 AM

Reposted by Miao Zhang

Rasmus Puggaard-Rode

@rpuggaardrode.bsky.social

Very exciting news for PraatSauce users! We've just pushed a new version which fully rewrites the code base, making things faster and simpler to use.
Instead of settings parameters in a bunch of Praat windows, you now set them in a spreadsheet file that looks like this
github.com/kirbyj/praat...

May 21, 2025 at 3:19 PM

Miao Zhang

@mzhang89.bsky.social

I’m thrilled to announce that our paper, Quantifying and Reducing Speaker Heterogeneity within the Common Voice Corpus for Phonetic Analysis, coauthored with @echodroff.bsky.social, Aref Farhadipour, Jiachen Ma, Annie Baker and Bogdan Pricop from @cl-uzh.bsky.social was accepted for INTERSPEECH2025.

May 21, 2025 at 9:36 AM

Reposted by Miao Zhang

Matthieu Boisgontier

@matthieuboisgontier.com

When running ANOVAs in #R, use car::Anova().
aov() and anova() use Type I sums of squares, meaning that order matters, which can distort results in unbalanced designs. car::Anova() is safer because it uses Type II sums of squares by default), each effect is adjusted for all the other effects.

April 13, 2025 at 5:42 PM

Miao Zhang

@mzhang89.bsky.social

When you train an acoustic model for a tone language with tone labels in the phone set, do you keep the underlying tone label or the surface tone label?

April 13, 2025 at 6:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news