Amy D Willis
banner
amydwillis.bsky.social
Amy D Willis
@amydwillis.bsky.social
Biodiversity-loving, error bar-needing statistics nerd; Associate Professor @UWBiostat. Methods & software for #microbiome & #biodiversity data. She/her.
dude you can just text me with this stuff ❤️

% of microbiome (& changes) aren't identifiable from HTS, but stacked barplots are helpful for generating clues esp if big shifts

data snooping has implications for error rate control. there isn't a hypothesis you're interested in?

good luck & enjoy xx
November 3, 2025 at 6:17 PM
Thanks for your interest, @genomesevolve.bsky.social ! As a field, *on average*, I feel we have moved...

1. away from obsessing over alpha and beta diversity comparisons
2. towards comparisons that are less sensitive to rare/undetected species (than diversity)

So 📈

🥳
October 18, 2025 at 9:11 AM
The question was "...*on the same sequencing run*?"
July 22, 2025 at 1:33 PM
Given that a Poisson regression with log link targets the same parameter as your NB regression, I'd be curious to see the coverage of robust Wald CI's. `rigr` wraps this, so does `raoBust`, so should be easy to add.

(Sorry -- I'm at a workshop today or I'd do it myself)
July 19, 2025 at 7:55 PM
raoBust doesn't invert score tests @nlaroy.bsky.social, but it does implement (model-misspecification) robust score tests, which are amazing for inference.

Feel free to open a feature request. We'll see what we can do.

github.com/statdivlab/r...
GitHub - statdivlab/raoBust: Generalized Linear Models with robust and non-robust Wald and Rao (score) tests
Generalized Linear Models with robust and non-robust Wald and Rao (score) tests - statdivlab/raoBust
github.com
July 19, 2025 at 7:48 PM
Agreed that (eg) MAG assembly vs taxonomic estimation makes a huge difference to your answer. Also, please definitely read the Conflict of Interest statement for the shallow shotgun paper and note how much of its claims rely on bioinfomatic subsampling *and not actual shallow sequencing data*
July 1, 2025 at 10:20 AM
That's wonderful to hear, Isabelle!! Thanks for sharing, too. Hope to see you there in 2026.
June 30, 2025 at 3:57 PM
How do I describe data with a lot of zeroes? Sparse.

How do I describe data with a lot of variance? High-variance.

How do I describe data where the totals convey complex information about an unknown quantity I care about? (abundance)

I don't. I just state my assumptions.
GitHub - statdivlab/radEmu
Contribute to statdivlab/radEmu development by creating an account on GitHub.
github.com
May 6, 2025 at 7:25 PM
I hope you're all having a better week than me.

😽😽 7/6
May 6, 2025 at 7:15 PM
I leave you with the StatDivLab mantra:

1. choose something meaningful to estimate
2. choose a sensible way to estimate it
3. choose tests that control Type 1 error

That's what we will keep doing, even if anonymous reviewers insist on buzzwords.

6/6
May 6, 2025 at 7:15 PM
Saying "microbiome data is zero-inflated" leads people to seek out "zero-inflated models." Usually, these are bad methods with bad properties. Stay away. 5/6
May 6, 2025 at 7:15 PM
Why does this matter? This sort of thinking leads biologists to trust estimators based on highly-parametrised parametric models that are (1) surely misspecified and (2) have terrible properties under misspecification. 4/6
May 6, 2025 at 7:15 PM
Don't get me started on overdispersed, let alone compositional. Microbiome data is none of these, and I'm not new to this field. 3/6
May 6, 2025 at 7:15 PM
In fact, if you look at blanks and other control data, you see a lot of incorrect detections. There's better evidence that microbiome data is NON-ZERO inflated than zero-inflated.

2/6
May 6, 2025 at 7:15 PM