Levi Waldron
banner
leviwaldron1.bsky.social
Levi Waldron
@leviwaldron1.bsky.social
Professor of Biostatistics at CUNY SPH, rstat / Bioconductor enthusiast, cancer genomics/metagenomics, proud HPV OSCC cancer survivor.
also has an associated Bioconductor package `benchdamic` to facilitate benchmarking against commonly used methods (Calgaro et al, Bioinformatics 2023), which we profited from in this analysis academic.oup.com/bioinformati...
benchdamic: benchmarking of differential abundance methods for microbiome data
AbstractSummary. Recently, an increasing number of methodological approaches have been proposed to tackle the complexity of metagenomics and microbiome dat
academic.oup.com
February 19, 2025 at 3:14 PM
I like your idea, and it seems promising! No snub intended, just not yet in our scope of 'commonly used' or of reviewing broad benchmarking-focused studies. FYI our benchmarking is reproducible and hopefully easily extensible w/ code from github.com/waldronlab/M...
GitHub - waldronlab/MicrobiomeBenchmarkDataAnalyses: Analyses using the datasets provided by the MicrobiomeBenchmarkData package.
Analyses using the datasets provided by the MicrobiomeBenchmarkData package. - waldronlab/MicrobiomeBenchmarkDataAnalyses
github.com
February 19, 2025 at 3:14 PM
I guess I can thank you for breaking my long sojourn from social media :D I made a thread about our preprint and referenced your post - we agree on compositional normalization, but "normalization based methods don't work well" (including for RNAseq) is a big claim! bsky.app/profile/levi...
February 19, 2025 at 12:02 PM
So we agree that compositional normalization is problematic, but disagree about the simple, widely-used methods. Prove your method outperforms 17 methods in the 3 datasets we've provided, and I'll eat humble pie 🙂
February 19, 2025 at 11:53 AM
Discussion! @inschool4life.bsky.social says all these methods are bad and that his new alternative to normalization improves rigor in microbiome *and* RNAseq analysis, as demonstrated by a simulation and a real-data study of each. I hope you're right! bsky.app/profile/insc...
February 19, 2025 at 11:53 AM
My history on simple methods: In 2014, I benchmarked a simple prediction method (can be trained in a spreadsheet!) against penalized regression. In 27 independent microarray studies, it performed comparably or better than theoretically superior methods (incl. Lasso) academic.oup.com/bioinformati...
Más-o-menos: a simple sign averaging method for discrimination in genomic data analysis
Abstract. Motivation : The successful translation of genomic signatures into clinical settings relies on good discrimination between patient subgroups. Man
academic.oup.com
February 19, 2025 at 11:53 AM
Disclaimer: I favor simple methods for high-dimensional data analysis. They perform well in diverse settings. I'm skeptical of hand-selected benchmarks by researchers with a "horse in the race." It's a good start but far short of showing broad utility.
February 19, 2025 at 11:53 AM
Implications: Our findings suggest researchers should use widely adopted non-parametric or RNA-seq DA methods. Further development of compositional methods should include benchmarking against datasets with known biological ground truth.
February 19, 2025 at 11:53 AM
Key findings: we benchmarked 17 DA approaches and found compositional methods often lack sensitivity and show increased variability. Non-parametric and RNA-seq-derived methods performed best, challenging the assumption that compositional methods are superior.
February 19, 2025 at 11:53 AM
MicrobiomeBenchmarkData includes:
1) Oral microbiomes (supragingival vs. subgingival plaques)
2) Vaginal microbiomes (healthy vs. bacterial vaginosis)
3) Spike-in dataset with known absolute abundances
These datasets cover diverse complexities.
February 19, 2025 at 11:53 AM
The Need for Ground Truth Data: DA method benchmarks usually rely on synthetic data, simulations, or expt'l data without a sequencing-independent biological ground truth. Our BioC package fills this gap with 3 experimental datasets with known ground truths www.bioconductor.org/packages/Mic...
MicrobiomeBenchmarkData
The MicrobiomeBenchmarkData package provides functionality to access microbiome datasets suitable for benchmarking. These datasets have some biological truth, which allows to have expected results for...
www.bioconductor.org
February 19, 2025 at 11:53 AM
Love this post!
December 21, 2023 at 9:15 PM