Maurício M.
banner
phydev.bsky.social
Maurício M.
@phydev.bsky.social
AI skeptic.

Sul Americano.

https://phydev.github.io
Reposted by Maurício M.
Our didactic review on machine learning for causal inference, now open access:
• identifiability (theory of when the data can answer a causal question)
• machine-learning estimators
• study design (asking well-framed questions + loopholes, eg with timewise data)
www.annualreviews.org/content/jour...
August 20, 2025 at 7:12 PM
Reposted by Maurício M.
🖊️AI for health: the impossible necessity of unbiased data

Is unbiased data important to build health AI? Yes!

Can there be unbiased data? No!
Building health on biased data discriminates

The notion of bias depends on the intended use:
gael-varoquaux.info/science/ai-f...
February 14, 2025 at 8:30 AM
What could go wrong when we use random forest based imputation methods for classical inference?

With a simple simulation study we show how random forest imputation can have catastrophic effects on classical inference with respect to bias and spurious correlations.

phydev.github.io/posts/ranger...
Maurício Moreira-Soares
phydev.github.io
February 9, 2025 at 10:59 AM
Reposted by Maurício M.
Based on this #MICCAI2024 paper, we are currently preparing a new submission with a Bayesian approach to investigate the probability of false claims in medical imaging AI papers. The results are shocking… stay tuned⏰

Great collaboration with @gaelvaroquaux.bsky.social and O. Colliot
Love this. Bold face numbers in a table don't cut it:

"For >60% of papers, the 2nd-ranked method was within the CI of the 1st-ranked method. Current publications typically don't provide sufficient evidence to support which models could be translated into clinical practice" arxiv.org/abs/2409.17763
Confidence intervals uncovered: Are we ready for real-world medical imaging AI?
Medical imaging is spearheading the AI transformation of healthcare. Performance reporting is key to determine which methods should be translated into clinical practice. Frequently, broad conclusions ...
arxiv.org
January 30, 2025 at 8:01 AM
I wrote a short tutorial on how to run deepseek and other models locally with ollama and open-webui: phydev.github.io/posts/deepse...
Maurício Moreira-Soares
phydev.github.io
January 30, 2025 at 10:44 AM
Reposted by Maurício M.
Wrangling string columns for machine learning, the new StringEncoder in @skrub-data.bsky.social gives such a good compute/prediction performance tradeoff.

It's mostly just a bunch of simple tricks, but with well-chosen defaults. This is what we aim for in skrub

skrub-data.org/stable/refer...
January 28, 2025 at 5:47 PM
Reposted by Maurício M.
Hot off the press! 📣📣In this tutorial we illustrate available multiple imputation approaches for handling longitudinal data including when they are clustered within higher level clusters. A reproducible example with R and Stata code provided! #OpenAccess

onlinelibrary.wiley.com/doi/10.1002/...
Multiple Imputation for Longitudinal Data: A Tutorial
Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thu....
onlinelibrary.wiley.com
January 27, 2025 at 4:14 AM
Reposted by Maurício M.
Happy to share the first paper of my PhD is published☺️!

In case you like to use class imbalance corrections, maybe it is interesting. Let me know what you think!

onlinelibrary.wiley.com/doi/10.1002/...

Many thanks to @maartenvsmeden.bsky.social, @benvancalster.bsky.social, Anne, Kim and Carl !!
January 27, 2025 at 3:27 PM
I’ve been living in Norway for 4.5 years and still in love with this place. Yesterday snowed all day long and this morning the sky is crystal clear with a beautiful yellow moon 🌙 , in contrast with the white snow that paints everything. I wished I had my camera with me - a recurrent thought here.
January 24, 2025 at 6:49 AM
Reposted by Maurício M.
Let us start 2025 in a positive mood: here are 10 methods things researchers can worry *less* about in 2025
a countdown clock with the number 10 in the center
ALT: a countdown clock with the number 10 in the center
media.tenor.com
December 23, 2024 at 10:36 AM
Reposted by Maurício M.
Key question to consider before submitting your paper on the development and validation of your new clinical prediction model is:

WHERE IS THE MODEL????
December 11, 2024 at 2:30 PM
Reposted by Maurício M.
NEW PAPER

A two pager with 5 criteria to evaluate prediction models, in particular those based on AI

doi.org/10.1093/eurh...
October 29, 2023 at 7:08 AM
How E-cadherin mutations, causing Hereditary Diffuse Gastric Cancer, lead to metastasis? We show with in vitro assays & mathematical models that invasion isn't just due to loss of cell-cell adhesiveness but also abnormal ECM interaction & favorable 3D tissue structure.
The ECM and tissue architecture are major determinants of early invasion mediated by E-cadherin dysf...
The use of mathematical modelling and in vitro assays shows that loss of cell-cell adhesion, increased ECM attachment and tissue architecture promote invasion in a model of hereditary diffuse gastric ...
www.nature.com
November 11, 2023 at 9:23 PM