Ehud Karavani
ehudk.bsky.social
Ehud Karavani
@ehudk.bsky.social
Research Staff Member at IBM Research.
Causal Inference 🔴→🟠←🟡.
Machine Learning 🤖🎓.
Data Communication 📈.
Healthcare ⚕️.
Creator of 𝙲𝚊𝚞𝚜𝚊𝚕𝚕𝚒𝚋: https://github.com/IBM/causallib
Website: https://ehud.co
it ain't nuthin' but a g thang
November 8, 2025 at 5:07 PM
An excellent opportunity to sneak in a DAG into a flowchart
October 23, 2025 at 2:05 PM
October 23, 2025 at 5:38 AM
June 22, 2025 at 5:49 PM
these parentheses have a lot to unpack 😅
May 22, 2025 at 5:53 AM
fixed it 🤭
May 14, 2025 at 6:46 AM
I'll admit the argument through IF theory is beyond me at this hour of my day, but I believe my case holds in the simplest simulation conceivable.
full code: gist.github.com/ehudkr/a9dd3...
April 26, 2025 at 7:28 PM
oh don't look at me just causally bayesianizing my lm
April 25, 2025 at 11:27 AM
starting on quantum computing, and now I cannot unsee brakets over Bell states everywhere I go.
April 23, 2025 at 3:04 PM
i mean, not quite, but also not entirely wrong?
February 27, 2025 at 12:15 PM
when i present this topic I often also exclude the upper right quadrant for amplifying the effect (and realism 😬😅)
February 26, 2025 at 5:09 PM
however, if the variables affecting the decision about prolonging treatment (X1) are affected by the 1st treatment decision (which is plausible), then regular regression methods can no longer provide a valid estimate because of the feedback mechanism between the treatment and the confounders.
February 19, 2025 at 3:18 PM
If the second treatment is also confounded (as is probably the case) BUT THESE CONFOUNDERS ARE NOT AFFECTED BY THE FIRST TREATMENT, then the adjustment is still quite simple:
post ~ pre + X0 + A0 + X1 + A1
February 19, 2025 at 3:18 PM
If you care about the cumulative effect of treatment over time, then you need to account for treatment varying over time.

In the simplest case, if the second treatment is completely randomized then it isn't a big deal:
post ~ pre + X0 + A0 + A1
will suffice
February 19, 2025 at 3:18 PM
Welcome to the zoo of time-varying treatment Solomon. There different answers depending on the different questions of interest.

The simplest answers will be to ignore treatment being dynamic. It will answer whether treatment _initiation_ is effective, regardless of how patients stick to protocol.
February 19, 2025 at 3:18 PM
now I'm feeling like a sucker for making this absolutely stunning piece of graphical abstract last week
February 4, 2025 at 1:58 PM
This is a bit of a tangent, but still related and interesting perspective on the topic (and the authors seem to have read Ben there)
arxiv.org/abs/2407.12220
January 31, 2025 at 8:17 PM
I'm flattered, but now you made me draw DAGs, Ben.
On the left, you don't expect ɛ (y=f(X)+ɛ) to be consistent across data splits since it's random, and thus fitting it is bad.
On the right, you don't expect U (ruler) to appear on deployment, so a model using it instead of X (skin) will be wrong.
January 31, 2025 at 8:08 PM
Appreciate the post and I agree DL provided new evidence.
I just think overfitting assumes iid train/test, so I'm not sure if cases like described in this paragraph hold (e.g., black swan).
I don't think that poor performance from distribution shift would be classified as "overfitting".
January 31, 2025 at 9:09 AM
For a bayesian view of this issue I can recommend "Regularization and Confounding in Linear Regression for Treatment Effect Estimation" by Hahn and friends, if only for coining "regularization-induced confounding"
projecteuclid.org/journals/bay...
January 15, 2025 at 9:26 PM
January 12, 2025 at 8:15 PM
Gave a hands-on casual inference workshop in Python tonight at the DataNights/DataHack causality course and really enjoyed how engaged everyone were.
January 7, 2025 at 8:36 PM
oh of course, the 3rd english dialect from the pirates of the C
January 6, 2025 at 1:14 PM
I didn't know Quarto can do this!
But it seems as simple as everything else in quarto:
closeread.dev
January 6, 2025 at 10:24 AM
"next week" who was I kidding...
anyways, it's here now, and I'm glad I set down and wrote it because I discovered I had to iron out some personal misunderstandings.
so without further ado and with even shinier visuals, a post about double cross-fitting for #causalinference
ehud.co/blog/2024/03...
January 2, 2025 at 7:17 PM