Erick Scott
banner
erickscott.bsky.social
Erick Scott
@erickscott.bsky.social
Scientist, building cstructure.
Have you considered shifting/adding an input route to left or right of the origin?

This would solve the final central tendency problem while also signifying baseline differences. Mixture of distributions also 'falls out' of this set up.
November 4, 2025 at 12:05 AM
I'd love to see someone specify that function...where does Terence Tao sit on the curve?
October 9, 2025 at 8:02 PM
I have been surprised that the first generation is usually more thoughtful with the sycophant warning as a system prompt.

I won't speculate on what's actually happening with matmul/reasoning, but I have found it helpful to counteract the vendor's ingratiating base prompt.
October 9, 2025 at 6:30 PM
Try adding: 'Don't be a sycophant' to your system prompt.

Gemini is more stubborn...
October 9, 2025 at 5:51 PM
I don't think they are useless and deepmind is definitely moving forward with principled quantitative approaches. Moving stochastic outputs into structured models + rapid human error correction is what @travisgerke.bsky.social and I are working on at cStructure. Happy to chat anytime
October 9, 2025 at 5:49 PM
The code that supposedly underpinned the analysis used a fake propensity score (0.1*covariate1 + 0.2*covariate2...) with a comment that a real propensity model should be implemented.

This happens all the time: code syntax was fine, semantics wrong - assoc. text was plausible. User beware.
2/2
October 9, 2025 at 5:45 PM
I have many similar stories. For example, I asked for a propensity score analysis of Lalonde assuming this canonical example is a best case scenario. I provided the dataset. The generated text provided a correct and nuanced description of the estimator and the ATE. 1/2
October 9, 2025 at 5:41 PM
Bayesian posterior distributions. So much information packed into the density. If two people disagree on what threshold should be used to make a decision, it's easy to calculate the support for either.
April 18, 2025 at 2:59 AM
Reminds me of the difference b/w Efficacy (ITT if properly used, e.g. abstinence for teen pregnancy) vs Effectiveness (PP, outcomes when abstinence is used in practice).

In practice, I think the effectiveness of a causal method is important as unmeasured confounding is ever present in real data.
April 17, 2025 at 6:49 PM
I really do believe Gordon et al. offered a best effort assessment. The scale, diversity, and quality of their ground truth is quite impressive.

What would you have done differently to reduce assumption violation?
April 17, 2025 at 3:16 PM
I see simulations as a useful tool to assess method performance under various degrees of assumption violation.

I also think the simulations should approximate the magnitude and direction of bias seen in high quality empirical studies.
April 17, 2025 at 3:12 PM
I love the WeightIt package, thanks for developing it.

How do you interpret these simulation papers in light of large scale empirical benchmarks

www.researchgate.net/publication/...
(PDF) Close Enough? A Large-Scale Exploration of Non-Experimental Approaches to Advertising Measurement
PDF | Randomized controlled trials (RCTs) have become increasingly popular in both marketing practice and academia. However, RCTs are not always... | Find, read and cite all the research you need on R...
www.researchgate.net
April 17, 2025 at 12:51 AM
Am I the only one in industry, that looks at this thread and remembers junior hires showing up to their first stakeholder meeting after throwing "all the x's" into sci-kit learn and then getting absolutely thrashed by the domain experts?
April 16, 2025 at 7:38 PM
It's like we learned absolutely nothing from the reproducibility crisis, kitchen-sink machine learning models for covid, population/environmental stratification in genomics, a/b testing at scale...sigh.
April 16, 2025 at 7:34 PM
I just named several industries that in practice don't blindly use LASSO. A/B testing is used by any industry with a website/app and small to large companies employ (data) scientists to design and analyze the experiments. Healthcare is a pretty large industry, Computing is a pretty large industry??
April 16, 2025 at 7:06 PM
Then I am puzzled by the idea that in practice scientists just expect LASSO to select the right variables. Here's SHAP docs describing why that is a bad assumption **in practice**
shap.readthedocs.io/en/latest/ex...
April 16, 2025 at 6:14 PM
You should encourage him to explore causal inference.

Practical applications that share the same concern about LASSO: A/B testing, drug development, electrical engineering, physics
April 16, 2025 at 2:36 PM
Empirical studies

RCT-Duplicate: www.rct-duplicate.org

FDA RWE examples: www.fda.gov/media/146258... &

Northwestern/Meta A/B testing: www.kellogg.northwestern.edu/faculty/gord... and arxiv.org/abs/2201.07055
April 14, 2025 at 2:01 PM
There are several excellent technical books on this subject.

WhatIf by Hernán and Robins: miguelhernan.org/whatifbook

Causal Inference in Statistics by Pearl: www.amazon.com/Causal-Infer...

Causality by Pearl: bayes.cs.ucla.edu/BOOK-2K/
April 14, 2025 at 1:52 PM
Rectangle is amazing for organizing windows
rectangleapp.com

Spaces are also really helpful to keep work streams partitioned. 3 finger up/down

Dbeaver is the best free database GUI for mac

Homebrew for package management is a must
Rectangle
Move and resize windows in macOS using keyboard shortcuts or snap areas. The official page for Rectangle.
rectangleapp.com
March 28, 2025 at 9:32 PM