Anne Scheel
banner
annemscheel.bsky.social
Anne Scheel
@annemscheel.bsky.social
Assistant prof at Utrecht University, trying to make science as reproducible as non-scientists think it is. Blogs at @the100ci.
Completely agree with both of these sentiments — setting a meaningful SESOI is hard, but it’s the right discussion to have!

bsky.app/profile/anne...
I completely agree! But if you realise that you can’t specify a SESOI for the life of you, it usually means that you can’t specify your research hypothesis enough to make it statistically testable. In that case, the best decision may to do something else: journals.sagepub.com/doi/10.1177/...
Why Hypothesis Testers Should Spend Less Time Testing Hypotheses - Anne M. Scheel, Leonid Tiokhin, Peder M. Isager, Daniël Lakens, 2021
For almost half a century, Paul Meehl educated psychologists about how the mindless use of null-hypothesis significance tests made research on theories in the s...
journals.sagepub.com
October 31, 2025 at 4:06 PM
If I understand you correctly, that would imply restricting your sample size/power so that effects smaller than what you consider relevant wouldn’t become significant. That’s not ideal. Equivalence tests allow you to distinguish between significant and relevant.
October 31, 2025 at 1:24 PM
I personally think that dichotomous claims are very important in science (to the degree that I think binary statistical decision criteria would keep re-emerging naturally even if you’d somehow abolish them now) but of course there are many research questions for which tests would be the wrong tool.
October 31, 2025 at 11:06 AM
1) IMO the choice between Bayesian vs frequentist methods is orthogonal to the choice between tests and (eg) estimation. 2) If you’re not interested in dichotomous claims, of course you don’t need to test, which I tried to imply here bsky.app/profile/anne...
October 31, 2025 at 11:06 AM
Reposted by Anne Scheel
So (beyond this specific example of anchor-based approaches) I would be happy to see many diverse applied examples on how SESOIs have been reasonably specified in sport and exercise science (if they exist...).
October 31, 2025 at 5:19 AM
(Or that a test actually isn’t the best tool for answering the research question, for that matter)
October 31, 2025 at 10:30 AM
Jinx!
October 31, 2025 at 10:28 AM
These are exactly the right discussions to have IMO, especially when they help us better understand what knowledge we need (and might still be missing) for setting up meaningful, informative tests.
October 31, 2025 at 10:28 AM
I completely agree! But if you realise that you can’t specify a SESOI for the life of you, it usually means that you can’t specify your research hypothesis enough to make it statistically testable. In that case, the best decision may to do something else: journals.sagepub.com/doi/10.1177/...
Why Hypothesis Testers Should Spend Less Time Testing Hypotheses - Anne M. Scheel, Leonid Tiokhin, Peder M. Isager, Daniël Lakens, 2021
For almost half a century, Paul Meehl educated psychologists about how the mindless use of null-hypothesis significance tests made research on theories in the s...
journals.sagepub.com
October 31, 2025 at 10:23 AM
Without an explicit SESOI, readers will apply implicit SESOIs (as happened here) and the discussion becomes unnecessarily confused. SESOIs have nothing to do with sample size, but "with large N, people suddenly become aware of the matter" (@dingdingpeng.the100.ci). Thanks for coming to my TED talk.
October 31, 2025 at 8:13 AM
The solution is to specify and test your alternative hypothesis. You can do this by defining a smallest effect size of interest (SESOI) and performing an equivalence or inferiority test, *in addition to* the null-hypothesis test. Here's a whole tutorial about it: doi.org/10.1177/2515... >
Sage Journals: Discover world-class research
Subscription and open access journals from Sage, the world's leading independent academic publisher.
doi.org
October 31, 2025 at 8:13 AM
But *only* testing against 0 usually isn't very interesting and has problems. If you get a non-sig. result, you can claim that power was too low and dismiss it. If you get a sig. result but the effect size is very small, @aaronjfisher.bsky.social may come and say that it's actually meaningless. >
October 31, 2025 at 8:13 AM
Statistical tests (and thus p-values, if you use frequentist methods) are useful if you want to test a hypothesis or make a dichotomous claim (for a nice overview, see doi.org/10.1177/0959... by @uyguntunc.bsky.social et al.), regardless of whether N is small or large. >
The epistemic and pragmatic function of dichotomous claims based on statistical hypothesis tests - Duygu Uygun Tunç, Mehmet Necip Tunç, Daniël Lakens, 2023
Researchers commonly make dichotomous claims based on continuous test statistics. Many have branded the practice as a misuse of statistics and criticize scienti...
doi.org
October 31, 2025 at 8:13 AM
2) But the real elephant in the room is *your effect of interest*. @aaronjfisher.bsky.social is right in that statistical significance doesn't tell you if an effect is practically or theoretically meaningful. But that neither means that p-values are useless, nor does it depend on the sample size. >
October 31, 2025 at 8:13 AM
5% is quite a lot if you think about it. Huge N gives you the luxury to reduce alpha by a lot and still keep very high power. E.g., alpha = 0.5% (0.005) would give you 98.8% power for the same effect (in a t-test). The best balance depends on the cost of each error type, see tinyurl.com/yut35b3u >
Justify Your Alpha by Minimizing or Balancing Error Rates
A preprint ("Justify Your Alpha: A Primer on Two Practical Approaches") that extends the ideas in this blog post is available at: https://ps...
tinyurl.com
October 31, 2025 at 8:13 AM
1) With the conventional alpha = 5% and a huge sample, you may have extremely high power for your effect of interest – say, 99.9%. That means beta (type-II error rate) = 0.1%. Are you sure that you want your type-I error rate to be 50x the size of your type-II error rate? >
October 31, 2025 at 8:13 AM