Clintin Davis-Stober
@clintin.bsky.social
Professor, quantitative psychology, decision theory, data science, mathematics, statistics, open science, modeling, weight lifting, photography, enjoyer of poetry
www.davis-stober.com
www.davis-stober.com
My front yard :)
October 16, 2025 at 6:58 PM
My front yard :)
Definitely something worth digging into. I’ll give it some thought
August 10, 2025 at 8:32 PM
Definitely something worth digging into. I’ll give it some thought
Reposted by Clintin Davis-Stober
here was our call for methodological standards in metaresearch four years ago. instead of getting fixated on a particular inference we'd like to make, we need to maintain scientific standards, do the hard work, respect the evidence. we can't keep jumping at self-serving solutions without question.
The case for formal methodology in scientific reform | Royal Society Open Science
Current attempts at methodological reform in sciences come in response to an overall
lack of rigor in methodological and scientific practices in experimental sciences.
However, most methodological ref...
share.google
August 10, 2025 at 3:40 PM
here was our call for methodological standards in metaresearch four years ago. instead of getting fixated on a particular inference we'd like to make, we need to maintain scientific standards, do the hard work, respect the evidence. we can't keep jumping at self-serving solutions without question.
I would add that while pcurve is comprised of tests, that indeed correspond to error rates, the actual hypotheses being tested have little to do with “evidential value”
August 10, 2025 at 4:40 PM
I would add that while pcurve is comprised of tests, that indeed correspond to error rates, the actual hypotheses being tested have little to do with “evidential value”
Test statistics being used are just simple sums, no 3rd moment information enters the test.
August 10, 2025 at 12:36 PM
Test statistics being used are just simple sums, no 3rd moment information enters the test.
Would this be grounds for dismissing the remaining 54 studies as lacking value? It makes no sense. Part of the problem is that the original p curve papers aren’t clear on what exactly is being tested. The authors claim they are tests of skew, but this is incorrect as the
August 10, 2025 at 12:36 PM
Would this be grounds for dismissing the remaining 54 studies as lacking value? It makes no sense. Part of the problem is that the original p curve papers aren’t clear on what exactly is being tested. The authors claim they are tests of skew, but this is incorrect as the
Happy to clarify. Pcurve is used to test whether a set of studies have (or lack) “evidential value” (which is not really defined). But the actual hypotheses being tested by pcurve don’t permit this, as Richard and I show. Suppose one study WAS underpowered in a set of 55 studies -
August 10, 2025 at 12:36 PM
Happy to clarify. Pcurve is used to test whether a set of studies have (or lack) “evidential value” (which is not really defined). But the actual hypotheses being tested by pcurve don’t permit this, as Richard and I show. Suppose one study WAS underpowered in a set of 55 studies -
All pcurve tests are just a simple sum of transformed pvalues. There is a fundamental disconnect between the null hypotheses being tested by p-curve and the claims being made.
August 9, 2025 at 9:18 PM
All pcurve tests are just a simple sum of transformed pvalues. There is a fundamental disconnect between the null hypotheses being tested by p-curve and the claims being made.
What this means is that a significant result for either test only allows one to claim that “at least one” study (out of the set) doesn’t have the property being considered. Why does this happen? Because pcurve completely ignores the configuration of the pvalues being considered.
August 9, 2025 at 9:18 PM
What this means is that a significant result for either test only allows one to claim that “at least one” study (out of the set) doesn’t have the property being considered. Why does this happen? Because pcurve completely ignores the configuration of the pvalues being considered.
The test for evidential value simply examines whether the effect size is zero for all studies. The test for lack of evidential value tests whether all studies are “underpowered”, I.e., have small non-centrality parameters.
August 9, 2025 at 9:18 PM
The test for evidential value simply examines whether the effect size is zero for all studies. The test for lack of evidential value tests whether all studies are “underpowered”, I.e., have small non-centrality parameters.
The developers of p-curve claim that p-curve can be used to make claims about the evidential value (or lack thereof) of whole sets of studies. We show that the actual hypotheses being tested do not allow for such strong conclusions.
August 9, 2025 at 9:18 PM
The developers of p-curve claim that p-curve can be used to make claims about the evidential value (or lack thereof) of whole sets of studies. We show that the actual hypotheses being tested do not allow for such strong conclusions.
The basic idea of p-curve rests on the idea that the skew of a set of p-values is informative about whether QRPs are occurring. As we show, the p-curve tests have nothing to do with skew. It is trivial to create left skewed pvalues that p-curve would confidently label as right skewed.
August 9, 2025 at 9:18 PM
The basic idea of p-curve rests on the idea that the skew of a set of p-values is informative about whether QRPs are occurring. As we show, the p-curve tests have nothing to do with skew. It is trivial to create left skewed pvalues that p-curve would confidently label as right skewed.
I'm so sorry this happened to you. There is no excuse for such bs.
June 11, 2025 at 7:50 PM
I'm so sorry this happened to you. There is no excuse for such bs.