František Bartoš
@fbartos.bsky.social
PhD Candidate | Psychological Methods | UvA Amsterdam | interested in statistics, meta-analysis, and publication bias | once flipped a coin too many times
We developed PublicationBiasBenchmark R package (github.com/FBartos/Publ...) that can be easily extended with new methods and measures. It also automatically generates a webpage with summary reports (fbartos.github.io/PublicationB...). All the raw data, results, and measures are available on OSF.
October 23, 2025 at 4:02 PM
We developed PublicationBiasBenchmark R package (github.com/FBartos/Publ...) that can be easily extended with new methods and measures. It also automatically generates a webpage with summary reports (fbartos.github.io/PublicationB...). All the raw data, results, and measures are available on OSF.
Our proposal addresses other issues of current simulation studies (incomparability, irreproducibility...).
We demonstrate the living synthetic benchmark methodology on the publication bias adjustment literature. See how previous simulations use different methods and measures.
We demonstrate the living synthetic benchmark methodology on the publication bias adjustment literature. See how previous simulations use different methods and measures.
October 23, 2025 at 4:02 PM
Our proposal addresses other issues of current simulation studies (incomparability, irreproducibility...).
We demonstrate the living synthetic benchmark methodology on the publication bias adjustment literature. See how previous simulations use different methods and measures.
We demonstrate the living synthetic benchmark methodology on the publication bias adjustment literature. See how previous simulations use different methods and measures.
To start the process, we suggest
- collecting all published methods and simulations
- evaluating all methods on all simulations
- publishing this set of results as the initial synthetic benchmark
- later research can update this benchmark with new methods and simulations
- collecting all published methods and simulations
- evaluating all methods on all simulations
- publishing this set of results as the initial synthetic benchmark
- later research can update this benchmark with new methods and simulations
October 23, 2025 at 4:02 PM
To start the process, we suggest
- collecting all published methods and simulations
- evaluating all methods on all simulations
- publishing this set of results as the initial synthetic benchmark
- later research can update this benchmark with new methods and simulations
- collecting all published methods and simulations
- evaluating all methods on all simulations
- publishing this set of results as the initial synthetic benchmark
- later research can update this benchmark with new methods and simulations
We want to separate those two steps.
New simulations should be published without new methods. Instead, they should evaluate all existing methods.
New methods should be published without new simulations. Instead, they should be assessed on all existing simulations.
New simulations should be published without new methods. Instead, they should evaluate all existing methods.
New methods should be published without new simulations. Instead, they should be assessed on all existing simulations.
October 23, 2025 at 4:02 PM
We want to separate those two steps.
New simulations should be published without new methods. Instead, they should evaluate all existing methods.
New methods should be published without new simulations. Instead, they should be assessed on all existing simulations.
New simulations should be published without new methods. Instead, they should evaluate all existing methods.
New methods should be published without new simulations. Instead, they should be assessed on all existing simulations.
> Why are you actively misrepresenting what others are saying all the time?
I'm happy to discuss with you in person if we meet anywhere, but I don't find replying to you online very productive at this point.
I'm happy to discuss with you in person if we meet anywhere, but I don't find replying to you online very productive at this point.
September 24, 2025 at 1:37 PM
> Why are you actively misrepresenting what others are saying all the time?
I'm happy to discuss with you in person if we meet anywhere, but I don't find replying to you online very productive at this point.
I'm happy to discuss with you in person if we meet anywhere, but I don't find replying to you online very productive at this point.
> Carter et al are right, and you are wrong
That's pretty much just arguing from authority
That's pretty much just arguing from authority
September 24, 2025 at 1:35 PM
> Carter et al are right, and you are wrong
That's pretty much just arguing from authority
That's pretty much just arguing from authority
I did not say meta-analyses with huge heterogeneity lol. I said under any heterogeneity. Would you consider tau = 0.1-0.2 on Cohen's d scale with an average effect size of 0.2-0.4 huge? I would not. Pretty meaningful result (and probably representative of many meta-analyses), but p-curve fails.
September 24, 2025 at 1:33 PM
I did not say meta-analyses with huge heterogeneity lol. I said under any heterogeneity. Would you consider tau = 0.1-0.2 on Cohen's d scale with an average effect size of 0.2-0.4 huge? I would not. Pretty meaningful result (and probably representative of many meta-analyses), but p-curve fails.
> P-curve does what worse than random effects?
All the simulations I linked shows that p-curve estimates the effect size worse, on averate, than random effects.
All the simulations I linked shows that p-curve estimates the effect size worse, on averate, than random effects.
September 24, 2025 at 1:31 PM
> P-curve does what worse than random effects?
All the simulations I linked shows that p-curve estimates the effect size worse, on averate, than random effects.
All the simulations I linked shows that p-curve estimates the effect size worse, on averate, than random effects.
Must've been a bug on the platform -- I could not see any responses I sent to the thread but other features worked fine.
September 24, 2025 at 1:29 PM
Must've been a bug on the platform -- I could not see any responses I sent to the thread but other features worked fine.
For some reason, I cannot reply to Lakens anymore?
Regardless, if anyone is interested in the topic:
- Carter does not say something completely opposite to my claims
- I^2 is not a measure of absolute heterogeneity, Laken's argument strawmans meta-analysis
- p-curve does worse than random effects
Regardless, if anyone is interested in the topic:
- Carter does not say something completely opposite to my claims
- I^2 is not a measure of absolute heterogeneity, Laken's argument strawmans meta-analysis
- p-curve does worse than random effects
September 24, 2025 at 12:22 PM
For some reason, I cannot reply to Lakens anymore?
Regardless, if anyone is interested in the topic:
- Carter does not say something completely opposite to my claims
- I^2 is not a measure of absolute heterogeneity, Laken's argument strawmans meta-analysis
- p-curve does worse than random effects
Regardless, if anyone is interested in the topic:
- Carter does not say something completely opposite to my claims
- I^2 is not a measure of absolute heterogeneity, Laken's argument strawmans meta-analysis
- p-curve does worse than random effects
It's not completely opposed - they say that they work well only under no heterogeneity. From their and other simulation studies it seems like that a simple random effects model performs better than p-curve even when publication bias is present. As such, I don't see any reason for using the method.
September 24, 2025 at 12:22 PM
It's not completely opposed - they say that they work well only under no heterogeneity. From their and other simulation studies it seems like that a simple random effects model performs better than p-curve even when publication bias is present. As such, I don't see any reason for using the method.
How is it directly opposite to what I'm saying?
Also, glad we got to the late-stage science when you start pulling arguments of authority. Always great debating with you :)
Also, glad we got to the late-stage science when you start pulling arguments of authority. Always great debating with you :)
September 24, 2025 at 12:22 PM
How is it directly opposite to what I'm saying?
Also, glad we got to the late-stage science when you start pulling arguments of authority. Always great debating with you :)
Also, glad we got to the late-stage science when you start pulling arguments of authority. Always great debating with you :)
You are still free to find any third-party realistic simulations to address my claim :)
September 24, 2025 at 8:22 AM
You are still free to find any third-party realistic simulations to address my claim :)
The issue is it fails even with low heterogeneity; you are just caricaturing any other slightly heterogeneous meta-analysis right now.
September 24, 2025 at 8:21 AM
The issue is it fails even with low heterogeneity; you are just caricaturing any other slightly heterogeneous meta-analysis right now.
> And hey, even if a paper they wrote in 2014 on a new method is now partially outdated, so what?
I accept the critique and acknowledge the method is outdated and should not be used. It might have been a great idea back then but it did not turn out to be any more.
I accept the critique and acknowledge the method is outdated and should not be used. It might have been a great idea back then but it did not turn out to be any more.
September 24, 2025 at 8:19 AM
> And hey, even if a paper they wrote in 2014 on a new method is now partially outdated, so what?
I accept the critique and acknowledge the method is outdated and should not be used. It might have been a great idea back then but it did not turn out to be any more.
I accept the critique and acknowledge the method is outdated and should not be used. It might have been a great idea back then but it did not turn out to be any more.
btw, we just released JASP 0.95.2, which fixes some previously reported stability issues -- consider updating your version :)
September 17, 2025 at 9:40 AM
btw, we just released JASP 0.95.2, which fixes some previously reported stability issues -- consider updating your version :)