Lightnews — Scholar-powered news

Reposted by Gilad Feldman

Emir Efendić

@emire.bsky.social

We have a new pre-print! 📝🖨️

We find that conversing with a disagreeing LLM helped improve people's inaccurate predictions!

osf.io/preprints/ps...

Let me tell you all about it:

February 12, 2026 at 9:13 AM

Reposted by Gilad Feldman

Jon Mellon

@jonmellon.bsky.social

We spent months training grad student RAs and GPT-5 mini still beat them by a lot

Ryan Briggs @ryancbriggs.net · 2d

We coded our ~100k articles using LLMs. Should you believe them? To answer this, we benchmarked 4 human RAs against 3 LLMs on their ability to recover ground truth article data. Details in the paper and appendices, but the LLMs did well and handily beat the highly trained humans.

Graphs of sensitivity, showing LLMs outperforming humans

February 11, 2026 at 5:12 PM

Reposted by Gilad Feldman

Ryan Briggs

@ryancbriggs.net

I have a new paper. We look at ~all stats articles in political science post-2010 & show that 94% have abstracts that claim to reject a null. Only 2% present only null results. This is hard to explain unless the research process has a filter that only lets rejections through.

It must be very hard to publish null results
Publication practices in the social sciences act as a filter that favors statistically significant results over null findings. While the problem of selection on significance (SoS) is well-known in theory, it has been difficult to measure its scope empirically, and it has been challenging to determine how selection varies across contexts. In this article, we use large language models to extract granular and validated data on about 100,000 articles published in over 150 political science journals from 2010 to 2024. We show that fewer than 2% of articles that rely on statistical methods report null-only findings in their abstracts, while over 90% of papers highlight significant results. To put these findings in perspective, we develop and calibrate a simple model of publication bias. Across a range of plausible assumptions, we find that statistically significant results are estimated to be one to two orders of magnitude more likely to enter the published record than null results. Leveraging metadata extracted from individual articles, we show that the pattern of strong SoS holds across subfields, journals, methods, and time periods. However, a few factors such as pre-registration and randomized experiments correlate with greater acceptance of null results. We conclude by discussing implications for the field and the potential of our new dataset for investigating other questions about political science.

February 11, 2026 at 5:00 PM

Reposted by Gilad Feldman

Ryan Briggs

@ryancbriggs.net

We coded our ~100k articles using LLMs. Should you believe them? To answer this, we benchmarked 4 human RAs against 3 LLMs on their ability to recover ground truth article data. Details in the paper and appendices, but the LLMs did well and handily beat the highly trained humans.

February 11, 2026 at 5:00 PM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

You can still contribute events to the Love Replications Week. We are now at 12 events and gained another strong partner with @cos.io.

Lukas Röseler @aufdroeseler.bsky.social · 14d

Join us for the LOVE REPLICATIONS WEEK from March 2 - 6 with talks on reproductions, replications, how to find them, how to conduct them, how to have them conducted on your study, where to publish them, and much more!

February 11, 2026 at 11:59 AM

Reposted by Gilad Feldman

Alex Holcombe

@alexh.bsky.social

Some of the reasons articulated here are why at MetaROR.org, we provide qualitative editorial assessments, rather than making accept/reject decisions. Get your #metascience reviewed with us; you can then pass it on to a partner journal to get your binary accept/reject decision, if you like!

February 10, 2026 at 10:16 PM

Reposted by Gilad Feldman

Daniel Lakens

@lakens.bsky.social

Everything is ready for the Perspectives on Scientific Error conference that starts tomorrow in Leiden! I look forward to hanging out with the mix of metascientists, philosophers of science, and statisticians! So many old friends will be there (and hopefully some new ones)! #PSE8

February 10, 2026 at 5:10 PM

Reposted by Gilad Feldman

Randy Ellis

@randalljellis.bsky.social

Here's my conversation with Mu Yang on Metascience Matters: www.youtube.com/watch?v=E2EK...

We discussed her work as a scientific sleuth, academic incentives for positive data, individual cases she has pursued, and why she loves being a sleuth.

Also on Spotify: open.spotify.com/episode/16R6...

300+ retractions, image manipulation, and why science should be boring | Metascience Matters #3

YouTube video by Metascience Matters

www.youtube.com

February 8, 2026 at 7:10 PM

Reposted by Gilad Feldman

Alex Holcombe

@alexh.bsky.social

Rejecting another Elsevier review request. Hoping my attempt at a dispassionate tone keeps my contempt for Elsevier from leaking through.

February 8, 2026 at 8:06 PM

Reposted by Gilad Feldman

Mike Frank

@mcxfrank.bsky.social

Another example: I was teaching turing tests in class and wanted to show off a live one with actual students. In 4-5 hours total I built something that connected students live in class or connected to an LLM and logged data with a realtime visualization. github.com/mcfrank/modi...

GitHub - mcfrank/modified_turing_test: Modified turing test for SymSys 1

Modified turing test for SymSys 1. Contribute to mcfrank/modified_turing_test development by creating an account on GitHub.

github.com

February 5, 2026 at 11:44 PM

Reposted by Gilad Feldman

Mike Frank

@mcxfrank.bsky.social

Example: Whybot is a game to measure kids' curiosity: kids get animal/space/food facts, and can either jump to a new topic or drill deeper. An LLM provided the explanations. It took ~30 minutes to get a working prototype to show colleagues and compare to their alternative that took months to build.

February 5, 2026 at 11:44 PM

Reposted by Gilad Feldman

Mike Frank

@mcxfrank.bsky.social

I am flabbergasted I am by how much vibe coding has expanded my capacities as a scientist and teacher.

In the last few weeks, I've mocked up class demos of a live turing test, generated cross-references for an encyclopedia, and prototyped new tablet tasks for developmental psych.

It's wild.

February 5, 2026 at 11:44 PM

Reposted by Gilad Feldman

Michèle Nuijten

@michelenuijten.bsky.social

I wrote a blog for the Meta-Research Center expressing my infinite frustration about not getting data. What else is new, you might think? Well, I added an extra layer of annoyance directed at the journals who do NOTHING to enforce promised data sharing.

metaresearch.nl/blog/2026/2/...

Promised Data Unavailable? – I’m Sorry, Ma’am, There’s Nothing We Can Do — Meta-Research Center

This blogpost has been written by Michèle Nuijten. Michèle is an assistant professor of our research group who investigates reproducibility and replicability in psychology. Also, she is the developer ...

metaresearch.nl

February 3, 2026 at 3:03 PM

Reposted by Gilad Feldman

Tom Stafford

@tomstafford.mastodon.online.ap.brid.gy

Read my latest post for reflections on reproducibility, research quality and a summary of a great new study which shows how NOT to do it

https://open.substack.com/pub/tomstafford/p/gambling-with-research-quality

Gambling with research quality

How you get 244 different ways to measure performance on the same test of decision making. And what it means for the reliability of behavioural science

tomstafford.substack.com

February 1, 2026 at 8:58 PM

Reposted by Gilad Feldman

Ian Hussey

@ianhussey.mmmdata.io

If you're interested in data sleuthing but aren't sure where to start,

or if you're conducting a systematic review/meta-analysis and want to ensure you're not including junk studies,

check out this Cochrane training session on Trustworthiness Assessment by @jdwilko.bsky.social

INSPECT-SR: A tool for assessing trustworthiness of randomised controlled trials | Cochrane

www.cochrane.org

February 2, 2026 at 6:05 PM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

You can now explore 1K (!) replication studies with over 2K findings. That is possible online (forrt.org/FReD-apps/ex...) or locally via our R package (forrt.org/FReD/).

FReD Explorer - FORRT Replication Database

forrt.org

February 2, 2026 at 9:45 AM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

The FORRT Replication Database has received a massive overhaul (FReD 2.0): We double-coded and validated all data from scratch and extended it in the course of a one-year-partnership with the @cos.io. We just switched to a faster interface thanks to @lukaswallrich.bsky.social’s wizardry.

February 2, 2026 at 9:45 AM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

All of this is the work of a large community. Over 250 people have been working on these projects relentlessly for years. Everything that we do, we share with a CC-BY license for others to reuse. Get in touch with us if you want to join the team and contribute!

February 2, 2026 at 9:45 AM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

Ok researchers rise and shine, it's groundhog day - what better way to get you up to date with what has been going on at the FORRT Replication Hub? forrt.org/replication-...

February 2, 2026 at 9:45 AM

Reposted by Gilad Feldman

Bastian Jaeger

@bxjaeger.bsky.social

Registered report (with 885 US MTurkers) finds no evidence for the claim that people with higher chronic loneliness have a stronger tendency to anthropomorphize nonhuman objects @giladfeldman.bsky.social

doi.org/10.1037/cns0...

February 2, 2026 at 1:56 PM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

Check out the program and register for online talks: forrt.org/LoveReplicat...
Of course, all of this is open and for free. Get in touch if you want to present something related to repetitive research yourself.

forrt.org

January 30, 2026 at 11:44 AM

Reposted by Gilad Feldman

Lukas Röseler

@aufdroeseler.bsky.social

Join us for the LOVE REPLICATIONS WEEK from March 2 - 6 with talks on reproductions, replications, how to find them, how to conduct them, how to have them conducted on your study, where to publish them, and much more!

January 30, 2026 at 11:44 AM

Reposted by Gilad Feldman

Bob C-J and Geoff Cumming

@thenewstats.bsky.social

I haven’t! Seems like a cool idea, though likely a good bit harder than p, df, and test stat correspondence.. but would be cool!

January 29, 2026 at 3:34 PM

Reposted by Gilad Feldman

Michèle Nuijten

@michelenuijten.bsky.social

I haven't! Cool idea :) The main obstacle for me was the text extraction. I guess that once you have the ES & CIs, you can check all sorts of things (is the CI symmetrical? is the ES even *in* the CI?)

January 30, 2026 at 9:48 AM

Reposted by Gilad Feldman

Steve Haroz

@steveharoz.com

Cool! I made a custom variant of statcheck a couple years ago to check a paper that had very non-standard reporting with both a CI and a p-value.

Info + osf link here:
pubpeer.com/publications...

PubPeer - Measuring Effects of Spatial Visualization and Domain On Vis...

There are comments on PubPeer for publication: Measuring Effects of Spatial Visualization and Domain On Visualization Task Performance: A Comparative Study (2022)

pubpeer.com

January 30, 2026 at 10:07 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news