Daniel Paleka
dpaleka.bsky.social
Daniel Paleka
@dpaleka.bsky.social
ai safety researcher | phd ETH Zurich | https://danielpaleka.com
Benchmarks can reward strategic gambling over calibrated forecasting when optimizing for ranking performance.

"Bet everything" on one scenario beats careful probability estimation for maximizing the chance of ranking #1 on the leaderboard. (6/7)
June 5, 2025 at 5:08 PM
Model knowledge cutoffs are guidelines about reliability, not guarantees of no information thereafter. GPT-4o, when nudged, can reveal knowledge beyond its stated Oct 2023 cutoff. (5/7)
June 5, 2025 at 5:08 PM
Date-restricted search leaks future knowledge. Searching pre-2019 articles about “Wuhan” returns results abnormally biased towards the Wuhan Institute of Virology — an association that only emerged later. (4/7)
June 5, 2025 at 5:08 PM
How well can LLMs predict future events? Recent studies suggest LLMs approach human performance. But evaluating forecasters presents unique challenges compared to standard LLM evaluations.

We identify key issues with forecasting evaluations 🧵 (1/7)
June 5, 2025 at 5:08 PM
why is it that whenever i see survivorship bias on my timeline it already has the red-dotted plane in the replies?
May 26, 2025 at 3:07 PM
Quick sycophancy eval: comparing the two recent OpenAI ChatGPT system prompts, it is clear last week's prompt moves other models towards sycophancy too, while the current prompt makes them more disagreeable.
April 30, 2025 at 3:16 PM
what are you doing Claude i thought we were friends
January 17, 2025 at 7:12 AM
Test-time compute based on arbitrage can make forecasts more consistent; this improves specific logical rules such as Negation, but doesn't generalize to the consistency rules we do not optimize over. (9/11)
January 11, 2025 at 1:53 AM
Some consistency checks are better signals than others; for instance, the violation of P(A)P(B|A) = P(A&B) explains a high fraction of the variation in forecasting performance over a range of forecasters. (7/11)
January 11, 2025 at 1:53 AM
Starting from a base question, we generate multiple logically related questions and ask them independently. (5/11)
January 11, 2025 at 1:53 AM
We create consistency checks from base forecasting questions, which we take from various sources (prediction market questions, synthetically generated from news, purely LLM-generated); ask the forecasters for probabilities, and check how consistent the predictions are (4/11)
January 11, 2025 at 1:53 AM
We test 10 different logical rules that a consistent forecaster should satisfy. (3/11)
January 11, 2025 at 1:53 AM
Recent LLM forecasters are getting better at predicting the future. But there's a challenge: How can we evaluate and compare AI forecasters without waiting years to see which predictions were right? (1/11)
January 11, 2025 at 1:53 AM
LLMs rapidly improving at software engineering and math, given that the rate of improvement in ideation is slower, means you should be intentional about what value is gained from doing a highly technical project now as opposed to later
January 8, 2025 at 12:54 AM
my New Year's resolution: don't work on a bigger project if there is not a clear reason for doing it *now*.

disregarding the AGI timelines, the R&D acceleration is a clear reason against technical work where the discount rates on the final product are low
December 31, 2024 at 10:52 PM
TIL that the atmosphere blocks basically all electromagnetic radiation, except three small windows: one for visible light, one for cooling the Earth, and one for radio waves. Earth is the USA of planets.
November 28, 2024 at 7:03 PM