Ken Liu
kzliu.bsky.social
Ken Liu
@kzliu.bsky.social
CS PhD @ Stanford AI Lab, Stanford NLP. Prev Google DeepMind.

https://ai.stanford.edu/~kzliu
New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of artificially difficult exams where progress ≠ value, we assess LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:
August 26, 2025 at 5:51 PM
Stanford NLP PhDs
Join the conversation
go.bsky.app
November 22, 2024 at 12:34 AM
hi
November 21, 2024 at 8:01 PM