Lightnews — Scholar-powered news

Ken Liu

@kzliu.bsky.social

460 followers 64 following 14 posts

CS PhD @ Stanford AI Lab, Stanford NLP. Prev Google DeepMind.

https://ai.stanford.edu/~kzliu

Posts Replies Media Videos

Ken Liu

@kzliu.bsky.social

New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.

Instead of artificially difficult exams where progress ≠ value, we assess LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far: