Lightnews — Scholar-powered news

Lexin Zhou

@lexinzhou.bsky.social

Research Intern at Microsoft | Working on AI Evaluation, Social Computing and NLP | Incoming PhD candidate for Fall 2025
https://lexzhou.github.io

Posts Replies Media Videos

Lexin Zhou

@lexinzhou.bsky.social

Thrilled to share this accessible MSR blogpost that summarizes our latest work on building a Science of AI Evaluation, where we manage to both reliably explain and predict success/failure of general-purpose AI models on new, unforeseen tasks and environments!

May 13, 2025 at 4:08 AM

Lexin Zhou

@lexinzhou.bsky.social

🚨To continuously foster conceptual & technical innovations for a science of AI Evaluation:

An open collaborative community is initiated by Leverhulme Centre for the Future of Intelligence, to adopt and extend our novel methodology.

Join us: kinds-of-intelligence-cfi.github.io/ADELE!

March 14, 2025 at 3:37 AM

Reposted by Lexin Zhou

Peter Henderson

@peterhenderson.bsky.social

To better understand why this matters in high-stakes contexts, you can also check out our previous work. We discuss why predicting model performance (e.g., failures on out-of-distribution languages in machine translation) remains essential in legal contexts.

March 11, 2025 at 8:07 PM

Reposted by Lexin Zhou

Peter Henderson

@peterhenderson.bsky.social

Understanding and extrapolating benchmark results will become essential for effective policymaking and informing users. New work identifies indicators that have high predictive power in modeling LLM performance. Excited for it to be out!

March 11, 2025 at 8:07 PM

Lexin Zhou

@lexinzhou.bsky.social

Thrilled to unlock AI Evaluation with explanatory and predictive power through general ability scales!

With a new methodology to
-Explain what common benchmarks really measure
-Extract explainable ability profiles of AI systems
-Predict performance for new task instances, in & out-of-distribution
🧵

March 11, 2025 at 6:12 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news