Lightnews — Scholar-powered news

Shaily

@shaily99.bsky.social

3.2K followers 530 following 300 posts

PhDing at LTI, CMU
Prev: Ai2, Google Research, MSR
Evaluating language technologies, regularly ranting, and probably procrastinating.
https://sites.google.com/view/shailybhatt/

Posts Replies Media Videos

Reposted by Shaily

Amanda Bertsch

@abertsch.bsky.social

We’re excited about Oolong as a challenging benchmark for information aggregation! Let us know which models we should benchmark next 👀

Paper: arxiv.org/abs/2511.02817
Dataset: huggingface.co/oolongbench
Code: github.com/abertsch72/o...
Leaderboard: oolongbench.github.io

Oolong: Evaluating Long Context Reasoning and Aggregation Capabilities

As model context lengths continue to grow, concerns about whether models effectively use the full context length have persisted. While several carefully designed long-context evaluations have recently...

arxiv.org

November 7, 2025 at 5:07 PM

Shaily

@shaily99.bsky.social

It cannot be defined, only experienced!

October 24, 2025 at 3:55 AM

Shaily

@shaily99.bsky.social

I now have a notion page that i made when i did this ages ago and i blindly follow 2023 me and am grateful to her.

October 8, 2025 at 1:31 PM

Shaily

@shaily99.bsky.social

You guys are writing at 9 AM !!!!!!!!!!!!!!!!!

September 29, 2025 at 9:57 PM

Reposted by Shaily

naitian

@naitian.org

I've written really terrible paragraphs that have made me want to stop at 9AM in the morning.

September 26, 2025 at 2:08 PM

Shaily

@shaily99.bsky.social

@danishpruthi.bsky.social
Kalika Bali at MSR (don't think she's on here)

September 21, 2025 at 12:39 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news