Lightnews — Scholar-powered news

Ruchira Dhar

@eclecticruchira.bsky.social

PhD Fellow in AI Evals @UniCopenhagen.
Interested in AI Policy/ AI Ethics/ Responsible AI.
Community Lead @cohereforai.bsky.social
Site: ruchiradhar.github.io
#nlproc #llm #ai

Posts Replies Media Videos

Ruchira Dhar

@eclecticruchira.bsky.social

A big thank you to all my collaborators! @danaesavi.bsky.social @yfyuan01.bsky.social @xinyichen2024.bsky.social @jiaangli.bsky.social @stellafrank.bsky.social @dibyaa.bsky.social @stephaniebrandl.bsky.social @danielhers.bsky.social @delliott.bsky.social

November 13, 2025 at 4:14 PM

Ruchira Dhar

@eclecticruchira.bsky.social

A small but meaningful step toward an evaluation culture that values clarity over marketing. Read the paper: papers.ssrn.com/sol3/papers...
(feat. EvalCards for Qwen & Gemini!).

Thrilled to share this work—hope this leads to more transparent, accessible AI releases 🚀

November 13, 2025 at 4:08 PM

Ruchira Dhar

@eclecticruchira.bsky.social

📌 Why it matters:

As LLM adoption grows, we need clear, honest, and comparable evaluation reporting. EvalCards help enable:
1️⃣ Better model selection
2️⃣ Smoother regulatory compliance
3️⃣ A more transparent AI ecosystem

November 13, 2025 at 4:08 PM

Ruchira Dhar

@eclecticruchira.bsky.social

💡 What we propose:

EvalCards to report model evaluations. They’re designed to be:
✅ Easy to write
✅ Easy to understand
✅ Hard to miss
Each card summarizes capabilities, safety tests, metrics, prompts & key notes. Here’s a sample for an OLMo model from @allen_ai!

November 13, 2025 at 4:08 PM

Ruchira Dhar

@eclecticruchira.bsky.social

AI evaluation reporting has 3 major problems:

⚠️ Reproducibility (missing metrics/prompting)
⚠️ Accessibility (details scattered everywhere)
⚠️ Governance (inconsistent disclosures + rising AI regulations)

November 13, 2025 at 4:08 PM

Ruchira Dhar

@eclecticruchira.bsky.social

Yes, this! And sometimes, I think about how we never really needed AI - like we wouldn't have died without it.

August 25, 2025 at 7:37 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news