Sasikanth Kotti
Sasikanth Kotti
@ksasi.bsky.social
Reposted by Sasikanth Kotti
🚨New blog: The AI Evaluation Chart Crisis 📝

From misleading bar heights to missing error bars, recent model launches have sparked debate on AI evals. In our new blogpost, we dig into what’s broken, why it matters and how they should be presented 👇

evalevalai.com/documentatio...
The AI Evaluation Chart Crisis
Charts used to showcase performance demonstrate broader issues in the AI evaluation ecosystem: a lack of balance between competitive benchmarking and statistical rigor.
evalevalai.com
August 11, 2025 at 7:20 PM
Reposted by Sasikanth Kotti
We are committed to making meaningful progress in machine learning research through open collaboration. Follow this 🧵to stay on top of our research contributions.
January 15, 2025 at 3:53 PM
Reposted by Sasikanth Kotti
Dear computer vision researchers, students & practitioners🔇🔇🔇

Remi Denton & I have written what I consider to be a comprehensive paper on the harms of computer vision systems reported to date & how people have proposed addressing them, from different angles.

PDF: cdn.sanity.io/files/wc2kmx...
December 16, 2024 at 4:52 PM
Reposted by Sasikanth Kotti
The baseline is around 65% accuracy with a simple fine-tuned BERT and ~95% with a LLM like GPT4o-mini.... your mission is to beat it while using the least energy possible!
Check out the dataset here:
huggingface.co/collections/...
And stay tuned for the last task later this week!🔥
Frugal AI Challenge Tasks - a frugal-ai-challenge Collection
Find the 3 datasets for the Frugal AI Challenge in this Collection! 🌎 Find all the details of the challenge at https://frugalaichallenge.org/
huggingface.co
December 18, 2024 at 8:17 PM
Reposted by Sasikanth Kotti
Today we’re launching the text challenge of the Frugal AI Challenge for the AI Action Summit: detecting climate misinformation in the media (Press, TV, Radio), sponsored by the French non-profit QuotaClimat.
The goal of the task is to detect climate-based misinformation and to categorize its type 📃
December 18, 2024 at 8:17 PM
Reposted by Sasikanth Kotti
Interesting result, even after you correct for anthropomorphizing language

The key takeaway is that providing information about the training condition (explicitly or implicitly) to an LM makes it only "align" (update the probability distribution) in that condition

www.anthropic.com/research/ali...
Alignment faking in large language models
A paper from Anthropic's Alignment Science team on Alignment Faking in AI large language models
www.anthropic.com
December 18, 2024 at 10:55 PM
Reposted by Sasikanth Kotti
📣📣 Wanna be an Area Chair or a Reviewer for @aclmeeting.bsky.social or know someone who would?

Nominations and self-nominations go here 👇

docs.google.com/forms/d/e/1F...
Volunteer to join ACL 2025 Programme Committee
Use this form to express your interest in joining the ACL 2025 programme committee as a reviewer or area chair (AC). The review period is 1st to 20th of March 2025. ACs need to be available for variou...
docs.google.com
December 6, 2024 at 6:01 AM
Reposted by Sasikanth Kotti
Announcing Global-MMLU - an improved MMLU Open dataset with evaluation coverage across 42 languages.

The result of months of work with the goal of advancing Multilingual LLM evaluation.

Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.
December 6, 2024 at 8:59 AM
Reposted by Sasikanth Kotti
Day 2 of smol course and the community is building something here.

👷 If you want to get involved, you can do this:
- read (and star) the repo
- check out our new discord channel
- open a PR to submit an exercise on module 1
- open an issue to improve the course
- review another submission

🧵
December 5, 2024 at 8:48 AM
Reposted by Sasikanth Kotti
If you're potentially interested in transitioning into AI safety research, come collaborate with my team at Anthropic!

Funded fellows program for researchers new to the field here: alignment.anthropic.com/2024/anthrop...
Introducing the Anthropic Fellows Program
alignment.anthropic.com
December 2, 2024 at 8:30 PM
Reposted by Sasikanth Kotti
Just FYI because it seems relevant and I've seen it misstated a few times, LAION retrained their dataset and provided a diff to migrate over to the filtered Re-LAION-5B dataset

HuggingFace, like most platforms that handle user content, do checks for CSAM too.

laion.ai/blog/relaion...
Releasing Re-LAION 5B: transparent iteration on LAION-5B with additional safety fixes | LAION
<p>Today, following <a href="https://laion.ai/notes/laion-maintenance/">a safety revision procedure</a>, we announce Re-LAION-5B, an updated version of LAION...
laion.ai
November 28, 2024 at 4:35 AM
Reposted by Sasikanth Kotti
We've just launched our AI Benchmarking Hub!
This is a new platform for rigorous, independent evaluations of AI model capabilities, featuring interactive visualizations and in-depth analysis. (1/8)

epoch.ai/blog/introdu...
November 27, 2024 at 6:29 PM
Reposted by Sasikanth Kotti
My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.

fleuret.org/dlc/

And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)

fleuret.org/lbdl/
November 26, 2024 at 6:15 AM
Reposted by Sasikanth Kotti
I made a Chrome Extension to bring Bluesky comments to any URL :)

Get it here: github.com/joneslloyd/b...

Credits to:

- @emilyliu.me
- @coryzue.com
- @louee.bsky.social

Any feedback and/or PRs are welcome.

I threw this together in 1.5 errors, so expect bugs etc.
GitHub - joneslloyd/bluesky-comments-chrome
Contribute to joneslloyd/bluesky-comments-chrome development by creating an account on GitHub.
github.com
November 26, 2024 at 12:40 AM
Reposted by Sasikanth Kotti
Just realized that with the API being opened and all we could make a Bluesky RAG.
November 24, 2024 at 7:54 PM
Reposted by Sasikanth Kotti
Short stories for children to read and listen to in many languages. Written with human imagination and translated, illustrated, and read with the help of Artificial Intelligence.

From the original Italian to English, spanish, French, Japanese, and chinese…

www.storiesottolestelle.com
December 26, 2023 at 12:22 AM