Lightnews — Scholar-powered news

Reposted by Sasikanth Kotti

EvalEval Coalition

@eval-eval.bsky.social

🚨New blog: The AI Evaluation Chart Crisis 📝

From misleading bar heights to missing error bars, recent model launches have sparked debate on AI evals. In our new blogpost, we dig into what’s broken, why it matters and how they should be presented 👇

evalevalai.com/documentatio...

The AI Evaluation Chart Crisis

Charts used to showcase performance demonstrate broader issues in the AI evaluation ecosystem: a lack of balance between competitive benchmarking and statistical rigor.

evalevalai.com

August 11, 2025 at 7:20 PM

Reposted by Sasikanth Kotti

Cohere Labs

@cohereforai.bsky.social

We are committed to making meaningful progress in machine learning research through open collaboration. Follow this 🧵to stay on top of our research contributions.

January 15, 2025 at 3:53 PM

Reposted by Sasikanth Kotti

Timnit Gebru

@timnitgebru.bsky.social

Dear computer vision researchers, students & practitioners🔇🔇🔇

Remi Denton & I have written what I consider to be a comprehensive paper on the harms of computer vision systems reported to date & how people have proposed addressing them, from different angles.

PDF: cdn.sanity.io/files/wc2kmx...

Screenshot of Table of Contents (Part 1)

Contents
1 Introduction 217
2 Positionality 221
3 Overview of Risks and Harms Associated with Computer
Vision Systems and Proposed Mitigation Strategies 223
3.1 Representational Harms . . . . . . . . . . . . . . . . . . . 223
3.2 Quality-of-Service and Allocative Harms . . . . . . . . . . 229
3.3 Interpersonal Harms . . . . . . . . . . . . . . . . . . . . . 237
3.4 Societal Harms: System Destabilization and Exacerbating
Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 245
4 Frameworks and Principles for Computer Vision
Researchers 266
4.1 Guidelines for Responsible Data and Model Development . 267
4.2 Measurement Modeling . . . . . . . . . . . . . . . . . . . 271
4.3 Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . . . 273
5 Reorientations of Computer Vision Research 276
5.1 Grounded in Historical Context and Considering
Power Dynamics . . . . . . . . . . . . . . . . . . . . . . . 276
5.2 Small, Task Specific . . . . . . . . . . . . . . . . . . . . . 279
5.3 Community-Rooted . . . . . . . . . . . . . . . . . . . . . 280

Screenshot of Table of Contents (Part 2)

6 Systemic Change 285
6.1 Collective Action and Whistleblowing . . . . . . . . . . . . 285
6.2 Refusal/The Right not to Build Something . . . . . . . . . 287
6.3 Independent Funding Outside of Military and Multinational
Corporations . . . . . . . . . . . . . . . . . . . . . . . . . 289
7 Conclusion 291
References 293

December 16, 2024 at 4:52 PM

Reposted by Sasikanth Kotti

Dr Sasha Luccioni

@sashamtl.bsky.social

The baseline is around 65% accuracy with a simple fine-tuned BERT and ~95% with a LLM like GPT4o-mini.... your mission is to beat it while using the least energy possible!
Check out the dataset here:
huggingface.co/collections/...
And stay tuned for the last task later this week!🔥

Frugal AI Challenge Tasks - a frugal-ai-challenge Collection

Find the 3 datasets for the Frugal AI Challenge in this Collection! 🌎 Find all the details of the challenge at https://frugalaichallenge.org/

huggingface.co

December 18, 2024 at 8:17 PM

Reposted by Sasikanth Kotti

Dr Sasha Luccioni

@sashamtl.bsky.social

Today we’re launching the text challenge of the Frugal AI Challenge for the AI Action Summit: detecting climate misinformation in the media (Press, TV, Radio), sponsored by the French non-profit QuotaClimat.
The goal of the task is to detect climate-based misinformation and to categorize its type 📃

December 18, 2024 at 8:17 PM

Reposted by Sasikanth Kotti

Michael Saxon

@saxon.me

Interesting result, even after you correct for anthropomorphizing language

The key takeaway is that providing information about the training condition (explicitly or implicitly) to an LM makes it only "align" (update the probability distribution) in that condition

www.anthropic.com/research/ali...

Alignment faking in large language models

A paper from Anthropic's Alignment Science team on Alignment Faking in AI large language models

www.anthropic.com

December 18, 2024 at 10:55 PM

Reposted by Sasikanth Kotti

Mor Geva

@megamor2.bsky.social

📣📣 Wanna be an Area Chair or a Reviewer for @aclmeeting.bsky.social or know someone who would?

Nominations and self-nominations go here 👇

docs.google.com/forms/d/e/1F...

Volunteer to join ACL 2025 Programme Committee

Use this form to express your interest in joining the ACL 2025 programme committee as a reviewer or area chair (AC). The review period is 1st to 20th of March 2025. ACs need to be available for variou...

docs.google.com

December 6, 2024 at 6:01 AM

Reposted by Sasikanth Kotti

Daniel Vila

@dvilasuero.hf.co

Announcing Global-MMLU - an improved MMLU Open dataset with evaluation coverage across 42 languages.

The result of months of work with the goal of advancing Multilingual LLM evaluation.

Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.

December 6, 2024 at 8:59 AM

Reposted by Sasikanth Kotti

Ben Burtenshaw

@benburtenshaw.bsky.social

Day 2 of smol course and the community is building something here.

👷 If you want to get involved, you can do this:
- read (and star) the repo
- check out our new discord channel
- open a PR to submit an exercise on module 1
- open an issue to improve the course
- review another submission

🧵

December 5, 2024 at 8:48 AM

Reposted by Sasikanth Kotti

Sam Bowman

@sleepinyourhat.bsky.social

If you're potentially interested in transitioning into AI safety research, come collaborate with my team at Anthropic!

Funded fellows program for researchers new to the field here: alignment.anthropic.com/2024/anthrop...

Introducing the Anthropic Fellows Program

alignment.anthropic.com

December 2, 2024 at 8:30 PM

Reposted by Sasikanth Kotti

Juliet Shen

@julietshen.bsky.social

Just FYI because it seems relevant and I've seen it misstated a few times, LAION retrained their dataset and provided a diff to migrate over to the filtered Re-LAION-5B dataset

HuggingFace, like most platforms that handle user content, do checks for CSAM too.

laion.ai/blog/relaion...

Releasing Re-LAION 5B: transparent iteration on LAION-5B with additional safety fixes | LAION

<p>Today, following <a href="https://laion.ai/notes/laion-maintenance/">a safety revision procedure</a>, we announce Re-LAION-5B, an updated version of LAION...

laion.ai

November 28, 2024 at 4:35 AM

Reposted by Sasikanth Kotti

Epoch AI

@epochai.bsky.social

We've just launched our AI Benchmarking Hub!
This is a new platform for rigorous, independent evaluations of AI model capabilities, featuring interactive visualizations and in-depth analysis. (1/8)

epoch.ai/blog/introdu...

November 27, 2024 at 6:29 PM

Reposted by Sasikanth Kotti

François Fleuret

@francois.fleuret.org

My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.

fleuret.org/dlc/

And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)

fleuret.org/lbdl/

November 26, 2024 at 6:15 AM

Reposted by Sasikanth Kotti

Lloyd Jones

@lloydjones.io

I made a Chrome Extension to bring Bluesky comments to any URL :)

Get it here: github.com/joneslloyd/b...

Credits to:

- @emilyliu.me
- @coryzue.com
- @louee.bsky.social

Any feedback and/or PRs are welcome.

I threw this together in 1.5 errors, so expect bugs etc.

GitHub - joneslloyd/bluesky-comments-chrome

Contribute to joneslloyd/bluesky-comments-chrome development by creating an account on GitHub.

github.com

November 26, 2024 at 12:40 AM

Reposted by Sasikanth Kotti

Alexander Doria

@dorialexander.bsky.social

Just realized that with the API being opened and all we could make a Bluesky RAG.

November 24, 2024 at 7:54 PM

Reposted by Sasikanth Kotti

Marco Ciappelli

@handle.invalid

Short stories for children to read and listen to in many languages. Written with human imagination and translated, illustrated, and read with the help of Artificial Intelligence.  From the original Italian to English, spanish, French, Japanese, and chinese…

www.storiesottolestelle.com

December 26, 2023 at 12:22 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news