From misleading bar heights to missing error bars, recent model launches have sparked debate on AI evals. In our new blogpost, we dig into what’s broken, why it matters and how they should be presented 👇
evalevalai.com/documentatio...
From misleading bar heights to missing error bars, recent model launches have sparked debate on AI evals. In our new blogpost, we dig into what’s broken, why it matters and how they should be presented 👇
evalevalai.com/documentatio...
Remi Denton & I have written what I consider to be a comprehensive paper on the harms of computer vision systems reported to date & how people have proposed addressing them, from different angles.
PDF: cdn.sanity.io/files/wc2kmx...
Remi Denton & I have written what I consider to be a comprehensive paper on the harms of computer vision systems reported to date & how people have proposed addressing them, from different angles.
PDF: cdn.sanity.io/files/wc2kmx...
Check out the dataset here:
huggingface.co/collections/...
And stay tuned for the last task later this week!🔥
Check out the dataset here:
huggingface.co/collections/...
And stay tuned for the last task later this week!🔥
The goal of the task is to detect climate-based misinformation and to categorize its type 📃
The goal of the task is to detect climate-based misinformation and to categorize its type 📃
The key takeaway is that providing information about the training condition (explicitly or implicitly) to an LM makes it only "align" (update the probability distribution) in that condition
www.anthropic.com/research/ali...
The key takeaway is that providing information about the training condition (explicitly or implicitly) to an LM makes it only "align" (update the probability distribution) in that condition
www.anthropic.com/research/ali...
Nominations and self-nominations go here 👇
docs.google.com/forms/d/e/1F...
Nominations and self-nominations go here 👇
docs.google.com/forms/d/e/1F...
The result of months of work with the goal of advancing Multilingual LLM evaluation.
Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.
The result of months of work with the goal of advancing Multilingual LLM evaluation.
Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.
👷 If you want to get involved, you can do this:
- read (and star) the repo
- check out our new discord channel
- open a PR to submit an exercise on module 1
- open an issue to improve the course
- review another submission
🧵
👷 If you want to get involved, you can do this:
- read (and star) the repo
- check out our new discord channel
- open a PR to submit an exercise on module 1
- open an issue to improve the course
- review another submission
🧵
Funded fellows program for researchers new to the field here: alignment.anthropic.com/2024/anthrop...
Funded fellows program for researchers new to the field here: alignment.anthropic.com/2024/anthrop...
HuggingFace, like most platforms that handle user content, do checks for CSAM too.
laion.ai/blog/relaion...
HuggingFace, like most platforms that handle user content, do checks for CSAM too.
laion.ai/blog/relaion...
This is a new platform for rigorous, independent evaluations of AI model capabilities, featuring interactive visualizations and in-depth analysis. (1/8)
epoch.ai/blog/introdu...
This is a new platform for rigorous, independent evaluations of AI model capabilities, featuring interactive visualizations and in-depth analysis. (1/8)
epoch.ai/blog/introdu...
fleuret.org/dlc/
And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)
fleuret.org/lbdl/
fleuret.org/dlc/
And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)
fleuret.org/lbdl/
Get it here: github.com/joneslloyd/b...
Credits to:
- @emilyliu.me
- @coryzue.com
- @louee.bsky.social
Any feedback and/or PRs are welcome.
I threw this together in 1.5 errors, so expect bugs etc.
Get it here: github.com/joneslloyd/b...
Credits to:
- @emilyliu.me
- @coryzue.com
- @louee.bsky.social
Any feedback and/or PRs are welcome.
I threw this together in 1.5 errors, so expect bugs etc.
www.storiesottolestelle.com
www.storiesottolestelle.com