Max Bartolo
banner
maxbartolo.bsky.social
Max Bartolo
@maxbartolo.bsky.social
Building robust LLMs @Cohere
Reposted by Max Bartolo
Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀

We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️
May 22, 2025 at 3:01 PM
I'm excited to share the tech report for our @cohere.com @cohereforai.bsky.social Command A and Command R7B models. We highlight our novel approach to model training including self-refinement algorithms and model merging techniques at scale. Read more below! ⬇️
March 27, 2025 at 3:01 PM
I really enjoyed my MLST chat with Tim @neuripsconf.bsky.social about the research we've been doing on reasoning, robustness and human feedback. If you have an hour to spare and are interested in AI robustness, it may be worth a listen 🎧

Check it out at youtu.be/DL7qwmWWk88?...
March 19, 2025 at 3:11 PM
Check out @lisaalaz.bsky.social's internship work with us @cohere.com questioning the rationale behind rationales 🔥
February 13, 2025 at 4:18 PM
Super excited to see PRISM recognised as a #NeurIPS2024 best paper. This was an incredible large-scale effort by @hannahrosekirk.bsky.social and fantastic collaborators. If you're interested in human feedback, check it out, there are 100+ pages of detailed insights! 🔥
December 11, 2024 at 4:23 PM
Reposted by Max Bartolo
Our paper PRISM alignment won a best paper award at #neurips2024!

All credits to @hannahrosekirk.bsky.social A.Whitefield, P.Röttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale

Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804
blog.neurips
December 11, 2024 at 4:20 PM
Reposted by Max Bartolo
Excited to reveal Genie 2, our most capable foundation world model that, given a single prompt image, can generate an endless variety of action-controllable, playable 3D worlds. Fantastic cross-team effort by the Open-Endedness Team and many other teams at Google DeepMind! 🧞
Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.
December 4, 2024 at 4:13 PM
Looking forward to @neuripsconf.bsky.social #NeurIPS #NeurIPS2024 in Vancouver next week! ❄️

Reach out (or pop by the @cohere.com booth) if you want to chat about human feedback, robustness and reasoning, prompt optimisation, adversarial data, glitch tokens, evaluation, or anything else!
an advertisement for vancouver in british columbia canada
ALT: an advertisement for vancouver in british columbia canada
media.tenor.com
December 2, 2024 at 5:11 PM
Sparks of multi-hop reasoning ✨
🚨 New Paper 🚨
Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80% for country, 6% for year)! 1/N
November 29, 2024 at 9:41 AM
Fun to see Douwe's Dynabench plot continue to inspire new groundbreaking benchmarking work!
Excited to announce "BALROG: a Benchmark for Agentic LLM and VLM Reasoning On Games" led b UCL DARK's @dpaglieri.bsky.social! Douwe Kiela plot below is maybe the scariest for AI progress — LLM benchmarks are saturating at an accelerating rate. BALROG to the rescue. This will keep us busy for years.
November 24, 2024 at 10:11 PM
🚨 LLMs can learn to reason from procedural knowledge in pretraining data! 🚨 I particularly enjoy research where the evidence contradicts our initial hypothesis. If you're interested in LLM reasoning, check out the 60+ pages of in-depth work at arxiv.org/abs/2411.12580
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:

Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢

🧵⬇️
November 20, 2024 at 5:28 PM
Reposted by Max Bartolo
We launched Judge Arena with @huggingface.bsky.social
@clefourrier.bsky.social - a platform that lets you easily compare models as judges side-by-side and vote for the best evaluation

Check out the live leaderboard and start voting now 🤗
November 19, 2024 at 7:08 PM