Lisa Alazraki
banner
lisaalaz.bsky.social
Lisa Alazraki
@lisaalaz.bsky.social
PhD student @ImperialCollege. Research Scientist Intern @Meta prev. @Cohere, @GoogleAI. Interested in generalisable learning and reasoning. She/her

lisaalaz.github.io
We also observe that LLMs fail to activate all the relevant neurons when they attempt to solve the tasks in Agent-CoMa. Instead, they mostly activate neurons relevant to only one reasoning type, likely as a result of single-type reasoning patterns reinforced during training.
August 28, 2025 at 2:01 PM
We test AgentCoMa on 61 contemporary LLMs of different sizes, including reasoning models (both SFT and RL-tuned). While the LLMs perform well on commonsense and math reasoning in isolation, they are far less effective at solving AgentCoMa tasks that require their composition!
August 28, 2025 at 2:01 PM
We have released #AgentCoMa, an agentic reasoning benchmark where each task requires a mix of commonsense and math to be solved 🧐

LLM agents performing real-world tasks should be able to combine these different types of reasoning, but are they fit for the job? 🤔

🧵⬇️
August 28, 2025 at 2:01 PM
Thrilled to share our new preprint on Reinforcement Learning for Reverse Engineering (RLRE) 🚀

We demonstrate that human preferences can be reverse engineered effectively by pipelining LLMs to optimise upstream preambles via reinforcement learning 🧵⬇️
May 22, 2025 at 3:01 PM
I’ll be presenting Meta-Reasoning Improves Tool Use in Large Language Models at #NAACL25 tomorrow Thursday May 1st from 2 until 3.30pm in Hall 3! Come check it out and have a friendly chat if you’re interested in LLM reasoning and tools 🙂 #NAACL
April 30, 2025 at 8:58 PM
We find the implicit setup without rationales is consistently superior in all cases. It also overwhelmingly outperforms CoT, even when we make this baseline more challenging by extending its context with additional, diverse question-answer pairs.
February 13, 2025 at 3:38 PM
Do LLMs need rationales for learning from mistakes? 🤔
When LLMs learn from previous incorrect answers, they typically observe corrective feedback in the form of rationales explaining each mistake. In our new preprint, we find these rationales do not help, in fact they hurt performance!

🧵
February 13, 2025 at 3:38 PM