Elias Stengel-Eskin
esteng.bsky.social
Elias Stengel-Eskin
@esteng.bsky.social
Postdoc @UNC working on NLP, AI, and computational linguistics. Formerly PhD student @JHU and undergrad @McGill
esteng.github.io
Pinned
Extremely excited to announce that I will be joining
@utaustin.bsky.social Computer Science in August 2025 as an Assistant Professor! 🎉
Reposted by Elias Stengel-Eskin
Some personal updates:
- I've completed my PhD at @unccs.bsky.social! 🎓
- Starting Fall 2026, I'll be joining the CS dept. at Johns Hopkins University @jhucompsci.bsky.social as an Assistant Professor 💙
- Currently exploring options for my gap year (Aug 2025 - Jul 2026), so feel free to reach out! 🔎
May 20, 2025 at 5:58 PM
Reposted by Elias Stengel-Eskin
📢 The SoLaR workshop will be collocated with COLM!
@colmweb.org

SoLaR is a collaborative forum for researchers working on responsible development, deployment and use of language models.

We welcome both technical and sociotechnical submissions, deadline July 5th!
May 12, 2025 at 3:25 PM
Reposted by Elias Stengel-Eskin
🚨 Introducing our @tmlrorg.bsky.social paper “Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation”
We present UnLOK-VQA, a benchmark to evaluate unlearning in vision-and-language models, where both images and text may encode sensitive or private information.
May 7, 2025 at 6:55 PM
Reposted by Elias Stengel-Eskin
🔥 BIG CONGRATS to Elias (and UT Austin)! Really proud of you -- it has been a complete pleasure to work with Elias and see him grow into a strong PI on *all* axes 🤗

Make sure to apply for your PhD with him -- he is an amazing advisor and person! 💙
Extremely excited to announce that I will be joining
@utaustin.bsky.social Computer Science in August 2025 as an Assistant Professor! 🎉
May 5, 2025 at 10:00 PM
Extremely excited to announce that I will be joining
@utaustin.bsky.social Computer Science in August 2025 as an Assistant Professor! 🎉
May 5, 2025 at 8:28 PM
🌵 I'm going to be presenting PBT at #NAACL2025 today at 2PM! Come by poster session 2 if you want to hear about:
-- balancing positive and negative persuasion
-- improving LLM teamwork/debate
-- training models on simulated dialogues

With @mohitbansal.bsky.social and @peterbhase.bsky.social
🎉Very excited that our work on Persuasion-Balanced Training has been accepted to #NAACL2025! We introduce a multi-agent tree-based method for teaching models to balance:

1️⃣ Accepting persuasion when it helps
2️⃣ Resisting persuasion when it hurts (e.g. misinformation)

arxiv.org/abs/2410.14596
🧵 1/4
April 30, 2025 at 3:04 PM
Reposted by Elias Stengel-Eskin
I will be presenting ✨Reverse Thinking Makes LLMs Stronger Reasoners✨at #NAACL2025!

In this work, we show
- Improvements across 12 datasets
- Outperforms SFT with 10x more data
- Strong generalization to OOD datasets

📅4/30 2:00-3:30 Hall 3

Let's chat about LLM reasoning and its future directions!
🚨 Reverse Thinking Makes LLMs Stronger Reasoners

We can often reason from a problem to a solution and also in reverse to enhance our overall reasoning. RevThink shows that LLMs can also benefit from reverse thinking 👉 13.53% gains + sample efficiency + strong generalization (on 4 OOD datasets)!
April 29, 2025 at 11:21 PM
✈️ Heading to #NAACL2025 to present 3 main conf. papers, covering training LLMs to balance accepting and rejecting persuasion, multi-agent refinement for more faithful generation, and adaptively addressing varying knowledge conflict.

Reach out if you want to chat!
April 29, 2025 at 5:52 PM
Check out 🚨CAPTURe🚨 -- a new benchmark testing spatial reasoning by making VLMs count objects under occlusion.

SOTA VLMs (GPT-4o, Qwen2-VL, Intern-VL2) have high error rates on CAPTURe (but humans have low error ✅) and models struggle to reason about occluded objects.

arxiv.org/abs/2504.15485

🧵👇
April 24, 2025 at 3:14 PM
Reposted by Elias Stengel-Eskin
🚨Real-world retrieval is messy: queries are ambiguous or docs conflict & have incorrect/irrelevant info. How can we jointly address these problems?

➡️RAMDocs: challenging dataset w/ ambiguity, misinformation & noise
➡️MADAM-RAG: multi-agent framework, debates & aggregates evidence across sources

🧵⬇️
April 18, 2025 at 5:06 PM
Reposted by Elias Stengel-Eskin
Excited to share my first paper as first author: "Task-Circuit Quantization" 🎉
I led this work to explore how interpretability insights can drive smarter model compression. Big thank you to @esteng.bsky.social, Yi-Lin Sung, and @mohitbansal.bsky.social for mentorship and collaboration. More to come
🚨Announcing TaCQ 🚨 a new mixed-precision quantization method that identifies critical weights to preserve. We integrate key ideas from circuit discovery, model editing, and input attribution to improve low-bit quant., w/ 96% 16-bit acc. at 3.1 avg bits (~6x compression)

📃 arxiv.org/abs/2504.07389
April 16, 2025 at 4:19 PM
Reposted by Elias Stengel-Eskin
What if we could transform advanced math problems into abstract programs that can generate endless, verifiable problem variants?

Presenting EFAGen, which automatically transforms static advanced math problems into their corresponding executable functional abstractions (EFAs).
🧵👇
April 15, 2025 at 7:37 PM
🚨Announcing TaCQ 🚨 a new mixed-precision quantization method that identifies critical weights to preserve. We integrate key ideas from circuit discovery, model editing, and input attribution to improve low-bit quant., w/ 96% 16-bit acc. at 3.1 avg bits (~6x compression)

📃 arxiv.org/abs/2504.07389
April 12, 2025 at 2:19 PM
Reposted by Elias Stengel-Eskin
🥳🥳 Honored and grateful to be awarded the 2025 Apple Scholars in AI/ML PhD Fellowship! ✨

Huge shoutout to my advisor @mohitbansal.bsky.social, & many thanks to my lab mates @unccs.bsky.social , past collaborators + internship advisors for their support ☺️🙏

machinelearning.apple.com/updates/appl...
March 27, 2025 at 7:25 PM
Reposted by Elias Stengel-Eskin
Introducing VEGGIE 🥦—a unified, end-to-end, and versatile instructional video generative model.

VEGGIE supports 8 skills, from object addition/removal/changing, and stylization to concept grounding/reasoning. It exceeds SoTA and shows 0-shot multimodal instructional & in-context video editing.
March 19, 2025 at 6:56 PM
🚨UPCORE is our new method for balancing unlearning/forgetting with maintaining model performance.

Best part is it works by selecting a coreset from the data rather than changing the model, so it is compatible with any unlearning method, with consistent gains for 3 methods + 2 tasks!
🚨 Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact.

UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods.

🧵👇
February 25, 2025 at 2:33 AM
Reposted by Elias Stengel-Eskin
🚨 Introducing UPCORE, to balance deleting info from LLMs with keeping their other capabilities intact.

UPCORE selects a coreset of forget data, leading to a better trade-off across 2 datasets and 3 unlearning methods.

🧵👇
February 25, 2025 at 2:23 AM
Reposted by Elias Stengel-Eskin
🚨 Check out "UTGen & UTDebug" for learning to automatically generate unit tests (i.e., discovering inputs which break your code) and then applying them to debug code with LLMs, with strong gains (>12% pass@1) across multiple models/datasets! (see details in 🧵👇)

1/4
🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨
which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests.

UTGen+UTDebug yields large gains in debugging (+12% pass@1) & addresses 3 key questions:

🧵👇
February 5, 2025 at 6:53 PM
🚨 Excited to announce UTGen and UTDebug, where we first learn to generate unit tests and then apply them to debugging generated code with LLMs, with strong gains (+12% pass@1) on LLM-based debugging across multiple models/datasets via inf.-time scaling and cross-validation+backtracking!

🧵👇
🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨
which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests.

UTGen+UTDebug yields large gains in debugging (+12% pass@1) & addresses 3 key questions:

🧵👇
February 4, 2025 at 7:13 PM
Reposted by Elias Stengel-Eskin
🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨
which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests.

UTGen+UTDebug yields large gains in debugging (+12% pass@1) & addresses 3 key questions:

🧵👇
February 4, 2025 at 7:10 PM
Reposted by Elias Stengel-Eskin
🎉 Congrats to the awesome students, postdocs, & collaborators for this exciting batch of #ICLR2025 and #NAACL2025 accepted papers (FYI some are on the academic/industry job market and a great catch 🙂), on diverse, important topics such as:

-- adaptive data generation environments/policies
...
🧵
January 27, 2025 at 9:38 PM
🎉Very excited that our work on Persuasion-Balanced Training has been accepted to #NAACL2025! We introduce a multi-agent tree-based method for teaching models to balance:

1️⃣ Accepting persuasion when it helps
2️⃣ Resisting persuasion when it hurts (e.g. misinformation)

arxiv.org/abs/2410.14596
🧵 1/4
January 23, 2025 at 4:51 PM
Reposted by Elias Stengel-Eskin
🎉Congratulations to Prof. @mohitbansal.bsky.social on being named a 2025 @RealAAAI Fellow for "significant contributions to multimodal AI foundations & faithful language generation and summarization." 👏

16 Fellows chosen worldwide by cmte. of 9 past fellows & ex-president: aaai.org/about-aaai/a...
January 21, 2025 at 3:56 PM
Congrats @mohitbansal.bsky.social for being selected to be part of this prestigious #AAAI Fellows group! Very well-deserved recognition of long-term contributions 🎉🎉
Thanks @AAAI for selecting me as a #AAAI Fellow! Very humbled+excited to be a part of the respected cohort of this+past years' fellows (& congrats everyone)! 🙏

100% credit goes to my amazing past/current students+postdocs+collab for their work (& thanks to mentors+family)!💙
aaai.org/about-aaai/a...
🎉Congratulations to Prof. @mohitbansal.bsky.social on being named a 2025 @RealAAAI Fellow for "significant contributions to multimodal AI foundations & faithful language generation and summarization." 👏

16 Fellows chosen worldwide by cmte. of 9 past fellows & ex-president: aaai.org/about-aaai/a...
January 21, 2025 at 7:55 PM
Congrats on this huge (and well-deserved) accomplishment @mohitbansal.bsky.social!
Deeply honored & humbled to have received the Presidential #PECASE Award by the @WhiteHouse and @POTUS office! 🙏

Most importantly, very grateful to my amazing mentors, students, postdocs, collaborators, and friends+family for making this possible, and for making the journey worthwhile + beautiful 💙
🎉 Congratulations to Prof. @mohitbansal.bsky.social for receiving the Presidential #PECASE Award by @WhiteHouse, which is the highest honor bestowed by US govt. on outstanding scientists/engineers who show exceptional potential for leadership early in their careers!

whitehouse.gov/ostp/news-up...
January 16, 2025 at 1:21 PM