Andrei Mircea
banner
mirandrom.bsky.social
Andrei Mircea
@mirandrom.bsky.social
PhD student at University of Montreal // Mila ··· mechanistic understanding of LLMs + Human-AI collaboration for science ··· http://mirandrom.github.io
Pinned
Step 1: Understand how scaling improves LLMs.
Step 2: Directly target underlying mechanism.
Step 3: Improve LLMs independent of scale. Profit.

In our ACL 2025 paper we look at Step 1 in terms of training dynamics.

Project: mirandrom.github.io/zsl
Paper: arxiv.org/pdf/2506.05447
Step 1: Understand how scaling improves LLMs.
Step 2: Directly target underlying mechanism.
Step 3: Improve LLMs independent of scale. Profit.

In our ACL 2025 paper we look at Step 1 in terms of training dynamics.

Project: mirandrom.github.io/zsl
Paper: arxiv.org/pdf/2506.05447
July 12, 2025 at 6:44 PM
Mechanistic understanding of systematic failures in language models is something more research should strive for IMO. This is really interesting work in that vein by @ziling-cheng.bsky.social, highly recommend you check it out.
Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n
June 10, 2025 at 3:12 PM
Reposted by Andrei Mircea
Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n
June 6, 2025 at 6:10 PM
📢 New paper “Language model scaling laws and zero-sum learning” at Sci4DL #neurips2024.

ℹ️ openreview.net/forum?id=yBq2g832Go TL;DR: scaling improves LMs by mitigating zero-sum learning, a mechanism that could be targeted directly and independent of scale.

West 205-207 4:30-5:30 PM

🧵 (1/12)
December 15, 2024 at 5:30 PM
Reposted by Andrei Mircea
My collaborators (Vivian White, @kamdh.bsky.social) will be presenting our work at the #Sci4DL workshop at #NeurIPS2024 today.

Location: West Meeting Room 205-207
Time: 4:30-5:30 PM

We present a principled probability distribution model of pre-trained deep neural networks. Check it out!
December 15, 2024 at 5:13 PM
Reposted by Andrei Mircea
For those of you attending #NeurIPS2024 in person: I'm from Vancouver and I made an extensive list of restaurants, bars, bookstores, etc., that I used to frequent when I still lived there. Enjoy!
dippedrusk.com/posts/2024-0...
Vagrant's Vancouver | Vagrant Gautam
A non-comprehensive list of places to go and things to do in the Greater Vancouver Area as curated by yours truly over 6 years. Might be outdated so please double-check!
dippedrusk.com
November 29, 2024 at 8:49 PM
Reposted by Andrei Mircea
✨I am on the faculty job market in the 2024-2025 cycle!✨

My research centers on advancing Responsible AI, specifically enhancing factuality, robustness, and transparency in AI systems.

If you have relevant positions, let me know! lasharavichander.github.io Please share/RT!
Abhilasha Ravichander - Home
lasharavichander.github.io
November 11, 2024 at 2:23 PM
Reposted by Andrei Mircea
✨EMNLP Paper! ✨
Have you ever constructed a table to organize your literature review process? Can we use LMs to generate these automatically?

We are excited to present ArxivDIGESTables 🍽️ a study of collecting, generating, and evaluating 🎓 scientific literature review tables 📃!
November 11, 2024 at 5:37 PM