Lightnews — Scholar-powered news

Andrei Mircea

@mirandrom.bsky.social

Step 1: Understand how scaling improves LLMs.
Step 2: Directly target underlying mechanism.
Step 3: Improve LLMs independent of scale. Profit.

In our ACL 2025 paper we look at Step 1 in terms of training dynamics.

Project: mirandrom.github.io/zsl
Paper: arxiv.org/pdf/2506.05447

July 12, 2025 at 6:44 PM

Andrei Mircea

@mirandrom.bsky.social

Mechanistic understanding of systematic failures in language models is something more research should strive for IMO. This is really interesting work in that vein by @ziling-cheng.bsky.social, highly recommend you check it out.

Ziling Cheng @ziling-cheng.bsky.social · Jun 6

Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n

June 10, 2025 at 3:12 PM

Reposted by Andrei Mircea

Ziling Cheng

@ziling-cheng.bsky.social

Do LLMs hallucinate randomly? Not quite.

Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably.

📎 Paper: arxiv.org/abs/2505.22630 1/n

June 6, 2025 at 6:10 PM

Andrei Mircea

@mirandrom.bsky.social

📢 New paper “Language model scaling laws and zero-sum learning” at Sci4DL #neurips2024.

ℹ️ openreview.net/forum?id=yBq2g832Go TL;DR: scaling improves LMs by mitigating zero-sum learning, a mechanism that could be targeted directly and independent of scale.

West 205-207 4:30-5:30 PM

🧵 (1/12)

December 15, 2024 at 5:30 PM

Reposted by Andrei Mircea

Muawiz Chaudhary

@muawizc.bsky.social

My collaborators (Vivian White, @kamdh.bsky.social) will be presenting our work at the #Sci4DL workshop at #NeurIPS2024 today.

Location: West Meeting Room 205-207
Time: 4:30-5:30 PM

We present a principled probability distribution model of pre-trained deep neural networks. Check it out!

December 15, 2024 at 5:13 PM

Reposted by Andrei Mircea

Vagrant Gautam

@dippedrusk.com

For those of you attending #NeurIPS2024 in person: I'm from Vancouver and I made an extensive list of restaurants, bars, bookstores, etc., that I used to frequent when I still lived there. Enjoy!
dippedrusk.com/posts/2024-0...

Vagrant's Vancouver | Vagrant Gautam

A non-comprehensive list of places to go and things to do in the Greater Vancouver Area as curated by yours truly over 6 years. Might be outdated so please double-check!

dippedrusk.com

November 29, 2024 at 8:49 PM

Reposted by Andrei Mircea

Abhilasha Ravichander

@lasha.bsky.social

✨I am on the faculty job market in the 2024-2025 cycle!✨

My research centers on advancing Responsible AI, specifically enhancing factuality, robustness, and transparency in AI systems.

If you have relevant positions, let me know! lasharavichander.github.io Please share/RT!

Abhilasha Ravichander - Home

lasharavichander.github.io

November 11, 2024 at 2:23 PM

Reposted by Andrei Mircea

Ben Newman

@benn9.bsky.social

✨EMNLP Paper! ✨
Have you ever constructed a table to organize your literature review process? Can we use LMs to generate these automatically?

We are excited to present ArxivDIGESTables 🍽️ a study of collecting, generating, and evaluating 🎓 scientific literature review tables 📃!

A screenshot of the first page of the paper discussed in the thread. Figure 1 contains a set of three cartoon papers with related text highlighted in three different colors. To its left, there's an arrow pointing to a cartoon table with a column corresponding to each color and a row corresponding to each paper.

November 11, 2024 at 5:37 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news