Lightnews — Scholar-powered news

Janet Liu

@janetlauyeung.bsky.social

740 followers 150 following 22 posts

🏫 asst. prof. of compling at university of pittsburgh

past:
🛎️ postdoc @mainlp.bsky.social, LMU Munich
🤠 PhD in CompLing from Georgetown
🕺🏻 x2 intern @Spotify @SpotifyResearch

https://janetlauyeung.github.io/

Posts Replies Media Videos

Janet Liu

@janetlauyeung.bsky.social

🛎️ we are up! #COLM2025

October 10, 2025 at 12:31 PM

Janet Liu

@janetlauyeung.bsky.social

🤚🏼 co-organizing a workshop on the 10th!

September 12, 2025 at 7:59 PM

Janet Liu

@janetlauyeung.bsky.social

💡 more findings, error analysis, and in-depth discussion are in our paper:

📄 arxiv.org/abs/2503.10515
🤖 github.com/mainlp/disco...

meet and chat with us at our poster in Vienna 🇦🇹 at #ACL2025NLP

🕰️ 11:00-12:30, Wednesday, July 30
📍 Hall 4/5 Session 12: IP-Posters

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

🔍 finding 3: discourse representations are best aligned across languages in the intermediate layers

Layer-wise probe performance by languages. Mean accuracy over five runs.

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

🌍 finding 2: our probes generalize across languages and language families

Mean accuracy over five runs of the Aya-23-35B-probe trained and tested on various partitions of DISRPT.

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

📌 finding 1: model size alone does not lead to discourse probing success; instead, multilingual training, dataset composition, and language-specific factors play significant roles

Mean accuracy over five runs of the probing classifiers trained on the entire DISRPT and full attention representations. The reference system DisCoDisCo achieved a mean accuracy of 47.9% (the red dashed line).

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

🧪 for 23 SOTA LLMs, we use a probing approach to test whether their representations encode information relevant to discourse relation classification on DISRPT 2023, which covers 13 languages, four frameworks, 26 datasets, and various genres, domains, and modalities

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

❓problem: discourse relations are central to NLU, but current work is primarily fragmented across frameworks & languages

🔧 solution: we proposed a unified label set of 17 relations across 4 discourse frameworks. This lets us compare model behavior across corpora, languages, and annotation schemes

Examples of the core discourse relation CONDITION (Bunt and Prasad, 2016) annotated in different frameworks and languages using different labels.

the proposed unified label set (see definitions and examples in the appendix of the paper)

July 10, 2025 at 12:38 PM

Janet Liu

@janetlauyeung.bsky.social

@munichcenterml.bsky.social
@slds-lmu.bsky.social
@munichcenterml.bsky.social
@berd-nfdi.bsky.social

May 16, 2025 at 1:24 PM

Janet Liu

@janetlauyeung.bsky.social

my amazing co-organizers: @assenmacher.bsky.social Jacob Beck, @barbaraplank.bsky.social , @stephnie.bsky.social, Frauke Kreuter, Gina Walejko

May 16, 2025 at 1:24 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news