Lightnews — Scholar-powered news

Siyuan Song

@siyuansong.bsky.social

160 followers 330 following 37 posts

senior undergrad@UTexas Linguistics
Looking for Ph.D position 26 Fall
Comp Psycholing & CogSci, human-like AI, rock🎸 @growai.bsky.social
Prev:
Summer Research Visit @MIT BCS(2025), Harvard Psych(2024), Undergrad@SJTU(2022-24)
Opinions are my own.

Posts Replies Media Videos

Siyuan Song

@siyuansong.bsky.social

study2:we examined whether LLMs report their own temperature better than other models do. We found that self-reflection offers no advantage over temperature prediction (predicting based on the prompt and the generated text), whether within the same model or across different models.(9/n)

August 26, 2025 at 3:00 PM

Siyuan Song

@siyuansong.bsky.social

study1:we reproduced C&S’s temperature self-reporting case using a broader set of prompt and temperature settings. We found such self-reflection is highly sensitive to the prompt: even when the sampling temperature is low, a prompt 'generate a crazy sentence' leads to a high-temperature report.(8/n)

August 26, 2025 at 3:00 PM

Siyuan Song

@siyuansong.bsky.social

How reliable is what an AI says about itself? The answer depends on whether models can introspect. But, if an LLM says its temperature parameter is high (and it is!)….does that mean it’s introspecting? Surprisingly tricky to pin down. Our paper: arxiv.org/abs/2508.14802 (1/n)

August 26, 2025 at 3:00 PM

Siyuan Song

@siyuansong.bsky.social

However, the consistency between these measures is low (kappa ~ .25 for experiment 1). And within-model correlation is not really higher than across-model correlation when we consider relevantly similar models, like random seed variants (see plot below for our breakdown of “similar” models). (6/8)

March 12, 2025 at 2:31 PM

Siyuan Song

@siyuansong.bsky.social

We see that meta-linguistic prompting and direct measurement of probabilities both contain grammatical knowledge. Accuracy of both methods is high (and meta-linguistic accuracy is higher than direct for larger models). (5/8)

March 12, 2025 at 2:31 PM

Siyuan Song

@siyuansong.bsky.social

New preprint w/ @jennhu.bsky.social @kmahowald.bsky.social : Can LLMs introspect about their knowledge of language?
Across models and domains, we did not find evidence that LLMs have privileged access to their own predictions. 🧵(1/8)

March 12, 2025 at 2:31 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news