Lightnews — Scholar-powered news

Gwen Yidou-Weng

@yidouweng.bsky.social

6. AND TRACE is blazingly fast!
—Seconds of classifier training to adapt to new constraints
—Only +20% inference time over base LM

July 19, 2025 at 10:00 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

5b. TRACE also adapts flexibly to 76 different personas in seconds.
For example, it can instantly imitate Twilight Sparkle’s tone, much better than prompting baseline.

July 19, 2025 at 9:59 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

5. Results:
Surprisingly, even with this low overhead, TRACE consistently outperforms much more expensive baselines on global control tasks. For example, in detoxification, TRACE outperforms DPO, RL, and FUDGE in quality while maintaining diversity and fluency.

July 19, 2025 at 9:59 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

4c. In short, TRACE achieves global control with tractable lookahead.
All you need is a one-time HMM distillation and a log-linear classifier for each constraint.

July 19, 2025 at 9:58 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

4b. How?
—TRACE uses a HMM “LM-surrogate” to efficiently predict p(x>t | x<t, xt): the distribution of all possible futures, given the prompt.
—With these possible futures, we use a simple classifier to estimate if a constraint will be met: p(s | x<t, xt, x>t).

July 19, 2025 at 9:58 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

4a. TRACE approximates p(s | x<t)—the chance that a partial sequence will meet constraints—not by direct modeling, but via probabilistic reasoning.
This decouples tractable prediction from flexible control.

July 19, 2025 at 9:58 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

3. Prev solutions:
Train-time methods (FT/RL):
—Train a model for p(xt | x<t, s)
Inference-time methods:
—Sampling from p(s | x<t, xt) is intractable for long sequences.
—Auxiliary guides approximate p(s | x<t, xt), but aren’t flexible for new constraints.

July 19, 2025 at 9:58 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

2. To control outputs, we want p(xt | x<t, s), i.e., every token should consider the end constraint s.
But LLMs natively do p(xt | x<t), wo knowing s.

Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)

This links next-token choice to end constraint satisfaction.

July 19, 2025 at 9:58 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

1. Why is LLM control hard?
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.

July 19, 2025 at 9:57 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

Paper: openreview.net/forum?id=Lhk...
Code: github.com/yidouweng/tr...

TRACE Back from the Future: A Probabilistic Reasoning Approach to...

As large language models (LMs) advance, there is an increasing need to control their outputs to align with human values (e.g., detoxification) or desired attributes (e.g., personalization, topic)....

openreview.net

July 19, 2025 at 9:57 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

2. To control outputs, we want p(xt | x<t, s), i.e., every token should consider the end constraint s.
But LLMs natively do p(xt | x<t), wo knowing s.

Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)

This links next-token choice to end constraint satisfaction.

July 19, 2025 at 9:52 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

1. Why is LLM control hard?
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.

July 19, 2025 at 9:52 PM

Gwen Yidou-Weng

@yidouweng.bsky.social

Paper: openreview.net/forum?id=Lhk...
Code: github.com/yidouweng/tr...

TRACE Back from the Future: A Probabilistic Reasoning Approach to...

As large language models (LMs) advance, there is an increasing need to control their outputs to align with human values (e.g., detoxification) or desired attributes (e.g., personalization, topic)....

openreview.net

July 19, 2025 at 9:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news