Probabilistic circuits | Controlled generation
—Seconds of classifier training to adapt to new constraints
—Only +20% inference time over base LM
—Seconds of classifier training to adapt to new constraints
—Only +20% inference time over base LM
For example, it can instantly imitate Twilight Sparkle’s tone, much better than prompting baseline.
For example, it can instantly imitate Twilight Sparkle’s tone, much better than prompting baseline.
Surprisingly, even with this low overhead, TRACE consistently outperforms much more expensive baselines on global control tasks. For example, in detoxification, TRACE outperforms DPO, RL, and FUDGE in quality while maintaining diversity and fluency.
Surprisingly, even with this low overhead, TRACE consistently outperforms much more expensive baselines on global control tasks. For example, in detoxification, TRACE outperforms DPO, RL, and FUDGE in quality while maintaining diversity and fluency.
All you need is a one-time HMM distillation and a log-linear classifier for each constraint.
All you need is a one-time HMM distillation and a log-linear classifier for each constraint.
—TRACE uses a HMM “LM-surrogate” to efficiently predict p(x>t | x<t, xt): the distribution of all possible futures, given the prompt.
—With these possible futures, we use a simple classifier to estimate if a constraint will be met: p(s | x<t, xt, x>t).
—TRACE uses a HMM “LM-surrogate” to efficiently predict p(x>t | x<t, xt): the distribution of all possible futures, given the prompt.
—With these possible futures, we use a simple classifier to estimate if a constraint will be met: p(s | x<t, xt, x>t).
This decouples tractable prediction from flexible control.
This decouples tractable prediction from flexible control.
Train-time methods (FT/RL):
—Train a model for p(xt | x<t, s)
Inference-time methods:
—Sampling from p(s | x<t, xt) is intractable for long sequences.
—Auxiliary guides approximate p(s | x<t, xt), but aren’t flexible for new constraints.
Train-time methods (FT/RL):
—Train a model for p(xt | x<t, s)
Inference-time methods:
—Sampling from p(s | x<t, xt) is intractable for long sequences.
—Auxiliary guides approximate p(s | x<t, xt), but aren’t flexible for new constraints.
But LLMs natively do p(xt | x<t), wo knowing s.
Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)
This links next-token choice to end constraint satisfaction.
But LLMs natively do p(xt | x<t), wo knowing s.
Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)
This links next-token choice to end constraint satisfaction.
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.
But LLMs natively do p(xt | x<t), wo knowing s.
Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)
This links next-token choice to end constraint satisfaction.
But LLMs natively do p(xt | x<t), wo knowing s.
Here’s the trick:
p(xt | x<t, s) ∝ p(xt | x<t) × p(s | x<t, xt)
This links next-token choice to end constraint satisfaction.
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.
Autoregressive LLM only “look one token ahead.”
But whether an output matches your style, safety, or format… can only be judged at the end.
This “myopic” generation means you can’t reliably enforce global controls—even with prompts or trained guides.