Probabilistic circuits | Controlled generation
—Seconds of classifier training to adapt to new constraints
—Only +20% inference time over base LM
—Seconds of classifier training to adapt to new constraints
—Only +20% inference time over base LM
For example, it can instantly imitate Twilight Sparkle’s tone, much better than prompting baseline.
For example, it can instantly imitate Twilight Sparkle’s tone, much better than prompting baseline.
Surprisingly, even with this low overhead, TRACE consistently outperforms much more expensive baselines on global control tasks. For example, in detoxification, TRACE outperforms DPO, RL, and FUDGE in quality while maintaining diversity and fluency.
Surprisingly, even with this low overhead, TRACE consistently outperforms much more expensive baselines on global control tasks. For example, in detoxification, TRACE outperforms DPO, RL, and FUDGE in quality while maintaining diversity and fluency.
Train-time methods (FT/RL):
—Train a model for p(xt | x<t, s)
Inference-time methods:
—Sampling from p(s | x<t, xt) is intractable for long sequences.
—Auxiliary guides approximate p(s | x<t, xt), but aren’t flexible for new constraints.
Train-time methods (FT/RL):
—Train a model for p(xt | x<t, s)
Inference-time methods:
—Sampling from p(s | x<t, xt) is intractable for long sequences.
—Auxiliary guides approximate p(s | x<t, xt), but aren’t flexible for new constraints.
TRACE lets LM see all endings before each move.
– Global control at inference time
– Tractable lookahead via an HMM LM-proxy
– Linear classifier per constraint
Outperform RL, DPO, FUDGE—at just +20% decoding over base LM.
#ICML2025 @guyvdb.bsky.social
TRACE lets LM see all endings before each move.
– Global control at inference time
– Tractable lookahead via an HMM LM-proxy
– Linear classifier per constraint
Outperform RL, DPO, FUDGE—at just +20% decoding over base LM.
#ICML2025 @guyvdb.bsky.social