Lightnews — Scholar-powered news

Kareem Ahmed

@kareemyousrii.bsky.social

What books is this? 👀

December 22, 2024 at 11:03 PM

Kareem Ahmed

@kareemyousrii.bsky.social

Will do!

December 11, 2024 at 8:00 PM

Kareem Ahmed

@kareemyousrii.bsky.social

Very cool!

December 11, 2024 at 6:08 PM

Kareem Ahmed

@kareemyousrii.bsky.social

This was the last paper of my PhD and was in collaboration with my very dear advisors @kaiwei_chang and @guyvdb

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

We evaluate our approach, which we call Gen C (like Gen Z, get it?) on several tasks such as LLM detoxification, Sudoku as well as shortest-path prediction and and show that our approach outperforms the baselines. We plan on adding even more tasks very soon.

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

More importantly, we can efficiently condition this approximate distribution on our constraint such that any sample provably satisfies the constraint. We can reweigh our samples using the LLM to correct for any bias introduced by our approximate distribution.

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

To do so, we construct a first-order approximation of the LLM centered at the unconstrained sample. This approximation naturally does not constitute the best LM, but allows us to efficiently represent a distribution over all sentences of bounded length.

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Now imagine we want to ban a bad expression, say "full of sh!t". We start by taking a sample from the LLM. The sample, shown in red, violates the constraint. What we want to do now is project the sample onto the support of the LLM distribution satisfying the constraint, m(alpha).

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Constrained decoding typically uses a DFA to mask invalid tokens at every step of generation. This ensures constraint satisfaction* but can introduce significant bias in the generated output.

*This is not strictly true due to tokenization. See paper for more on this.

December 11, 2024 at 12:20 AM

Kareem Ahmed

@kareemyousrii.bsky.social

More importantly, we can efficiently condition this approximate distribution on our constraint such that any sample provably satisfies the constraint. We can reweigh our samples using the LLM to correct for any bias introduced by our approximate distribution.

December 11, 2024 at 12:14 AM

Kareem Ahmed

@kareemyousrii.bsky.social

To do so, we construct a first-order approximation of the LLM centered at the unconstrained sample. This approximation naturally does not constitute the best LM, but allows us to efficiently represent a distribution over all sentences of bounded length.

December 11, 2024 at 12:14 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Now imagine we want to ban a bad expression, say "full of sh!t". We start by taking a sample from the LLM. The sample, shown in red, violates the constraint. What we want to do now is project the sample onto the support of the LLM distribution satisfying the constraint, m(alpha).

December 11, 2024 at 12:14 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Constrained decoding typically uses a DFA to mask invalid tokens at every step of generation. This ensures constraint satisfaction* but can introduce significant bias in the generated output.

*This is not strictly true due to tokenization. See paper for more on this.

December 11, 2024 at 12:14 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Can't wait to finally meet you and hopefully @mniepert.bsky.social in person! :)

December 8, 2024 at 1:38 AM

Kareem Ahmed

@kareemyousrii.bsky.social

Hi! I work on probabilistic ML and Neuro-Symbolic AI

November 18, 2024 at 6:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news