Lightnews — Scholar-powered news

Andrew Jesson

@anndvision.bsky.social

i'm at ICLR this week

presenting at the morning poster session on thursday

excited to catch up with friends and collaborators, old and new

let's chat

April 23, 2025 at 4:10 AM

Andrew Jesson

@anndvision.bsky.social

we explore two different discrepancies: the negative log likelihood (NLL), and the negative log marginal likelihood (NLML)

the NLL gives p-values that are informative of whether there are enough in-context examples

this can reduce risk in safety critical settings

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

we show that the GPC is an effective OOD predictor on generative image completion tasks using a modified Llama-2 model trained from scratch

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

we show that the GPC is an effective predictor of out-of-capability natural language tasks using pre-trained LLMs

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

we show that the GPC is an effective OOD predictor for tabular data using synthetic data and a modified Llama-2 model trained from scratch

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

the result is the generative predictive p-value

pre-selecting a significance level α to threshold the p-value gives us a predictor of model capacity: the generative predictive check (GPC)

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

understanding these nuances is the domain of Bayesian model criticism

posterior predictive checks form a family of model criticism techniques

but for discrepancy functions like the negative log likelihood, PPCs require the likelihood and posterior

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

the posterior is informative about if there are enough in-context examples

but such inferences are made by any model, even misaligned ones

if a model is too flexible, more examples may be needed to specify the task

if it is too specialized, the inferences may be unreliable

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

a model θ defines a joint distribution over datasets x and explanations f

the joint comprises the likelihood over datasets and the prior over explanations

the posterior is a distribution over explanations given a dataset

the posterior predictive gives the model a voice

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

an in-context learning problem comprises a model, a dataset, and a task

knowing when an LLM provides reliable responses is challenging in this setting

there may not be enough in-context examples to specify the task

or the model may just not have the capability to it

December 13, 2024 at 4:11 PM

Andrew Jesson

@anndvision.bsky.social

can generative ai solve your in-context learning problem ?

we develop a predictor that only requires sampling and log probs

we show it works for tabular, natural language, and imaging problems

come chat at the safe generative ai workshop at NeurIPS

📄 arxiv.org/abs/2412.06033

December 13, 2024 at 4:11 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news