Lightnews — Scholar-powered news

Zaid Khan

@codezakh.bsky.social

270 followers 530 following 9 posts

PhD student @ UNC NLP with @mohitbansal working on grounded reasoning + code generation | currently interning at Ai2 (PRIOR) | formerly NEC Laboratories America | BS + MS @ Northeastern

zaidkhan.me

Posts Replies Media Videos

Zaid Khan

@codezakh.bsky.social

It was a fun collaboration with @esteng.bsky.social @archiki.bsky.social @jmincho.bsky.social @mohitbansal.bsky.social! 🥳

Paper: arxiv.org/abs/2504.09763
Project Page: zaidkhan.me/EFAGen
Datasets + Models: huggingface.co/collections/...
HF Paper: huggingface.co/papers/2504....

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

Scientists often infer abstract procedures from specific instances of problems and use the abstractions to generate new, related instances. For example, programs encoding the formal rules and properti...

arxiv.org

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

EFAs can be used for adversarial search to find harder problem variants. This has some interesting potential uses, such as finding fresh problems for online RL or identifying gaps / inconsistencies in a model’s reasoning ability. We can find variants of even Level 1 problems (GPT-4o) solves wrong.

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

EFAGen can infer EFAs for diverse sources of math data.

We demonstrate this by inferring EFAs on the NuminaMath dataset, which includes problems ranging from grade school to olympiad level problems. EFAGen can successfully infer EFAs for all math sources in NuminaMath, even olympiad-level problems.

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

EFAs are effective at augmenting training data.

Getting high-quality math data is expensive. EFAGen offers a way to improve upon existing math training data by generating problem variants through EFAs. EFA-based augmentation leads to consistent improvements across all evaluation metrics.

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

LMs can self-improve at inferring EFAs with execution feedback!

We self-train Llama-3.1-8B-Instruct with rejection finetuning using our derived unit tests as a verifiable reward signal and see substantial improvements in the model’s ability to infer EFAs, especially on harder problems.

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

Key Insight💡: We formalize properties any valid EFA must possess as unit tests and treat EFA inference as a program synthesis task that we can apply test-time search to.

April 15, 2025 at 7:37 PM

Zaid Khan

@codezakh.bsky.social

➡️ EFAGen can generate data to augment static math datasets
➡️ EFAGen can infer EFAs for diverse + difficult math problems
➡️ Use EFAs to find + generate harder variants of existing math problems
➡️ LLMs can self-improve at writing EFAs

April 15, 2025 at 7:37 PM

Reposted by Zaid Khan

Mohit Bansal

@mohitbansal.bsky.social

-- positional bias of faithfulness for long-form summarization
-- improving generation faithfulness via multi-agent collaboration

(PS. Also a big thanks to ACs+reviewers for their effort!)

January 27, 2025 at 9:38 PM

Reposted by Zaid Khan

Mohit Bansal

@mohitbansal.bsky.social

-- safe T2I/T2V gener
-- generative infinite games
-- procedural+predictive video repres learning
-- bootstrapping VLN via self-refining data flywheel
-- automated preference data synthesis
-- diagnosing cultural bias of VLMs
-- adaptive decoding to balance contextual+parametric knowl conflicts
🧵

January 27, 2025 at 9:38 PM

Reposted by Zaid Khan

Mohit Bansal

@mohitbansal.bsky.social

-- adapting diverse ctrls to any diffusion model
-- balancing fast+slow sys-1.x planning
-- balancing agents' persuasion resistance+acceptance
-- multimodal compositional+modular video reasoning
-- reverse thinking for stronger LLM reasoning
-- lifelong multimodal instruc tuning via dyn data selec
🧵

January 27, 2025 at 9:38 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news