Lightnews — Scholar-powered news

Reposted by Alex Lew

Tomer Ullman

@tomerullman.bsky.social

not sure how to get this across to non-academics but here goes,

Imagine if you were suddenly told 'we decided not to pay your salary', that's kind of what the grant cuts felt like.

Now imagine if you were suddenly told 'we are going to set your dog on fire', that's what this feels like:

May 22, 2025 at 6:35 PM

Reposted by Alex Lew

Ben Lipkin

@benlipkin.bsky.social

Want to use AWRS SMC?

Check out the GenLM control library: github.com/genlm/genlm-...

GenLM supports not only grammars, but arbitrary programmable constraints from type systems to simulators.

If you can write a Python function, you can control your language model!

May 13, 2025 at 2:22 PM

Reposted by Alex Lew

Ben Lipkin

@benlipkin.bsky.social

Many LM applications may be formulated as text generation conditional on some (Boolean) constraint.

Generate a…
- Python program that passes a test suite.
- PDDL plan that satisfies a goal.
- CoT trajectory that yields a positive reward.
The list goes on…

How can we efficiently satisfy these? 🧵👇

May 13, 2025 at 2:22 PM

Reposted by Alex Lew

João Loula

@joaoloula.bsky.social

#ICLR2025 Oral

How can we control LMs using diverse signals such as static analyses, test cases, and simulations?

In our paper “Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo” (w/ @benlipkin.bsky.social,
@alexlew.bsky.social, @xtimv.bsky.social) we:

April 25, 2025 at 7:33 PM

Alex Lew

@alexlew.bsky.social

@xtimv.bsky.social and I were just discussing this interesting comment in the DeepSeek paper introducing GRPO: a different way of setting up the KL loss.

It's a little hard to reason about what this does to the objective. 1/

Also note that, instead of adding KL penalty in the reward, GRPO regularizes by directly adding the KL divergence between the trained policy and the reference policy to the loss, avoiding complicating the calculation of the advantage.

February 10, 2025 at 4:32 AM

Alex Lew

@alexlew.bsky.social

If you're interested in a PhD at the intersection of machine learning and programming languages, consider applying to Yale CS!

We're exploring new approaches to building software that draws inferences and makes predictions. See alexlew.net for details & apply at gsas.yale.edu/admissions/ by Dec. 15

Probabilistic and differentiable programming at Yale — fully funded PhD positions starting Fall 2025! Apply by Dec. 15.

Do a PhD at the rich intersection of programming languages and machine learning.

December 8, 2024 at 4:27 PM

Reposted by Alex Lew

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Kind of a broken record here but proceedings.neurips.cc/paper_files/...
is totally fascinating in that it postulates two underlying, measurable structures that you can use to assess if RL will be easy or hard in an environment

e introduce the effective horizon, a property of
MDPs that controls how difficult RL is. Our analysis is mo-
tivated by Greedy Over Random Policy (GORP), a simple
Monte Carlo planning algorithm (left) that exhaustively ex-
plores action sequences of length k and then uses m random
rollouts to evaluate each leaf node. The effective horizon
combines both k and m into a single measure. We prove
sample complexity bounds based on the effective horizon that
correlate closely with the real performance of PPO, a deep
RL algorithm, on our BRIDGE dataset of 155 deterministic
MDPs (right).

November 23, 2024 at 6:18 PM

Alex Lew

@alexlew.bsky.social

The New Yorker used to have human narrators do pretty great audio versions of selected articles. But then they quietly switched to generic, lifeless AI (with no indication until you click "Listen").

Occasionally they'll still have a human reader, like Sedaris here, and the contrast is insane

The New Yorker @newyorker.com · Nov 23

David Sedaris writes about travelling with his longtime partner, Hugh—and asking him if a stranded passenger could join their drive from Maine to New York. “The look he gave me was not one I had never seen before.”

The Long Way Home After a Cancelled Flight, by David Sedaris

Had I proposed earlier that we invite someone stranded to come with us to New York, Hugh would have said no. But now there was really no way for him to back out.

www.newyorker.com

November 23, 2024 at 7:57 PM

Reposted by Alex Lew

Evan Peck

@peck.phd

Trying something new:
A 🧵 on a topic I find many students struggle with: "why do their 📊 look more professional than my 📊?"

It's *lots* of tiny decisions that aren't the defaults in many libraries, so let's break down 1 simple graph by @jburnmurdoch.bsky.social

🔗 www.ft.com/content/73a1...

November 20, 2024 at 5:09 PM

Reposted by Alex Lew

xuan (ɕɥɛn / sh-yen)

@xuanalogue.bsky.social

mixtures of circuit approximations of algorithms, I tell you!

kernel methods in the space of (short, propositional) programs!!

why memorize and interpolate answers when you can memorize and interpolate answer-producing procedures??

Laura @lauraruis.bsky.social · Nov 20

To my surprise, we find the opposite of what I thought when we started this project:

The approach to reasoning LLMs use looks unlike retrieval, and more like a generalisable strategy synthesising procedural knowledge from many documents doing a similar form of reasoning.

November 21, 2024 at 1:15 PM

Alex Lew

@alexlew.bsky.social

Surprisal of title beginning with 'O'? 3.22
Surprisal of 'o' following 'Treatment '? 0.11
Surprisal that title includes surprisal of each title character? Priceless [...I did not know titles could do this]

Screenshot of the title of the paper "On the Proper Treatment of Tokenization in Psycholinguistics." Over each letter, the authors have plotted the surprisal of the letter (-log p(this letter | context)).

November 21, 2024 at 4:06 PM

Alex Lew

@alexlew.bsky.social

This is a very cool integration of LLMs + Bayesian methods.

LLMs serve as *likelihoods*: how likely would the human be to have issued this (English) command, given a particular (symbolic) plan? No generation, just scoring :)

A Bayesian agent can then resolve ambiguity in really sensible ways

Examples of the CLIPS agent from Zhi-Xuan, Ying et al. 2024 resolving ambiguity in human instructions.

November 19, 2024 at 7:19 PM

Alex Lew

@alexlew.bsky.social

It's interesting just how recent this shift was. Autodiff existed but hadn't been adopted by the ML community. Justin Domke had a blog post in 2009 lamenting that so many papers claimed "an efficient algorithm for gradients" as a key technical contribution

justindomke.wordpress.com/2009/02/17/a...

November 18, 2024 at 7:35 PM

Alex Lew

@alexlew.bsky.social

Hi Bluesky! My claim to fame is the development of the Alexander Hamiltonian Monte Carlo algorithm.

Younger researchers may not realize due to Moore's Law (Lin-Manuel Miranda becomes roughly half as cool every two years), but back when this was published in 2021, it was considered mildly topical

Algorithm box from the joke paper "Alexander Hamiltonian Monte Carlo."

A proof of the main theorem from the paper, which is that the algorithm "eventually converges to the room where it happens, if the user is willing to wait for it."

November 18, 2024 at 7:11 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news