Lightnews — Scholar-powered news

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Super interesting application of program search

Goals are mapped to programs which are embedded in a latent space.

A fitness metric is assigned to the programs and program search is done to synthesise new human-like goals.

February 22, 2025 at 11:53 AM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Wanna try to guess which of those gets parsed as a string and which as a number? Answer in alt text.

YAML parsing in python is weird.

{'lol': ['5.0E6',
'5.0e6',
'5.E6',
'5.e6',
'5E6',
'5e6',
5e-06,
5e-06,
5e-06,
5e-06,
'5E-6',
'5e-6',
5000000.0,
5000000.0,
5000000.0,
5000000.0,
'5E+6',
'5e+6']}

February 17, 2025 at 4:49 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Interesting idea to generate responses using diffusion rather than left-to-right auto-regressive models

February 17, 2025 at 12:31 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Supercomputers - large computer clusters - allow you to work a number of years ahead.

Creating the GUI at PARC seemed like a "waste of FLOPs" but revolutionized computing.

From here https://www.youtube.com/watch?v=dZQ7x0-MZcI

February 15, 2025 at 12:56 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Neat idea! Fine-tuning using majority voting and length filtering generalises a model's capabilities.

Models generalise to slightly harder versions of a problem, and the correct answers are used to bootstrap the next model and the next one and so on.

February 13, 2025 at 1:17 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Turning the temperature up using R1

Starting to think

gibberish gibberish gibberish

Focus again. Calm up.

🤣

January 25, 2025 at 6:44 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Hm, using reasoning models really feels qualitatively different (using @openrouter.bsky.social for inference).

It's fun to see these aha moments and it'd be interesting to understand whether their presence helps.

January 25, 2025 at 12:26 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

R1 rewards the model when it uses the correct thinking tags.

At least in that case, it looks like <thinking> is simply a consequence of RL.

January 22, 2025 at 4:34 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Hm, seems that a system prompt telling Claude to "think before answering" is what should create more chains of thoughts.

January 22, 2025 at 4:14 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Huh, interesting, Claude 3.5 sonnet seems to do hidden CoT in the app.

Could not reproduce with the API tho.

January 22, 2025 at 4:03 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

1.5B distilled model beats 4o and Claude 3.5 Sonnet on hard math problems

January 20, 2025 at 6:33 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

If the benchmarks reproduce, seems that even 7B distilled models from R1 outputs beat 4o while even a 1.5B model gets close.

No wonder that OAI keeps the chains of thoughts private!

January 20, 2025 at 2:25 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

These "wait, let's go back" moments are also emergent.

January 20, 2025 at 2:11 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Chain of thought gets longer and longer the longer the RL algo runs.

January 20, 2025 at 2:11 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Just doing RL with a reward related to the correctness signal given by a verifier seems to be enough.

Another example of "the model just wants to learn". No need for fancy search - looks like the model will learn the right algo in the chain of thought.

January 20, 2025 at 2:11 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Interesting result re evolutionary algos for inference time search

January 20, 2025 at 11:45 AM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

NVIDIA's recent Project Digits is in the same category, with 1 FP4 PFLOPS (0.25 FP16 PFLOPS) and 128 GB unified memory at - hopefully - $3000.

Another important feature is NCCL, which is fast inter-GPU communication. Thus, multiple boxes can be used for inference.

January 7, 2025 at 2:50 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

A good future is one with AIs on every desk.

If that future is to come, we need to catalyse a similar community and similar machines.

The first machine I heard about in this category was Tinybox.

The smallest has 0.7 FP16 PFLOPS and 144GB GPU memory at $15k.

January 7, 2025 at 2:50 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

In 1975, the Altair 8800 was released at about $3000 (inflation adjusted).

It was programmed using individual switches and its display was a bunch of lights on the front panel.

Nonetheless, the price was low enough to start a hobbyist community and catalyse the PC community.

January 7, 2025 at 2:50 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Comparing vectors of landmarks with a remote database is the first product use of homomorphic encryption I've heard of.

It's a good one!

Privacy-preserving RAG with local LLM and remote documents could be done in a very similar way.

Put more simply: You take a photo; your Mac or iThing locally outlines what it thinks is a landmark or place of interest in the snap; it homomorphically encrypts a representation of that portion of the image in a way that can be analyzed without being decrypted; it sends the encrypted data to a remote server to do that analysis, so that the landmark can be identified from a big database of places; and it receives the suggested location again in encrypted form that it alone can decipher.

January 5, 2025 at 9:31 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Another benchmark where o-class models show a jump compared to GPT-class models (arXiv:2406.04520)

Mystery Blocksworld is a block stacking task where the names are randomised, requiring generalisation.

Still plenty of room to go, but clearly the start of a new s curve.

January 3, 2025 at 6:51 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

One of the great things about AI is how accessible research tooling is.

Breakthrough labs (this is from OpenAI) are basically GPUs, Python, monitoring, docs, and a chat app.

Even if this post was part in jest, this is a point of joy. We should make sure the culture of openness continues.

December 27, 2024 at 12:37 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

One thing I'd wish to know more about is the type of RL used.

My intuition is that it rhymes with MaxEnt, with code and math verifiers. Like, OREO has similar scaling curves.

December 24, 2024 at 4:44 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Nice 5x to 15.6x speed-up of equivariant operations from NVIDIA.

November 20, 2024 at 10:52 AM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

This is a super interesting plot.

ML conventional wisdom is the bias-variance trade-off.

Here is a neural net with a single hidden layer. At first, bias decreases and variance increases.

As you train for longer, you get a phase transition and then *both* decrease.

November 18, 2024 at 7:58 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news