Lightnews — Scholar-powered news

Flaviu Cipcigan

@flaviucipcigan.bsky.social

there's a bunch of scripts that can migrate old posts through the api, such as this one: github.com/marcomaroni-...

i don't know if timestamps would migrate, but I've seen folks who have posts with older timestamps than the start of the bluesky

GitHub - marcomaroni-github/twitter-to-bluesky: Import all tweets exported from X/Twitter to a Bluesky account.

Import all tweets exported from X/Twitter to a Bluesky account. - marcomaroni-github/twitter-to-bluesky

github.com

February 26, 2025 at 11:24 AM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Thanks! Not sure, I'll try it 🤔

February 21, 2025 at 8:30 AM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Feb 15

What is large for a language model? Is it 400B, 70B or maybe 1T?

I think focus on raw number of parameters is a less useful frame than thinking about inference speed, cost and location of inference (on-device vs cloud).

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Dec 24

ARC-AGI is one of the most interesting benchmarks in ML.

o3 achieving human-level on the semi-private eval feels like a significant breakthrough.

Calibrating, I'd say o3 is a GPT-1 or GPT-2 moment. The direction for improvement is getting clear, with more of the research fog getting lifted.

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Feb 9

I've been reflecting today about OpenAI's five levels to measure progress in AI.

GPT-4 was at Level 1, conversational AI: a model competent at 0.1-1s tasks, like holding a conversation.

O1 / R1 reached Level 2, reasoners: a model solving 1-10min tasks such as basic coding tasks and math.

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Jan 8

A critique I hear often of LLMs is that they don't have a notion of truth, that they are BS machines, in Frankfurt's sense.

I don't think that's quite right.

Here's two papers that helped me have a more nuanced view of this question.

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Nov 16

I've been posting a lot about AI lately, so wanted to also share some of my work on the bio/chem side.

Antimicrobial peptides are proteins that kill bacteria. Most do so by making circular holes in their membranes.

In this fun to write paper, we showed fractal pores in bacterial membranes.

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Sep 9

5/ Another project I really enjoyed was the discovery of new antimicrobial peptides.

Antimicrobial resistance is a big threat to public health, and here we show ways to combine simulation and deep learning with new antimicrobial peptides validated in vivo.

Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations - Nature Biomedical Engineering

A computational method leveraging deep learning and molecular dynamics simulations enables the rapid discovery of antimicrobial peptides with low toxicity and with high potency against diverse Gram-positive and Gram-negative pathogens.

www.nature.com

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Sep 9

4/ Previously, I looked at discovering small molecules that capture carbon dioxide.

One of the interesting things about that project was how gradient boosted trees + chemical fingerprints performed very well in practice.

Machine Guided Discovery of Novel Carbon Capture Solvents

The increasing importance of carbon capture technologies for deployment in remediating CO2 emissions, and thus the necessity to improve capture materials to allow scalability and efficiency, faces...

arxiv.org

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Using GFlowNets to discover new materials for carbon capture

Discovery of novel reticular materials for carbon dioxide capture using GFlowNets

Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. Using GFlowNets, we generate po...

pubs.rsc.org

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Aug 30

MetaGFN preprint is out 🥳

1/ When building AIs for science, it's important for the algorithms to discover beyond things we already know. This is why effective, open-ended exploration is important. Here we propose MetaGFN, an algorithm to effectively find distant modes in probability distributions.

MetaGFN: Exploring Distant Modes with Adapted Metadynamics for...

Generative Flow Networks (GFlowNets) are a class of generative models that sample objects in proportion to a specified reward function through a learned policy. They can be trained either...

arxiv.org

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Nov 16

One of the exciting things happening in AI for Science has been the growth in lab automation.

Automated labs coupled with active learning are a super exciting area with lots of opportunities for progress.

I promised @cpaxton.bsky.social a short thread on this, so here it goes!

🧪

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Nov 7

If you're interested in foundation models for materials and molecules, check out our repo: github.com/IBM/materials

We have three models released based on SMILES, SELFIES and molecular graphs.

More to come shortly - we aim to have a unified collection of state-of art models across all modalities.

GitHub - IBM/materials: Foundation Model for Materials - FM4M

Foundation Model for Materials - FM4M. Contribute to IBM/materials development by creating an account on GitHub.

github.com

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Flaviu Cipcigan @flaviucipcigan.bsky.social · Jan 7

There's a lot of enthusiasm in the community about transformers trained on chemical or biological data.

Here's some interesting results and some thoughts on future directions.

Tim Kellogg @timkellogg.me · Jan 6

this is nuts

a new 7B llama-style LLM for embedding of genomes & detection of pathogens in wastewater

i’ve had a hunch that LLMs could lead to some big bio breakthroughs, since it feels like genes & proteins are a lot like a language

February 20, 2025 at 9:09 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Seems like they're fixing this in PyYAML 7.0

Support for the YAML 1.2 Core and JSON schemas [Take 2] by perlpunk · Pull Request #555 · yaml/pyyaml

Supersedes #512 This is a draft and subject to discussion. See also #486 (For #512: Thanks to @SUSE for another hackweek! I had four days of work time dedicated to an open source project of my choi...

github.com

February 17, 2025 at 4:49 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Supercomputers - large computer clusters - allow you to work a number of years ahead.

Creating the GUI at PARC seemed like a "waste of FLOPs" but revolutionized computing.

From here https://www.youtube.com/watch?v=dZQ7x0-MZcI

February 15, 2025 at 12:56 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Where do large compute clusters come into play in this case?

Alan Kay talked about the Wayne Gretzky game, a hockey player famous for his quote about skating where the puck will be.

February 15, 2025 at 12:56 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

Similarly, the benchmark scores of a model with a given number of parameters increases each generation due to better data and training algorithms, caveated by dataset leakage.

February 15, 2025 at 12:56 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

For each generation, for a fixed parameter count, the speed of training & inferring decreases due to hardware and software advances, like flash attention and multi-head latent attention.

At each generation, larger and larger number of parameters can be ran locally.

February 15, 2025 at 12:56 PM

Flaviu Cipcigan

@flaviucipcigan.bsky.social

My first computer used a processor in the Intel 8086 generation, which had about 29k transistors.

Today, an Apple M4 has 28B transistors, meaning I experienced a scale-up of 1,000,000x in my lifetime.

I expect a similar scale-up for language models.

February 15, 2025 at 12:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news