Lightnews — Scholar-powered news

Benjamin Warner

@benjaminwarner.dev

Some personal news: I've joined sophont.med to help build the next generation of open medical foundation models.

We've relaunched medarc.ai, our open science research community. Join us if you want to help advance open medical AI.

And we are hiring.

October 27, 2025 at 7:08 PM

Reposted by Benjamin Warner

Tim Kellogg

@timkellogg.me

counterpoint: GPT-5 does this, it says it doesn’t know rather than hallucinate, the world hasn’t fallen apart

The Conversation UK @uk.theconversation.com · Sep 13

The cure is likely to be worse than the disease.

Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow

The cure is likely to be worse than the disease.

tcnv.link

September 13, 2025 at 11:50 AM

Reposted by Benjamin Warner

Tom Aarsen

@tomaarsen.com

ModernBERT goes MULTILINGUAL!

One of the most requested models I've seen, @jhuclsp.bsky.social has trained state-of-the-art massively multilingual encoders using the ModernBERT architecture: mmBERT.

Stronger than an existing models at their sizes, while also much faster!

Details in 🧵

September 9, 2025 at 2:54 PM

Benjamin Warner

@benjaminwarner.dev

ChatGPT has been the best technical search engine since o4-mini.

Thinking Mini still makes for a good faster search if you don’t need the extra reasoning ability.

Simon Willison @simonwillison.net · Sep 6

The previously sensible advice to never use ChatGPT for search needs to be rethought - GPT-5 in thinking mode is shockingly good at running searches now simonwillison.net/2025/Sep/6/r...

GPT-5 Thinking in ChatGPT (aka Research Goblin) is shockingly good at search

“Don’t use chatbots as search engines” was great advice for several years... until it wasn’t. I wrote about how good OpenAI’s o3 was at using its Bing-backed search tool back …

simonwillison.net

September 6, 2025 at 10:19 PM

Benjamin Warner

@benjaminwarner.dev

Good LLMs do know and/or can reason about these things. Small, cheap, and often free LLMs are the models which cannot.

Remember the glue on pizza Reddit post that the subpar Google AI cited uncritically? Bing’s then integration of GPT 3.5 recognized the Reddit post as sarcasm.

Mark @mbainter.bsky.social · Aug 24

LLMs don't reason, so it doesnt know that the reddit post is likely an uncritical repost of the other articles. It doesn't follow the discussion to see it is (probably) debunked in the comments. All things a marginally educated human would do. AI summary raises that bar, because of ai hype.

August 24, 2025 at 9:53 PM

Reposted by Benjamin Warner

Sung Kim

@sungkim.bsky.social

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++ by Thien Tran

He walkthrough how he learned to implement Flash Attention for 5090 in CUDA C++. The main objective is to learn writing attention in CUDA C++,

August 24, 2025 at 12:45 AM

Reposted by Benjamin Warner

Eugene Vinitsky 🍒

@eugenevinitsky.bsky.social

Microsoft made a useful LLM copilot tool that could summarize text in spreadsheets. They provided clear instructions about how to use it and not to use it. In response, journalists are now mocking them for doing exactly the right thing and showing how to use and not use the tools.

August 21, 2025 at 1:50 AM

Benjamin Warner

@benjaminwarner.dev

Reports of AI eating entry level jobs are greatly exaggerated.

My guess is current and near-future LLMs are more likely to increase the demand for programmers, not decrease demand (Jevons Paradox).

John Burn-Murdoch @jburnmurdoch.ft.com · Jul 18

But, plot twist:

The much-discussed contraction in entry-level tech hiring appears to have *reversed* in recent months.

In fact, relative to the pre-generative AI era, recent grads have secured coding jobs at the same rate as they’ve found any job, if not slightly higher.

July 18, 2025 at 5:06 PM

Benjamin Warner

@benjaminwarner.dev

One of the questions we debated while training ModernBERT was whether a modern trained encoder would unlock zero-shot reasoning using only it's generative head?

Spoilers: the answer is yes.

$from transformers import pipeline model_name = "answerdotai/ModernBERT-Large-Instruct" fill_mask = pipeline("fill-mask", model=model_name, tokenizer=model_name) text = """You will be given a question and options. Select the right answer. QUESTION: If (G, .) is a group such that (ab)^-1 = a^-1b^-1, for all a, b in G, then G is a/an CHOICES: - A: commutative semi group - B: abelian group - C: non-abelian group - D: None of these ANSWER: [unused0] [MASK]""" results = fill_mask(text) answer = results[0]["token_str"].strip() print(f"Predicted answer: {answer}") # Answer: B$

February 10, 2025 at 6:13 PM

Reposted by Benjamin Warner

Simon Willison

@simonwillison.net

o3-mini is really good at writing internal documentation - feed it a codebase, get back a detailed explanation of how specific aspects of it work simonwillison.net/2025/Feb/5/o...

o3-mini is really good at writing internal documentation

I wanted to refresh my knowledge of how the Datasette permissions system works today. I already have [extensive hand-written documentation](https://docs.datasette.io/en/latest/authentication.html) for...

simonwillison.net

February 5, 2025 at 6:09 AM

Reposted by Benjamin Warner

Maria Antoniak

@mariaa.bsky.social

If you want to quickly catch up on all the open modeling things (DeepSeek, ModernBERT, etc.), this was a great overview, by @natolambert.bsky.social.

I somehow got into an argument last week with someone who was insisting that all models are industrial blackboxes... and I wish I'd had this on hand.

The latest open artifacts (#6): Reasoning models, China's lead in open-source, and a growing multimodal space

Artifacts log 6 The open LM ecosystem yet again accelerates.

www.interconnects.ai

January 27, 2025 at 3:05 PM

Benjamin Warner

@benjaminwarner.dev

In addition to being the best retrieval model under 300M params on METB (without extra work), and top 10 for under 1B, here's a fun tidbit from Alibaba's GTE ModernBERT model card:

gte-modernbert-base beats gte-qwen1.5-7b on LoCo long context retrieval with 7B less parameters.

January 23, 2025 at 7:22 PM

Reposted by Benjamin Warner

Tom Aarsen

@tomaarsen.com

The newest extremely strong embedding model based on ModernBERT-base is out: `cde-small-v2`. Both faster and stronger than its predecessor, this one tops the MTEB leaderboard for its tiny size!

Details in 🧵

January 14, 2025 at 1:21 PM

Reposted by Benjamin Warner

Antoine Chaffin

@nohtow.bsky.social

ModernBERT-embed-base is awesome because it allows to use ModernBERT-base for various tasks out-of-the-box
But the large variant of ModernBERT is also awesome...
So today, @lightonai.bsky.social is releasing ModernBERT-embed-large, the larger and more capable iteration of ModernBERT-embed!

January 14, 2025 at 3:32 PM

Benjamin Warner

@benjaminwarner.dev

ModernBERT is officially released on Transformers v4.48.0. You no longer need to install from git to use.

If you are plugging ModernBERT into an existing encoder finetuning pipeline, try increasing the learning rate. We've found that ModernBERT tends to prefer a higher LR than older models.

Transformers v4.48.0: ModernBERT, Aria, TimmWrapper, ColPali, Falcon3, Bamba, VitPose, DinoV2 w/ Registers, Emu3, Cohere v2, TextNet, DiffLlama, PixtralLarge, Moonshine

January 10, 2025 at 6:28 PM

Benjamin Warner

@benjaminwarner.dev

The good: 32GB
The bad: $2,000
The Ugly*: PCIe 5 without NVLink

January 7, 2025 at 7:12 AM

Reposted by Benjamin Warner

John West

@johnwest.bsky.social

Via @simonwillison.net's excellent blog, I found this great quote about AI models, from @benjaminwarner.dev et al. www.answer.ai/posts/2024-1...

It seems to me that AI will be most relevant in people's lives because the Honda Civic is ubiquitous, not so much because everyone is driving a Ferrari.

Basically, a frontier model like OpenAI’s O1 is like a Ferrari SF-23. It’s an obvious triumph of engineering, designed to win races, and that’s why we talk about it. But it takes a special pit crew just to change the tires and you can’t buy one for yourself. In contrast, a BERT model is like a Honda Civic. It’s also an engineering triumph, but more subtly, since it is engineered to be affordable, fuel-efficient, reliable, and extremely useful. And that’s why they’re absolutely everywhere.

January 1, 2025 at 4:04 PM

Reposted by Benjamin Warner

Tom Aarsen

@tomaarsen.com

That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!

Details in 🧵

December 31, 2024 at 3:43 PM

Benjamin Warner

@benjaminwarner.dev

This week we released ModernBERT, the first encoder to reach SOTA on most common benchmarks across language understanding, retrieval, and code, while running twice as fast as DeBERTaV3 on short context and three times faster than NomicBERT & GTE on long context.

December 22, 2024 at 6:12 AM

Reposted by Benjamin Warner

Mark J. Nelson

@mm-jj-nn.bsky.social

Great blog post (by a 15-author team!) on their release of ModernBERT, the continuing relevance of encoder-only models, and how they relate to, say, GPT-4/llama. Accessible enough that I might use this as an undergrad reading.

Finally, a Replacement for BERT: Introducing ModernBERT

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

December 19, 2024 at 7:11 PM

Benjamin Warner

@benjaminwarner.dev

I feel the need for speed.

December 13, 2024 at 9:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news