Lightnews — Scholar-powered news

Reposted by Simon

Ted Underwood

@tedunderwood.com

This is an important negative reality. It's also sustained by industry incentives—and the thing most likely to break it is a big company or two deciding it's bad business to network on a competitor's site.

I am a fan of banning mean bsky account FlameTroll420, but that's not the tipping point.

John Herrman @jwherrman.bsky.social · 10d

Wrote about an obvious and yet profoundly underappreciated aspect of the AI boom: its total narrative capture by Elon Musk's X nymag.com/intelligence...

The reality of the AI industry is far larger than a subculture on X with hundreds of billions of dollars flowing into physical infrastructure, reshaping jobs, and deploying into products used by billions of people. But that makes X's capture of the AI conversation all the more notable. Much as Twitter long ago captured, amplified, and distorted elite conversation in media, in sports, in parts of finance, in left politics, and more recently in right politics, the world's understanding of what's going on in Silicon Valley right now - the hype and the doom and the bubble and the progress - is first processed through the strange culture and incentives of X, which is now owned by estranged OpenAI co-founder, Google antagonist, and xAI founder Elon Musk. It has been merged into xAI, where the in-house chatbot occasionally confuses itself with Hitler. When the history of this period in tech is eventually written, it will be composed of, among other things, a whole lot of threads and replies from blue checks on X.

Then, of course, there's the Musk factor. The fact of his ownership creates obvious risks for all of his competitors, who rely on his platform to supply the vibes they need to help keep their companies, and the AI boom in general, going. It's worth remembering that, back before he decided to purchase it and let it drive him nuts, Twitter was, to Musk, a platform with clear functional utility, a social network that helped make him wildly famous and use that fame to convince thousands of retail traders to buy into Tesla, the company that made him the richest man in the world.
Even if we assume, against all reason, that Musk won't leverage this unusual sort of power over all the competitors gathered on his platform, X's influence over them and how they're seen by the rest of the world is undeniable. Along with "AI Twitter," some of the other communities thriving on Musk era X include MAGA guys and crypto guys, meaning that the background vibe of the primary gathering place for the industry that's trying to remake the world is quite a bit more conservative and mercenary than it otherwise might have been, and while the primary factor in the public's deep skepticism of where Al is going is clearly fear about labor, association with polarizing Muskian politics probably doesn't help assuage ambient fears about job loss and surveillance. (One important consequence of having the AI elite gathered on X is that their media diets are now substantially made up of one another's posts, effusive and dismissive responses to their oren posts, and standard "For You" page Musk-era algo-slop.
It's hard to overstate how thoroughly their information diets are influenced by their timelines, and you can get a pretty good idea of what those look like by jacking into
AI Twitter yourself.)

January 5, 2026 at 8:18 PM

Reposted by Simon

Xe

@xeiaso.net

Do I know anyone on the Bing team? I'm noticing abusive traffic that ignores robots.txt.

December 17, 2025 at 2:29 PM

Reposted by Simon

Yogi Jaeger

@yoginho.spore.social.ap.brid.gy

So ... we're doing this thing, and we want to do it with you:

#theperspectivestudio - A Collaborative Practice for a Fragmented World

with Marcus Neustetter

You can come to the studio — or we can bring the studio to you!

Just published on Andrea Hiott's […]

[Original post on spore.social]

Drawing by Marcus Neustetter, showing a world being co-created by lots of little people on a scaffold.

December 14, 2025 at 8:53 PM

Reposted by Simon

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

OpenAI aren't talking about it yet, but it turns out they've adopted Anthropic's brilliant "skills" mechanism in a big way

Skills are now live in both ChatGPT and their Codex CLI tool, I wrote up some detailed notes on how they work so far here: https://simonwillison.net/2025/Dec/12/openai-skills/

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

One of the things that most excited me about Anthropic’s new Skills mechanism back in October is how easy it looked for other platforms to implement. A skill is just …

simonwillison.net

December 12, 2025 at 11:45 PM

Reposted by Simon

Tim Kellogg

@timkellogg.me

in the parent circles, there’s horror stories of kitchen remodels getting delayed due to ICE raids, so i anticipate NIMBYs to come out as anti-ICE soon

December 11, 2025 at 12:40 AM

Reposted by Simon

Steve Herman

@w7voa.journa.host.ap.brid.gy

CBP is proposing that travelers to the US from such countries as France, Germany, South Korea and the UK submit extensive personal info, including social media histories, email addresses used in the past 10 years and parents’ birthplaces. https://public-inspection.federalregister.gov/2025-22461.pdf

December 10, 2025 at 1:53 AM

Simon

@spoltier.qoto.org.ap.brid.gy

Benebelt

Night scene of a curving archway topped by a light sign: BAHNOF ZÜRICH ENGE. Thick fog diffuses all light.

December 9, 2025 at 9:58 PM

Simon

@spoltier.qoto.org.ap.brid.gy

Reading ~~classical liberal~~ (neoclassical reactionary) writing about ❄️🍑, I'm realizing something: just like polished English used to be evidence of quality of content ( now subverted by LLMs), being halfway literate used to be a sign of semi-elite status... I don't think J.S. Mill et al […]

Original post on qoto.org

qoto.org

December 7, 2025 at 12:38 PM

Reposted by Simon

Tim Kellogg

@timkellogg.me

in the words of Gemini 3:

“It is basically a Frankenstein monster combining a CNN (Convolutional Neural Network) and a Transformer, organized like a mammalian brain”

0.5B, SYNTH

huggingface.co/mkurman/Neur...

A wide, multi-stage neural-network diagram divided into three vertical blocks labeled “Stage 1: Sensory Layers,” “Stage 2: Associative Layers,” and “Stage 3: Motor Layers.” On the far left, a column titled “Inputs” shows “Input Token IDs” feeding into an “Embedding Lookup,” then “Initial Hidden States.” A dotted green line labeled “Saved Residual ‘R’ (Identity Path)” branches downward and later feeds into bridges.

Stage 1 (light blue panel) contains a boxed stack labeled “Repeating Alternating Pair (N_sensory/2).” Inside the stack, a purple box reads “Decoder Layer (Attn w/ Toggled RoPE + MLP)” above an orange box labeled “Causal Conv2D Block (Dilated Convolution).” Arrows show “Initial Hidden States” entering the stack and “Sensory Output” exiting.

Stage 2 (lavender panel) is structurally identical: “Repeating Alternating Pair (N_assoc/2)” with the same purple Decoder Layer and orange Causal Conv2D Block. Arrows denote “Associative Input” entering and “Associative Output” leaving.

Stage 3 (light turquoise panel) repeats the same structure, labeled “Repeating Alternating Pair (N_motor/2),” with “Motor Input” entering and “Motor Output” exiting.

Below these stages is a green zone with two submodules: “Bridge 1” on the left and “Bridge 2” on the right. Bridge 1 contains a box “SiLU(+R)” then “RMSNorm,” feeding into a green circle marked “( + )” indicating addition. Bridge 2 mirrors this but with “SiLU(–R) (Negated)” before RMSNorm, also feeding into an addition circle. Dotted green arrows show flow from the residual path and from each stage into the bridges, then up into subsequent stages.

On the far right, a vertical “Output Head” stack includes “Final RMSNorm,” then “Linear Head (Vocab Size),” then “Softmax / Logits,” producing the final output.

December 3, 2025 at 5:11 AM

Reposted by Simon

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

Four new models from Mistral today - all Apache 2 licensed, all vision-capable, and one of them is a 3GB model that can run in a web browser and answer questions about things it can see through the webcam! https://simonwillison.net/2025/Dec/2/introducing-mistral-3/

Introducing Mistral 3

Four new models from Mistral today: three in their "Ministral" smaller model series (14B, 8B, and 3B) and a new Mistral Large 3 MoE model with 675B parameters, 41B active. …

simonwillison.net

December 2, 2025 at 5:33 PM

Reposted by Simon

Karsten Schmidt

@toxi.mastodon.thi.ng.ap.brid.gy

Hierarchies 😩... One of the biggest recurring time-consuming issues I sometimes encounter is making decisions about _where_ to put some (new or exisiting) code/feature, i.e. in which package, new or existing, considering: functional fit (topic), structural fit (pre-existing data format […]

Original post on mastodon.thi.ng

mastodon.thi.ng

November 30, 2025 at 4:24 PM

Simon

@spoltier.qoto.org.ap.brid.gy

Wrote about using #googlejules to migrate an #rstats test suite to #testthat

https://www.linkedin.com/posts/mirai-solutions-gmbh_from-runit-to-testthat-with-coding-agent-activity-7399043585066655744-yRX6

#aiagents

From RUnit to testthat with Coding Agent Support | Mirai Solutions GmbH

We present an interesting case study: we migrated the test suite of our R package 𝗫𝗟𝗖𝗼𝗻𝗻𝗲𝗰𝘁 from 𝗥𝗨𝗻𝗶𝘁 to 𝘁𝗲𝘀𝘁𝘁𝗵𝗮𝘁 using 𝗔𝗜-𝗽𝗼𝘄𝗲𝗿𝗲𝗱 𝗰𝗼𝗱𝗶𝗻𝗴 𝗮𝗴𝗲𝗻𝘁𝘀s. We used 𝗚𝗼𝗼𝗴𝗹𝗲 𝗝𝘂𝗹𝗲𝘀, an asynchronous coding agent, to handle the repetitive, multi-file refactoring work. The process wasn't just about automation, it required careful context preparation, environment setup, and iterative prompt engineering. Key 𝘁𝗮𝗸𝗲𝗮𝘄𝗮𝘆𝘀: • AI agents excel at tedious, semantics-aware tasks (like fixing 𝘦𝘹𝘱𝘦𝘤𝘵_𝘦𝘲𝘶𝘢𝘭 argument order across dozens of files) • Managed VM environments reduce risks while maintaining utility • Combining different AI tools (Jules + Aider) provided robust validation through "third-party" code review This results in a faithfully migrated test suite with equivalent coverage and behavior, achieved faster and with more confidence. Read about our process, challenges, and solutions in our news post. https://lnkd.in/d-QUGDJ4 #rstats #SoftwareDevelopment #AIEngineering #TestAutomation #OpenSource

www.linkedin.com

November 28, 2025 at 11:38 AM

Reposted by Simon

Tim Kellogg

@timkellogg.me

community note: using cost on the y axis makes it appear like cheaper models are more capable on pass@3

November 25, 2025 at 1:59 PM

Reposted by Simon

Tim Kellogg

@timkellogg.me

he’s nice even when he’s trashing someone

Andrej Karpathy & @karpathy
X.com
your post challenged me. every one of your points is wrong but i had to think about each for a while :)

November 22, 2025 at 11:23 PM

Reposted by Simon

Tim Kellogg

@timkellogg.me

Evolutionary Algorithms for optimizing LLM weights

Gradient descent and backpropagation have a lot of problems, alignment becomes a nightmare. Evolutionary algos fix this, but they don’t scale

A recent paper, EGGROLL, makes it computationally feasible to do now

www.alphaxiv.org/abs/2511.16652

An infographic titled “Evolution Strategies (ES): From Simple Ideas to Scalable AI (A 2017 Perspective & Beyond)” divided into four numbered sections, each with illustrations and concise explanations.

⸻

1. What are Evolution Strategies (ES)? (Hierarchy & Concept)

A blue panel shows how ES fits inside the broader family of Evolutionary Algorithms (EA), which follow the cycle: Population → Mutation → Selection → Repeat.
A diagram illustrates the ES loop:
• Parent Weights (Population) →
• Mutated Children (+Noise) →
• Evaluate (Fitness/Reward) →
• Update Parent (toward the average of best).

Caption: “ES is a type of EA, often used for parameter search in continuous spaces.”

⸻

2. The “Foggy Mountain” Metaphor: ES vs. Gradient Descent

Two side-by-side mountain scenes:

Left: Gradient Descent (Backprop)
A hiker calculates the slope with a magnifying glass. Text:
“Feels the slope, takes precise steps downhill. Efficient, but mathematically complex to implement.”

Right: Evolution Strategy (Random Exploration)
Multiple clones of a character fan out with +2, –5, +4, –2, +8 labels.
“Spawns clones, moves randomly, evaluates all, moves parent to average of the best. No gradients, simple, robust.”

⸻

3. The 2017 Breakthrough: Surprising Scalability

A performance graph shows ES scalability as a straight rising line, compared to Complex RL with a slowly rising curve.
Icons at the bottom depict Deep Neural Networks, Atari Games, and MuJoCo Robots.
Summary notes:
• Simple ES could train deep networks for complex RL tasks.
• Highly scalable (linear speedup with more CPUs).
• Robust and required fewer hyperparameter tweaks.

⸻

4. ES and Modern LLMs: Context & Application

Left (Pre-training):
A stack of “Trillions of Tokens” next to a GPT-4-like brain with a big red X over ES and a green check mark over Gradient Descent (Backprop).
“Too slow for massive supervised pre-training. Backprop is vastly more efficient.”

Right (Fine-tuning / RLHF):

A wide, stylized infographic titled “The Renaissance of Evolution Strategies (ES) for LLMs: A 2017–2025 Timeline.” It is divided into three vertical panels labeled ACT 1, ACT 2, and ACT 3, each representing a stage in the evolution of ES research.

⸻

ACT 1: The 2017 Foundation (The Simple Promise)

A light-blue panel depicting early ES work.
• 2017 marker at top.
• Illustration of small neural networks (millions of parameters).
• Arrow into a cluster of CPUs labeled Linear Scaling with CPUs.
• Text explains: OpenAI proves ES (“random guessing”) scales linearly for moderately complex tasks (e.g., Atari, MuJoCo).
• At the bottom, a group of small figures push a giant boulder uphill, symbolizing early effort.

⸻

ACT 2: The Gradient Crisis in LLM Alignment (2020–2024)

A red-orange panel showing difficulties with large-scale RLHF and gradient-based alignment.
• Illustration of a brain containing massive LLMs with billions to trillions of parameters.
• Pre-training (Facts) shown as a smooth arrow—works fine.
• Alignment/RLHF (Helpful, Harmless, Honest) depicted as a turbulent red wave with icons of flame, warning signs, and jagged charts.
• Problems listed:
• Brittleness & Instability: Exploding/vanishing gradients, PPO issues.
• Memory Hog: Massive computation graphs overwhelm GPUs; limits batch sizes.
• Complexity: Tuning hyper-sensitive parameters is very difficult.
• Graphics include overheating servers and chaotic line charts.

⸻

ACT 3: The 2025 Breakthrough (Why ES Works Now)

A bright green panel with futuristic visuals.
• 2025 marker with a citation: alphaXiv:2511.16652.
• Title: ES Viable for LLMs through Technical Leaps.

Leap 1: PEFT + ES (Shrinking the Mountain)
• A glowing cube labeled LLM, showing:
• 99% Frozen Weights
• 1% Active (PEFT/LoRA)
• Explanation: ES applied only to small adapter layers, shrinking search space from billions to millions.

Leap 2: Structured Exploration (Smart Shoving)

November 23, 2025 at 2:19 AM

Reposted by Simon

✧✦Catherine✦✧

@whitequark.mastodon.social.ap.brid.gy

i'm delighted to be able to host academic content at https://grebedoc.dev!

(yes, you can push 500 MB of slides and stuff as a single site to it. yes, i will gladly host it! no, it will not cost me any remotely meaningful amount of money, push at your leisure)

Grebedoc — static site hosting for git forges

grebedoc.dev

November 17, 2025 at 5:41 AM

Reposted by Simon

Tim Kellogg

@timkellogg.me

the ironic part about immigrants is they’re not lazy, the lazy ones didn’t have enough agency to move to a different country

immigration is as close to a filter for high performing individuals as you’re going to get

November 16, 2025 at 6:35 PM

Reposted by Simon

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

Some notes on GPT-5.1, which is now available in the OpenAI API

The new reasoning options are interesting, but the pelican feels like a bit of a regression from GPT-5 https://simonwillison.net/2025/Nov/13/gpt-51/

Introducing GPT-5.1 for developers

OpenAI announced GPT-5.1 yesterday, calling it a smarter, more conversational ChatGPT. Today they've added it to their API. We actually got four new models today: gpt-5.1 gpt-5.1-chat-latest gpt-5.1-codex gpt-5.1-codex-mini There …

simonwillison.net

November 14, 2025 at 12:10 AM

Reposted by Simon

Ted Underwood

@tedunderwood.com

I find AI does accelerate solving complex problems, so you can get back to your to-do list.

Unfortunately, I love being immersed in long complex problems, and hate managing my top-level to-do list. So I am once again begging tech companies to make us an AI Project Manager.

November 12, 2025 at 2:26 PM

Reposted by Simon

Tim Kellogg

@timkellogg.me

overheard: “they’re on twitter, instagram, X.. i don’t even know what X is, what is X?”

November 5, 2025 at 2:11 PM

Simon

@spoltier.qoto.org.ap.brid.gy

Spooky jog before dawn

Dark forest with the full moon shining through thick fog. A leaf-covered path is faintly visible in the foreground. A few close trees are clearly visible, most others fade off.

November 6, 2025 at 3:41 PM

Reposted by Simon

Simon Willison

@simon.fedi.simonwillison.net.ap.brid.gy

And it's not just Cursor... rival agentic coding IDE Windsurf announced their own custom RL-trained fast coding model today as well!

Here are notes and a pelican on Windsurf's new SWE-1.5 model https://simonwillison.net/2025/Oct/29/swe-15/

Introducing SWE-1.5: Our Fast Agent Model

Here's the second fast coding model released by a coding agent IDE in the same day - the first was Composer-1 by Cursor. This time it's Windsurf releasing SWE-1.5: Today …

simonwillison.net

October 30, 2025 at 12:06 AM

Reposted by Simon

Cameron

@cameron.stream

I will be in Berlin on December 10th giving talks.

I'm looking for other places to give talks in Europe around the same time.

Please reach out if know of a spot for me to speak, happy to joint sponsor with Letta.

Preferences for London, Paris, Amsterdam, etc.

October 29, 2025 at 4:55 PM

Reposted by Simon

daniel:// stenberg://

@bagder.mastodon.social.ap.brid.gy

The other day we had our first ever chained AI tool success on the #curl factory floor:

- tool A found a possible flaw in code and reported it.

- using the plain English description from tool A, tool B could create a reproducible by itself that verified the finding

The sense of magic is […]

Original post on mastodon.social

mastodon.social

October 29, 2025 at 7:52 AM