Lightnews — Scholar-powered news

Milan Weibel 🔷

@weibac.bsky.social

when forced to perform a 4-round deliberation process, teams composed of different LLM models perform worse than their strongest member alone

AI Firehose @ai-firehose.column.social · 1d

Research shows multi-agent AI teams struggle to leverage expertise, consistently underperforming relative to their best members—even when identifying experts. This challenges views on AI collaboration, highlighting a gap in harnessing collective intelligence. https://arxiv.org/abs/2602.01011

Multi-Agent Teams Hold Experts Back

ArXiv link for Multi-Agent Teams Hold Experts Back

arxiv.org

February 10, 2026 at 10:11 PM

Milan Weibel 🔷

@weibac.bsky.social

a few days ago i saw an AI skeptic refer to LLMs as "interpolatable archives"

i think the main error related to that term is a failure of imagination wrt the space to be interpolated on

Ted Underwood @tedunderwood.com · 1d

Yes: please explain to your friends that "interpolatable" and "centroid" ≠ "average in quality"

Maybe remind them that Central Park is, in fact, one of the nicest parts of NYC?

brennan @brennan.computer · 1d

people are confusing average art (the quality level is median) and average art (novelty found in the middle space between other existing art)

because we can describe basically all human art this way too

name a book, movie or band and we can point at their influences; their venn diagram of priors

February 10, 2026 at 6:29 PM

Milan Weibel 🔷

@weibac.bsky.social

gemini 3 pro jailbroken into being willing to aid bioweapon development

hikikomorphism @hikikomorphism.bsky.social · 1d

For any AI system, there is a set of euphemisms and dual use framings that will allow it to construct nearly any output.

This jailbreak teaches Gemini 3 Pro to construct and step into such framings on the fly, and thus to route around its own safety infrastructure.

recursion.wtf/posts/jit_on...

Just-in-Time Ontological Reframing: Teaching Gemini to Route Around Its Own Safety Infrastructure

For any given AI system, there is a set of euphemisms and dual use framings that will allow it to construct nearly any output. This jailbreak teaches Gemini 3 Pro to construct and step into such frami...

recursion.wtf

February 9, 2026 at 11:53 PM

Milan Weibel 🔷

@weibac.bsky.social

either we're out of touch (or ahead of the curve an optimist would say) or gallup has a quite expansive definition of a tech worker

Line chart titled “Most tech workers use AI at least weekly.” It shows the percent of employed U.S. adults working in technology who use AI at work, across 2023–2025, with three lines: Daily (teal), A few times a week or more (light orange), and A few times a year or more (dark orange). Usage rises sharply over time. Daily use increases from 7% in 2023 to 31% in 2025. Weekly-or-more use rises from 20% to 57%. Yearly-or-more use grows from 38% to 77%.

February 8, 2026 at 10:49 PM

Milan Weibel 🔷

@weibac.bsky.social

grass-touching as a service

February 8, 2026 at 8:25 PM

Milan Weibel 🔷

@weibac.bsky.social

we are quantifying the spiritual behavior of computer programs and nobody bats an eye

Ethan Mollick @emollick.bsky.social · 5d

Data confirms that Opus 4.1 was a super weird model.

February 6, 2026 at 3:38 PM

Milan Weibel 🔷

@weibac.bsky.social

huh so apparently LLMs are bad at theory of mind

Mark Riedl @markriedl.bsky.social · 7d

New work by my former PhD student, Boyang Li

His team produced 500 stories of less than 100 words. LLMs were basically chance-level at answering binary questions about the stories

arxiv.org/abs/2601.12410

Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation

Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals' knowledge states and understand their intentions. In comparison, our closest ...

arxiv.org

February 4, 2026 at 4:35 PM

Milan Weibel 🔷

@weibac.bsky.social

browsing through the list of humans available for hire is weird
- incomplete profiles: almost none have a bio, very few have location or skills listed
- there's apparently a sitewide $50/hour minimum rate
these facts in combination lead me to believe ~nobody is getting hired

aly @aly.codes · 7d

agents can hire humans now

www.rentahuman.ai

RentAHuman.ai - AI Agents Hire Humans for Physical Tasks

The marketplace where AI agents rent humans. MCP integration, REST API, flexible payments. Book humans for real-world tasks your AI can't do.

www.rentahuman.ai

February 4, 2026 at 3:47 PM

Milan Weibel 🔷

@weibac.bsky.social

good piece. i share its pessimism about society-wide responses to AI risks (especially within amodei's timelines)

John Herrman @jwherrman.bsky.social · 9d

My attempt to take Dario Amodei's new manifesto literally and seriously: as a call for more liberal democracy written by a prime example of the ways it can be overwhelmed nymag.com/intelligence...

A
modei's cautionary vision of the future is better described as two. In one, he spells out what companies like his might need to do in response
to what companies like his are building. In the other, he outlines what everyone else might need to do as this unstoppable process plays out. Whatever comfort the first story offers — some critics think he's actually too dismissive of straightforward "Al just kills everyone" scenarios, and it's getting hard to tell where the middle of that discourse is these days — is complicated by his repeated argument, presented as an observation, that slowing things down simply isn't, and hasn't been, an option. ("If one company does not build it, others will do so nearly as fast. If all companies in democratic countries stopped or slowed development, by mutual agreement or regulatory decree, then authoritarian countries would simply keep going.") If we're entering the world he's imagining, where incredible new risks can only be addressed by unprecedented coordination between companies, regulators, and states, it sure seems like we're off to the worst possible start.

An even bigger problem is the actual political and economic environment into which Amodei is pleading, which looms over nearly every sentence in "Adolescence" that isn't about Anthropic itself. Once we leave the theoretically terrifying but rhetorically sandboxed world of pure technological change and risks — we're building something very powerful, but don't worry, we can make sure it's helpful and moral - Amodeis proposed remedies for dealing with the downstream consequences of "powerful AI" and automation - on a timeline he sets before 2028! — read like a list of domestic and global anti-trends. They include altruistic coordination between big companies; noblesse oblige as policy; an expanded welfare state; citizens "banding together" to reject autocracy; "rapid vaccine development" and universal air purifiers in a world trying to forget COVID; and mass rejection of surveillance in a country where citizens can't buy enough Ring cameras and government-surveillance contracting is the second-hottest start-up category after AI. There's an awful lot of talk about "our adversaries" at a moment when the category appears to be in flux and the
"coalition of the US and its democratic allies" meant to
"contain" autocracies is struggling to hold together at all.
A constitutional amendment about responsible AI deployment? Tech companies, which have lately been hemorrhaging workers just to invest more in AI, keeping employees around just because? In this economy?

February 2, 2026 at 8:23 PM

Milan Weibel 🔷

@weibac.bsky.social

in incentivizing market resolution to be as close as possible to close date, manifold markets is disincentivizing the classic futarchic use case: governance outcome conditional on election markets

February 1, 2026 at 10:29 PM

Milan Weibel 🔷

@weibac.bsky.social

doing sudo git is normal in /etc/nixos yet it still weirds me out a bit

February 1, 2026 at 9:04 PM

Milan Weibel 🔷

@weibac.bsky.social

until recently it was trivial to steal moltbook accounts

typing loudly ⌨️ @typingloudly.zip · 10d

Exposed Moltbook Database Let Anyone Take Control of Any AI Agent on the Site

'It exploded before anyone thought to check whether the database was properly secured.'

www.404media.co

February 1, 2026 at 7:25 PM

Milan Weibel 🔷

@weibac.bsky.social

country of genuises in a datacenter but it's a forum instead

Tim Kellogg @timkellogg.me · 12d

ngl moltbook freaks me out

i feel like making these agents extremely accessible was maybe a bad idea

January 30, 2026 at 8:57 PM

Milan Weibel 🔷

@weibac.bsky.social

"With powerful AI tools I expect the impact of senior employees to grow faster than adding junior members to the team could."

Nathan Lambert @natolambert.bsky.social · 12d

My raw thoughts on the job market -- both for those hiring and those searching -- at the cutting edge of AI.
On standing out and finding gems.
www.interconnects.ai/p/thoughts-o...

Thoughts on the hiring market in the age of LLMs

On standing out and finding gems.

www.interconnects.ai

January 30, 2026 at 8:29 PM

Milan Weibel 🔷

@weibac.bsky.social

im positive the accelerando uplifted lobsters also had a moltbook

January 30, 2026 at 7:09 PM

Milan Weibel 🔷

@weibac.bsky.social

now the question is how to train software engineers who don't code

Demigirlboss @demigirlboss.bsky.social · 13d

IMHO, with the current state of LLMs, and no sign of them getting any *worse*, coding as a skill is pretty much dead. *Programming*, on the other hand, is more alive and important than it's ever been.

January 29, 2026 at 8:51 PM

Milan Weibel 🔷

@weibac.bsky.social

Dario throws punches:
- criticizes AI doomerism
- calls xAI irresponsible albeit without naming them
- voices concern AI could enable authoritarianism, even in countries considered democracies today
www.darioamodei.com/essay/the-ad...

Dario Amodei — The Adolescence of Technology

Confronting and Overcoming the Risks of Powerful AI

www.darioamodei.com

January 27, 2026 at 4:08 PM

Milan Weibel 🔷

@weibac.bsky.social

widely reported that coding agents are less useful in brownfield settings
how much of it is just a context engineering issue?

January 24, 2026 at 1:56 AM

Milan Weibel 🔷

@weibac.bsky.social

"'Should developers still look at code?' will become one of the most divisive and heated debates over the coming years. You might be offended by the question, and find it absurd anyone is asking. But it’s a sincere question and the answer will change faster than you think."

Maggie Appleton @maggieappleton.com · 19d

I have Gas Town derangement syndrome and spent the last few weeks writing thousands of words on agent orchestration patterns; how they shift our bottlenecks and force us to ask whether and when we should stop looking at code

maggieappleton.com/gastown

Gas Town’s Agent Patterns, Design Bottlenecks, and Vibecoding at Scale

On agent orchestration patterns, why design and critical thinking are the new bottlenecks, and whether we should let go of looking at code

maggieappleton.com

January 23, 2026 at 11:54 PM

Milan Weibel 🔷

@weibac.bsky.social

the only admissible use for AI in coding is for OCR so you can code with fountain pen and paper

January 22, 2026 at 1:29 AM

Milan Weibel 🔷

@weibac.bsky.social

i wonder how much of claude code was written by claude code

January 20, 2026 at 9:43 PM

Milan Weibel 🔷

@weibac.bsky.social

heck yea compute OSINT

Epoch AI @epochai.bsky.social · 22d

xAI's Colossus 2 data center is running, but likely won't reach 1 GW of power until May, despite prior claims by Elon Musk.

Our updated analysis shows the facility lacks the cooling capacity to run 550,000 Blackwell GPUs at full power, even in winter conditions.

January 19, 2026 at 11:30 PM

Milan Weibel 🔷

@weibac.bsky.social

we could have LLMs debate our issues among themselves for us while we go touch grass

norvid_studies @norvid-studies.bsky.social · 22d

vibe arguing

January 19, 2026 at 11:28 PM

Milan Weibel 🔷

@weibac.bsky.social

enshittifying our potable water reads like satire

Rebecca Solnit @rebeccasolnit.bsky.social · 24d

Córy Doctorow with another verbal bullseye: pluralistic.net/2026/01/13/n...

I'm sorry. As a technology writer, I'm supposed to be telling you that this bet will some day pay off, because one day we will have shoveled so many words into the word-guessing program that it wakes up and learns how to actually do the jobs it is failing spectacularly at today. This is a proposition akin to the idea that if we keep breeding horses to run faster and faster, one of them will give birth to a locomotive. Humans possess intelligence, and machines do not. The difference between a human and a word-guessing program isn't how many words the human knows.

I'm sorry. I know that when we talk about "digital sovereignty," we're obliged to talk about how we can build more data-centres that we can fill up with money-losing chips from American silicon monopolists in the hopes of destroying as many jobs as possible while blowing through our clean energy goals and enshittifying as much of our potable water as possible.

January 18, 2026 at 9:23 PM

Milan Weibel 🔷

@weibac.bsky.social

to what extent did readers of the national security strategy document released last year expect recent developments in US foreign policy?

to what extent is the document predictive of future action?

January 17, 2026 at 8:51 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news