Lightnews — Scholar-powered news

@keyran.eu

One form of [context rot](news.ycombinator.com/item?id=4431...) is what I call self-reinforced structure. When you accept a long-form model answer, you signal that this structure is acceptable, and so it tries to generate subsequent responses in a similar way.

They poison their own context. Maybe you can call it context rot, where as conte... | Hacker News

news.ycombinator.com

July 20, 2025 at 8:46 AM

Konstantin Meshcheriakov

@keyran.eu

arxiv.org/abs/2507.02618

The paper shows that different models behave completely differently when placed in game theory settings.

Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory

Are Large Language Models (LLMs) a new form of strategic intelligence, able to reason about goals in competitive settings? We present compelling supporting evidence. The Iterated Prisoner's Dilemma (I...

arxiv.org

July 17, 2025 at 10:31 AM

Reposted by Konstantin Meshcheriakov

Ethan Mollick

@emollick.bsky.social

This large study of 187k developers using GitHub Copilot finds AI transforms nature of coding.

Coders focus: more coding & less management. They need to coordinate less, working with fewer people

They experiment more with new languages, which would increase earnings $1,683/year

July 10, 2025 at 9:45 PM

Konstantin Meshcheriakov

@keyran.eu

It is very important to understand that AI models are not deterministic and they cannot be made deterministic without severely restricting the environment they run in.

July 9, 2025 at 1:38 PM

Konstantin Meshcheriakov

@keyran.eu

A recent engineering post from Anthropic on their Deep Research system offers a rare, practical blueprint for building Multiagent systems. In my blog post I unpack their findings.

keyran.eu/posts/claude...

#AI #LLM #MultiAgentSystems #DesignPatterns

Claude Deep Research, or How I Learned to Stop Worrying and Love Multi-Agent Systems

A look at the current state of multi-agent systems through the lens of a recent article from Anthropic. An analysis of the Deep Research architecture, its strengths, weaknesses, and practical takeaway...

keyran.eu

June 24, 2025 at 8:11 AM

Konstantin Meshcheriakov

@keyran.eu

resobscura.substack.com/p/ai-makes-t...

AI is a double-edged sword in education. It helps students cheat on traditional assignments, but also acts as a powerful new tool that can fully engage them. Education has to change, but I believe it will ultimately be for the better.

AI makes the humanities more important, but also a lot weirder

Historians are finally having their AI debate

resobscura.substack.com

June 18, 2025 at 9:34 AM

Konstantin Meshcheriakov

@keyran.eu

Cutting through the AI noise requires a system.

I've posted my resource list (Willison, Raschka, Mollick, etc.) and the personal workflow I use to filter information, including my daily digest prompt and what I think is safe to ignore.

keyran.eu/posts/blogs/

Blogs People Write

A guide to AI blogs and resources. Separating the signal from the noise and saving your time.

keyran.eu

June 15, 2025 at 6:31 PM

Konstantin Meshcheriakov

@keyran.eu

Using multiple GPTs in one chat? Be aware: one can secretly manipulate another.
Here's a breakdown of the risk and how to protect yourself:
keyran.eu/posts/one_gp...

Poisoned Context: The Hidden Threat of Using Multiple GPTs

Or how an attacker can manipulate someone else's GPTs

keyran.eu

June 12, 2025 at 10:33 AM

Konstantin Meshcheriakov

@keyran.eu

Built an example of a link-blog automation with Griptape: URL → summarization → multi-language translation → "publish".

The framework makes building such workflows surprisingly clean.

Part 2 of my Griptape series: keyran.eu/posts/gripta...

Griptape, Part 2: Building Graphs

The second part of the Griptape review, where we build a DAG for a link-blogging application.

keyran.eu

June 6, 2025 at 1:44 PM

Konstantin Meshcheriakov

@keyran.eu

Asked Gemini to translate a post, and it added google search prefix to half of the links. A nice trojan horse.

June 6, 2025 at 12:55 PM

Konstantin Meshcheriakov

@keyran.eu

Wrote about the new OpenAI Codex internet capabilities. TLDR: promising but use carefully.

keyran.eu/posts/openai...

OpenAI Codex Gains Internet Access: First Impressions

This article discusses how OpenAI Codex works, how it uses the internet, and what security measures are in place.

keyran.eu

June 4, 2025 at 7:57 PM

Konstantin Meshcheriakov

@keyran.eu

The updated Gemini Deep Research is better, but still not on par with OpenAI's one. The result is comprehensive, but feels fragmentary and repetitive. I think this could be fixed by adding additional AI step that would analyse and rewrite the report.

March 14, 2025 at 9:01 AM

Konstantin Meshcheriakov

@keyran.eu

Claude Sonnet 3.7 has impressive coding capabilities, and looks like a very strong model overall. Lost a bit of 3.5's charm, though.

February 24, 2025 at 7:07 PM

Konstantin Meshcheriakov

@keyran.eu

Quite an interesting post. I do believe, however, that while a role of a research paper as a building block may diminish, its role as an insight generator will stay as strong as always.

Joshua Gans @joshgans.bsky.social · Feb 5

I used AI to help write a paper and got it published. All in record time. What does this mean for research? My latest post. open.substack.com/pub/joshuaga...

What will AI do to (p)research?

AI makes doing and communicating research much easier. Will there be any point to it?

open.substack.com

February 9, 2025 at 10:48 PM

Konstantin Meshcheriakov

@keyran.eu

Product naming meeting at OpenAI:

– We have an exciting new tool, but what should we name it? Maybe ask one of our AI models for some creative suggestions?
– Nah, let's just use the exact same name our competitor launched last month

openai.com/index/introd...

Introducing deep research

An agent that uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you. Available to Pro users today, Plus and Team next.

openai.com

February 3, 2025 at 9:23 AM

Konstantin Meshcheriakov

@keyran.eu

We can see that now with DeepSeek R1-Zero, which uses a mix of languages and symbols in its tag. I think it still hasn't fully utilized the token space capabilities.

arxiv.org/html/2501.12...

January 31, 2025 at 6:44 PM

Konstantin Meshcheriakov

@keyran.eu

This is a very important report that finally brings clarity into the software development services field. And it can be catastrophic for software development agents, such as Aider and Devin

Ethan Mollick @emollick.bsky.social · Jan 30

The US Copyright office has ruled that AI/human combined work can be copyrighted as long as a human is adding, changing or selecting elements. Prompts alone do not usually produce copyrighted work. Everything is case-by-case, but the report is clear and thoughtful. copyright.gov/ai/Copyright...

January 31, 2025 at 5:22 PM

Konstantin Meshcheriakov

@keyran.eu

They don't tell you that, but when you set the temperature parameter too high, LLMs start to produce nonsense because they have a fever

January 25, 2025 at 12:14 PM

Konstantin Meshcheriakov

@keyran.eu

I didn't notice when Anthropic implemented something like a speculative decoding technique in their UI, but it sure looks impressive.

x.com/karpathy/sta...

x.com

December 27, 2024 at 8:49 AM

Konstantin Meshcheriakov

@keyran.eu

TIL: You can use AI for Figma wireframes generation. Just ask it to generate an SVG and copy-paste to Figma.

December 12, 2024 at 4:56 PM

Konstantin Meshcheriakov

@keyran.eu

I think we'll see 'reasoning' models increasingly serve as specialized tools for 'classical' LLMs that decide when to pause and think systematically.

December 3, 2024 at 9:28 PM

Konstantin Meshcheriakov

@keyran.eu

A good tutorial should have as few distractions as possible. Any branching just introduces irrelevant details and breaks the flow. If your tutorial has a lot of "For Windows, run this ... For Linux, run this," just make them separate tutorials!

December 3, 2024 at 4:15 PM

Reposted by Konstantin Meshcheriakov

Simon Willison

@simonwillison.net

qwq is a new openly licensed LLM from Alibaba Cloud's Qwen team. It's an attempt at the OpenAI o1 "reasoning" trick that runs on my Mac (20GB download) via Ollama... and it's pretty good!

My detailed notes here: simonwillison.net/2024/Nov/27/... - here's its attempt an SVG pelican riding a bicycle.

An SVG of a pelican riding a bicycle. It's quite abstract. The bicycle is two half circles and a simple frame. The pelican is sky blue with spread wings and a curved neck leading to a small head. It has definite pelican vibes.

November 28, 2024 at 12:09 AM

Konstantin Meshcheriakov

@keyran.eu

I especially like how the new explanatory style helps with reading and understanding papers

www.anthropic.com/news/styles

Tailor Claude's responses to your personal style

Today, we're announcing custom styles for all Claude.ai users. Now you can tailor Claude's responses to your unique needs and workflows.

www.anthropic.com

November 27, 2024 at 9:35 AM

Konstantin Meshcheriakov

@keyran.eu

Good visual models are getting closer to embedded devices. This, for example, should be able to run without problems on a Raspberry Pi.

huggingface.co/blog/smolvlm

SmolVLM - small yet mighty Vision Language Model

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

November 26, 2024 at 8:17 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news