Konstantin Meshcheriakov
banner
keyran.eu
Konstantin Meshcheriakov
@keyran.eu
Head of AI at Klika Tech
One form of [context rot](news.ycombinator.com/item?id=4431...) is what I call self-reinforced structure. When you accept a long-form model answer, you signal that this structure is acceptable, and so it tries to generate subsequent responses in a similar way.
They poison their own context. Maybe you can call it context rot, where as conte... | Hacker News
news.ycombinator.com
July 20, 2025 at 8:46 AM
arxiv.org/abs/2507.02618

The paper shows that different models behave completely differently when placed in game theory settings.
Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory
Are Large Language Models (LLMs) a new form of strategic intelligence, able to reason about goals in competitive settings? We present compelling supporting evidence. The Iterated Prisoner's Dilemma (I...
arxiv.org
July 17, 2025 at 10:31 AM
Reposted by Konstantin Meshcheriakov
This large study of 187k developers using GitHub Copilot finds AI transforms nature of coding.

Coders focus: more coding & less management. They need to coordinate less, working with fewer people

They experiment more with new languages, which would increase earnings $1,683/year
July 10, 2025 at 9:45 PM
It is very important to understand that AI models are not deterministic and they cannot be made deterministic without severely restricting the environment they run in.
July 9, 2025 at 1:38 PM
A recent engineering post from Anthropic on their Deep Research system offers a rare, practical blueprint for building Multiagent systems. In my blog post I unpack their findings.

keyran.eu/posts/claude...

#AI #LLM #MultiAgentSystems #DesignPatterns
Claude Deep Research, or How I Learned to Stop Worrying and Love Multi-Agent Systems
A look at the current state of multi-agent systems through the lens of a recent article from Anthropic. An analysis of the Deep Research architecture, its strengths, weaknesses, and practical takeaway...
keyran.eu
June 24, 2025 at 8:11 AM
resobscura.substack.com/p/ai-makes-t...

AI is a double-edged sword in education. It helps students cheat on traditional assignments, but also acts as a powerful new tool that can fully engage them. Education has to change, but I believe it will ultimately be for the better.
AI makes the humanities more important, but also a lot weirder
Historians are finally having their AI debate
resobscura.substack.com
June 18, 2025 at 9:34 AM
Cutting through the AI noise requires a system.

I've posted my resource list (Willison, Raschka, Mollick, etc.) and the personal workflow I use to filter information, including my daily digest prompt and what I think is safe to ignore.

keyran.eu/posts/blogs/
Blogs People Write
A guide to AI blogs and resources. Separating the signal from the noise and saving your time.
keyran.eu
June 15, 2025 at 6:31 PM
Using multiple GPTs in one chat? Be aware: one can secretly manipulate another.
Here's a breakdown of the risk and how to protect yourself:
keyran.eu/posts/one_gp...
Poisoned Context: The Hidden Threat of Using Multiple GPTs
Or how an attacker can manipulate someone else's GPTs
keyran.eu
June 12, 2025 at 10:33 AM
Built an example of a link-blog automation with Griptape: URL → summarization → multi-language translation → "publish".

The framework makes building such workflows surprisingly clean.

Part 2 of my Griptape series: keyran.eu/posts/gripta...
Griptape, Part 2: Building Graphs
The second part of the Griptape review, where we build a DAG for a link-blogging application.
keyran.eu
June 6, 2025 at 1:44 PM
Asked Gemini to translate a post, and it added google search prefix to half of the links. A nice trojan horse.
June 6, 2025 at 12:55 PM
Wrote about the new OpenAI Codex internet capabilities. TLDR: promising but use carefully.

keyran.eu/posts/openai...
OpenAI Codex Gains Internet Access: First Impressions
This article discusses how OpenAI Codex works, how it uses the internet, and what security measures are in place.
keyran.eu
June 4, 2025 at 7:57 PM
The updated Gemini Deep Research is better, but still not on par with OpenAI's one. The result is comprehensive, but feels fragmentary and repetitive. I think this could be fixed by adding additional AI step that would analyse and rewrite the report.
March 14, 2025 at 9:01 AM
Claude Sonnet 3.7 has impressive coding capabilities, and looks like a very strong model overall. Lost a bit of 3.5's charm, though.
February 24, 2025 at 7:07 PM
Quite an interesting post. I do believe, however, that while a role of a research paper as a building block may diminish, its role as an insight generator will stay as strong as always.
February 9, 2025 at 10:48 PM
Product naming meeting at OpenAI:

– We have an exciting new tool, but what should we name it? Maybe ask one of our AI models for some creative suggestions?
– Nah, let's just use the exact same name our competitor launched last month

openai.com/index/introd...
Introducing deep research
An agent that uses reasoning to synthesize large amounts of online information and complete multi-step research tasks for you. Available to Pro users today, Plus and Team next.
openai.com
February 3, 2025 at 9:23 AM
We can see that now with DeepSeek R1-Zero, which uses a mix of languages and symbols in its tag. I think it still hasn't fully utilized the token space capabilities.

arxiv.org/html/2501.12...
January 31, 2025 at 6:44 PM
This is a very important report that finally brings clarity into the software development services field. And it can be catastrophic for software development agents, such as Aider and Devin
The US Copyright office has ruled that AI/human combined work can be copyrighted as long as a human is adding, changing or selecting elements. Prompts alone do not usually produce copyrighted work. Everything is case-by-case, but the report is clear and thoughtful. copyright.gov/ai/Copyright...
January 31, 2025 at 5:22 PM
They don't tell you that, but when you set the temperature parameter too high, LLMs start to produce nonsense because they have a fever
January 25, 2025 at 12:14 PM
I didn't notice when Anthropic implemented something like a speculative decoding technique in their UI, but it sure looks impressive.

x.com/karpathy/sta...
x.com
x.com
December 27, 2024 at 8:49 AM
TIL: You can use AI for Figma wireframes generation. Just ask it to generate an SVG and copy-paste to Figma.
December 12, 2024 at 4:56 PM
I think we'll see 'reasoning' models increasingly serve as specialized tools for 'classical' LLMs that decide when to pause and think systematically.
December 3, 2024 at 9:28 PM
A good tutorial should have as few distractions as possible. Any branching just introduces irrelevant details and breaks the flow. If your tutorial has a lot of "For Windows, run this ... For Linux, run this," just make them separate tutorials!
December 3, 2024 at 4:15 PM
Reposted by Konstantin Meshcheriakov
qwq is a new openly licensed LLM from Alibaba Cloud's Qwen team. It's an attempt at the OpenAI o1 "reasoning" trick that runs on my Mac (20GB download) via Ollama... and it's pretty good!

My detailed notes here: simonwillison.net/2024/Nov/27/... - here's its attempt an SVG pelican riding a bicycle.
November 28, 2024 at 12:09 AM
I especially like how the new explanatory style helps with reading and understanding papers

www.anthropic.com/news/styles
Tailor Claude's responses to your personal style
Today, we're announcing custom styles for all Claude.ai users. Now you can tailor Claude's responses to your unique needs and workflows.
www.anthropic.com
November 27, 2024 at 9:35 AM
Good visual models are getting closer to embedded devices. This, for example, should be able to run without problems on a Raspberry Pi.

huggingface.co/blog/smolvlm
SmolVLM - small yet mighty Vision Language Model
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
huggingface.co
November 26, 2024 at 8:17 PM