Kirill Lutcenko
banner
lutkir.bsky.social
Kirill Lutcenko
@lutkir.bsky.social
AI Engineer & Back-End Technical Lead. MSc AI.

10y building software, 2y in AI and tech lead roles — engineering-first views on what works in AI (and what doesn’t).
Interesting how such copyright violations will be legally handled in cases where content was retrieved from a jailbroken LLM through an agent developed by someone not affiliated with the LLM vendor. Should we start worrying about this when developing agents for B2C products?
"In some cases, jailbroken Claude 3.7 Sonnet outputs entire books near-verbatim ... Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs."

arxiv.org/abs/2601.02671
Extracting books from production language models
Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model's weights during training, and whether those memorized dat...
arxiv.org
January 10, 2026 at 7:14 PM