#GPTBot
you know I thought adding a technobabbling AI-crawler honeypot to my server would be funny, but I didn't expect it to be that effective

was running out of RAM because openAI GPTBot is stuck in the infinite honeypot

(it's https://maurycyz.com/projects/trap_bots/)
November 17, 2025 at 1:35 PM
OpenAI crawling websites
koldfront.dk
November 15, 2025 at 8:41 AM
"In the past year, Common Crawl’s CCBot has become the scraper most widely blocked by the top 1,000 websites, surpassing even OpenAI’s GPTBot, which collects content for ChatGPT" www.theatlantic.com/technology/2... via @theatlantic.com
The Nonprofit Doing the AI Industry’s Dirty Work
The web archive Common Crawl has been quietly funneling paywalled articles to AI companies—and lying to publishers about it.
www.theatlantic.com
November 4, 2025 at 4:57 PM
Ich glaub dieser Vergleich hinkt ein bissl, in Anbetracht eures robots.txt. Zumindest hab ich bei noch keinem Blumengeschäft ein Schild "Liebe Roboter, bitte nix fladern" gesehen.
November 4, 2025 at 10:43 AM
[Étude] Presse française versus IA : qui bloque quel bot ?

Étude exclusive sur 1132 sites de presse français : 23% bloquent les robots d'IA. CCBot en tête, suivi de GPTBot. Découvrez les stratégies de blocage des éditeurs.
[Étude] Presse française versus IA : qui bloque quel bot ?
Étude exclusive sur 1132 sites de presse français : 23% bloquent les robots d'IA. CCBot en tête, suivi de GPTBot. Découvrez les stratégies de blocage des éditeurs.
www.lvlup.fr
November 3, 2025 at 9:00 AM
Welp, it turns out my site was getting slowed down massively because of Ai bots combing it for content. GPTbot had accessed it 1.34k times just this morning!

I can take solace in the fact that at least my art has made their image generation marginally worse. Suck it big tech!!
October 30, 2025 at 1:21 PM
Ok actually, because robots.txt is optional and ignorable, here is our current, full AI bot blocking solution using the non-optional .htaccess mod_rewrite (see alt text)
October 28, 2025 at 5:09 PM
Just adding a bunch of new/renamed AI scraper bots to the ol blocklist

robots.txt:
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: semantic-visions.com
Disallow: /
October 28, 2025 at 1:41 PM
Am I Visible on AI : Vérifiez votre visibilité sur ChatGPT#Seo #ChatGPTcrawler #GPTBot #optimisationcontenuIA #robots.txtIA #visibilitéIA
October 24, 2025 at 7:57 AM
とはいえ、GPTBot のクロールをブロックしてるサイトでは ChatGPT Atlas からでもコンテンツにアクセスできないので UA 文字列自体はあまり重要じゃなかった
October 22, 2025 at 5:37 AM
RIP robots.txt (1994-2025)

Died when AI companies discovered "please don't" isn't legally binding.

ClaudeBot has a 70,900 crawls per referral, GPTBot has a 1,700:1 and Perplexity CEO: "lol, not a legal framework tho"

Turns out we're all just unpaid contributors

www.heise.de/en/backgroun...
Obituary: Farewell to robots.txt (1994-2025)
The voluntary compliance protocol that civilized the internet has departed, bids Henning Fries farewell.
www.heise.de
October 20, 2025 at 5:47 PM
"OpenAI led the charge with its GPTBot, ChatGPT-User, and OAI-SearchBot – a trinity of violations that left robots.txt helplessly watching its directives being diligently ignored. The company publicly claimed compliance, while in June 2025, Cloudflare documented a devastating crawl-to-referral […]
Original post on mastodon.social
mastodon.social
October 20, 2025 at 8:00 AM
> Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; + https:// openai.com/gptbot ) Thousands of times in my personal nginx logs this evening. Even thought they're all getting 403'd. Kinda funny really. (I am aware of Anubis et.al.)

Interest | Match | Feed
Origin
bsd.network
October 16, 2025 at 11:30 AM
@awinkler@openbiblio.social @digiSberlin@openbiblio.social wmf hat doch einen eigenen GPTbot, der kann doch die sparql abfragen generieren.
October 15, 2025 at 11:44 AM
On Website Technicals (2025-06) - Tech updates: Junited - Rigby to Buttersafe - GPTBot badness, captions, diversion delay, under-volt, X11 fossil. #junited2025 - https://www.earth.org.uk/note-on-site-technicals-97.html
On Website Technicals (2025-06)
Tech updates: Junited - Rigby to Buttersafe - GPTBot badness, captions, diversion delay, under-volt, X11 fossil. #Junited2025
www.earth.org.uk
October 14, 2025 at 9:53 AM
more than 3,800 top domains are now blocking ai crawlers, and it’s not just openai's bots that are receiving the cold shouler.

cloudflare’s new data shows gptbot, ccbot, and google-extended lead the pack of most-disallowed ai agents.
October 10, 2025 at 7:01 PM
Google führt ein, dass Devoloper von Android-Apps sich identifizieren müssen

Alle Hersteller von LLM GPT führen Vibe Coding ein.

Wie funktioniert das? Der GPTbot identifiziert sich bei Google und verantwortet die App mit allen Modifikationen des menschlichen "Developers"?

Der menschliche […]
Original post on no-pony.farm
no-pony.farm
October 9, 2025 at 4:44 PM
From Googlebot to GPTBot: Who’s crawling your site in 2025 https://blog.cloudflare.com/from-googlebot-to-gptbot-whos-crawling-your-site-in-2025/
From Googlebot to GPTBot: Who’s crawling your site in 2025
blog.cloudflare.com
October 6, 2025 at 8:24 AM
Zip-бомбы против агрессивных ИИ-краулеров Некоторые владельцы сайтов жалуются на большое количество ботов, ...

#openai #gptbot #sha-256 #hashcash #perplexity #ai #anubis #краулеры #zip-бомба

Origin | Interest | Match
October 5, 2025 at 6:28 PM
От Googlebot до GPTBot: кто будет сканировать ваш сайт в 2025 году

https://kripta.biz/posts/48494BB0-8029-4BEB-9CEE-7AF5EED9C6F8
October 2, 2025 at 3:41 AM
从Googlebot到GPTBot:2025年谁在爬取你的网站?

https://qian.cx/posts/B5BFC466-848E-4AC4-9291-D0B36DCCB3E2
October 2, 2025 at 3:40 AM
AI crawlers aren’t welcome everywhere.

Of the top 10,000 domains, 3,710 disallow them in robots.txt:

GPTBot leads with 240 full blocks + 66 partial

ClaudeBot: 161 full, 47 partial

PerplexityBot: 128 full, 52 partial

Even ChatGPT-User, Anthropic, and Amazonbot are on the block list
September 9, 2025 at 5:38 AM
gptbotがアクセスしてきた形跡があるんだけど気持ち悪い
September 7, 2025 at 1:04 AM
was scrolling through my Nginx logs earlier to debug something and noticed a lot of requests from GPTBot, so I ran a simple python script to analyse it.
September 4, 2025 at 11:48 PM