What is ClaudeBot? AI crawler guide
Learn what ClaudeBot is, who operates it, its verified user-agent, robots.txt posture, and how blocking it can affect AI search, citations, training, or agent visibility.
Anthropic crawler for public web content that could contribute to Claude model training.
What is ClaudeBot?
ClaudeBot is a web crawler operated by Anthropic. It visits publicly accessible web pages to collect content that may be used to train Claude, Anthropic's AI model. The crawler identifies itself with the user-agent token ClaudeBot and respects the Robots Exclusion Protocol. Site owners can control its access through standard robots.txt directives. Anthropic provides documentation explaining how the crawler works and how to block it. Its activity is part of Anthropic's effort to improve Claude's understanding of language and the world by learning from a broad range of internet text.
What it's for
For site owners, ClaudeBot represents a direct path through which public content may enter Anthropic's training pipeline. Allowing the crawler could mean your material helps shape future versions of Claude, potentially increasing the model's awareness of your domain. Blocking it signals that you do not want your content used for this purpose. Because the crawler honors robots.txt, you have a straightforward mechanism to opt out. This decision can affect whether your site's information appears in or influences AI-generated answers, though it does not impact traditional search engine indexing.
How to handle ClaudeBot
To prevent ClaudeBot from accessing your site, add a rule in your robots.txt file that disallows the user-agent token ClaudeBot. This instructs the crawler to skip your entire site or specific paths, depending on your configuration. The change takes effect the next time the crawler reads your robots.txt file. There is no need to contact Anthropic directly for standard blocking. If you later decide to allow access, simply remove or adjust the rule. Regularly check your server logs to confirm the crawler's compliance.
robots.txt rule
User-agent: ClaudeBot Disallow: /
Blocking cost
Blocking ClaudeBot may prevent your content from being included in Anthropic's training data, which could reduce the likelihood of your site's information appearing in Claude's future responses.
Examples
- A news website adds ClaudeBot to its robots.txt disallow list, and the crawler stops fetching articles, so new stories are not used for training.
- An e-commerce site allows ClaudeBot, and its product descriptions may later help Claude answer shopping-related questions more accurately.
- A blog owner notices ClaudeBot in server logs, checks the official documentation, and decides to block it to keep personal posts out of AI training.
Related bots
- CCBot: Also tracked as a training crawler.
- GPTBot: Also tracked as a training crawler.
- AI2Bot: Also tracked as a training crawler.
- VelenPublicWebCrawler: Also tracked as a training crawler.
- img2dataset: Also tracked as a training crawler.
- Bytespider: Also tracked as a training crawler.
- Meta-ExternalAgent: Also tracked as a training crawler.
- ICC-Crawler: Also tracked as a training crawler.
- LAIONDownloader: Also tracked as a training crawler.
- AI Training Opt-Out: ClaudeBot is a training crawler tied to this policy decision.
- Anthropic-AI: ClaudeBot connects this operator term to its crawler behavior.
Frequently Asked Questions
Does ClaudeBot follow robots.txt rules?
Yes, ClaudeBot honors the Robots Exclusion Protocol. If you disallow it in your robots.txt file, it will not crawl the specified paths.
What happens if I block ClaudeBot?
Blocking ClaudeBot tells Anthropic that you do not want your public content used for training Claude. The crawler will stop visiting your site, and future training runs should exclude your material.
Can I block only parts of my site from ClaudeBot?
Yes, you can use the Disallow directive in robots.txt to restrict ClaudeBot from specific directories or pages while allowing access to other areas.
Does ClaudeBot affect my site's search engine ranking?
No, ClaudeBot is not a search engine crawler. Blocking or allowing it has no direct effect on how your site appears in search results.
How can I verify that ClaudeBot is obeying my robots.txt?
You can check your web server logs for requests from the ClaudeBot user-agent and confirm that it is not accessing disallowed paths.
Data & Sources
- Anthropic documentation - Primary source for ClaudeBot crawler details.
- ClaudeBot live crawler data - Trakkr crawler telemetry for this user agent.