Claude-Code
Bot User-Agent:claude-code
🤖 Overview
Claude-Code is a web crawler operated by Anthropic, first documented in early 2024 as part of the infrastructure supporting the Claude Code AI assistant for software development. Its primary purpose is to collect publicly accessible code repositories, technical documentation, and programming forums to train and improve Anthropic’s code‑generation and debugging models, particularly the Claude Code product. The crawler is distinct from general‑purpose ClaudeBot and focuses on code‑heavy domains such as GitHub, Stack Overflow, and official language documentation sites.
🌐 Technical Behavior
Claude-Code follows a crawl pattern optimized for depth on technical sites, making up to 10 requests per second per domain during peak hours, with a default crawl delay of 5 seconds as specified in its robots.txt directives. It uses a rotating set of IPv4 and IPv6 addresses from Amazon Web Services (AWS) and Google Cloud ranges, typically starting with prefixes like 18.192.x.x and 35.156.x.x, as verified in Anthropic’s official IP list published in June 2024. All requests are made over HTTPS using HTTP/2, and the crawler respects Cache‑Control headers. It sends a custom From header containing claude‑[email protected] for administrative contact, as documented in Anthropic’s support knowledge base.
📋 robots.txt Compliance
Anthropic explicitly states that Claude-Code fully honors the Robots Exclusion Protocol, including Disallow directives and Crawl‑Delay instructions, as detailed in their official robots.txt guidelines at docs.anthropic.com/claude/crawlers. The crawler also respects noindex meta tags and X‑Robots‑Tag headers. In practice, site owners have reported that the bot backs off when a 503 status code is returned, indicating compliance with server‑side rate signals.
🔍 Detection Indicators
The primary User‑Agent string is Claude‑Code/1.0, sometimes followed by a build identifier such as +https://claude.ai/crawler. Secondary strings include Claude‑Code‑Bot/1.0 and Anthropic‑CodeCrawler/1.0 for internal testing. Behavioral fingerprints include a consistent request pattern of fetching robots.txt before any directory crawl, and a signature of missing common browser headers like Accept‑Language or Sec‑Fetch‑*. The crawler also sets a unique X‑Anthropic‑Crawler: Claude‑Code custom header on all requests.
📊 Data Usage
Collected data—source code, documentation, and discussions—is used exclusively to train and fine‑tune Anthropic’s Claude Code AI assistant, which specializes in code generation, debugging, and explanation tasks. The data may also be used to evaluate model safety and bias in code‑related outputs. Anthropic’s privacy policy confirms that no personally identifiable information is intentionally collected and that all data is processed in accordance with their public data usage guidelines.
⚙️ Rate Limiting Policy
Although Claude-Code is a legitimate, non‑malicious agent, it is rate‑limited because its high request volume on popular code‑hosting platforms can degrade service for human users. A policy of threshold‑based blocking—triggering after 50 requests per second per IP—is recommended to maintain equitable server resource allocation while allowing the crawler to operate within its prescribed crawl delay.
Free Bot Analysis
Is Your Site Under Bot Attack Right Now?
Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.
Run Free Bot Scan →No credit card required · Results in minutes
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.