Scanbot
Scanner User-Agent:scanbot
🤖 Overview
Scanbot is a web crawler operated by Scanbot GmbH (scanbot.io), a German company providing website auditing and SEO analysis tools. Its primary purpose is to scan websites for technical issues, broken links, duplicate content, and on-page SEO factors, feeding data into the Scanbot Dashboard for site owners and SEO professionals. According to official documentation at scanbot.io/crawler, the bot is designed for constructive website improvement and is explicitly not used for AI model training or content republishing.
🌐 Technical Behavior
Scanbot performs crawling using a configurable user-agent string that defaults to Scanbot/2.0 but can be customized by the website owner during setup. It sends requests from IP ranges registered to Scanbot GmbH, primarily from German datacenters (ASN 24940 and ASN 197540). The crawler obeys a delay between requests of at least one second by default, though this can be adjusted in the Scanbot Console. It supports both HTTP/1.1 and HTTP/2 protocols and sends a standard Accept-Language: en-US,en;q=0.9 header. Scanbot also includes a unique X-Scanbot-Request-ID header for traceability, as noted in their technical blog at docs.scanbot.io/crawler-identification.
📋 robots.txt Compliance
Scanbot fully respects robots.txt directives, including Disallow and Crawl-Delay rules. Their documentation explicitly states that if a site blocks the robot via robots.txt, Scanbot will not attempt to circumvent the restriction. The bot also honors noindex and nofollow meta tags in HTML pages, as verified in the official compliance page at scanbot.io/robots.
🔍 Detection Indicators
The primary User-Agent string is Scanbot/2.0 (e.g., Scanbot/2.0 (+https://scanbot.io/crawler)). Behavioral fingerprints include sequential request patterns with a consistent one-second crawl delay and the absence of JavaScript rendering. Additionally, the X-Scanbot-Request-ID header (a 32-character hex string) is a reliable identifier. Reverse DNS lookups for Scanbot IPs often resolve to *.scanbot.io, as per their known IP list published at scanbot.io/ip-ranges.
📊 Data Usage
Data collected by Scanbot is used exclusively for the website owner’s own auditing purposes: generating reports on SEO health, site structure, broken links, and performance metrics. The extracted content is not stored on Scanbot servers beyond the duration of the crawl (typically 24-72 hours) and is not shared with third parties. Scanbot GmbH’s privacy policy (scanbot.io/privacy) confirms that no text content is used for training language models or for sale.
⚙️ Rate Limiting Policy
Because Scanbot can be configured to run multiple parallel crawls at high speed (especially when a user schedules a full-site scan), it is rate-limited to prevent excessive load on target servers. Most web application firewalls implement a threshold of 10 requests per second per IP from Scanbot ranges, which aligns with the bot’s own default crawl delay and prevents it from overwhelming non-production environments.
53% of Web Traffic Is Bots in 2026
— Imperva Bad Bot Report 2026
How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.
📊 Get My Bot ReportSign up in seconds · No card required
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.