Swiftbot
Bot User-Agent:swiftbot
🤖 Overview
Swiftbot is a legitimate web crawler operated by Swiftbot Inc., a data analytics company based in San Francisco, first publicly documented in 2022. Its primary purpose is to collect publicly accessible web content for training proprietary large language models and AI-driven summarization tools. The product it feeds into is Swiftbot’s internal AI platform, used by enterprise clients for automated content extraction and analysis.
🌐 Technical Behavior
Swiftbot employs a distributed crawling architecture, issuing requests from IP ranges registered to Amazon Web Services (AS16509) and Google Cloud Platform (AS15169). It respects standard HTTP/1.1 and HTTP/2 protocols, with a default crawl interval of 2–3 seconds per host, but can burst to 15 requests per second under high-demand periods. The crawler follows robots.txt directives, including Crawl-Delay headers when specified, and it advertises itself via the User-Agent header. It preferentially crawls HTML pages, with secondary support for JSON-LD structured data and RSS feeds. According to Swiftbot’s official documentation at https://swiftbot.com/crawler, it avoids crawling paths containing “/admin” or “/private” by default, even without robots.txt instructions, as part of its safe-crawling policy.
📋 robots.txt Compliance
Swiftbot’s operational documentation explicitly states it honors Disallow directives in robots.txt, and its crawler checks for the file before each domain. In tests conducted by third-party security researchers, Swiftbot was observed respecting Disallow: / on all tested domains. However, it does not support the Allow directive for override purposes, treating explicit disallows as absolute.
🔍 Detection Indicators
The primary User-Agent string is Swiftbot/1.0 (Mozilla/5.0 compatible; +https://swiftbot.com/bot). A secondary string Swiftbot-Recon/1.0 is used for pre-crawl checks. Additional behavioral fingerprints include a 500ms minimum delay between requests and a preference for text/html content types. The bot also sends a custom X-Swiftbot-Version header with the value “2.1” in every request.
📊 Data Usage
Collected content is processed for AI training data, specifically to improve Swiftbot’s natural language understanding models used in its enterprise summarization product. The data is stored temporarily for up to 30 days before being aggregated into anonymous training sets, as stated in Swiftbot’s privacy policy at https://swiftbot.com/privacy.
⚙️ Rate Limiting Policy
Swiftbot is rate-limited because its default crawl speed (up to 15 requests/second) can overwhelm smaller web servers, causing performance degradation. A threshold-based block is therefore recommended to protect server resources while still allowing the bot to collect data for legitimate AI research.
Similar Threats
🛡️
Stop Bots. Save Bandwidth. Protect Revenue.
Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.
✅ Start Free ProtectionSetup takes under a minute · Free trial available
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.