search x-bot
Search Engine User-Agent:search-x-bot
🤖 Overview
Search X-Bot is a legitimate web crawler operated by X Corp. (formerly Twitter Inc.) as part of its search indexing infrastructure. First documented in public mailing lists and the official Twitter Developer Platform around 2023, the bot is used to collect publicly available web content for inclusion in X’s search results, particularly for link previews, card generation, and text-based indexing within the platform. The crawler is a direct successor to the earlier Twitterbot and is designed to improve the relevance and freshness of content surfaced when users search for URLs or topics on X.
🌐 Technical Behavior
The Search X-Bot uses HTTP/1.1 and HTTP/2 protocols with a default User-Agent string of X-Bot/1.0 (not to be confused with the older Twitterbot). It fetches pages over both IPv4 and IPv6, with IP ranges announced via ASN 13414 (Twitter Inc.). The bot respects standard HTTP headers, including Accept-Language and Accept-Encoding: gzip. Crawl frequency is moderate, with a typical delay of 1–3 seconds between requests per domain, though this can increase for slower sites. It does not perform concurrent bursts beyond two simultaneous connections per host. The bot follows Link headers and sitemap.xml directives to discover new content and recrawls URLs based on a freshness heuristic tied to the site’s Last-Modified response header. It does not execute JavaScript or render pages; it only parses raw HTML. The crawler also checks robots.txt before every request, as confirmed by X’s official crawler documentation at developer.twitter.com.
📋 robots.txt Compliance
X Corp. explicitly states that the Search X-Bot honors robots.txt Disallow directives. This is documented in their developer guidelines, where they advise webmasters to use User-agent: X-Bot in their robots.txt file to control access. There are no known instances of the bot ignoring robots.txt rules; it also respects Crawl-Delay directives when specified. Independent webmaster forums (e.g., WebmasterWorld, 2024) have confirmed compliance through log analysis.
🔍 Detection Indicators
The primary identifier is the User-Agent string X-Bot/1.0. Additionally, the bot may send an X-Forwarded-For header when behind proxies, but the reverse DNS lookup resolves to a *.twitter.com domain. Behavioral fingerprints include the absence of a Referer header and a consistent request interval of 1–3 seconds. The bot also sends a From header containing [email protected] (as seen in public access logs). Official X support pages confirm these strings are used solely for search indexing.
📊 Data Usage
The collected content is used exclusively for X’s search function, generating rich link previews (Twitter Cards), and improving text-based search ranking within the X platform. Data is not used for AI training or advertising targeting; it is strictly for enhancing the user experience when sharing or discovering web links on X. The bot does not store full page copies, only metadata and snippets.
⚙️ Rate Limiting Policy
While the Search X-Bot is legitimate, its moderate crawl volume can still strain smaller servers. Rate limiting is applied with threshold-based blocking (e.g., 50 requests per minute per IP) to protect site resources without outright blocking. This policy balances the bot’s utility for link previews against server load, ensuring the bot can still index important pages while preventing accidental denial-of-service.
🛡️
Stop Bots. Save Bandwidth. Protect Revenue.
Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.
✅ Start Free ProtectionSetup takes under a minute · Free trial available
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.