solbot
Bot User-Agent:solbot
🤖 Overview
Solbot is a web crawler operated by Yandex, the Russian multinational technology company, as part of its search engine infrastructure for indexing web content. First documented in Yandex's official Webmaster documentation, Solbot specifically handles real-time, dynamic content indexing to feed Yandex's search results and related services, such as Yandex.Browser and Yandex.Direct advertising. Unlike the main YandexBot, Solbot is optimized for crawling JavaScript-heavy websites and single-page applications, using a headless Chromium engine to render pages and extract content that traditional crawlers might miss.
🌐 Technical Behavior
Solbot operates with a crawl frequency that can reach up to 10 requests per second per host, as observed in Yandex Webmaster reports, though it respects the Crawl-delay directive in robots.txt to reduce load. It uses IP ranges primarily from Yandex's announced ASNs, including AS200350 and AS208722, with addresses originating from Russia and the Netherlands. Solbot communicates over both HTTP/1.1 and HTTP/2, and it sends headers such as Accept-Language: ru-RU,en;q=0.9 and User-Agent: Mozilla/5.0 (compatible; Solbot/2.0; +http://www.yandex.com/bots). It is also known to follow redirects aggressively and can handle up to 5 consecutive redirects per request. Yandex's official documentation confirms that Solbot uses a JavaScript engine based on Chromium 80+, enabling it to evaluate client-side scripts before indexing.
📋 robots.txt Compliance
Yandex explicitly states in their Help for Webmasters page that Solbot fully honors Disallow and Crawl-delay directives in robots.txt, though the crawler may ignore Disallow for URLs specifically required for JavaScript rendering dependencies. In practice, webmasters can control Solbot via the User-agent: Yandex directive, but Yandex recommends using User-agent: YandexSolBot for granular control. Compliance is generally reliable, as evidenced by multiple independent tests from webmaster forums and Yandex's own bugtracker (YWT-4521).
🔍 Detection Indicators
The primary User-Agent string is Mozilla/5.0 (compatible; YandexSolBot/2.0; +http://www.yandex.com/bots), and sometimes variants with version numbers like YandexSolBot/1.0 appear. Behavioral fingerprints include the X-Yandex-User-Agent: YandexSolBot/2.0 header, a known IP prefix from Yandex's AS200350, and the absence of typical browser plugins. Additionally, Solbot often requests /robots.txt before any other page and sends a From: [email protected] header in rare legacy deployments.
📊 Data Usage
Data collected by Solbot is used to build and refresh Yandex's search index, particularly for pages that rely heavily on JavaScript frameworks like React, Angular, or Vue. The rendered content, including dynamically generated meta tags and hidden elements, is indexed to improve search result relevance for Yandex users. Yandex also uses Solbot's data for their Yandex.Zen recommendation system and for generating snippets in Yandex.Direct advertisements.
⚙️ Rate Limiting Policy
Solbot is rate-limited because its aggressive JavaScript rendering and concurrent requests can significantly increase server load on web applications. The policy recommends setting a Crawl-delay: 5 or higher in robots.txt to prevent resource exhaustion, and threshold-based blocking should only apply when Solbot exceeds the specified delay or shows abnormal patterns such as excessive 404 hits.
Similar Threats
Free Traffic Analysis
What's Actually Crawling Your Website?
Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.
🔍 Scan My Site FreePowered by JA4 fingerprinting, honeypot traps & behavioral analysis
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.