YandexAdditionalBot

Bot User-Agent: yandexadditionalbot

🤖 Overview

YandexAdditionalBot is a web crawler operated by the Russian technology company Yandex, primarily used to fetch supplementary content not covered by the main YandexBot crawler. According to Yandex’s official documentation at https://yandex.com/support/webmaster/bot-and-robots-tasks.html, this bot is employed for tasks such as checking external links, retrieving CSS and JavaScript resources, and gathering data for Yandex’s proprietary indexing algorithms. It is part of Yandex’s broader ecosystem that powers the Yandex Search engine and related services like Yandex.Maps and Yandex.Video.

🌐 Technical Behavior

This bot exhibits aggressive crawl patterns, often making multiple requests per second from a pool of IP addresses primarily in the 77.88.0.0/18 and 93.158.0.0/18 ranges, as verified by Yandex’s published IP list at https://yandex.com/support/webmaster/bot-and-robots-tasks/posts/task-ip-addresses.html. It uses the HTTPS protocol and sends a standard HTTP User-Agent header. YandexAdditionalBot typically fetches only a few resources per domain per minute, but it can escalate during indexing of dynamically generated pages. It follows redirects (HTTP 301/302) and respects Cache-Control headers. The bot does not appear to support HTTP/2 or Accept-Encoding for gzip by default, though it may accept compressed responses if offered.

📋 robots.txt Compliance

YandexAdditionalBot fully honors robots.txt directives as documented in Yandex’s webmaster guidelines. It will obey Disallow, Allow, and Crawl-delay directives specifically targeted at its User-Agent string. The bot also respects Disallow: / for entire sections. Notably, it ignores robots.txt for resources explicitly needed for rendering (e.g., CSS/JS if blocked, it may still request them), a behavior Yandex documents as “limited override for page rendering” but otherwise adheres to standard rules.

🔍 Detection Indicators

The primary detection fingerprint is the User-Agent string: Mozilla/5.0 (compatible; YandexAdditionalBot/3.0; +http://yandex.com/bots). Additional identifying headers include X-Yandex-Bot: 1 and sometimes a custom From header with the bot’s contact email. It does not use the X-Robots-Tag directive. The bot’s IP addresses are verifiable via reverse DNS lookups that return *.yandex.net or *.yandex.ru. Behavioral fingerprints include a high variance in request intervals and frequent fetching of same paths with different query parameters.

📊 Data Usage

Collected data is used exclusively for Yandex Search indexing and for building Yandex’s knowledge graph. The bot retrieves supplementary resources to improve search result relevance, including link validation, metadata extraction, and image alt-text verification. According to Yandex’s privacy policy at https://yandex.com/legal/confidential/, the data is not used for AI training or third-party analytics but solely to maintain and enhance Yandex’s search index.

⚙️ Rate Limiting Policy

Rate limiting is recommended because YandexAdditionalBot can initiate abrupt bursts of requests, particularly when re‑indexing updated content. A threshold-based block (e.g., >20 requests per second from a single IP) is prudent to protect server resources without harming legitimate indexing, as the bot will automatically back off and retry later per its crawl logic.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.