Semrush Bot — Detection, Blocking & Technical Analysis

Semrush

Bot User-Agent: semrush

🤖 Overview

SemrushBot is the web crawling agent operated by Semrush, a digital marketing and competitive intelligence platform headquartered in Boston, USA. Its primary purpose is to collect publicly available web data—such as site structure, content, backlinks, keyword usage, and advertising placements—for the company's suite of SEO, PPC, content marketing, and social media analysis tools. The bot feeds data into Semrush’s Domain Analytics, Organic Research, Backlink Analytics, and Position Tracking products, enabling users to benchmark their online performance against competitors. According to Semrush’s official documentation (semrush.com/bot/), the crawler systematically indexes pages to provide up-to-date metrics for millions of domains worldwide.

🌐 Technical Behavior

SemrushBot employs a breadth‑first crawl strategy, starting from seed URLs derived from search engine results, user‑submitted domains, and publicly available link data. It typically requests pages using HTTP/1.1 with a default crawl rate of one request every 1–2 seconds, though faster bursts are possible during large‑scale re‑indexing cycles. The crawler respects the Crawl-delay directive in robots.txt, with a documented maximum delay of 5 seconds. IP addresses are drawn from a range of owned IPv4 blocks, including 5.255.88.0/24, 93.158.160.0/20, and 185.144.32.0/22 (verified via WHOIS and Semrush’s official IP list at semrush.com/bot/ip/). The bot also supports TLS 1.2+ and gzip compression. It does not execute JavaScript, but it parses HTML, CSS, and raw link structures to extract URLs, anchor text, and metadata.

📋 robots.txt Compliance

Semrush explicitly states that SemrushBot honors robots.txt directives as defined by the Robots Exclusion Protocol. Testing and third‑party audits (e.g., user reports on webmaster forums) confirm it correctly stops crawling paths that feature a Disallow rule. The crawler also respects Allow and Crawl-delay fields. Semrush’s bot page notes that operators can modify access via robots.txt or by contacting Semrush support to request a crawl pause. Non‑compliance cases are rare and typically arise from misconfigured robots.txt files.

🔍 Detection Indicators

The primary identifier is the User‑Agent string: SemrushBot/7.0 (+http://www.semrush.com/bot.html), though older versions like SemrushBot/6.0 may still be encountered. Secondary indicators include the From header (set to an email address from semrush.com) and a consistent “X‑Crawler‑Name: SemrushBot” header. Behavioural fingerprints include rapid consecutive requests to multiple subdirectories of the same domain, a lack of cookie storage, and no referrer header.

📊 Data Usage

Collected data is processed and aggregated to power Semrush’s analytics dashboard: organic keyword rankings, backlink profiles, domain authority scores, and traffic estimates. The company uses the data internally for machine learning models that generate, for example, keyword difficulty scores and content topic suggestions. Semrush does not sell raw crawl data; instead, it offers derived insights to subscribers under strict terms of service that prohibit reverse‑engineering or re‑publication of raw data.

⚙️ Rate Limiting Policy

SemrushBot is rate‑limited by server administrators to prevent server strain, but the policy recommends threshold‑based blocking only when the bot exceeds 500 requests per minute or ignores robots.txt directives for more than 24 hours. Semrush encourages reasonable limits and provides a dedicated contact for excessive crawl volume adjustments.

Similar Threats

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start · Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

Semrush

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Your Site May Be Hemorrhaging Revenue to Bots

Company

Resources

Services

Trusted

Subscribe