statbot Bot — Detection, Blocking & Technical Analysis

statbot

Bot User-Agent: statbot

🤖 Overview

Statbot is a web crawler operated by StatCounter, an Irish web analytics company founded in 1999, whose primary purpose is to collect publicly available website metadata to support the StatCounter Global Stats reports and to verify the correct installation of StatCounter’s tracking code on client sites. The bot is documented on StatCounter’s official crawler page at https://statcounter.com/statbot.html, which states that it is only used to check that the StatCounter code is present and functional, not to harvest page content.

🌐 Technical Behavior

Statbot performs HTTP HEAD and GET requests at a low frequency, typically sending one request per URL per day, and it does not follow robots.txt disallowed directories by default according to its own policy page. IP addresses used by Statbot are drawn from a pool of StatCounter-owned ranges, including 69.16.255.0/24 and 72.14.254.0/24, as documented on https://statcounter.com/ip-addresses.html. The crawler operates over HTTP/1.1 and supports gzip compression. It respects the Cache-Control header and does not parse JavaScript or CSS. The bot’s crawl pattern is linear: it requests the homepage first, then checks for the StatCounter tracking snippet in the HTML response.

📋 robots.txt Compliance

StatCounter explicitly states on its bot page that Statbot does not honor robots.txt directives for its verification purposes, because it only makes a single HEAD request to confirm code presence and never downloads resources such as images, scripts, or large files. However, for any site that wishes to block even this minimal check, StatCounter provides a mechanism to opt out via a specific User-Agent token described in their FAQ.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; Statbot/1.0; +http://www.statcounter.com/statbot.html) and a more recent variant Statbot/2.0 (http://www.statcounter.com/statbot.html). Behavioral fingerprints include a single HEAD request per session, a missing Referer header, and a Connection: close directive in the request. No Accept-Language header is sent, and the User-Agent is the only identifying header.

📊 Data Usage

Data collected by Statbot is used exclusively to confirm that the StatCounter tracking code is active on a website, which in turn allows StatCounter to produce aggregated analytics for its clients. No page content, user data, or personal information is stored; the bot only checks for the presence of the StatCounter JavaScript snippet. StatCounter Global Stats reports are derived from the tracking data collected from participating websites, not from the bot’s checks.

⚙️ Rate Limiting Policy

Because Statbot makes very few requests and serves a legitimate operational purpose, rate limiting is typically set to allow a small number of daily requests per IP (e.g., 5–10) without penalization. Blocking after excessive requests is unnecessary under normal conditions, but thresholds may be applied if a site detects abnormal frequency spikes due to misconfiguration or multiple StatCounter accounts.

Similar Threats

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute · Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

statbot

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Stop Bots. Save Bandwidth. Protect Revenue.

Company

Resources

Services

Trusted

Subscribe