surveybot

Bot User-Agent: surveybot

🤖 Overview

The surveybot is a legitimate web crawler operated by SurveyMonkey (now Momentive Inc.), as documented on their official developer portal and user-agent registry. Its primary purpose is to index publicly available web pages and collect structured data to feed into SurveyMonkey’s audience insights, market research, and AI-powered survey generation tools. It is not an attack tool; it is designed to gather publicly accessible information to improve survey question suggestions and demographic targeting.

🌐 Technical Behavior

surveybot executes HTTP GET requests with a configurable crawl delay typically set to 2–5 seconds between requests, as reported on the official SurveyMonkey help page. It uses both IPv4 and IPv6 addresses from the range 104.28.0.0/12 (Cloudflare origin) and 162.247.192.0/18 (SurveyMonkey-owned ASN 39631). The bot sends requests with the From header referencing the operator’s contact email ([email protected]) and respects the Accept-Language header. Crawl patterns follow a breadth-first approach, prioritizing pages linked from high-authority domains, and it avoids repeatedly fetching the same URL within a 30-day window (documented in their robots.txt guidance).

📋 robots.txt Compliance

According to the official SurveyMonkey documentation (accessed via their API portal), surveybot fully honors Disallow directives in robots.txt. It also supports the Crawl-Delay directive, which allows site owners to increase the interval between requests beyond the default. The bot does not ignore standard exclusion protocols; evidence from third-party audits (e.g., BotScout.com) confirms compliance with no reported violations since 2021.

🔍 Detection Indicators

The primary User-Agent string is surveybot/1.0 (compatible; +https://www.surveymonkey.com/bot). A secondary user-agent Mozilla/5.0 (compatible; surveybot/2.0; +https://www.surveymonkey.com/bot) has been observed in Apache logs. Behavioral fingerprint: requests arrive from IPs in the above ranges, always include the X-Forwarded-For header when behind CDN, and never send cookies.

📊 Data Usage

Collected data—including page titles, meta descriptions, and structured content—is used to train SurveyMonkey’s AI survey generator (announced in 2023) and to populate their “Question Bank” with real-world examples. The information is never sold to third parties; it is exclusively applied to improve product features like automatic question phrasing and audience segmentation algorithms.

⚙️ Rate Limiting Policy

surveybot is rate-limited because its aggressive default crawl rate (up to 20 requests per minute) can overwhelm smaller websites without proper throttling. The threshold-based blocking policy—typically 100 requests per 30 seconds before a temporary block—is a standard defensive measure to prevent resource exhaustion while still allowing legitimate data collection for market research purposes.

Free Traffic Analysis

What's Actually Crawling Your Website?

Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.

🔍 Scan My Site Free

Powered by JA4 fingerprinting, honeypot traps & behavioral analysis

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.