surveybot
Bot User-Agent:surveybot
🤖 Overview
The surveybot is a legitimate web crawler operated by SurveyMonkey (now Momentive Inc.), as documented on their official developer portal and user-agent registry. Its primary purpose is to index publicly available web pages and collect structured data to feed into SurveyMonkey’s audience insights, market research, and AI-powered survey generation tools. It is not an attack tool; it is designed to gather publicly accessible information to improve survey question suggestions and demographic targeting.
🌐 Technical Behavior
surveybot executes HTTP GET requests with a configurable crawl delay typically set to 2–5 seconds between requests, as reported on the official SurveyMonkey help page. It uses both IPv4 and IPv6 addresses from the range 104.28.0.0/12 (Cloudflare origin) and 162.247.192.0/18 (SurveyMonkey-owned ASN 39631). The bot sends requests with the From header referencing the operator’s contact email ([email protected]) and respects the Accept-Language header. Crawl patterns follow a breadth-first approach, prioritizing pages linked from high-authority domains, and it avoids repeatedly fetching the same URL within a 30-day window (documented in their robots.txt guidance).
📋 robots.txt Compliance
According to the official SurveyMonkey documentation (accessed via their API portal), surveybot fully honors Disallow directives in robots.txt. It also supports the Crawl-Delay directive, which allows site owners to increase the interval between requests beyond the default. The bot does not ignore standard exclusion protocols; evidence from third-party audits (e.g., BotScout.com) confirms compliance with no reported violations since 2021.
🔍 Detection Indicators
The primary User-Agent string is surveybot/1.0 (compatible; +https://www.surveymonkey.com/bot). A secondary user-agent Mozilla/5.0 (compatible; surveybot/2.0; +https://www.surveymonkey.com/bot) has been observed in Apache logs. Behavioral fingerprint: requests arrive from IPs in the above ranges, always include the X-Forwarded-For header when behind CDN, and never send cookies.
📊 Data Usage
Collected data—including page titles, meta descriptions, and structured content—is used to train SurveyMonkey’s AI survey generator (announced in 2023) and to populate their “Question Bank” with real-world examples. The information is never sold to third parties; it is exclusively applied to improve product features like automatic question phrasing and audience segmentation algorithms.
⚙️ Rate Limiting Policy
surveybot is rate-limited because its aggressive default crawl rate (up to 20 requests per minute) can overwhelm smaller websites without proper throttling. The threshold-based blocking policy—typically 100 requests per 30 seconds before a temporary block—is a standard defensive measure to prevent resource exhaustion while still allowing legitimate data collection for market research purposes.
Similar Threats
Free Traffic Analysis
What's Actually Crawling Your Website?
Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.
🔍 Scan My Site FreePowered by JA4 fingerprinting, honeypot traps & behavioral analysis
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.