scollspider
Crawler User-Agent:scollspider
🤖 Overview
ScollSpider is a legitimate web crawler operated by Scoll Inc., a data aggregation company based in the United States, first publicly documented in 2019. Its primary purpose is to collect publicly accessible web content for market research, competitive intelligence, and lead generation services offered through the Scoll Data Platform. According to Scoll's official documentation available on their corporate website, the crawler indexes pages from e‑commerce, job boards, and news sites to feed structured datasets used by enterprise clients.
🌐 Technical Behavior
ScollSpider performs periodic full‑site crawls with a configurable request rate typically between 1 and 5 requests per second per domain, as stated in their published crawl guidelines. It uses both HTTP/1.1 and HTTP/2 protocols, and its requests include an Accept-Language: en-US,en;q=0.9 header. The crawler’s IP range is documented in the ASN AS396982 (Scoll Inc.), with subnets such as 23.92.17.0/24 and 45.33.32.0/19. It respects standard robots exclusion rules, but may concurrently fetch multiple pages from the same host using session‑based parallel requests.
📋 robots.txt Compliance
ScollInc explicitly states in their robots.txt best‑practices guide that ScollSpider honors Disallow directives. The crawler fetches /robots.txt at the start of each crawl and caches it for 24 hours. However, operators have reported occasional delays in obeying newly updated directives, as documented in community forums, but overall compliance is considered good for a data‑aggregation bot.
🔍 Detection Indicators
The primary User‑Agent string is ScollSpider/1.0 (compatible; https://scoll.com/crawler). A secondary fingerprint includes the X‑Scoll‑Crawl: 1 custom header and a consistent Connection: keep‑alive header. Webmasters can verify requests by checking reverse DNS entries which resolve to *.crawler.scoll.com.
📊 Data Usage
Collected data is used to build aggregated datasets for market trend analysis, price comparison, and lead generation. Scoll does not use the data to train AI models; instead, it is processed into anonymized, structured feeds sold to business customers for analytics dashboards and competitive monitoring, as described in their privacy policy.
⚙️ Rate Limiting Policy
Because ScollSpider can sustain moderate crawl rates (up to 5 req/s) and may revisit pages frequently, webmasters are advised to rate‑limit by IP or User‑Agent if server resources are constrained. Threshold‑based blocking (e.g., 10 req/s) is justified to protect site performance while still permitting legitimate data access.
Similar Threats
Free Bot Analysis
Is Your Site Under Bot Attack Right Now?
Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.
Run Free Bot Scan →No credit card required · Results in minutes
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.