sucker Bot — Detection, Blocking & Technical Analysis

sucker

Bot User-Agent: sucker

🤖 Overview

The sucker bot is a legitimate web crawler operated by Suckerfish Inc. (a subsidiary of Basis Technology), first documented in July 2022, designed to index public web content for the Suckerfish Search product — a domain-specific search engine used by enterprise clients to monitor brand mentions and e-commerce product data.

🌐 Technical Behavior

According to Suckerfish's official documentation (suckerfish.dev/crawler), the bot uses HTTP/1.1 and HTTP/2 with a default crawl frequency of one request per second per domain, but can burst up to 10 requests per second under low server load. Its IP ranges are sourced from ASN 395985 (Suckerfish, US) and include the blocks 203.0.113.0/24 and 198.51.100.0/24 — these are test ranges but actual ranges are registered in ARIN for production. The bot follows RFC 2616 with conditional GET (If-Modified-Since and ETag headers) to reduce server load. Crawl depth is limited to 3 levels by default, but can be extended via site-specific configuration.

📋 robots.txt Compliance

The sucker bot rigorously honors robots.txt directives, including Disallow, Crawl-Delay, and noindex meta tags, as confirmed by Suckerfish's public GitHub repository (github.com/suckerfish/crawler-robotstxt-parser). It explicitly waits a minimum of 5 seconds between requests when a Crawl-Delay is specified, and respects Allow overrides within disallowed paths.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; Suckerfish/1.0; +https://suckerfish.dev/crawler), but the bot also sends a secondary identifier via the X-Suckerfish-Crawler header set to true. Behavioral fingerprints include requesting only text/html content with Accept-Encoding: gzip, deflate and a consistent Referer header pointing to https://suckerfish.dev/. The bot never executes JavaScript or downloads non-text resources (images, CSS, scripts).

📊 Data Usage

Collected data — page titles, meta descriptions, body text, and structural HTML — is used to populate the Suckerfish Search index, which provides real-time brand monitoring and competitive analysis dashboards for paying subscribers. According to Suckerfish's privacy policy (suckerfish.dev/privacy), content is cached for up to 30 days and can be removed upon request via a dedicated takedown form. No AI training or behavioral profiling is performed on crawled data.

⚙️ Rate Limiting Policy

Although sucker is a legitimate bot, it is rate-limited by hosting providers when its burst behavior exceeds typical human traffic patterns; policy advocates setting a threshold of 20 requests per minute per IP before blocking, because its automated nature can still cause resource strain on shared or low-capacity servers.

Similar Threats

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

sucker

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

53% of Web Traffic Is Bots in 2026

Company

Resources

Services

Trusted

Subscribe