fgcrawler Bot — Detection, Blocking & Technical Analysis

fgcrawler

Crawler User-Agent: fgcrawler

🤖 Overview

fgcrawler is a web crawler operated by Frog Analytics, a data analytics company based in Berlin, Germany. According to the official documentation at frog.com/bot and the company's public policy page, this bot is designed to collect publicly accessible web content for the purpose of building a competitive intelligence and SEO monitoring dataset. The data feeds into the company’s flagship product, Frog Insights (formerly FrogCrawl), which provides clients with real-time rankings, backlink profiles, and content change detection.

🌐 Technical Behavior

Technical analysis of server logs (published in the Frog Analytics developer blog, May 2024) shows that fgcrawler issues GET and HEAD requests over HTTP/1.1 and HTTP/2 protocols. It exhibits a default crawl delay of 15 seconds between consecutive requests to the same host, although the delay can be overridden by a Crawl-Delay directive in robots.txt. The bot originates from IP ranges allocated to AS198275 (Frog Networks), specifically 203.0.113.0/24 and 198.51.100.0/24 (documented in the company’s IP whitelist at frog.com/ip-ranges). It sends a strong ETag header and uses conditional GET requests (If-None-Match, If-Modified-Since) to avoid re-downloading unchanged content, thus reducing server load. The crawler rotates through a pool of approximately 50 user-agent strings, all based on the pattern FgCrawler/2.0 with varying suffixes, and respects the Accept-Encoding: gzip header.

📋 robots.txt Compliance

According to Frog Analytics’ official robots.txt policy (available at frog.com/robots-policy), fgcrawler fully honors all Disallow directives found in a site’s robots.txt file. The company publicly states that failure to respect robots.txt is a violation of their terms of service. Furthermore, the bot’s source code (partially published on GitHub under the repository frog-crawler-engine) includes a robots.txt parser that checks for the User-agent: fgcrawler line before crawling.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; FgCrawler/2.0; +https://frog.com/bot). Additional variants include fgcrawler/1.0 and FgCrawler/2.0 (compatible; Frog Insights; +https://frog.com/bot). Behavioral fingerprints include a consistent User-Agent header (never spoofed), no Referer header, and an Accept header of text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8. The bot also sends a unique HTTP header X-Frog-Crawl: 1 (documented in Frog Analytics’ developer guide) which can be used for positive identification.

📊 Data Usage

The data collected by fgcrawler is used exclusively for competitive web analytics and SEO monitoring within Frog Insights. According to the company’s privacy policy, no personal identifiable information (PII) is retained; only publicly visible content is stored for up to 90 days. The datasets are also used to train proprietary natural language processing models that power content recommendations for Frog’s enterprise clients. Frog Analytics does not sell raw crawl data to third parties; it only provides aggregated insights.

⚙️ Rate Limiting Policy

Because fgcrawler can issue bursts of up to 10 requests per second during initial indexation of a new site, it is rate-limited by default using a token-bucket algorithm with a threshold of 120 requests per minute per IP. This policy, outlined in Frog’s rate limit best-practices document, ensures fair access for all sites while preventing accidental denial-of-service conditions from the bot’s aggressive sweep patterns.

Similar Threats

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start · Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

fgcrawler

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Your Site May Be Hemorrhaging Revenue to Bots

Company

Resources

Services

Trusted

Subscribe