Web Fetch Bot — Detection, Blocking & Technical Analysis

Web Fetch

Bot User-Agent: web-fetch

🤖 Overview

Web Fetch is a web crawler operated by Fetch.ai (fetch.ai), a decentralized machine learning platform that enables autonomous economic agents. First publicly documented in 2020, the bot collects publicly accessible web content to train AI agents and improve their real‑world decision‑making capabilities within the Fetch ecosystem. It is part of the Fetch.ai network, which uses blockchain technology to coordinate agent interactions.

🌐 Technical Behavior

The Web Fetch crawler operates via a distributed network of agents, each making HTTP/1.1 GET requests with a median frequency of one request per 10–15 seconds per IP. IP addresses are drawn from cloud providers including AWS (ec2‑range) and Google Cloud (compute‑range), and are dynamically reassigned every 24 hours. The bot prioritizes new or updated content by parsing sitemaps and internal link structures, and it respects Last‑Modified and ETag headers to avoid re‑downloading unchanged resources. It supports TLS 1.2 and 1.3, and does not execute JavaScript or maintain session state.

📋 robots.txt Compliance

According to Fetch.ai’s official crawler policy (fetch.ai/robots), Web Fetch fully honors Disallow directives and respects Crawl‑Delay if specified. The bot checks robots.txt on each visit and caches the file for up to 24 hours. Evidence from multiple webmasters confirms it does not attempt to bypass disallowed paths.

🔍 Detection Indicators

The primary User‑Agent string is WebFetch/1.0 (+https://fetch.ai/crawler) and may include a From header of [email protected]. Behavioral fingerprints include a consistent rate of requests, no user‑interaction patterns, and a lack of JavaScript execution. The bot also sets a custom HTTP header X‑Fetch‑Agent: true for identification.

📊 Data Usage

Collected web data is used to train autonomous economic agents on the Fetch.ai network, enabling them to understand web‑based information and automate tasks such as price monitoring, data extraction, and service discovery. The data is anonymized and used solely for internal AI training; it is not sold or shared with third parties.

⚙️ Rate Limiting Policy

Web Fetch is rate‑limited because its distributed agent architecture can generate high request volumes from multiple IPs simultaneously. Threshold‑based blocking (e.g., >50 requests per minute per IP) is applied to preserve server stability while still allowing legitimate content retrieval.

Similar Threats

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute · Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

Web Fetch

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Stop Bots. Save Bandwidth. Protect Revenue.

Company

Resources

Services

Trusted

Subscribe