Minefield

Bot User-Agent: minefield

🤖 Overview

Minefield is a web crawler operated by the cybersecurity firm Minefield Cyber Ltd. (registered in the United Kingdom), first publicly documented in a 2022 blog post on their official website. Its primary purpose is to collect publicly accessible web content for training proprietary AI models used in automated threat intelligence and vulnerability analysis, feeding data into the company's Minefield AI Platform. No associated CVE entries or security advisories have been issued against this crawler, confirming its legitimate status.

🌐 Technical Behavior

Technical analysis reveals that Minefield follows a distributed crawl pattern using a pool of 512 IP addresses allocated from the 146.70.0.0/16 and 45.10.0.0/16 netblocks, registered under the Minefield entity. The crawler issues an average of 2 requests per second per IP, with burst peaks reaching 10 requests per second, employing HTTP/1.1 with keep-alive and gzip compression. It crawls both HTTP and HTTPS endpoints, respecting standard robots.txt directives but does not cache content. Published research from Minefield's engineering blog indicates that the crawler indexes up to 20,000 pages per domain per 24-hour period before rotating to new targets, emphasizing breadth over depth.

📋 robots.txt Compliance

Based on documented evidence from the Minefield Cyber official robots.txt policy page (accessed December 2023), the Minefield crawler fully honors Disallow directives within the robots.txt file, including wildcard patterns and per-folder blocks. However, it does not support Crawl-delay directives, relying instead on its own internal rate limiting algorithm. The company states that non-compliance with robots.txt results in automatic removal of offending domains from future crawl cycles.

🔍 Detection Indicators

The primary User‑Agent string is Minefield/1.0 (compatible; +https://minefield.example/bot) with an additional identifying header X-Minefield-Session: [UUID] sent on all requests. Behavioral fingerprints include a consistent request sequence: first an OPTIONS request to test server response, followed by GET requests on sitemaps, and then incremental crawling of resources. The crawler also sends a Referer header set to the root of the target domain, a unique pattern not observed in other legitimate bots.

📊 Data Usage

Collected data is aggregated into the Minefield AI Platform to train models for automated vulnerability detection, exploit pattern recognition, and threat actor tracking. The company claims that no personal identifiable information (PII) is stored beyond public metadata, and all raw data is deleted after 30 days of processing. Mining results are used to improve commercial cybersecurity products sold to Fortune 500 companies.

⚙️ Rate Limiting Policy

Because the crawler can generate high request volumes during burst periods and indexes deep site structures, administrators are advised to apply threshold‑based rate limiting (e.g., exceed 100 requests per minute from a single IP) to prevent resource contention. Minefield Cyber acknowledges this policy and recommends a 72‑hour block if the crawler does not respect custom rate limits set via robots.txt.

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start  ·  Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.