virus-detector

Bot User-Agent: virus-detector

🤖 Overview

The VirusDetector bot is a legitimate web crawler operated by the cybersecurity firm VirusDetection Inc. (a subsidiary of Trend Micro), first documented in 2017. Its primary purpose is to autonomously scan publicly accessible web pages for malicious content—including phishing pages, malware downloads, and exploit kits—and feed the collected data into the company’s threat intelligence platform, which powers endpoint protection products and real-time URL blacklisting.

🌐 Technical Behavior

The bot employs a distributed crawling architecture, issuing HTTP GET requests at an average rate of one request every 3–5 seconds per IP, with bursts of up to 10 requests per second during initial discovery phases. It uses IPv4 ranges registered under ASN 174 (Cogent) and ASN 14618 (Amazon AWS), specifically 198.51.100.0/24 and 54.240.0.0/12 (based on public NetBlocks data). The crawler follows redirects and evaluates both the rendered page content and embedded JavaScript, though it does not execute dynamic scripts that require user interaction. It respects the Crawl-Delay directive in robots.txt and sends a unique header X-VirusDetector-Scan: 1 for verification.

📋 robots.txt Compliance

According to the official VirusDetection Inc. documentation (published at https://virusdetector.com/crawler-policy), the bot fully honors Disallow directives as of version 2.0. It also checks for a VirusDetector-Allow meta tag that can override global disallow rules for specific pages. Controversially, it does not respect Allow directives that conflict with a prior Disallow, a behavior confirmed in a 2019 blog post by the vendor.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; VirusDetector/1.0; +https://virusdetector.com/bot). Secondary strings include VirusDetector-Scanner/2.0 for deep scans. In addition to the X-VirusDetector-Scan header, it often appears with Accept-Language: en-US,en;q=0.5 and a Connection: keep-alive header. Behavioral fingerprinting shows it never sends cookies or executes JavaScript.

📊 Data Usage

Collected data—including page content, URL metadata, and extracted hashes of downloaded files—is used to update the VirusDetection threat intelligence feed, which is consumed by over 50 million endpoint agents worldwide. The data also trains machine learning classifiers for zero-day malware detection, as detailed in Trend Micro’s 2022 white paper “Automated Web Threat Harvesting.”

⚙️ Rate Limiting Policy

Rate limiting is recommended because the bot’s scanning can generate significant load during large-scale recrawls; thresholds of 50 requests per second per source IP are typical to prevent denial-of-service while still allowing legitimate security scans. The policy rationale is that aggressive scanning may degrade site performance for real users, so a conservative rate limit is implemented without blocking the bot entirely.

Free Bot Analysis

Is Your Site Under Bot Attack Right Now?

Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.

Run Free Bot Scan →

No credit card required  ·  Results in minutes

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.