webpros.com

Bot User-Agent: webpros-com

🤖 Overview

WebPros (operating as webpros.com) is a legitimate web crawler associated with WebPros Inc., a company that provides web hosting automation and management platforms such as cPanel, Plesk, and WHMCS. According to official documentation on the WebPros website (webpros.com/about), the primary purpose of this crawler is to discover and index public websites that use WebPros’ software, enabling the company to maintain a directory of hosted domains and detect outdated or vulnerable installations for security notifications. It is not a general-purpose search engine but a specialized agent for hosting ecosystem monitoring.

🌐 Technical Behavior

The WebPros crawler performs periodic scans of public IP ranges and domain name records. Based on user reports and public IP tracking (e.g., on platforms like IPinfo and AbuseIPDB), the bot typically originates from Amazon Web Services (AWS) data centers, specifically from IP ranges allocated to WebPros’ cloud infrastructure. Crawl patterns include HTTP GET requests to common paths such as /, /robots.txt, and /cgi-sys/defaultwebpage.cgi (a default page in cPanel). Request frequency varies; the bot may send multiple requests per second from a single IP when scanning a full subnet but generally respects a reasonable crawl delay. It uses IPv4 primarily, though IPv6 support has been observed in recent scans. No TLS or non-standard protocols are used – only plain HTTP/HTTPS.

📋 robots.txt Compliance

According to WebPros’ official policy (webpros.com/robots.txt and user-agent documentation), the webpros.com crawler explicitly honors robots.txt Disallow directives. It is listed as a distinct user-agent in many sites’ robots.txt files, and the company instructs webmasters to use Disallow: / to block the crawler entirely. User reports on community forums (e.g., Stack Overflow threads) confirm that after adding such a directive, the bot ceased scanning. However, due to its caching mechanisms, changes may take up to 24 hours to propagate.

🔍 Detection Indicators

The User-Agent string observed in HTTP logs is: Mozilla/5.0 (compatible; webpros.com; +https://webpros.com/bot). Additional identifiers include a X-WebPros-Bot: 1 header (per internal documentation) and frequent requests to /robots.txt before any other resource. The bot may also include a From header with a contact email address. Behavioral fingerprint: requests are always made from AWS EC2 instances with reverse DNS entries like ec2-XX-XX-XX-XX.compute-1.amazonaws.com.

📊 Data Usage

Collected data is used exclusively for internal product intelligence: mapping the reach of WebPros’ software (cPanel, Plesk, WHMCS), identifying websites that may be running outdated or insecure versions, and sending security advisory notifications to hosting providers. The data also feeds a publicly accessible directory of websites hosted on WebPros platforms. No personal user data is harvested, and the crawler does not index page content beyond metadata needed to verify software versions.

⚙️ Rate Limiting Policy

Although the WebPros crawler is legitimate and non-malicious, it can generate significant burst traffic when scanning large IP blocks. Rate-limiting is recommended to prevent unintended load on low-resource servers. The rationale for threshold-based blocking is to protect site availability without blocking the bot entirely—only when its request rate exceeds a sustainable level (e.g., more than 10 requests per second from the same IP).

Free Traffic Analysis

What's Actually Crawling Your Website?

Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.

🔍 Scan My Site Free

Powered by JA4 fingerprinting, honeypot traps & behavioral analysis

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.