rational sitecheck Bot — Detection, Blocking & Technical Analysis

rational sitecheck

Bot User-Agent: rational-sitecheck

🤖 Overview

Rational SiteCheck is a web crawler operated by IBM Rational, a division of IBM, first introduced as part of the IBM Rational AppScan product suite in 2005. Its primary purpose is to automatically discover and map web application structures, including URLs, forms, parameters, and cookies, to facilitate comprehensive security vulnerability scanning. The crawler feeds data into the AppScan engine, which performs static and dynamic analysis to identify common web vulnerabilities such as SQL injection, cross-site scripting, and authentication flaws. It is a legitimate, commercially licensed tool used by security professionals for penetration testing and compliance audits.

🌐 Technical Behavior

During scanning, Rational SiteCheck employs a depth-first crawl strategy, following all hyperlinks and form submissions within the target domain, including JavaScript-generated links and AJAX calls when configured. It sends requests at a configurable rate, typically between 10 to 50 requests per second, but can be throttled to avoid disruption. The bot uses HTTP/1.1 and supports HTTPS, often sending custom headers like X-Requested-With: XMLHttpRequest to simulate real user behavior. IP ranges are not published but originate from IBM’s corporate address blocks in the United States and Europe. It respects Robots.txt directives by default, but users can override this for thorough testing. The crawler also processes sitemap.xml files for coverage.

📋 robots.txt Compliance

According to IBM’s official documentation for AppScan, the crawler honors Disallow directives in robots.txt by default, but provides an option to ignore them for authorized penetration testing. This dual behavior is explicitly documented in the IBM Knowledge Center. For non-testing use, the bot will avoid paths listed as disallowed. Site owners can control access by adding specific User-agent entries for Rational SiteCheck or AppScan.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; Rational SiteCheck/1.0; +http://www.ibm.com/software/rational/appscan/) or variations. Additional fingerprints include a high frequency of HEAD and GET requests, lack of Accept-Language header, and sequential URL parameter testing. The bot also sends unique cookies like AppScanSession during scans. Log entries often show rapid sequential access patterns across directories.

📊 Data Usage

Collected data—including page structures, form inputs, and response codes—is used exclusively for vulnerability analysis and reporting within IBM Rational AppScan. No data is stored longer than the scan session, and results are presented to the user as a security assessment report. This is not used for AI training or public indexing; it is strictly for security audit purposes.

⚙️ Rate Limiting Policy

Because Rational SiteCheck can generate high request volumes during comprehensive scans, it is rate-limited by default to prevent accidental denial-of-service on target servers. Threshold-based blocking is recommended when the crawler exceeds a site's capacity, but administrators should whitelist IBM’s IP ranges during authorized security tests to avoid false positives.

Similar Threats

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.