trex

Bot User-Agent: trex

🤖 Overview

trex is a web crawler operated by T-REX Inc., a company specializing in large-scale web data extraction for AI training and market intelligence. The bot, identified by the user-agent string T-REX/1.0, was first documented in the company's official developer portal and is used to feed data into their proprietary machine learning models and analytics platform. According to T-REX’s public documentation, the crawler’s primary mission is to collect publicly available web content to improve natural language understanding and trend analysis products.

🌐 Technical Behavior

trex employs a distributed crawling architecture, requesting pages at a rate of approximately 10–20 requests per second per IP, with bursts during off-peak hours. The crawler uses HTTP/1.1 and HTTP/2 protocols, and respects Transfer-Encoding: chunked responses. IP ranges are announced via ASN 39471 (T-REX Inc.) and include blocks from major cloud providers such as AWS (us-east-1) and Google Cloud (us-central1). User-Agent tokens include Mozilla/5.0 (compatible; T-REX/1.0; +https://trex.com/crawler) and trexbot/1.0. trex does not follow JavaScript redirects and only processes static HTML, ignoring pages requiring client-side rendering. The crawler sends a custom X-Trex-Crawl-ID header to aid site operators in identifying individual crawl sessions.

📋 robots.txt Compliance

According to T-REX Inc.’s official FAQ, trex fully honors robots.txt directives, including per-path Disallow rules and Crawl-delay instructions. The crawler is designed to avoid disallowed pages and reduces its request rate when a Crawl-delay directive is present. T-REX provides a public validation tool at https://trex.com/robots-check where site owners can test compliance.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; T-REX/1.0; +https://trex.com/crawler). Additional fingerprints include a missing Accept-Language header, a consistently empty Referer header, and a User-Agent that omits common platform details. The bot also exhibits a predictable request interval pattern with no randomization, making it identifiable through behavioral analysis. Security researchers at the Web Crawler Identification Project have published a signature based on the combination of the User-Agent and the custom X-Trex-Crawl-ID header.

📊 Data Usage

Collected data is primarily used to train T-REX’s large language models and to generate market trend reports for enterprise clients. The company’s privacy policy states that no personally identifiable information is intentionally stored, and all text is curated to remove email addresses and phone numbers. trex also feeds a real-time analytics dashboard that tracks brand mentions and sentiment across the web.

⚙️ Rate Limiting Policy

trex is rate-limited because its high request frequency can overwhelm smaller web servers; threshold-based blocking is recommended to preserve site performance while still allowing legitimate data collection for AI training and analytics.

Free Bot Analysis

Is Your Site Under Bot Attack Right Now?

Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.

Run Free Bot Scan →

No credit card required  ·  Results in minutes

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.