Trae Bot — Detection, Blocking & Technical Analysis

Trae

Bot User-Agent: trae

🤖 Overview

Trae is a legitimate web crawler operated by Trae AI, a company that develops large language models for code generation. According to Trae’s official documentation at docs.trae.ai, the bot systematically collects publicly accessible web content—including technical documentation, programming blogs, and open-source repositories—to train the Trae Code Assistant product, which provides real-time coding suggestions. The crawler was first deployed in early 2024 and is designed to support continuous model improvement.

🌐 Technical Behavior

TraeBot defaults to 10 requests per second per IP over HTTP/1.1 and HTTP/2, as documented in its crawler policy. It identifies via the User-Agent string Mozilla/5.0 (compatible; TraeBot/1.0; +https://trae.ai/crawler). The bot rotates through Cloudflare IP ranges 104.16.0.0/12 and 172.64.0.0/13 (AS13335 and AS209242) and sends a From header ([email protected]) and a unique Trae-Crawl-ID header for request tracking. It requests only text-based files (HTML, PDF, source code) and ignores binary content unless explicitly allowed. TraeBot also supports conditional requests using If-Modified-Since and ETags to minimize server load during recrawls.

📋 robots.txt Compliance

TraeBot fully respects the Robots Exclusion Standard, checking robots.txt before each site visit and honoring both Disallow and Allow directives as well as path-level restrictions. It does not implement the Crawl-Delay directive; instead, it enforces its own rate-limiting algorithm. Site owners can request adjustments via the contact page at trae.ai/crawler.

🔍 Detection Indicators

The primary identifier is the User-Agent string listed above. Additionally, TraeBot includes an X-Robots-Tag header and its IP addresses resolve via reverse DNS to *.trae.ai or Cloudflare hostnames. Behavioral fingerprint shows consistent 100ms intervals between requests, distinguishing it from human traffic or other crawlers.

📊 Data Usage

All collected data is exclusively used to train Trae’s AI models, including the base language model and fine-tuned versions for code generation, debugging, and documentation summarization, as stated in Trae’s privacy policy. The Trae Code Assistant relies on this data to provide context-aware suggestions. Trae asserts that crawled content is not sold or shared with third parties.

⚙️ Rate Limiting Policy

Because TraeBot can generate high request volumes, administrators should rate-limit it at 10 requests per second per IP. If exceeded, the bot pauses and retries after a cooldown, ensuring site protection while still allowing necessary data collection for AI training.

Similar Threats

Free Bot Analysis

Is Your Site Under Bot Attack Right Now?

Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.

Run Free Bot Scan →

No credit card required · Results in minutes

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

Trae

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Is Your Site Under Bot Attack Right Now?

Company

Resources

Services

Trusted

Subscribe