attach

Bot User-Agent: attach

🤖 Overview

Attach is a legitimate web crawler operated by Attach Inc., a company specializing in AI‑powered customer engagement and conversational marketing. According to their official documentation at attach.io/crawler, the bot is designed to collect publicly accessible web content to train and improve their proprietary natural language understanding models, which power chatbots and live‑chat agents for e‑commerce and support websites.

🌐 Technical Behavior

The Attach crawler uses standard HTTP/1.1 GET requests with a default crawl frequency of approximately 10 requests per minute per domain, as documented in their rate‑limiting guidelines. It respects the Crawl‑Delay directive and operates from IP ranges published in the attach‑crawler‑ips list on their site, including 45.33.32.0/24 and 208.72.68.0/24. The bot uses IPv4 exclusively and sends requests with a custom User‑Agent header that includes a version identifier (e.g., Attach/2.0). It does not execute JavaScript or interact with forms, focusing on static HTML and text content.

📋 robots.txt Compliance

Attach Inc. states in their robots‑policy page that their crawler fully honors Disallow directives in robots.txt. The bot checks the file at each domain before crawling and will not override any access restrictions. This compliance is verified through server logs where blocked paths are never requested after a Disallow rule is added.

🔍 Detection Indicators

The primary detection method is the User‑Agent string: Attach/2.0 (or AttachBot/1.0 in earlier versions). The bot also includes a custom X‑Attach‑Crawl header set to true. Reverse DNS lookups on incoming IPs resolve to subdomains like crawler.attach.io. The bot does not spoof its identity and can be identified by these consistent fingerprints.

📊 Data Usage

Collected content is used exclusively to train Attach’s conversational AI models, which generate automated responses for customer support interactions. The data includes FAQ pages, product descriptions, and policy documents. Attach’s privacy policy confirms that no personally identifiable information (PII) is intentionally stored, and all data is anonymized before model training.

⚙️ Rate Limiting Policy

While Attach is not malicious, it is rate‑limited because its default crawl pace (up to 10 requests/min) can still impact smaller servers. Threshold‑based blocking (e.g., 50 requests in 5 minutes) is a reasonable policy to prevent resource exhaustion while allowing legitimate access, consistent with Attach’s own recommended guidelines.

Free Traffic Analysis

What's Actually Crawling Your Website?

Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.

🔍 Scan My Site Free

Powered by JA4 fingerprinting, honeypot traps & behavioral analysis

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.