BuiltWith
Bot User-Agent:builtwith
🤖 Overview
BuiltWith is a legitimate web technology profiler operated by BuiltWith Pty Ltd, an Australian company based in Sydney. Its primary purpose is to identify the web technologies, frameworks, analytics tools, and hosting providers used by websites, feeding data into the BuiltWith Pro and Trends products available at builtwith.com. The crawler automatically visits millions of sites to build a comprehensive technology lookup database, which is used by developers, marketers, and sales teams for competitive analysis and lead generation.
🌐 Technical Behavior
BuiltWith deploys distributed crawling from a dedicated set of IP addresses, primarily sourced from its own infrastructure and cloud providers. According to official documentation on their website, the crawler identifies itself via the User-Agent string “Mozilla/5.0 (compatible; BuiltWith/1.0; +http://builtwith.com/)” and may also use “BuiltWith/1.1” or “BuiltWith/2.0” depending on the crawl version. The bot typically requests the robots.txt file first, then fetches the HTML of the homepage and a limited number of subpages to detect technology signatures. Request frequency is high—often dozens of requests per second across a domain—but the crawler respects a configurable crawl delay if specified in robots.txt. BuiltWith does not follow internal links aggressively; instead, it performs targeted scans using a precompiled list of domains or user-submitted URLs. The bot uses HTTP/1.1 and HTTP/2 protocols and may send an Accept-Encoding header to request gzip compression. IP ranges are not publicly documented, but network analysis shows they originate from ASN AS16509 (Amazon) and AS13335 (Cloudflare), among others, and rotate frequently to avoid rate limits on target servers.
📋 robots.txt Compliance
BuiltWith explicitly states on its website that it respects the rules defined in a site’s robots.txt file, including Disallow directives and Crawl-Delay settings. However, because the bot only scans a limited set of pages (typically those needed to confirm technology presence), a broad Disallow for the entire site (“/”) will stop all BuiltWith crawling. There is no evidence of non-compliance in official support pages or security advisories; the company has a published crawling policy at builtwith.com/crawler that confirms adherence to the Robots Exclusion Protocol.
🔍 Detection Indicators
The primary detection indicator is the User-Agent string containing “BuiltWith” – for example, “Mozilla/5.0 (compatible; BuiltWith/1.0; +http://builtwith.com/)”. The bot also sends a distinctive X-Robots-Tag? No, but it may include a Referer header pointing to builtwith.com. Behavioral fingerprints include a very low number of page requests per session (often just 1-3), rapid sequential requests to multiple unrelated domains, and a lack of human-like interaction patterns (no scrolling, no image downloads). Server logs will show hits from IPs that resolve to *.builtwith.com or generic cloud provider hostnames, with high concurrency across different targets.
📊 Data Usage
Collected data—primarily technology fingerprints (JavaScript libraries, server headers, analytics tags, CSS class patterns)—is compiled into the BuiltWith database and made available to subscribers via an API and a web dashboard. The raw crawl data is also used to generate trend reports, technology adoption statistics, and market share analyses published on the BuiltWith blog and research papers. No personally identifiable information (PII) is intentionally collected; the focus is strictly on technology stacks, not user content.
⚙️ Rate Limiting Policy
BuiltWith is rate-limited because its high-frequency scanning can strain under-resourced web servers and disrupt normal traffic. The company recommends that site administrators implement a reasonable crawl delay (e.g., 10 seconds) in robots.txt or use mod_security rules to throttle the bot when it exceeds a threshold (e.g., 100 requests per minute per IP). This policy balances the bot’s legitimate business purpose with the need to protect server performance for end users.
Similar Threats
53% of Web Traffic Is Bots in 2026
— Imperva Bad Bot Report 2026
How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.
📊 Get My Bot ReportSign up in seconds · No card required
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.