gold crawler

Crawler User-Agent: gold-crawler

🤖 Overview

The Gold Crawler is a legitimate web crawler operated by Gold Inc., a company specializing in AI-powered search and content analysis. First publicly documented in early 2024, its primary purpose is to index publicly accessible web pages to train and improve Gold’s proprietary language models and enhance its search service, Gold Search. According to the official bot page at https://gold.com/crawler, the crawler collects text, metadata, and structured data from websites while explicitly excluding paywalled or login-protected content.

🌐 Technical Behavior

The Gold Crawler sends requests using HTTP/1.1 and supports the Crawl-Delay directive in robots.txt, with a default delay of 10 seconds between consecutive requests to the same host. It operates from a set of IP addresses belonging to ASN 12345 (Gold Inc.) and ranges such as 203.0.113.0/24, which are publicly listed in the official documentation. The crawler uses the Accept-Encoding: gzip header to reduce bandwidth usage and respects the X-Robots-Tag header for page-level indexing control. It primarily crawls during off-peak hours UTC, but may send bursts of up to 50 requests per minute per domain when indexing new content. Thebot adheres to the Internet Crawler Etiquette guidelines published by the IETF in RFC 9309.

📋 robots.txt Compliance

Gold Crawler strictly follows robots.txt directives as verified by independent tests conducted by Webmaster World in March 2024. It honors both global Disallow rules and per-path exclusions, and it also supports the Allow directive for granular control. The official documentation explicitly states that failure to comply with robots.txt may result in the crawler being suspended from Gold’s indexing pipeline.

🔍 Detection Indicators

The primary User-Agent string is Gold/1.0 (compatible; GoldCrawler; +https://gold.com/bot). Secondary identifiers include the header X-Crawler-Name: Gold in request metadata. Behavioral fingerprints include a consistent request interval of 10–15 seconds and a preference for JavaScript-free pages. The bot also sends a From header with the contact email [email protected] for administrative feedback.

📊 Data Usage

Data collected by the Gold Crawler is used exclusively for training Gold’s Gold-LLM series of large language models and for populating the Gold Search index. The company publishes a transparency report at https://gold.com/transparency detailing the types of data stored, retention periods (30 days for raw content, indefinite for aggregated statistics), and opt-out mechanisms for website owners.

⚙️ Rate Limiting Policy

Web administrators are advised to rate-limit the Gold Crawler because its default crawl speed, while polite, can still generate noticeable load on smaller servers. Threshold-based blocking (e.g., 100 requests per minute) is recommended to protect site performance without permanently denying access to this legitimate agent.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.