trovator

Bot User-Agent: trovator

🤖 Overview

Trovator is a web crawler operated by Shout Today Ltd., a UK-based company, and powers the Trovator search engine, which focuses on indexing publicly available web content for general search and discovery. According to official documentation at https://www.trovator.com/, the bot is designed to aggregate data from websites to build a searchable index, with a primary goal of providing users with relevant search results while respecting website owner preferences.

🌐 Technical Behavior

Trovator crawls using HTTP/1.1 and HTTPS protocols, sending requests with a default frequency that can vary but is documented to respect Crawl-Delay directives in robots.txt. Its IP ranges are allocated by Shout Today Ltd. and typically fall within the 185.151.200.0/22 and 185.151.204.0/22 blocks, as confirmed by reverse DNS lookups and WHOIS records. The crawler does not dynamically alter IPs per request but uses a set of identifiable proxies. It fetches pages sequentially, with a typical delay of 1–5 seconds between requests unless a site specifies otherwise. The bot follows links via GET requests and respects nofollow attributes.

📋 robots.txt Compliance

Based on the official robots.txt documentation at https://www.trovator.com/robots.txt, the Trovator bot fully honors Disallow and Crawl-Delay directives. It also recognizes the Allow directive and can be restricted using the User-agent: Trovator line. There is evidence in community forums that it respects custom rules, though some webmasters report occasional crawling of disallowed paths due to misconfiguration of the robots.txt file on the origin server.

🔍 Detection Indicators

The primary User-Agent string for Trovator is Mozilla/5.0 (compatible; Trovator/1.1; +https://www.trovator.com/bot). The bot sends a custom From header with the email address [email protected] for contact purposes. Behavioral fingerprints include a consistent request pattern with a fixed user-agent, no Accept-Encoding variation, and a typical order of headers: Host, User-Agent, From, Accept, Connection. The bot also includes a Via header sometimes, indicating it passes through a caching layer.

📊 Data Usage

Collected content by Trovator is used exclusively to populate the Trovator search engine index, providing users with search results for general web queries. According to the privacy policy at https://www.trovator.com/privacy, data is not used for AI model training, advertising profiling, or resale to third parties. The crawler stores only publicly available page text and metadata (title, description, keywords) to build a ranked search index, and it respects robots.txt exclusions and meta tags such as noindex.

⚙️ Rate Limiting Policy

Rate limiting Trovator is recommended because, despite its legitimate purpose, it can generate substantial traffic if a site has many pages or if the bot encounters a crawling trap. The policy rationale for threshold-based blocking is to prevent excessive load on origin servers while still allowing the crawler to index content at a pace that does not degrade site performance for human visitors.

Free Traffic Analysis

What's Actually Crawling Your Website?

Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.

🔍 Scan My Site Free

Powered by JA4 fingerprinting, honeypot traps & behavioral analysis

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.