iria
Bot User-Agent:iria
🤖 Overview
Iria is a web crawler operated by Iria Technologies (also known as Iria AI), a European startup based in Estonia, publicly launched in April 2023. Its primary purpose is to collect publicly accessible web content — particularly product listings, pricing data, and structured metadata — to train proprietary e-commerce intelligence models and power a real-time price comparison API. The crawler is not associated with any major search engine; it feeds data exclusively into Iria’s commercial products, including the Iria Price Watch dashboard and the IRIA-1 recommendation engine. Official documentation (iria.ai/crawler) and a public GitHub repository (github.com/iria-tech/crawler) describe its architecture.
🌐 Technical Behavior
Iria performs both broad and targeted crawls, prioritizing product pages, category pages, and structured data endpoints (e.g., JSON-LD schemas). It respects HTTP/2 multiplexing and uses asynchronous fetch with a default concurrency of 8 simultaneous requests per domain. Request frequency is dynamic: it increases during off-peak hours (UTC 22:00–06:00) up to 10 requests per minute per domain, and drops to 2 requests per minute during peak traffic. IP ranges are allocated from ASN 207977 (Iria Technologies) and include prefixes 185.236.88.0/24 and 2a0e:b107:3000::/36, according to PeeringDB records. The crawler identifies itself via HTTP headers including Accept: text/html,application/json and Accept-Language: en-US,en;q=0.9, but does not send a Referer header.
📋 robots.txt Compliance
Based on the official Iria crawler documentation (iria.ai/robots), the bot fully honors Disallow directives. It also supports the Allow directive and respects a custom Crawl-Delay instruction if present in robots.txt. In practice, Iria’s crawl manager checks robots.txt every 24 hours and caches it; violations have not been reported in any public security mailing lists or CVE entries as of 2025.
🔍 Detection Indicators
The primary User-Agent string is Mozilla/5.0 (compatible; Iriabot/1.0; +https://iria.ai/bot). A secondary string IriaCrawler/2.0 is used for JavaScript-rendered content (when executed with a headless Chromium engine). Additional fingerprints include a specific X-Iria-Version header set to the crawler’s build number (e.g., 20250101) and a Cache-Control: no-cache request header. Reverse DNS lookups on Iria IPs resolve to *.crawl.iria.ai.
📊 Data Usage
Collected data is used for training Iria’s proprietary AI models that automate price monitoring, detect out-of-stock patterns, and generate competitive intelligence reports. The extracted product information is also stored in a structured database to power the Iria Price Watch API, which is sold as a SaaS product to e-commerce businesses. According to Iria’s privacy policy, raw page text and images are retained for up to 30 days and are not shared with third parties.
⚙️ Rate Limiting Policy
Iria is rate‑limited because its aggressive concurrency and high request volumes during off‑peak hours can overwhelm smaller web servers. A threshold‑based blocking policy (e.g., >15 requests per minute per IP) is recommended to protect origin servers without permanently denying access, as Iria’s documentation explicitly states it will back off and retry with exponential delays if it receives a 429 Too Many Requests response.
Free Bot Analysis
Is Your Site Under Bot Attack Right Now?
Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.
Run Free Bot Scan →No credit card required · Results in minutes
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.