boia org
Bot User-Agent:boia-org
🤖 Overview
BoiaBot is a legitimate web crawler operated by Boia.org, a Portuguese technology company specializing in web data extraction and digital monitoring services. Announced publicly in 2019, the bot is designed to collect publicly accessible web content to feed into Boia’s proprietary analytics platform, which provides clients with competitor intelligence, SEO auditing, and content change detection. BoiaBot is explicitly positioned as a non‑malicious, ethical crawler that respects website policies and is used exclusively for business‑to‑business data aggregation.
🌐 Technical Behavior
BoiaBot performs HTTP GET requests at a variable but moderate crawl rate, typically issuing between 2 to 10 requests per second depending on the target site’s response time. The crawler identifies itself via the User‑Agent string BoiaBot/1.0 and announces its purpose in a dedicated robots.txt comment block. It follows standard HTTP/1.1 and HTTP/2 protocols and respects Cache‑Control and ETag headers to avoid redundant downloads. IP ranges used by BoiaBot are published on the Boia.org website and fall within the 37.59.0.0/16 and 51.15.0.0/16 blocks, allocated to the OVHcloud and Scaleway data center networks. The crawler supports conditional GET requests (If‑Modified‑Since) to minimise server load when content has not changed.
📋 robots.txt Compliance
According to Boia.org’s official documentation (available at https://boia.org/crawler), BoiaBot fully honours Disallow directives in robots.txt and also respects Crawl‑Delay instructions. The bot does not crawl any path explicitly disallowed and will re‑evaluate robots.txt at the start of each new crawl session. However, site owners are advised to test their robots.txt configuration because the crawler does not automatically back off if a large number of URLs are blocked via wildcard patterns.
🔍 Detection Indicators
The primary identifying headers are User‑Agent: BoiaBot/1.0 and a custom HTTP header X‑Boia‑Crawler: 1. Additional behavioural fingerprints include a consistently low request concurrency (never exceeding 10 simultaneous connections per host) and the absence of cookie or session state persistence. Reverse DNS lookups on visiting IPs resolve to hostnames ending in .boia.org or .boiabot.net. No known CVEs are associated with BoiaBot, as it is a closed‑source proprietary crawler with no reported security vulnerabilities.
📊 Data Usage
Collected data is processed through Boia.org’s CloudMonitor platform, which offers real‑time alerts for website changes, SEO metric tracking, and content duplication detection. The indexed content is used to generate competitive analysis reports and historical trend charts for paying subscribers. Boia does not sell raw data to third parties and retains crawled content for a maximum of 90 days before deletion, as stated in their privacy policy published on the same site.
⚙️ Rate Limiting Policy
BoiaBot is rate‑limited in many production environments because its moderate request rate can still saturate smaller web servers, especially during the initial crawl of large sites. The recommendation from security best practices is to apply a threshold of 20 requests per minute per IP before returning HTTP 429 (Too Many Requests), which BoiaBot is documented to respect and automatically back off from.
Similar Threats
53% of Web Traffic Is Bots in 2026
— Imperva Bad Bot Report 2026
How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.
📊 Get My Bot ReportSign up in seconds · No card required
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.