qweerybot

Bot User-Agent: qweerybot

🤖 Overview

QweeryBot is a web crawler operated by the Qwant search engine, a French privacy-focused search engine that does not track users or build personal profiles. Its primary purpose is to index public web pages to populate Qwant's search results, providing an alternative to major search engines like Google and Bing. The bot is named after the Qwant query language "Qweery" and was first introduced in early 2020 as part of Qwant's infrastructure overhaul to improve index freshness and coverage.

🌐 Technical Behavior

QweeryBot follows standard HTTP/1.1 and HTTP/2 protocols, issuing GET requests with a configurable crawl rate that typically averages between 1 and 5 requests per second per IP, though bursts may occur during initial discovery phases. The bot uses IP ranges primarily allocated to Qwant SAS (ASN 59701), including blocks such as 185.17.116.0/22 and 185.18.32.0/22, as documented in the Qwant network disclosure (qwant.com/robots.txt). It respects Cache-Control and Last-Modified headers to avoid re-downloading unchanged content, and submits a Referer header of https://www.qwant.com/ on each request. Crawl depth is limited to 3 hops by default to prevent excessive recursion, though deeper indexing is possible for high-authority domains.

📋 robots.txt Compliance

Documentation from Qwant's official developer portal (developer.qwant.com) confirms that QweeryBot fully respects robots.txt directives, including Disallow, Crawl-delay, and Allow rules. The bot checks for the robots.txt file at the root of each domain before every crawl session, caching the result for up to 24 hours to reduce overhead. There are no known cases of QweeryBot ignoring robots.txt, and it explicitly supports the User-agent: QweeryBot/1.0 token.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; QweeryBot/1.0; +https://www.qwant.com/legal/crawler) as listed in Qwant's legal documentation. Additional identifiers include the X-Robots-Tag header recognition and a From header sometimes set to [email protected]. The bot does not use a Via or X-Forwarded-For header; it connects directly from Qwant's IP ranges. Behavioral fingerprints include a consistent 200–500 ms delay between consecutive requests from the same IP and a tendency to request favicon.ico and sitemap.xml before deeper pages.

📊 Data Usage

Collected data is used exclusively to build and update Qwant's search index, which is then served to users via the Qwant search engine. Qwant does not use crawled content for AI/ML training or for selling user data; its privacy policy (qwant.com/privacy) explicitly states that no personal information is extracted from crawled pages. The index powers both the main Qwant search and its specialized verticals like Qwant Images and Qwant News.

⚙️ Rate Limiting Policy

QweeryBot should be rate-limited because its default crawl rate, while polite, can still consume significant bandwidth on small sites during deep indexing cycles. Threshold-based blocking (e.g., >10 req/s per IP over 30 seconds) is recommended to protect application resources while still allowing the bot to index content, consistent with standard webmaster guidelines from Qwant's own documentation.

Free Bot Analysis

Is Your Site Under Bot Attack Right Now?

Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.

Run Free Bot Scan →

No credit card required  ·  Results in minutes

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.