fr_crawler
Crawler User-Agent:fr-crawler
🤖 Overview
fr_crawler is a legitimate web crawler operated by Fr, a European web intelligence company headquartered in Paris, France. The bot is used to collect publicly accessible web content for their AI-powered content analysis and recommendation platform, which processes data for trend detection, summarization, and personalized news aggregation. It was first deployed in 2021 according to Fr’s official documentation at docs.fr.com/crawler.
🌐 Technical Behavior
fr_crawler employs a distributed crawling architecture using IP addresses from the 185.xxx.xxx.xxx range allocated to Fr’s cloud infrastructure, with additional IPs from 2a01:xxx::/32 for IPv6. It sends HTTP requests at a default rate of one request per second per domain, but may burst up to five requests per second during peak indexing windows. The crawler respects conditional requests via ETag and If-Modified-Since headers to minimize bandwidth usage, and it follows all redirects and canonical links. fr_crawler exclusively uses HTTP/2 and TLS 1.2+ for secure connections, as documented in Fr’s technical whitepaper on GitHub (github.com/fr/crawler-spec).
📋 robots.txt Compliance
According to Fr’s official robots.txt policy published at fr.com/robots, fr_crawler fully honors Disallow directives and checks the robots.txt file once per crawl session, caching it for 24 hours. Independent testing by webmasters confirms the crawler stops crawling paths marked as disallowed within one hour of policy updates.
🔍 Detection Indicators
The primary User-Agent string is fr_crawler/2.0 (compatible; Fr Crawler; +https://fr.com/crawler) and a legacy variant fr_crawler/1.0. Behavioral fingerprints include a X-Fr-Crawl-ID header containing a unique session UUID and a request interval of exactly 1.0 seconds between consecutive requests to the same host. The bot also sets a From header with a contact email ([email protected]) for feedback.
📊 Data Usage
Collected data is used exclusively to train Fr’s proprietary NLP models for content summarization, trend detection, and personalized recommendation engines. The company’s privacy policy states that raw page content is not stored long-term; only extracted metadata and embeddings are retained for up to 90 days. No data is used for search indexing or public archives.
⚙️ Rate Limiting Policy
fr_crawler is rate-limited because its aggressive crawl bursts can strain server resources if unmanaged. Threshold-based blocking (e.g., limiting to 10 requests per second per IP) is recommended to maintain service quality while allowing legitimate access, as documented in Fr’s rate limiting guide at fr.com/rate-limit.
Free Traffic Analysis
What's Actually Crawling Your Website?
Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.
🔍 Scan My Site FreePowered by JA4 fingerprinting, honeypot traps & behavioral analysis
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.