infonavirobot

Bot User-Agent: infonavirobot

🤖 Overview

InfoNaviRobot is a legitimate web crawler operated by InfoNavi Inc., a Japanese technology company specializing in web data aggregation and AI-driven search services. The bot’s primary purpose is to collect publicly accessible web pages for indexing into InfoNavi’s proprietary search engine and to feed data into machine learning models for natural language processing and content classification. It is not a malicious agent; rather, it is an automated data acquisition tool used to enhance the company’s search results and AI training pipelines, as documented in their official crawler policy page (https://www.infonavi.co.jp/crawler).

🌐 Technical Behavior

InfoNaviRobot employs a breadth-first crawling strategy with a configurable request frequency that averages one request per 3–5 seconds per domain, though it may burst up to 10 requests per minute during initial discovery phases. The bot uses HTTP/1.1 and HTTP/2 protocols, sending requests with a standard Accept header of text/html,application/xhtml+xml. Its IP ranges are not publicly disclosed, but according to third-party logs (e.g., community lists on GitHub), it typically originates from IPv4 addresses owned by InfoNavi’s ASN (AS38476) in Japan. The crawler respects Cache-Control and If-Modified-Since headers to reduce server load, and it identifies itself via the User-Agent string: Mozilla/5.0 (compatible; InfoNaviRobot/2.0; +https://www.infonavi.co.jp/crawler).

📋 robots.txt Compliance

InfoNaviRobot fully honors robots.txt directives, as stated in its official policy. It reads the file on each domain visit and respects both Disallow and Crawl-delay directives. Empirical evidence from webmaster forums confirms that the bot halts crawling immediately on disallowed paths and respects a minimum delay of 5 seconds when a Crawl-delay is specified.

🔍 Detection Indicators

The primary detection indicator is the User-Agent string: InfoNaviRobot/2.0 (or older versions like InfoNaviRobot/1.0). Additionally, the bot includes a custom HTTP header X-InfoNavi-Crawler: 1 in some requests, and its IP addresses reverse-resolve to hostnames under the domain *crawl.infonavi.co.jp*. The bot also sends a unique From header containing a contact email address (e.g., [email protected]) in compliance with RFC 7231.

📊 Data Usage

Collected data is used for two primary purposes: (a) building and updating InfoNavi’s search engine index for Japanese-language web content, and (b) training internal AI models for summarization, entity recognition, and semantic search. The company’s privacy policy (https://www.infonavi.co.jp/privacy) states that no personally identifiable information is intentionally collected, and cached pages are deleted after 30 days unless used for model training.

⚙️ Rate Limiting Policy

Although InfoNaviRobot is legitimate and respects standard directives, it is rate-limited in many web applications due to its persistent crawling across multiple pages. A threshold-based blocking policy — for example, limiting the bot to 50 requests per minute per IP — is recommended to ensure it does not consume excessive server resources while still allowing its beneficial indexing activities.

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute  ·  Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.