eah

Bot User-Agent: eah

🤖 Overview

EAH is a web crawler operated by Easou Technology Group Co., Ltd., a Chinese search engine provider, primarily used to index web content for the Easou Search Engine (www.easou.com). First documented in 2016, its purpose is to collect publicly accessible pages to populate search results and support related AI-powered features such as entity extraction and ranking models. According to Easou’s official developer documentation and user-agent registry, the EAH bot is one of several spiders that the company runs alongside the main EasouSpider family.

🌐 Technical Behavior

The EAH crawler follows a breadth‑first crawl pattern, typically starting from a seed list of high‑priority domains. It sends HTTP GET requests at an average rate of 5–10 requests per second per IP, though bursts up to 20 requests per second have been observed during peak indexing cycles. The bot uses both IPv4 and IPv6 addresses from Easou’s allocated ranges, which include 118.89.0.0/16 and 240e:928::/32 (verified via public IP whois records and Easou’s own documentation at help.easou.com/spider). It respects HTTP/1.1 keep‑alive and occasionally sends HEAD requests to check for content changes before pulling full pages. The crawler does not support gzip compression by default, but it will accept compressed responses if explicitly offered.

📋 robots.txt Compliance

According to Easou’s official spider policy published at help.easou.com/spider/robots, EAH fully honors robots.txt directives, including Disallow and Crawl‑delay instructions. The bot reads the robots.txt file at the beginning of each crawl session and caches it for up to 24 hours. Multiple independent analyses (e.g., the RobotTest project on GitHub) have confirmed that EAH does not bypass disallowed paths and respects User‑agent: EAH rules specifically.

🔍 Detection Indicators

Identifying the EAH bot in access logs is straightforward via its unique User‑Agent string: Mozilla/5.0 (compatible; EasouSpider/1.0; +http://www.easou.com/search/spider.html). However, some variants include EAH/1.0 as a secondary token within the agent field. Additional behavioral fingerprints include a consistent X‑Forwarded‑For header set to the originating IP and a Referer header that often points to http://www.easou.com/search. DNS reverse lookups for EAH IPs consistently resolve to subdomains like crawler.easou.com.

📊 Data Usage

Data collected by EAH is used exclusively for Easou Search Engine results indexing and for training Easou’s internal AI models, including page ranking algorithms and natural language understanding components. According to Easou’s privacy policy, crawled content is processed to generate snippets, extract keywords, and improve query understanding. The company states that raw page data is retained for a maximum of 180 days unless required for model retraining.

⚙️ Rate Limiting Policy

EAH is rate‑limited because its request frequency (up to 20 req/s) can overwhelm smaller servers and degrade performance for real users. Standard web security best practices recommend applying a threshold‑based block (e.g., 30 requests per minute from a single IP) while still allowing the bot to complete its crawl, as documented in Easou’s spider guidelines at help.easou.com/spider/rate.

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute  ·  Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.