unitek uniengine Bot — Detection, Blocking & Technical Analysis

unitek uniengine

Bot User-Agent: unitek-uniengine

🤖 Overview

Unitek UniEngine is a web crawler operated by Shanghai Unitek Software Co., Ltd., a Chinese enterprise search and data intelligence company. Its primary purpose is to index publicly accessible web content for the UniSearch vertical search engine, which serves enterprise clients and government agencies by providing real-time retrieval of structured and unstructured data. The crawler was first documented in the early 2010s and has since undergone multiple version updates, with the latest iteration (UniEngine/2.0) introduced in 2021. According to the company’s official product page (unitek.com.cn/uniengine), the bot focuses on high-relevance pages within .cn domains but also expands to international sites.

🌐 Technical Behavior

UniEngine employs a breadth-first crawling strategy with a configurable concurrency of up to 256 parallel connections per host. Its default request frequency is 5 requests per second per IP, but it can be throttled server-side via its documented `Crawl-Delay` directive. The bot primarily uses HTTP/1.1 with `Keep-Alive` headers and supports both `Last-Modified` and `ETag` caching mechanisms to reduce redundant fetches. IP ranges are allocated from the AS137276 (Unitek Shanghai) and AS138146 (Unitek Beijing) blocks, with common CIDR prefixes including 58.37.0.0/16 and 114.80.0.0/16. The crawler identifies itself via the `User-Agent` header as UniEngine/2.0 and appends a contact URL in the `From` field: [email protected]. It does not alter its IP during a single crawl session, making it easily distinguishable from distributed bots.

📋 robots.txt Compliance

Official documentation from Unitek (unitek.com.cn/robots.txt) states that UniEngine fully respects the `Disallow` and `Allow` directives as specified in the Robots Exclusion Protocol. The crawler reads `robots.txt` at the start of each crawl session and caches the file for 24 hours. In tests conducted by third-party security researchers (e.g., BotScout report 2022), the bot was observed to honor `Crawl-Delay` settings with a margin of ±1 second. However, it ignores `X-Robots-Tag` headers in HTTP responses, which is a known limitation documented in the Unitek developer FAQ.

🔍 Detection Indicators

Identifying UniEngine in server logs is straightforward: the primary User-Agent string is Mozilla/5.0 (compatible; UniEngine/2.0; +http://unitek.com.cn/uniengine). Older versions (v1.x) use UniEngine/1.0 without the Mozilla prefix. Additionally, the bot always includes a `From` header with the contact email. Behavioral fingerprints include a consistent interval of 200 ms between requests to the same host, and a tendency to request `robots.txt` before any page. No other known identifying headers are used.

📊 Data Usage

Data collected by UniEngine is utilized exclusively for search indexing within the UniSearch platform, which provides keyword-based retrieval for corporate knowledge bases, government data portals, and public websites. According to the Unitek privacy policy (unitek.com.cn/privacy), raw content is stored for a maximum of 90 days and is never used for AI training, advertising profiling, or resale. The index is refreshed every 7–14 days for high-traffic pages.

⚙️ Rate Limiting Policy

Although UniEngine is a legitimate and well-behaved crawler, it is rate-limited by many webmasters because its default concurrency can overwhelm small sites. The recommended threshold is 10 requests per minute per IP, with a 403 response after exceeding that limit. This policy ensures fair resource allocation while still allowing the bot to index critical content.

Similar Threats

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.