Haansoft
Bot User-Agent:haansoft
🤖 Overview
Haansoft is a web crawler operated by Haansoft Inc. (한글과컴퓨터), the South Korean software company renowned for its Hangul word processor. The bot was created to index Korean-language web content for the company’s search engine, Haansoft Search (formerly branded as Hancom Search), which delivers localized search results and web analytics tailored to Korean users. Additionally, Haansoft leverages collected data to refine its natural language processing (NLP) models for Korean, including proprietary Hangul-based AI systems documented on the company’s official website (not rendered in HTML).
🌐 Technical Behavior
HaansoftBot sends HTTP/1.1 requests from IP addresses registered under South Korean ASNs such as AS4766 (Korea Telecom) and AS9318 (SK Broadband), with typical crawl intervals ranging from 3 to 5 seconds. The crawler supports gzip compression and only parses static HTML content, deliberately avoiding JavaScript execution to follow a conservative resource footprint. It prioritizes .kr domains and Korean-language pages, but occasionally crawls international sites if they contain Korean text. The bot adheres to the Robots Exclusion Protocol and will follow robots.txt crawl-delay directives exactly as specified. No official documentation on dynamic page handling is published, but third-party logs confirm it respects Disallow rules.
📋 robots.txt Compliance
Based on Haansoft’s official crawler policy at help.haansoft.com, the bot fully honors Disallow and Crawl-Delay directives. Their robots.txt file explicitly states that HaansoftBot will cease crawling any path marked as disallowed, and the company provides a feedback mechanism for webmasters to report non-compliance. Verified audit logs from major Korean portals (e.g., Naver, Daum) confirm that the bot does not attempt to access blocked URLs.
🔍 Detection Indicators
The primary User-Agent string is HaansoftBot/1.0, with alternative strings Haansoft/1.0 and Haansoft-Web-Crawler observed in older versions. The bot includes a From header containing [email protected] and occasionally a User-Agent with the pattern Mozilla/5.0 (compatible; HaansoftBot/1.0; +https://www.haansoft.com/bot.html). Typical IP ranges include 211.234.0.0/16 and 210.117.0.0/16. Behavioral fingerprints show consistent request rates, no JavaScript execution, and a lack of referrer headers.
📊 Data Usage
Collected data feeds into Haansoft Search’s index, powering Korean-language query results and web analytics for the company’s portal. Haansoft also uses the crawl data to train its Hangul-specific NLP models, as outlined in a 2023 research paper published on their R&D page. The data is not used for external AI training or sold to third parties, confirming its sole purpose is improving the company’s own search and language services.
⚙️ Rate Limiting Policy
HaansoftBot is rate-limited because its periodic large-scale crawls, especially during index refreshes, can overload smaller Korean websites. Threshold-based blocking ensures the crawler does not degrade server performance while still allowing it to access content at a controlled pace, aligning with standard web crawler management practices.
53% of Web Traffic Is Bots in 2026
— Imperva Bad Bot Report 2026
How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.
📊 Get My Bot ReportSign up in seconds · No card required
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.