lapozzbot

Bot User-Agent: lapozzbot

🤖 Overview

lapozzbot is a legitimate web crawler operated by Lapozz Kft., the Hungarian company behind the price comparison portal lapozz.hu. Its primary purpose is to systematically index product listings, prices, availability, and merchant information from e-commerce websites to feed the Lapozz search engine, which allows users to compare offers across thousands of Hungarian and Central European online stores. First publicly documented in 2012, the bot is part of a standard web crawling infrastructure used by price aggregation platforms.

🌐 Technical Behavior

lapozzbot performs scheduled, deep crawls of e-commerce domains, often following pagination links and product category trees. According to official Lapozz documentation, the bot respects a default crawl delay of 10 seconds between requests, though this can be configured via the Crawl-Delay directive in robots.txt. Requests are made over HTTP/1.1 and HTTP/2 from IP ranges belonging to Hungarian internet service providers (e.g., AS5483 and AS12301), with a geographic focus on Hungary. The crawler does not execute JavaScript and only retrieves HTML content, relying on structured data (JSON-LD, Microdata) when available. It stores session cookies briefly to avoid disrupting server load, and rotates through a set of about 5–10 user-agent strings to prevent simple pattern-based blocking.

📋 robots.txt Compliance

The bot fully honors robots.txt Disallow directives, as confirmed by Lapozz’s public crawler policy page. It also respects Allow and Crawl-Delay directives, and will not crawl paths marked with Disallow: /. Administrators can use the User-agent: lapozzbot entry to control access. There are no documented cases of the bot disregarding robots.txt rules, and Lapozz provides a contact address ([email protected]) for feedback or disputes.

🔍 Detection Indicators

The primary user-agent string is Mozilla/5.0 (compatible; lapozzbot/2.0; +https://lapozz.hu/robot), with variations like lapozzbot/1.0 seen on older logs. Additional HTTP headers include a From field set to [email protected] and a Accept-Language header preferring Hungarian (hu-HU,hu;q=0.9). The bot rarely sends Referer headers and uses a consistent TCP fingerprint matching Linux kernel versions. Log analysis shows requests originate from a narrow set of IPv4 addresses belonging to Lapozz’s own data center.

📊 Data Usage

Collected data—product names, prices, descriptions, stock status, and merchant names—is used exclusively to populate the Lapozz price comparison engine. The platform provides free consumer access to search and filter results, and monetizes via affiliate links and merchant advertisements. No collected data is used for AI training, user profiling, or resale to third parties, as stated in Lapozz’s privacy policy.

⚙️ Rate Limiting Policy

lapozzbot is rate-limited because its deep crawling patterns, while respecting delays, can still generate high volumes of requests during initial indexation or when covering large product catalogs. Administrators are advised to set a Crawl-Delay directive tailored to their server capacity, as the bot does not self-throttle beyond default values.

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start  ·  Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.