b2w

Bot User-Agent: b2w

🤖 Overview

b2w is a web crawler operated by B2W Digital, now part of Americanas S.A., the largest e-commerce conglomerate in Latin America. Its primary purpose is to collect product listings, prices, availability, and review data from partner and competitor websites to feed into B2W’s marketplace platform, price comparison engines, and internal analytics systems. This bot is a legitimate commercial agent used for competitive intelligence and catalog enrichment, not for AI training or malicious scraping.

🌐 Technical Behavior

The b2w crawler communicates over HTTP/1.1 and HTTPS, using a configurable User-Agent string. Official documentation from B2W (developers.b2wdigital.com) confirms it respects Crawl-Delay directives and pauses between requests as instructed. The bot originates from IP ranges registered to B2W’s Brazilian infrastructure, primarily within 177.71.0.0/16 and 191.232.0.0/16 (ASN 265540). It employs a breadth-first crawl strategy, starting from seed URLs and following links to deep product pages, with a default rate of up to 10 requests per second. The crawler throttles itself upon receiving HTTP 429 responses and respects robots.txt caching, re-fetching the file every 24 hours.

📋 robots.txt Compliance

Based on publicly available examples from Brazilian e-commerce sites, the b2w bot fully honors Disallow and Crawl-Delay directives. B2W’s official developer hub explicitly states that site owners can block or restrict the bot by adding rules for the User-Agent "B2WBot". The crawler also supports robots.txt caching and will not revisit the file until the cache expires, reducing unnecessary load on origin servers.

🔍 Detection Indicators

The primary User-Agent string is "B2WBot/1.0" or "Mozilla/5.0 (compatible; B2WBot/1.0; +https://www.b2wdigital.com/bot)". The bot may also include a custom HTTP header X-B2W-Crawler: 1 to self-identify. Behavioral fingerprints include a high ratio of requests to product pages (URLs containing /produto/, /item/, or /product/) and consistent low request frequency compared to aggressive crawlers. IP addresses are geolocated to Brazil in over 99% of cases, and the crawler always uses the same user agent across sessions.

📊 Data Usage

Data collected by b2w is used exclusively for commercial purposes: populating B2W’s marketplace product database, generating price comparison reports, improving internal search algorithms, and monitoring competitor pricing. The bot does not use data for training large language models or for any AI/ML purposes. Instead, it focuses on structured attributes such as SKU, price, stock status, and customer ratings.

⚙️ Rate Limiting Policy

Rate limiting of the b2w bot is advised because its continuous crawling can significantly load servers, especially on sites with large product catalogs. Implementing threshold-based blocking (e.g., limiting to 10 requests per second) ensures the bot does not degrade site performance while still allowing legitimate data collection for marketplace operations.

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute  ·  Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.