Zauba

Bot User-Agent: zauba

🤖 Overview

Zauba is a web crawler operated by Zauba Technologies Pvt. Ltd., an Indian business intelligence firm that aggregates public corporate and financial data. The bot systematically scrapes content from publicly accessible web pages to feed into the Zauba.com platform, which provides company profiles, director information, and financial records primarily sourced from Indian regulatory filings like the Ministry of Corporate Affairs (MCA). According to the company’s official website (zauba.com), the service aims to make corporate data transparent and accessible, and the crawler is the mechanism by which this data is collected and updated.

🌐 Technical Behavior

The Zauba bot typically exhibits a crawl pattern that targets directories and pages containing business registration details, financial summaries, and legal filings. It respects a moderate request frequency, often sending 1–2 requests per second to avoid overwhelming origin servers, though exact throttling may vary. Official documentation from Zauba’s blog and support pages (archived at web.archive.org) indicate the bot uses HTTP/1.1 protocol with a User-Agent string of “Zauba (business intelligence)” and IP ranges allocated to AWS (Amazon Web Services) and Indian data centers, as observed in server logs. The crawler fetches both HTML and JSON endpoints, depending on the site’s structure, and does not execute JavaScript, relying instead on static content.

📋 robots.txt Compliance

Based on documented evidence from multiple website robots.txt files and Zauba’s own statements (published on their “robots.txt” guidelines page, now archived), the Zauba bot honors Disallow directives when explicitly configured. However, there are no publicly accessible third-party audits confirming strict compliance; the company states that it adheres to standard crawl conventions and expects webmasters to block unwanted access via robots.txt or IP-based restrictions.

🔍 Detection Indicators

The primary identifying User-Agent string is “Mozilla/5.0 (compatible; Zauba/1.0; +https://www.zauba.com/bot)” or simply “Zauba (business intelligence)”. Behavioral fingerprints include consecutive requests to URLs matching patterns like “/company/*” or “/director/*” within short time windows, and a consistent IP origin from AWS or Indian data centers. No additional custom HTTP headers are documented.

📊 Data Usage

Collected data is used exclusively to populate and update Zauba.com’s business intelligence database, which offers subscription-based access to corporate records, financials, and director backgrounds. The information supports due diligence, market research, and compliance checks for Indian companies. Zauba does not publicly disclose use of the data for AI training; its primary purpose is searchable business analytics.

⚙️ Rate Limiting Policy

Although Zauba is legitimate and non-malicious, aggressive or high-frequency crawling may degrade server performance for other users, hence rate limiting is recommended. Threshold-based blocking—triggered when the bot exceeds, for example, 50 requests per minute—ensures fair resource allocation without denying access entirely, aligning with standard webmaster practices.

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start  ·  Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.