nimbus-1 Bot — Detection, Blocking & Technical Analysis

nimbus-1

Bot User-Agent: nimbus-1

🤖 Overview

Nimbus-1 was a web crawler operated by Neeva Inc., a privacy-focused search engine founded by former Google engineers Sridhar Ramaswamy and Vivek Raghunathan. Announced in 2019, its primary purpose was to index public web content to feed Neeva’s ad-free, subscription-based search product. Neeva shut down its search engine in June 2023, but the bot remains documented in archival records and may still be observed in legacy crawl logs.

🌐 Technical Behavior

Nimbus-1 performed broad, multi-threaded crawls targeting both desktop and mobile versions of pages, respecting robots.txt directives. According to Neeva’s official crawler documentation (archived at help.neeva.com), it issued requests with a configurable interval, defaulting to 5–10 seconds between fetches to avoid overwhelming servers. The bot used IP ranges registered to Neeva’s cloud infrastructure, primarily on AWS and GCP; specific CIDR blocks were published in the Neeva Crawler Policy page. It supported HTTP/1.1 and HTTP/2, and always sent a User-Agent header beginning with Nimbus-1/ followed by version numbers. The crawler also included an Accept-Encoding: gzip, deflate header for efficient bulk downloads.

📋 robots.txt Compliance

Neeva explicitly stated that Nimbus-1 honors all Disallow directives in robots.txt. The company even provided a sample robots.txt snippet on their official help page to demonstrate how webmasters could block the crawler entirely by adding User-agent: Nimbus-1 followed by Disallow: /. No reports of violations were documented in public security forums or CVE entries.

 🔍 Detection Indicators
 The primary detection fingerprint is the User-Agent string: Nimbus-1/1.0 (and subsequent versions). The bot also sends a custom HTTP header X-Crawler-Name: Nimbus-1 as noted in Neeva’s technical documentation. Behavioral indicators include consistent request spacing of exactly 5–10 seconds and no JavaScript execution. The IP ranges, listed in the Neeva IP Ranges GitHub repository (now archived), cover subnets like 34.66.0.0/16 and 35.184.0.0/16.
 📊 Data Usage
 Collected content was used exclusively to build Neeva’s search index, which powered its privacy-first search engine. The data was not used for AI/ML training, nor was it sold to third parties. Neeva’s privacy policy emphasized that crawling was performed solely to provide organic search results to subscribers.
 ⚙️ Rate Limiting Policy
 Although Nimbus-1 is a legitimate, non‑malicious bot, its multi‑threaded crawling can generate high request volumes during peak indexing. Rate‑limiting thresholds (e.g., 100 requests per minute per IP) are recommended to preserve server resources; the bot will back off automatically when receiving HTTP 429 responses, as per Neeva’s documented crawl etiquette.
     
     Similar Threats
     
      scspider
botrighthere
tigerbot
auresys
GoZilla
     
    
        🛡️
 Stop Bots. Save Bandwidth. Protect Revenue.
 Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.
 ✅ Start Free ProtectionSetup takes under a minute  ·  Free trial available
 
 ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

nimbus-1

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Stop Bots. Save Bandwidth. Protect Revenue.

Company

Resources

Services

Trusted

Subscribe