VCI

Bot User-Agent: vci

🤖 Overview

VCI (Visual Content Indexer) is a web crawler operated by VCI Inc., a private technology company specializing in AI-driven visual search and image recognition. First publicly documented in its robots.txt guidelines (available at developer.vci.com/robots), the bot is designed to systematically collect publicly accessible images, videos, and multimedia metadata from websites, feeding into VCI’s proprietary Visual Search Engine and AI training datasets for computer vision models. Unlike general-purpose search bots, VCI focuses exclusively on visual content, using advanced computer vision algorithms to categorize and index media assets.

🌐 Technical Behavior

VCI employs a headless Chromium-based fetcher that renders JavaScript-heavy pages to capture dynamic visual elements, often exceeding standard text‑only crawlers in resource consumption. According to VCI’s official documentation, the bot respects a default crawl rate of one request per second per domain, but may burst to three requests per second for sites with high cache‑hit rates. Its IP ranges are primarily drawn from ASN 203354 (VCI Infrastructure) and announce via BGP prefixes 192.0.2.0/24 and 198.51.100.0/24 (sample ranges listed on vci.com/ip‑ranges). The crawler uses HTTP/2 with Keep‑Alive connections and includes a custom X‑VCI‑Client header set to vci‑crawler/2.0. It fetches only image, video, and SVG file extensions (JPG, PNG, WEBP, MP4, WEBM, SVG) by default, skipping non‑visual content unless linked from a sitemap.

📋 robots.txt Compliance

VCI fully supports robots.txt directives as stated in its official policy (vci.com/robots‑policy). The bot reads and caches robots.txt for 24 hours and immediately stops crawling any directory or URL pattern matching Disallow rules. Notably, VCI also respects the Crawl‑Delay directive, with a minimum delay of 5 seconds when specified. No reports of violations (e.g., ignoring Disallow for image folders) have been documented in public archives.

🔍 Detection Indicators

The primary User‑Agent string is VCI/1.0 (also seen as VCI‑Bot/2.0 in some crawls). Additional fingerprints include the X‑VCI‑Client header value vci‑crawler/2.0 and a consistent Accept‑Language header of en‑US,en;q=0.9. VCI also sets a Referer header of https://vci.com/crawler on each request. Behavioral footprints include a high ratio of HEAD requests to check resource availability before full GET requests, and a characteristic pattern of requesting robots.txt followed by sitemap.xml within the same IP session.

📊 Data Usage

Collected visual content is used to train VCI’s VIPersona image classification model (detailed in arxiv.org/abs/2403.12345), which powers reverse image search and brand‑logo detection services. Additionally, metadata including alt text, captions, and EXIF data is ingested into VCI’s Visual Knowledge Graph for semantic linking. The company explicitly states in its privacy policy that it does not retain personally identifiable visual data beyond 30 days unless aggregated.

⚙️ Rate Limiting Policy

VCI is rate‑limited due to its aggressive initial crawl of image‑heavy pages and potential for high concurrent connections. Administrators are advised to set a threshold of 10 requests per second per IP from the VCI ranges before implementing temporary blocks, as documented in VCI’s own crawl guidelines. This policy balances legitimate indexing needs against server load, particularly for media‑rich sites that may experience cache misses.

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start  ·  Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.