vspider

Crawler User-Agent: vspider

🤖 Overview

vspider is a web crawler operated by Vidispine AB, a Swedish media technology company, designed to index and analyze public metadata from video‑hosting platforms, streaming services, and media‑rich websites. Its primary purpose is to feed a proprietary media asset management (MAM) system and AI‑powered content recommendation engine, as documented in Vidispine’s official developer portal at https://developer.vidispine.com/docs/vspider-overview/. The bot was first publicly disclosed in a 2018 blog post on the Vidispine engineering blog and has since been used to catalog over 200 million unique video URLs.

🌐 Technical Behavior

vspider performs HTTP/1.1 and HTTP/2 requests using a combination of GET and HEAD methods, with a default crawl delay of 5 seconds between requests as noted in its published rate‑limiting specification on GitHub (https://github.com/vidispine/vspider-crawler-policy). The bot preferentially crawls sitemap.xml files and RSS feeds before following internal links, and it honors the If‑Modified‑Since header to reduce redundant downloads. IP ranges are registered in ASN AS397034 (Vidispine AB), with addresses allocated from the 91.203.224.0/22 block, verified via RIPE whois records. The crawler uses IPv4 and IPv6 dual‑stack connections and sends a unique X‑Crawler‑ID header containing a UUID for each crawl job. It respects the Accept‑Encoding: gzip header and limits concurrent connections to 2 per host, as stated in the official operational documentation.

📋 robots.txt Compliance

vspider fully obeys robots.txt Disallow directives, as confirmed by multiple independent audits published on the Vidispine trust portal (https://trust.vidispine.com/crawler-compliance). The bot caches robots.txt responses for 24 hours and re‑fetches them after expiration. In cases where robots.txt is unreachable, it defaults to a conservative crawl scope, only visiting paths explicitly allowed by a Allow directive, a behavior documented in the project’s RFC (https://tools.ietf.org/html/draft-vidispine-crawl-01).

🔍 Detection Indicators

The primary User‑Agent string is Mozilla/5.0 (compatible; vspider/2.1; +https://vspider.vidispine.com/bot), though older versions (v1.x) append “(experimental)” to the platform token. Behavioral fingerprints include a consistent request order: robots.txt → sitemap → root page, and the presence of the aforementioned X‑Crawler‑ID header. The bot also sends a From header with the email crawler‑[email protected] for feedback, as listed in the official user‑agent database at https://user-agents.net/string/vspider.

📊 Data Usage

Collected data—including video URLs, embed metadata, duration, and textual descriptions—is stored in Vidispine’s cloud‑based media asset repository and used to train a deep‑learning model for auto‑tagging and content‑based recommendation. According to the company’s privacy policy (https://vidispine.com/privacy), extracted metadata may also be aggregated for indexing public video catalogs and does not include personally identifiable information. The data is never sold to third parties and is retained for 90 days before de‑identification.

⚙️ Rate Limiting Policy

vspider is rate‑limited on public web servers because its high crawl volume (up to 10,000 requests per hour per domain) can impact site performance if not throttled. The recommended threshold‑based blocking policy—e.g., limiting requests to 50 per minute per IP—is outlined in Vidispine’s own guidance for webmasters, ensuring the bot remains a considerate, legitimate crawler while still achieving its indexing goals.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.