interseek

Bot User-Agent: interseek

🤖 Overview

Interseek is a web crawler operated by Interseek Ltd, a London-based recruitment technology company founded in 2015. Its primary purpose is to index job listings, company career pages, and employment-related content from publicly accessible websites to feed into the Interseek job aggregation platform. The bot collects structured data such as job titles, descriptions, locations, salaries, and application links, which are then used to power a global job search engine similar to Indeed or Glassdoor. Interseek Ltd publicly documents the crawler’s behavior on their official site at interseek.com/bot and provides a dedicated verification endpoint for webmasters.

🌐 Technical Behavior

The Interseek crawler employs a distributed architecture using cloud infrastructure primarily hosted on AWS and Google Cloud. Its IP ranges are listed in multiple WHOIS registries under ASN 16509 (Amazon) and ASN 15169 (Google), but Interseek Ltd also operates a small pool of static IPs (e.g., 185.199.108.0/24) for verification. The bot issues requests over HTTP/1.1 and HTTP/2, with a default crawl frequency of one request every 3 to 5 seconds per domain. It respects ETags and Last-Modified headers to avoid re-crawling unchanged content. Crawl priority is determined by a proprietary relevance algorithm that scores pages based on keyword density of job-related terms (e.g., "apply", "careers", "position"). The bot also parses JSON-LD and microdata for structured job postings, and it makes head requests before full GET requests to validate resource availability. Interseek’s crawler does not follow redirect chains longer than 5 hops and terminates on 4xx or 5xx status codes.

📋 robots.txt Compliance

According to Interseek’s official documentation at interseek.com/robots, the bot fully honors Disallow directives in robots.txt. It parses the file at the root of each domain before crawling and caches the rules for up to 24 hours. However, the bot uses a user-agent token “Interseek” (not “InterseekBot” or “InterseekCrawler”), which may cause some webmasters to inadvertently block unrelated crawlers. Interseek Ltd also provides a custom robots.txt extension allowing site owners to set crawl-delay and access restrictions via the X-Robots-Tag HTTP header.

🔍 Detection Indicators

The primary User-Agent string is Mozilla/5.0 (compatible; Interseek/1.0; +https://interseek.com/bot). Occasionally the bot sends a secondary string: Interseek/1.0 (bot; +https://interseek.com/bot). Behavioral fingerprints include a referrer header always set to https://interseek.com, and the X-Interseek-Version header set to 2.3.1. The bot never sends Accept-Encoding: gzip on initial requests, and its Connection header is always keep-alive. Additionally, the bot’s requests include a custom X-Forwarded-For header with a unique crawler ID for debugging.

📊 Data Usage

Collected data is used exclusively for job listing aggregation and employment market analytics. Interseek’s platform indexes over 50 million live job postings daily, enabling real-time salary comparisons and employer reviews. The company states that no personal data is extracted; only job post metadata (title, company, location, description) is stored. Data is not sold to third parties but may be used to train machine learning models for job recommendation algorithms within the Interseek platform, as disclosed in their privacy policy at interseek.com/privacy.

⚙️ Rate Limiting Policy

Interseek is rate-limited because its crawl rate can saturate low-capacity servers, especially when indexing large career sites. The recommended threshold is 30 requests per minute per domain, after which webmasters should return 429 Too Many Requests. Interseek Ltd acknowledges that aggressive crawling may occur during initial index builds and advises cooperative throttling via the official robots.txt delay directive. This policy balances data freshness with fair resource usage.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.