Mojeek Bot — Detection, Blocking & Technical Analysis

Mojeek

Bot User-Agent: mojeek

🤖 Overview

Mojeek is operated by Mojeek Limited, a UK-based independent search engine company founded in 2004. Its primary purpose is to crawl the web and build its own search index, distinct from relying on Bing or Google results. The MojeekBot crawler feeds data into Mojeek’s proprietary search engine, which serves privacy-focused, ad-free search results and powers the Mojeek API used by third-party applications.

🌐 Technical Behavior

MojeekBot uses a breadth-first crawling strategy, with a default crawl delay of 5 seconds between requests, as documented in their official crawler page. The bot respects If-Modified-Since headers to avoid re-crawling unchanged content. Outbound IP ranges are advertised via reverse DNS records (e.g., mojeek.com) and include IPv4 addresses from ASN 209242. Crawling is performed over HTTPS only, and the bot sends a User-Agent string of MojeekBot/0.3 (+https://www.mojeek.com/bot.html). The bot also parses sitemaps and adheres to Crawl-Delay directives in robots.txt when present, though a default of 5 seconds applies otherwise.

📋 robots.txt Compliance

MojeekBot fully honors Disallow instructions in robots.txt. Official documentation at https://www.mojeek.com/bot.html explicitly states that the bot checks robots.txt before each crawl request. It also respects wildcard paths and per-directive exclusions. Tests by webmasters confirm the bot stops crawling disallowed paths within one crawl cycle, with no known history of ignoring these rules.

🔍 Detection Indicators

The primary identification is the User-Agent string: MojeekBot/0.3 (+https://www.mojeek.com/bot.html). Additional fingerprint includes a reverse DNS lookup on the requesting IP resolving to a mojeek.com hostname. The bot sends a Referer header set to its own homepage. It does not spoof other browsers and always includes the bot URL in the user agent. No known alternative user‑agent variants exist.

📊 Data Usage

Collected data is used exclusively to build and update the Mojeek search index. This includes full-text content, metadata, and links for ranking algorithms. Mojeek does not use crawled data for AI training, advertising profiling, or third-party resale. The index supports a privacy-focused search engine that does not track users or store search histories, as stated in their privacy policy at https://www.mojeek.com/privacy.

⚙️ Rate Limiting Policy

MojeekBot is rate-limited because, despite being legitimate, its aggressive default crawl frequency can overwhelm smaller web servers. The policy recommends setting a Crawl-Delay: 5 in robots.txt or using threshold‑based blocking (e.g., blocking IPs if requests exceed 10 per minute) to protect server resources while still enabling indexing.

Similar Threats

Free Bot Analysis

Is Your Site Under Bot Attack Right Now?

Find out exactly how much of your traffic is automated — and which bots are draining your bandwidth and skewing your analytics.

Run Free Bot Scan →

No credit card required · Results in minutes

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

Mojeek

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Is Your Site Under Bot Attack Right Now?

Company

Resources

Services

Trusted

Subscribe