picosearch

Search Engine User-Agent: picosearch

🤖 Overview

Picosearch is a lightweight, customisable web search engine operated by Picosearch.com (previously known as PicoSearch), primarily used to provide site‑specific search functionality for small to medium‑sized websites. Launched in the early 2000s, it allows webmasters to create a dedicated search index of their own pages, which users can query via a hosted search box. The bot that feeds this service is the Picosearch crawler, whose purpose is to index web pages submitted by site owners, building a targeted search corpus. Unlike general‑purpose search engines, Picosearch indexes only those pages explicitly added by the webmaster, making it a low‑noise, permission‑based crawler.

🌐 Technical Behavior

The Picosearch crawler operates at a modest crawl rate, typically sending one request every few seconds per domain, and only accesses URLs that have been manually submitted by the site owner through the Picosearch control panel. It uses the HTTP/1.1 protocol and does not support JavaScript rendering — it fetches only raw HTML and linked resources such as images or stylesheets if explicitly required. The bot identifies itself with the User‑Agent string "Picosearch" (exact case). IP addresses originate from a limited range owned by the hosting provider of Picosearch, though no official public IP list is published. The crawler respects the robots.txt directives and obeys standard crawl delays. It does not follow redirects beyond one hop and will not index password‑protected or dynamically generated content.

📋 robots.txt Compliance

According to official documentation on Picosearch.com, the bot honours robots.txt Disallow directives and will not crawl any page or directory marked as off‑limits. It also respects the Crawl‑Delay directive if set. Because the bot only crawls user‑submitted URLs, it is less likely to inadvertently access restricted areas compared to open‑web crawlers.

🔍 Detection Indicators

The primary identification is the User‑Agent string "Picosearch". No other custom headers are sent. The bot can be detected by checking for this exact string in server logs or by monitoring requests that originate from known Picosearch IP blocks (which are often listed in webmaster forums). It does not impersonate other user agents or use obfuscation.

📊 Data Usage

The collected data is used solely to build a searchable index of the webmaster’s own website. No content is stored outside of the Picosearch platform for AI training, analytics, or third‑party reuse. The index is updated only when the site owner triggers a re‑crawl. Picosearch does not sell or share indexed data.

⚙️ Rate Limiting Policy

While Picosearch is a low‑frequency crawler, it is still rate‑limited in web application firewalls because any automated bot can inadvertently stress a server if the crawled site contains many pages. The rationale for threshold‑based blocking is to protect server resources and prevent unintended load, especially on shared hosting environments where even a few extra requests can degrade performance.

Free Traffic Analysis

What's Actually Crawling Your Website?

Discover which unwanted bots are being blocked on your site, how often they hit, and where they come from — real data from your own traffic, not guesswork.

🔍 Scan My Site Free

Powered by JA4 fingerprinting, honeypot traps & behavioral analysis

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.