Shodan

Bot User-Agent: shodan

🤖 Overview

Shodan is a search engine operated by Shodan.io, founded by John Matherly, that indexes internet-connected devices and services by scanning public IP addresses. Unlike traditional web crawlers, Shodan focuses on banners returned by services (e.g., HTTP, SSH, FTP, RTSP) to build a searchable database of devices from webcams to industrial control systems. Its primary purpose is to provide security researchers, IT professionals, and the public with visibility into exposed devices and misconfigurations, aiding in vulnerability assessment and network discovery.

🌐 Technical Behavior

Shodan performs aggressive, continuous probing of IPv4 address space by sending packets to a wide range of ports (e.g., 21, 22, 23, 80, 443, 8080, 8443, 3306, 3389, 5900) and collecting response banners. According to official documentation on Shodan.io, the crawler uses a custom scanning engine that sends initial probes and then parses only the first packet response to minimize load, but it can still generate high request rates—up to several thousand scans per minute from a single IP. Shodan’s IP ranges are not published in a single list, but they are drawn from large cloud providers and data centers; common source IPs include ranges owned by DigitalOcean, Linode, and Hetzner. The scanner uses both TCP SYN and complete connections depending on the protocol, and it respects TCP timeouts but does not follow robots.txt because it is not a web crawler in the traditional HTTP sense. Shodan also offers an API for custom queries and real-time scanning.

📋 robots.txt Compliance

Shodan explicitly states on its FAQ page that it does not honor robots.txt because its scanning targets IP addresses and service banners rather than web pages. The robots.txt standard is designed for web crawling over HTTP, not for direct socket-level probes. However, Shodan provides a self-opt-out mechanism: owners can configure their network to return no response or a specific banner that Shodan will ignore, or they can submit an abuse request to have their IPs excluded entirely from future scans. This is documented in Shodan’s “opt-out” page.

🔍 Detection Indicators

Detection of Shodan scanning is typically based on source IP addresses from known Shodan scanning servers. User-Agent strings are not used because Shodan does not send HTTP requests during the initial scan; however, if a web server is probed via HTTP, the request may include a User-Agent: Shodan/1.0 or Mozilla/5.0 (compatible; Shodan) header, as noted in community analysis. Behavioral fingerprints include rapid sequential port scans on a single IP, with no cookie or session handling, often originating from cloud hosting IP ranges. Shodan also publishes a list of its scanner IPs via its API and the Shodan command-line tool, which can be used to blacklist them.

📊 Data Usage

Collected data—including service banners, version strings, geolocation, and open port information—is indexed and made searchable on Shodan.io. Users can search for devices running specific software (e.g., “Apache 2.4.49”) or locate exposed industrial control systems. The data is used for vulnerability research, exposure assessments, academic studies (e.g., analysis of default credentials), and by penetration testers. Shodan also offers a “Shodan Maps” feature for geographic visualization and an “Exploits” database linking banners to known CVE entries, such as CVE-2021-41773 (Apache Path Traversal).

⚙️ Rate Limiting Policy

Because Shodan’s scanning can overwhelm poorly defended networks and generate substantial traffic, administrators often rate-limit or block its IP ranges using firewall rules. The rationale is that while Shodan is a legitimate research tool, unchecked scanning degrades network performance and may trigger false alarms in intrusion detection systems, so threshold-based blocking is a prudent security measure.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.