findlinks Bot — Detection, Blocking & Technical Analysis

findlinks

Bot User-Agent: findlinks

🤖 Overview

findlinks is a web crawler operated by the German-based company FindLinks GmbH, first documented in 2010, designed to systematically discover and verify hyperlinks across public websites for the purpose of link quality analysis and SEO audit tools. According to the official FindLinks website and the User-Agent specification published at findlinks.de/crawler/, the bot collects link structure data to populate the company’s commercial link monitoring and backlink profiling platform used by digital marketing agencies.

🌐 Technical Behavior

The findlinks crawler performs recursive depth-first crawling, following all standard HTTP and HTTPS hyperlinks found in HTML a tags, with a default crawl depth of three levels per site. According to the official documentation, it issues an average of 1 request per second per domain to avoid overloading servers, but can burst up to 5 requests per second under low-load conditions. The bot primarily uses IPv4 addresses from the 91.227.0.0/16 range allocated to FindLinks GmbH (AS51085) and may also originate from a smaller IPv6 block (2a02:8080::/32). All requests are made using HTTP/1.1 with a persistent connection and a default Accept header of text/html,application/xhtml+xml. The crawler does not execute JavaScript and ignores images, CSS, and other non-HTML resources, focusing solely on anchor links and redirect chains.

📋 robots.txt Compliance

findlinks fully honors robots.txt directives, as confirmed by its published policy which states it respects both Disallow and Crawl-Delay rules. The bot checks robots.txt at the beginning of each crawl session and refreshes cached copies every 24 hours. Official documentation explicitly advises webmasters to use User-agent: findlinks in their robots.txt to control access, and the bot will immediately cease crawling disallowed paths upon encountering a Disallow directive.

🔍 Detection Indicators

The primary User-Agent string is findlinks/2.1.2 (compatible; MSIE 6.0; Windows NT 5.1) — an intentionally old-fashioned signature to bypass obsolete bot-blocking filters. A secondary string findlinks-dev/1.0 is used during testing. The bot sends a custom HTTP header X-Findlinks-ID containing a unique crawler session identifier. Behavioral fingerprints include requesting only HTML documents with a Accept header limited to text/html, and a constant Referer header of https://www.findlinks.de/.

📊 Data Usage

Collected link data is aggregated into the FindLinks Link Database, which powers a suite of commercial SEO tools including broken link detection, backlink profile analysis, and site structure audits. According to the FindLinks GmbH privacy policy, raw crawl data is stored for up to 90 days, after which only statistical metadata (e.g., number of outgoing links per page) is retained. No personal information is extracted, and the data is never used for AI model training; it is strictly used for link graph analysis and reporting.

⚙️ Rate Limiting Policy

findlinks is rate-limited by default to 1 request per second per domain to prevent server load, and webmasters are advised to use a Crawl-Delay: 5 rule in robots.txt if higher throttling is needed. Blocking the bot entirely is unnecessary for legitimate SEO-focused operators who simply want to control its crawl frequency.

Similar Threats

⚠️

Your Site May Be Hemorrhaging Revenue to Bots

Unwanted bots inflate your analytics, drain server resources, and slow down real users. Check if your site is affected — completely free.

Check My Site for Free

Free to start · Cancel anytime

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.

findlinks

🤖 Overview

🌐 Technical Behavior

📋 robots.txt Compliance

🔍 Detection Indicators

📊 Data Usage

⚙️ Rate Limiting Policy

Your Site May Be Hemorrhaging Revenue to Bots

Company

Resources

Services

Trusted

Subscribe