checklinks

Bot User-Agent: checklinks

🤖 Overview

checklinks is a legitimate web crawler operated by CheckLinks Inc., a provider of webmaster tools primarily focused on hyperlink integrity. The bot’s purpose is to systematically crawl websites to verify the validity of internal and external links, reporting broken or redirected URLs to site owners through the company’s free and paid link‑checking services. It does not collect data for AI training or advertising; instead, it serves as a diagnostic agent for website maintenance.

🌐 Technical Behavior

The checklinks crawler adheres to a polite crawling profile, typically issuing one request per URL per site per scanning cycle with a default delay of 10–30 seconds between requests. Official documentation from checklinks.com/robots.txt states that the bot respects the Crawl‑Delay directive if set. Its requests originate from a fixed set of IPv4 addresses published at checklinks.com/ips.txt, which are currently allocated within the range 198.51.100.0/24 (example range; actual list is maintained live). The bot uses HTTP/1.1 over TLS 1.2+ and sends a standard User-Agent header (see Detection Indicators). It does not execute JavaScript or load external resources, focusing only on parsing anchor (<a>) elements from the raw HTML.

📋 robots.txt Compliance

checklinks fully honors the Robots Exclusion Protocol, as verified by its operator’s public statement and the presence of its own robots.txt file that mirrors recommended practices. It will not crawl any path explicitly disallowed by a site’s robots.txt, and it obeys both Disallow and Crawl-Delay directives without exception. According to the company’s FAQ, the bot also respects Allow overrides and treats User-agent: * entries as applicable if no checklinks‑specific rule is found.

🔍 Detection Indicators

The primary identifying User-Agent string is CheckLinks/1.0 (compatible; checklinks.com), though variants such as CheckLinks/2.0 have been observed in server logs. Additionally, the bot sends a custom X-Crawler: checklinks header for easier identification. The default request rate is low enough that site operators can distinguish it from aggressive bots by the consistent inter‑request intervals and the absence of referral or cookie headers.

📊 Data Usage

Collected data is used exclusively to generate link integrity reports for subscribed webmasters. The bot records the HTTP status code, response time, and final redirect URL for each link checked. No personal or behavioral data is stored; the results are aggregated and presented in a dashboard that highlights broken links, redirected URLs, and slow‑responding resources. CheckLinks Inc. states it does not sell or share crawl data with third parties.

⚙️ Rate Limiting Policy

Although checklinks is not malicious, it is often rate‑limited because its crawling pattern—checking every link on a large site—can generate hundreds or thousands of requests per scan cycle, potentially triggering load spikes on smaller servers. A threshold‑based block (e.g., >100 requests/second) is recommended to protect server resources while still allowing the bot’s polite default rate to complete its task.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.