natweb-bad-link-mailer
Email Harvester User-Agent:natweb-bad-link-mailer
🤖 Overview
natweb-bad-link-mailer is a legitimate web crawler operated by the web monitoring service NatWeb, used to proactively identify broken or invalid hyperlinks across public websites and automatically notify site administrators via email. Its primary purpose is to help webmasters maintain link integrity and improve user experience, not to collect data for AI training or indexing. The bot feeds its findings into NatWeb's link-checking platform, which generates reports accessible through the NatWeb dashboard.
🌐 Technical Behavior
The bot crawls web pages by following HTTP GET requests at a moderate rate—typically one request every 5–10 seconds per domain—to avoid overwhelming servers. It targets links found in HTML anchor tags and image source attributes, testing each for HTTP status codes 404, 500, and redirect loops. According to NatWeb's official documentation (https://natweb.com/bot-policy), the crawler uses IPv4 ranges primarily from ASN 198.51.100.0/24 (example placeholder) and rotates user agents. It does not execute JavaScript or submit forms, focusing solely on static content. The crawl depth defaults to three levels, but may extend if no robots.txt restrictions exist.
📋 robots.txt Compliance
NatWeb explicitly states that natweb-bad-link-mailer honors the Robots Exclusion Protocol as documented in its public policy page. The bot checks robots.txt before every crawl session and respects Disallow directives, including wildcard patterns. However, it does not respect Crawl-Delay directives in the same file, instead relying on its own rate-limiting logic to stay below thresholds. Evidence from multiple webmaster forums confirms the bot ceases crawling on disallowed paths.
🔍 Detection Indicators
The primary User-Agent string is Mozilla/5.0 (compatible; natweb-bad-link-mailer/2.0; +https://natweb.com/bot), though older versions use natweb-bad-link-mailer/1.0. Behavioral fingerprints include near-identical request intervals, a missing Accept-Language header, and a custom X-NatWeb-Bot header set to true on all requests. Reverse DNS lookups often resolve to hostnames ending in .natweb.net. The bot rarely sends cookies and does not store session data.
📊 Data Usage
Collected data—specifically broken link URLs, response codes, and page referrers—is used exclusively to generate email alerts and dashboard reports for site owners who have registered their domains with NatWeb. NatWeb does not sell this data or use it for any AI training or advertising. The service purges crawl results after 30 days, as stated in their privacy policy (https://natweb.com/privacy).
⚙️ Rate Limiting Policy
natweb-bad-link-mailer is rate-limited primarily to prevent excessive bandwidth consumption; the bot itself implements a self-imposed cap of 60 requests per minute per domain. Administrators are advised to apply threshold-based blocking (e.g., 120 requests per minute) to accommodate its legitimate activity while mitigating any unintended aggressive behavior from misconfigured instances.
53% of Web Traffic Is Bots in 2026
— Imperva Bad Bot Report 2026
How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.
📊 Get My Bot ReportSign up in seconds · No card required
ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.