urly warning

Bot User-Agent: urly-warning

🤖 Overview

URLy.Warning is a legitimate web crawler operated by URLy (urly.fi), a URL shortening and link safety platform. First identified in public documentation around 2019, its primary purpose is to automatically scan shortened URLs and the destination pages they redirect to, in order to detect phishing sites, malware hosts, and other malicious content. The bot feeds its findings into URLy’s warning system, which displays a safety notice to users before they visit a potentially dangerous link — a service documented on urly.fi and referenced in multiple cybersecurity advisories (e.g., AbuseIPDB entries listing URLy.Warning as a benign scanner).

🌐 Technical Behavior

URLy.Warning follows a directed crawl pattern: it first fetches a shortened URL, then follows any HTTP redirects (301, 302, 307) to the final destination. It does not typically crawl the full site; instead, it limits its requests to the specific landing pages and a few linked resources (e.g., JavaScript or image assets that could indicate malicious payloads). Request frequency is moderate — typically no more than 1–2 requests per second per IP, based on observed traffic patterns reported on webmaster forums. The IP ranges used are owned by URLy’s hosting providers (often Hetzner or DigitalOcean) and are listed in public DNSBLs as benign scanners. It uses HTTP/1.1 with standard headers including Accept-Language and Accept-Encoding.

📋 robots.txt Compliance

According to URLy’s official documentation and confirmed by webmaster reports, URLy.Warning fully honors robots.txt directives. If a path is disallowed, the crawler will skip that URL and log it as inaccessible. This compliance is enforced by the bot’s source code, which parses the standard robots exclusion protocol before each request.

🔍 Detection Indicators

The most reliable detection method is the User-Agent string: URLy.Warning (often with a version suffix like URLy.Warning/2.0). Some variations include URLy Warnings or URLy/2.0. Additionally, requests typically originate from a small set of IP addresses documented in the URLy GitHub repository (github.com/urly/services). The bot rarely sends cookies or session data, and its Referer header is often empty or set to the original shortened URL.

📊 Data Usage

Collected data is used exclusively for URLy’s link safety verification. The bot captures the final destination URL, HTTP response codes, page titles, and any malware indicators like known phishing domains (compared against URLy’s internal threat intelligence feed). This data is not used for AI training or search indexing, but to generate real-time warnings displayed to URLy users before they click on a shortened link. No content is stored beyond temporary caching for analysis.

⚙️ Rate Limiting Policy

Although URLy.Warning is not malicious, it can generate a burst of requests when scanning many shortened links pointing to the same domain — potentially overwhelming smaller sites. Therefore, rate-limiting is standard practice: thresholds of 10–20 requests per minute per IP are recommended, with a 503 response to force the bot to back off, after which it will respect the retry-after header per its documented behavior.

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

Sign up in seconds  ·  No card required

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.