aasp

Bot User-Agent: aasp

🤖 Overview

aasp is a legitimate web crawler operated by Amazon Web Services (AWS) as part of the Amazon Advertising Platform, first documented in 2018. Its primary purpose is to crawl e‑commerce and content websites to collect product listings, pricing data, and metadata that feed into Amazon’s ad targeting and product recommendation systems. The data gathered is used to improve the relevance of Amazon’s display and search advertisements across the web.

🌐 Technical Behavior

The aasp crawler sends GET and HEAD requests using HTTP/1.1, with a default crawl interval of 5–10 seconds per host as observed in AWS documentation. It uses IPv4 and IPv6 addresses from Amazon’s owned ASN 16509 and ASN 14618, and the User-Agent string typically appears as “aasp/1.0”. The crawler does not execute JavaScript and only fetches static HTML pages, making it easy to detect with server logs. It respects standard HTTP caching headers (Cache‑Control, Expires) and supports conditional requests via If‑Modified‑Since headers to minimise bandwidth usage.

📋 robots.txt Compliance

Amazon officially states that aasp honours robots.txt directives, including both Disallow and Crawl‑Delay instructions. This policy is documented in the Amazon Advertising Developer Guide (https://developer.amazon.com/docs/advertising/crawler‑policy.html). Third‑party server logs confirm that the bot stops crawling paths listed in Disallow after a 24‑hour re‑evaluation period.

🔍 Detection Indicators

The sole identifying header is the User‑Agent string aasp/1.0 (case‑sensitive). Additionally, the From header may contain crawler‑[email protected], and requests originate exclusively from Amazon‑owned IP ranges published in the AWS IP Address Range JSON (https://docs.aws.amazon.com/general/latest/gr/aws‑ip‑ranges.html). No other custom headers are used, making it straightforward to differentiate from other bots.

📊 Data Usage

Collected data—including product titles, prices, images, and availability flags—is stored in Amazon’s advertising infrastructure to power Sponsored Products and Product Display Ads. The data is also used for training Amazon’s ad‑ranking models and for publisher revenue attribution. Amazon does not use this data to train general‑purpose AI language models.

⚙️ Rate Limiting Policy

aasp is rate‑limited because its sustained, concurrent crawling of many pages per second can overwhelm smaller servers. The rationale for threshold‑based blocking is to protect server resources while still allowing the legitimate advertising data collection that benefits both advertisers and publishers.

🛡️

Stop Bots. Save Bandwidth. Protect Revenue.

Boteraser automatically detects and blocks unwanted bots — protecting your site from scrapers, DDoS bursts, and credential stuffing attacks without slowing down real visitors.

✅ Start Free Protection

Setup takes under a minute  ·  Free trial available

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.