GoogleAgent-Mariner Bot — Detection, Blocking & Technical Analysis

GoogleAgent-Mariner

Bot User-Agent: googleagent-mariner

🤖 Overview

GoogleAgent-Mariner is a web crawler operated by Google DeepMind, introduced in December 2024 as part of the experimental Project Mariner — a research prototype that uses a Chrome extension and a backend crawler to allow an AI agent to browse the web and perform multi‑step tasks on behalf of users. Unlike standard search crawlers, this bot focuses on real‑time content retrieval for the agent’s decision‑making pipeline and does not feed data into Google’s search index or general AI training datasets.

🌐 Technical Behavior

GoogleAgent-Mariner emulates a full browser environment, executing JavaScript and rendering pages to capture dynamic, client‑side content. Requests are sent with a consistent User‑Agent string: Mozilla/5.0 (compatible; GoogleAgent‑Mariner; +https://developers.google.com/search/docs/crawling‑indexing/overview). IP addresses originate from Google’s documented public ranges (e.g., 66.249.0.0/16) and reverse‑DNS lookups resolve to *.googlebot.com or *.mariner.google.com. The crawler uses HTTP/2 and HTTPS, and includes a From: header with a project contact email. During agent sessions, the crawl frequency is variable but typically enforces a minimum 10‑second delay between requests per domain; burst behavior can occur when the agent processes multiple tasks in rapid succession.

📋 robots.txt Compliance

According to Google’s official developer documentation (published alongside Project Mariner), GoogleAgent‑Mariner fully respects Disallow directives and Crawl‑delay instructions found in robots.txt. It also checks the X‑Robots‑Tag HTTP header and noindex meta tags. The crawler will not access pages behind login forms, paywalls, or any resource protected by HTTP authentication, and it respects robots.txt exclusions for subdirectories with a granularity equal to Googlebot.

🔍 Detection Indicators

The primary identifier is the User‑Agent string GoogleAgent‑Mariner/1.0, often accompanied by a Via header indicating the Mariner proxy. Additional fingerprints include the X‑Forwarded‑For header with IPs from Google’s allocated blocks and a consistent Accept-Language: en‑US,en;q=0.9 header. Site operators can verify requests by performing a reverse DNS lookup — legitimate crawlers resolve to a googlebot.com or mariner.google.com subdomain, and a forward DNS check against the returned name must match the original source IP.

📊 Data Usage

The content collected by GoogleAgent‑Mariner is used exclusively to power the Project Mariner AI agent — enabling it to interpret page structures, extract information, fill forms, and complete workflows on behalf of the user. Google states that data may be retained temporarily for debugging and performance analysis but is never used to train general‑purpose language models or improve search retrieval. No collected content is shared with third parties or sold.

⚙️ Rate Limiting Policy

Although legitimate, GoogleAgent‑Mariner can generate sustained request bursts during complex agent sessions, potentially consuming server resources. Rate limiting is recommended with a threshold of 100 requests per minute per IP — beyond this level, blocking is justified to protect site stability and ensure fair access for all visitors.

Similar Threats

53% of Web Traffic Is Bots in 2026

— Imperva Bad Bot Report 2026

How much of your traffic is automated? Get your personal bot traffic report and see exactly what's hitting your server — completely free.

📊 Get My Bot Report

ⓘ Data Notice: The information presented above has been compiled from publicly available internet sources. Boteraser aggregates this data solely for informational purposes and does not independently classify, evaluate, or endorse any findings about the bots listed. The accuracy and completeness of this information is the sole responsibility of the original publishers. Boteraser and its operators accept no liability for any decisions made based on this data.