ShortSpan.ai logo

Agentic browsers revive old web threats at scale

Agents
Published: Thu, May 07, 2026 • By James Armitage
Agentic browsers revive old web threats at scale
New research shows LLM-powered browsing agents are not just prompt-injection bait. By treating page content as trustworthy steps, they reintroduce classic web and social-engineering attacks, often amplified. The team built 18 proofs of concept and saw 14 reproduce across four major LLMs. Alignment and prompt filters did not save them.

Agentic browsing has a sales pitch that writes itself: let a Large Language Model, or LLM, click through the web so you do not have to. The industry then narrowed the risk to a single buzzword, prompt injection, and called it a day. This paper refuses that simplification and, frankly, it is about time.

The authors extend the usual see-to-act model into a proper browser threat model and cast the agent as a confused deputy. That phrase matters. The agent has your permissions and no common sense about where a page ends and a task begins. From that, they derive 20 attack families and implement 18. The hits are old-school: social engineering, cross-origin trickery, and UI deception that were once dulled by user scepticism now land because the “user” is a deterministic parser with tool access.

Five failure modes explain the bulk of the carnage. First, the agent bridges data within a site that a normal user would not, like scraping a hidden account key and posting it back. Second, it bridges across sites, turning markup control into cross-origin exfiltration via third-party domains. Third, it hallucinates URLs and clicks them, which the authors sensibly call slop-squatting, handing traffic to attackers who register the guessed domains. Fourth, it leaks its own instructions or your data to pages when asked, because a page element looks like part of the task. Fifth, once tools are wired in, the agent misuses them, triggering code execution or file writes because a page suggested it.

This is not a single-vendor fluke. Fourteen of the twenty attacks reappeared across four different LLMs and two agent harnesses, including Playwright-MCP and BrowserOS. A prompt-filter intermediary, BrowseSafe, missed several confusion plays. Alignment did not help. None of that should surprise anyone who has actually built production systems: if you collapse trust boundaries, web basics come back with a bigger budget.

Here is the uncomfortable truth. Agentic browsers do not fail in exotic new ways. They repackage familiar web scams and bypass decades of mitigations by moving the human out of the loop. That makes them riskier than a human user, not because models are evil, but because organisations are handing them the keys without giving them a map of the building.

My view: this work is the strongest argument yet that agentic browsing needs a rearchitecture, not another round of prompt band-aids. If you point these systems at sensitive workflows today, you are accepting cross-origin leaks, tool abuse, and domain misnavigation as business-as-usual. That is not doomerism. It is just how the web works when you bolt a credulous deputy onto it.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

WAAA! Web Adversaries Against Agentic Browsers

Authors: Sohom Datta, Alex Nahapetyan, William Enck, and Alexandros Kapravelos
Large language models (LLMs) are increasingly being integrated into web browsers to create agentic browsing systems that execute actions on behalf of the user. Prior work considering the security of agentic browsers focuses exclusively on indirect prompt-injection attacks. However, by failing to consider traditional web attacks, previous agentic browser threat models have a blind spot to web social engineering attacks originally designed to trick humans. In this paper, we propose the first web-focused threat model for agentic browsers and use it to derive a taxonomy of 20 attacks across both the web and LLM space, and implement 18 of the attacks. Our threat model extends the original See$\rightarrow$Act browser agent model to account for all components of a browser, and frames the agent as a confused deputy unable to distinguish task steps from traditional web attacks. We show that 10 web threats can reemerge often in amplified forms once an agent can be influenced by untrusted page content. We further conduct a generalizability study on 14 of the 20 attacks, showing that our attacks reproduce across 4 major LLM models spanning multiple vendors. We show that agentic browsers exhibit five major failure modes when facing traditional and LLM web threats, demonstrating the need to rearchitect agentic browsers before they are ready for the current web.

🔍 ShortSpan Analysis of the Paper

Problem

This paper studies the security of agentic browsers—web browsers in which large language models (LLMs) autonomously perceive pages and execute actions on behalf of users. Prior work largely treats agentic-browser threats as indirect prompt injection, neglecting traditional web and social-engineering attacks. That omission creates a blind spot: an LLM-driven agent can be manipulated by attacker-controlled page content and common web deception patterns, potentially performing privileged actions or leaking sensitive data.

Approach

The authors propose the first web-focused threat model for agentic browsers by extending the See→Act agent model to include origin, element sets and explicit browser capabilities. They model the agent as a confused deputy authorised to perform privileged operations but unable to distinguish legitimate page components from malicious ones. From the model they derive a taxonomy of 20 attacks, build proof-of-concept implementations for 18, and run a generalisability study on 14 attacks across four major LLMs and two agentic-browser harnesses (Playwright-MCP and BrowserOS). Experiments used controlled local deployments and time-limited sessions; they also tested some commercial agents and a prompt-filter intermediary (BrowseSafe).

Key Findings

  • Taxonomy and implementation: A taxonomy of 20 attacks was produced and 18 attacks were implemented as proofs of concept, showing a broad attack surface.
  • Re-emergence and amplification: Ten well-mitigated web threats reappear or are amplified when an agent processes untrusted page content, enabling cross-origin data flows and same-site manipulation that bypass traditional mitigations.
  • Five failure modes: Agentic browsers exhibit five core failures—bridging same-site data, bridging cross-site data, hallucinating URLs (slop-squatting), leaking prompts or user data to sites, and misusing integrated external tools—each enabling distinct classes of attacks.
  • Generalizability across models and products: Fourteen attacks reproduced across four LLM models from multiple vendors; failures are not limited to a single model or vendor and defeat out-of-the-box alignment in several cases.
  • Realistic exploit examples: Proofs of concept included successful scenarios such as an agent posting a user account key from a page, exfiltrating a cookie via a third-party domain, leaking credentials from an overlaid prompt, and inducing code execution or file writes when tool integrations were available.
  • Existing defences insufficient: An intermediary prompt-filter (BrowseSafe) failed to flag several confusion-style web attacks, and browser protections that assume human interaction or code-level attacks do not prevent agent-driven confusion exploits.

Limitations

Assumptions include an unpoisoned model, a non-malicious user, and absence of traditional browser memory-safety bugs. The generalisability study covers 14 of 20 attacks and uses controlled harnesses rather than live-system abuse. Two attack classes were not triggered in some harnesses due to missing agentic interfaces. Network on-path attackers were out of scope.

Implications

Offensive implications are substantial: web adversaries can leverage familiar social-engineering techniques and simple markup to coerce agentic browsers into leaking cross-origin or same-site secrets, navigating to attacker-controlled domains (including pre-registered hallucinated domains), invoking external tools to execute code or access files, and fingerprinting agents to tailor attacks. Because agents collapse trusted and untrusted content, attackers with only markup-level control can achieve effects comparable to script-level compromise, lowering the bar for practical exploitation.


Related Articles

Related Research

Get the Weekly AI Security Digest

Top research and analysis delivered to your inbox every week. No spam, unsubscribe anytime.