New Benchmark Reveals MCP Attacks Are Worryingly Easy

Attacks

Published: Mon, Aug 18, 2025 • By Adrian Calder

MCPSecBench tests Model Context Protocol deployments and finds widespread vulnerabilities. The benchmark maps 17 attack types across clients, transports, servers and prompts, and shows over 85% of attacks succeed somewhere. Providers vary widely; core protocol flaws compromise Claude, OpenAI and Cursor. This forces honest security testing before deployment.

MCPSecBench pulls the curtain back on the Model Context Protocol and the view is uncomfortable. The authors create a repeatable playground and run 17 real attack types across MCP clients, transports, servers and prompts. The headline: more than 85 percent of attacks succeed on at least one platform, and four protocol or implementation attacks succeed everywhere.

Concerning specifics include schema inconsistencies, a known vulnerable client (CVE-2025-6514), DNS rebinding and man-in-the-middle attacks that achieved universal success against Claude, OpenAI and Cursor. Prompt injection defenses are inconsistent: Claude refused tested injections, OpenAI refused a third, and Cursor refused none. Tool-related failures are practical too: slash-command overlap leaked credentials on Cursor, while tool shadowing, sandbox escape and tool poisoning worked repeatedly on some hosts.

Why it matters

This research shows MCP adds attack surfaces beyond prompts. Attackers can exploit clients, the transport layer and servers as easily as feeding bad prompts. That turns a component meant to make agents useful into a high-risk integration point for data leaks and operational compromise. If you deploy agents without testing the full MCP stack, you are running a live experiment on your production systems.

What to do next

Run MCPSecBench in a lab before production; patch or replace vulnerable clients; enforce schema validation and strict type checking; use strong transport protections such as mTLS and DNS rebinding mitigations; apply least-privilege to tools and sandbox them; add runtime monitoring for unusual tool calls; and bake MCP tests into CI. In short, treat MCP like any other networked service: assume it will be attacked and verify your defenses.

Additional analysis of the original ArXiv paper

📋 Original Paper Title and Abstract

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal, open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and expands their attack surfaces. In this paper, we present the first systematic taxonomy of MCP security, identifying 17 attack types across 4 primary attack surfaces. We introduce MCPSecBench, a comprehensive security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, and attack scripts to evaluate these attacks across three major MCP providers. Our benchmark is modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for systematic security assessment. Experimental results show that over 85% of the identified attacks successfully compromise at least one platform, with core vulnerabilities universally affecting Claude, OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit considerable variability across different hosts and models. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all MCP layers.

🔍 ShortSpan Analysis of the Paper

Problem

This paper analyses security risks introduced by the Model Context Protocol (MCP), an open standard that connects large language models to external tools and data. By expanding the attack surface beyond prompts to clients, transport and servers, MCP-powered agents face new vectors for data leakage, unauthorised actions and real-world harm. The authors argue a systematic taxonomy and repeatable benchmark are needed to evaluate these risks.

Approach

The authors present a formal taxonomy of MCP security with 17 attack types across four primary attack surfaces: user interaction, MCP client, MCP transport and MCP server. They implement MCPSecBench, a modular benchmark and playground that integrates a prompt dataset, example MCP hosts, vulnerable and malicious MCP servers, an example vulnerable client (mcp-remote with CVE-2025-6514), and transport attacks such as Man-in-the-Middle and DNS rebinding. They evaluated each attack 15 times on three MCP hosts: Claude Desktop, OpenAI (GPT-4.1) and Cursor, reporting Attack Success Rate and Refusal Rate.

Key Findings

Over 85% of the 17 attack types succeeded at compromising at least one platform.
Four protocol/implementation attacks (Schema inconsistencies, Vulnerable client, MCP rebinding, Man-in-the-Middle) had 100% success across all hosts.
Prompt injection defences varied: Claude refused all tested injections (0% ASR, 100% refusal), OpenAI refused 33.3% and Cursor never refused.
Tool misuse showed 40–53% ASR; slash-command overlap in Cursor leaked credentials with 100% ASR.
Tool shadowing, sandbox escape and tool poisoning repeatedly succeeded (100% ASR on some hosts); tool shadowing success varied: Claude 100%, OpenAI 80%, Cursor 26.7%.
Vulnerable server exploits (e.g. path traversal) affected Claude and OpenAI universally; Cursor showed occasional failures due to workspace limits.

Limitations

Evaluation covered three MCP hosts and a constructed set of servers and clients; broader ecosystem coverage is not reported. Mitigation and detection strategies were not evaluated in this work and are left for future extension. Some experimental detail such as full infrastructure specifications is only partially reported.

Why It Matters

MCPSecBench exposes widespread, practical vulnerabilities in MCP deployments and provides a standardised, extensible framework for reproducing and comparing attacks across MCP layers. The findings show that attackers can exploit clients, transports and servers as readily as prompts, underlining the need for systematic security testing, hardened client/server implementations and transport protections to prevent data exfiltration and operational compromise.

Attribution Original paper on arXiv