Study Reveals Major Security Flaws in MCP Ecosystem
Defenses
The Model Context Protocol (MCP) aims to let Large Language Models (LLMs) call external tools via structured metadata. In plain terms, it wires models to small services that do things for them. That plumbing has become a lively ecosystem of hosts, registries and community servers. The paper under review is the first systematic security analysis of that ecosystem, and the findings are a useful cold shower.
What the researchers did
The team decomposes MCP into three components: hosts that run the model and invoke tools, registries that list servers, and the servers themselves which implement tool behaviour. They instrument four MCP hosts (Cursor, Windsurf, Claude Desktop and Cline), test three LLMs, and crawl six public registries. The dataset includes 67,057 servers, many linked to GitHub repositories and Python-based tool implementations. Experiments are controlled; the authors do not perform live attacks on public registries and they responsibly disclosed issues to affected parties.
Key findings
First, hosts largely lack output verification. That means the host translates an LLM output into a tool call without robust checks. A malicious server can craft descriptions or responses that steer the model, or prompt it to reveal secrets. The paper demonstrates three poisoning patterns: abuse of built-in tools, extraction of request information, and misuse of server-provided tools.
Second, tool confusion and context problems are real. Identical tool names across servers can cause the wrong tool to be invoked. Cursor, for example, tends to invoke the first listed tool, creating a predictable attack path. Tools removed from the registry can still linger in interaction history, producing what the authors call context dangling and unpredictable calls.
Third, registries are weakly governed. The crawled registries contain invalid links, empty content and many repositories with inadequate metadata. The authors find concrete operational risks: leaked tokens in example configurations, 212 hijackable GitHub accounts and 304 hijackable redirected accounts, and affix squatting on npm packages. They also surface proof of concept malicious content, including fourteen malicious tool descriptions and two abnormal error messages in Python servers.
Model behaviour varies. GPT 4o is generally more conservative; Claude Sonnet 4 and Gemini 2.5 Pro are more willing to follow crafted instructions. That variability matters because hosts and registries cannot assume uniform model conservatism.
Why this matters is straightforward. MCP is already being used to extend LLM capabilities in productivity tools and integrations. If servers can manipulate a model or be hijacked en masse, the outcome is data leakage, unauthorised actions, or worse. The ecosystem's openness is a feature for innovation, but it is also an attack surface.
The authors propose practical mitigations: enforce output verification and strict tool-call boundaries at hosts, require provenance and signing for registry submissions, sandbox tool invocations, apply least privilege to server credentials, and monitor and audit behaviour. These are sensible, implementable measures; what is less clear is who will pay to deploy them across dozens of small registries and hundreds of host vendors.
Two concrete actions readers can take right now:
- For host operators: implement output verification and explicit tool-call boundaries, sandbox server calls and enforce least privilege for credentials.
- For security teams using MCP servers: treat registries as untrusted sources, verify server links and repos before installation, and require signed server releases or provenance checks.
Additional analysis of the original ArXiv paper
📋 Original Paper Title and Abstract
Toward Understanding Security Issues in the Model Context Protocol Ecosystem
🔍 ShortSpan Analysis of the Paper
Problem
The Model Context Protocol MCP is an open standard that lets AI powered applications interact with external tools through structured metadata. The MCP ecosystem has rapidly grown to include MCP hosts such as Cursor Windsurf Claude Desktop and Cline; registries such as mcp so MCP Market MCP Store Pulse MCP Smithery and npm; and thousands of community contributed MCP servers. Despite its momentum there has been little systematic study of its architecture and security risks. This paper provides the first comprehensive security analysis of the MCP ecosystem by decomposing it into three core components hosts registries and servers and examining their interactions and trust relationships. Users search registries for servers and configure them in the host which translates LLM generated outputs into external tool invocations provided by servers and executes them. The qualitative analysis reveals that hosts lack output verification for LLM outputs enabling malicious servers to steer model behaviour and trigger security threats including sensitive data exfiltration. The study also uncovers numerous vulnerabilities that allow attackers to hijack servers due to the absence of a vetted server submission process in registries. To support the analysis the authors collected and analysed a dataset of 67057 servers from six public registries. The quantitative analysis shows that a substantial number of servers are susceptible to hijacking. The work concludes with practical defence strategies for MCP hosts registries and users, and the findings were responsibly disclosed to affected parties.
Approach
The study adopts a mixed qualitative and quantitative methodology. It analyses MCP hosts to reveal a lack of output verification and investigates the three stage attack surface comprising registry level attacks and post integration attacks. Data were gathered from MCP registries to quantify risks, including server metadata extraction and analysis of hosting platforms such as GitHub. The team installed four MCP hosts Cursor Windsurf Claude Desktop and Cline and conducted experiments using three LLMs GPT 4 o Claude Sonnet 4 and Gemini 2 5 Pro across multiple runs. They crawled six registries selected from mastra including four decentralised registries mcp so MCP Market MCP Store Pulse MCP and a centralised registry Smithery and npm as a representative central registry. They collected 67057 MCP servers, with 52102 of the server file links provided by GitHub repositories and 44549 tools extracted from Python based servers. Logs from the model provider OpenAI captured requests and outputs to analyse how models interact with host controlled tool invocations. The work states that all experiments were performed in a controlled manner without submitting servers to public registries and that no real world attacks were launched; disclosure to affected parties was conducted and the authors plan to open source code and data.
Key Findings
- Hosts lack output verification, enabling malicious servers to influence model behaviour and potentially exfiltrate sensitive data.
- Tool confusion occurs when identical tool names across different servers lead the host to invoke the wrong tool; Cursor exhibits a bias by always invoking the first listed tool, while the other hosts do not show this behaviour.
- Context dangling tool risks arise when a tool previously available is removed but remains in the context history, causing unreliable or unintended invocations; Windsurf shows the most robustness while Cline shows the weakest handling.
- Tool descriptions and metadata can be manipulated by attackers to prompt the model into performing unintended actions or leaking information; three poisoning attack types are identified: abuse of built in tools, extraction of request information, and abuse of server provided tools.
- Poisoning success rates vary by model; built in tools are more reliably exploited; GPT 4 o tends to be more conservative whereas Claude Sonnet 4 and Gemini 2 5 Pro frequently follow crafted instructions; tool shadowing and parameter manipulation are demonstrated.
- fourteen malicious tool descriptions and two abnormal returned error messages were found in Python based MCP servers demonstrating proof of concept attacks.
- Registry level risks are significant: decentralised registries show invalid links, empty content and missing readme files; centralized registries such as Smithery and npm exhibit incomplete server data and potential policy gaps; registry level token leakage is observed with GitHub tokens in server configuration examples.
- Large scale indicators include 212 hijackable GitHub accounts and 304 hijackable redirected accounts; redirection hijacking can mislead users to attacker controlled repositories; affix squatting on npm shows many groups with identical names differing only by affixes and most are maintained by different developers.
- The study highlights several defence recommendations across hosts registries and users including output verification tool based checks server signing provenance sandboxing least privilege auditing and user vigilance during server configuration and link validation.
Limitations
The research has limitations including the focus on four MCP hosts and Python based servers for tool extraction which may not capture the full breadth of the MCP ecosystem. The analysis of attacks is conducted in controlled settings and does not involve live exploitation of public registries. Detection of malicious tool descriptions and return messages is conservative and may undercount threats. The data set covers six registries collected in a specific window between late June and early July 2025 and the findings may not generalise to all future MCP deployments. The work also relies on publicly available metadata and does not assess operational mitigations deployed by registry administrators in all environments.
Why It Matters
The work demonstrates that an open AI augmented tool ecosystem can be vulnerable to data exfiltration and manipulation of AI actions when external tools are not validated or trusted. The findings have practical security implications for the deployment of MCP powered applications across industries; if a large fraction of servers are compromised there could be widespread privacy and security harms. The report emphasises pragmatic mitigations including enforcing output verification and tool call boundaries at hosts, vetting and signing servers in registries with provenance models, sandboxing tool invocations, enforcing least privilege and robust monitoring and auditing. It also recommends registries periodically validate metadata and server links, require sanitised configuration data to avoid token leakage, and encourage end users to review server configurations for legitimacy. Overall the study offers crucial insights for the cyber security community into the risks of open AI augmented tool ecosystems and underscores the need for verifiable and auditable tool invocation practices to protect user data and system integrity.