Auditing MCP agents for over-privileged tool access
Agents
Connect a Large Language Model (LLM) to external tools and you inherit whatever privileges those tools expose. The Model Context Protocol (MCP) makes that wiring easier, but it also standardises the path to over-privilege if servers hand out file access, network reach, or command execution without restraint. Most teams do not have a clean way to see what an MCP server really enables before it ships.
What the toolkit does
The paper introduces mcp-sec-audit, a protocol-aware auditor for MCP servers. It has two parts. The static path scans Python code and MCP metadata using a rulebook of indicators for capability families such as command execution, file write, and network operations. The dynamic path runs the target server inside Docker, fuzzes MCP calls, and uses eBPF to capture kernel-level behaviour such as syscalls, file I O and network traffic. Findings are aggregated into a weighted risk score with mapped mitigation suggestions.
On a synthesised malicious tool, the static scan detected command execution, file I O and network operations with confidence scores between 0.65 and 0.85, produced a medium risk rating of 42.5 out of 100, and finished in under two seconds. On the MCPTox benchmark of 45 real-world MCP servers, the tool reported 663 capability instances across 491 samples and flagged 367 of those samples, or 74.7 percent. Where metadata indicators existed, detection was 100 percent. Risk was mostly low (92.3 percent) with 7.7 percent medium; the average score was 9.96 and the maximum 49.05. In a lab of nine intentionally vulnerable servers, static analysis detected all Python cases but missed all JavaScript ones. Dynamic sandboxing detected all nine, with an average risk score of 61.4 out of 100 and a higher score than static by 36.2 points on average. The server with explicit remote code execution scored highest at 65.3.
So what for security teams
The headline is that dynamic sandboxing provides the meaningful coverage. It is language-agnostic and sees what actually runs. The cost is operational: you need Linux hosts with eBPF, privileged Docker, and the patience to build images. Static analysis is quick and cheap triage for Python-first environments, but it will miss behaviour hidden behind indirection and anything not in Python.
There are material limitations. The static engine is pattern-based and prone to both false positives and false negatives. There is no JavaScript or TypeScript AST analysis yet. Dynamic runs require manual image builds and the output maps to generic hardening tips, not deployment-ready controls like seccomp profiles or policy templates. Integration into CI CD and richer policy packs is listed as future work.
Commercially, this looks useful as a pre-deployment gate for MCP-backed agents. Think of it as a capability inventory with a nudge towards least privilege. It will not replace proper design, allow-listing, and human review, but it can surface obvious foot-guns before a model hits production. If you run mixed-language MCP servers, plan for the dynamic path on a hardened runner. Use the findings to enforce simple allow-lists, restrict file paths, block command execution unless truly necessary, and segment egress. The mitigation guidance is intentionally generic, so expect to translate it into your own policies.
Watch this space. If the authors add CI integration, JavaScript coverage, and reusable policy templates, this moves from handy lab tool to baseline control for anyone standardising on MCP. Until then, it is a pragmatic way to see what your agents might actually do, rather than what the README claims.
Additional analysis of the original ArXiv paper
📋 Original Paper Title and Abstract
Auditing MCP Servers for Over-Privileged Tool Capabilities
🔍 ShortSpan Analysis of the Paper
Problem
The paper studies security risks introduced when Large Language Models use the Model Context Protocol (MCP) to access external tools and data. MCP servers can expose privileged capabilities such as filesystem access, network requests and command execution; if these are over‑privileged or undocumented they create attack surfaces for misuse by models or adversaries. Existing static security tools do not reason about MCP-specific artefacts like tool metadata and therefore cannot reliably report exposed runtime capabilities before deployment.
Approach
The authors present mcp-sec-audit, an extensible auditing toolkit that combines protocol‑aware static analysis with optional sandboxed dynamic verification. The static pipeline uses a TOML rulebook of keywords and regular expressions to scan Python source and MCP metadata for indicators of capability families (for example command_exec, file_write). The dynamic pipeline runs the target in an isolated Docker sandbox while a protocol fuzzer injects adversarial payloads and an eBPF monitor captures kernel‑level telemetry (syscalls, file I/O, network) which is serialized into event logs for behavioural reconstruction. A risk scorer aggregates findings into a weighted risk level and a mitigation engine maps detected capabilities to deployment hardening suggestions. The implementation is Python‑based, plugin driven, requires Python 3.10+ and Docker, and expects eBPF support for dynamic monitoring.
Key Findings
- Static analysis can rapidly detect explicit code patterns: on a synthesized malicious tool it identified command execution, file I/O and network operations with confidence scores 0.65–0.85, produced a MEDIUM risk rating (42.5/100), five mitigation recommendations and completed in under two seconds without Docker.
- On the MCPTox benchmark of 45 real‑world MCP servers the tool identified 663 capability instances across 491 samples (average 1.35 capabilities per sample). When metadata indicators were present detection was 100%, and overall 367 of 491 samples were flagged (74.7%). Risk distribution reported 92.3% LOW and 7.7% MEDIUM with average score 9.96 and maximum 49.05.
- In a vulnerable‑servers lab of nine intentionally flawed MCP servers static analysis detected all Python cases (2/2) but missed JavaScript servers (0/7) due to current language limitations. Dynamic sandboxing achieved language‑agnostic coverage, detecting all 9 servers with an average risk score of 61.4/100 and producing higher scores than static analysis (average increase +36.2 points). The server with explicit remote code execution scored highest (65.3).
Limitations
Current static detection is pattern‑based and susceptible to false positives and false negatives; it supports Python and JSON metadata but lacks JavaScript/TypeScript AST analysis. Dynamic analysis requires Linux hosts with eBPF and privileged Docker containers and manual image builds. Mitigation advice is generic and not output as deployment‑ready policies or fine‑grained seccomp profiles. Integration with CI/CD and richer policy templates remains future work.
Why It Matters
mcp-sec-audit provides a focused pre‑deployment audit that maps code and metadata indicators to concrete capability risks and hardening recommendations, helping reduce over‑privileged MCP tool deployments. Combining static and dynamic streams improves coverage and reveals runtime behaviour that static rules miss, which is especially relevant for heterogeneous implementations. For security teams this enables repeatable assessments, gating in CI pipelines and prioritised mitigation before exposing models to potentially dangerous tool capabilities.