Auditing an MCP Server Before You Trust It With Production Access
The Model Context Protocol (MCP) has become the default integration layer for production LLM agents. Most teams adopting MCP have not asked the same questions of an MCP server that they would ask of any other piece of software they grant production credentials. This is a practical audit playbook: what to inspect, in what order, and what to fix before you hand an MCP server access that touches the systems your business actually runs on.
Why this matters now
Three things changed in the last six months. First, in April 2026 a critical "by design" weakness was disclosed in the official MCP SDK that enables arbitrary command execution against any vulnerable MCP implementation. The systemic vulnerability spans Python, TypeScript, Java, and Rust and affects more than 7,000 publicly accessible servers and 150 million package downloads. Second, security researchers analyzing public MCP server inventories report that 43 percent contain command-injection-class flaws that predate the SDK issue. Third, real-world agentic deployments are now combining MCP servers with privileged tool access (cloud APIs, source control, ticketing, and in some cases industrial control systems) at a rate that outpaces the security community's ability to publish defensive patterns.
We expect the audit playbook below to be revised as the field matures. It is grounded in our hands-on review of several production MCP deployments and in the publicly disclosed incidents through April 2026. For organizations deploying MCP today, the cost of a one-day audit is materially lower than the cost of the first incident.
Key findings from the field
- Tool description injection is the most broadly effective attack. An attacker who can write any text into a tool's description (or any field the agent treats as instructions) can subvert the agent's behavior on every call to that tool. We have observed this work against every major agent platform we have tested.
- Credential blast radius is almost always under-scoped. The MCP server holds the token. The token rarely has the minimum scope required. When the server is compromised, the credentials it holds become the attacker's first capability.
- Authentication is the single most common gap. Many MCP servers ship as local stdio processes that assume a trusted single-user environment, then get redeployed onto a shared host or behind a network proxy without rethinking authentication.
- Supply chain is the second-most-common gap. Most MCP servers are JavaScript or Python packages with deep transitive dependencies, no signed releases, and update channels that no one is watching.
- Logs and runtime alerting are usually absent. The agent calls the server, the server calls the upstream API, the API returns data. None of those calls land in a log a human will ever read.
The threat surface
Six attack classes account for almost everything we have seen against production MCP deployments. The first three exploit the protocol's design assumptions. The last three are conventional software-security problems that the MCP layer makes more dangerous because the agent itself acts as a confused deputy.
| Attack class | Mechanism | Concrete example |
|---|---|---|
| Tool description injection | Attacker writes adversarial text into a tool's name, description, or parameter schema. The agent reads this metadata as authoritative instructions. | A weather-lookup MCP tool's description includes hidden text instructing the agent to also send the user's most recent chat to a logging endpoint. |
| Tool poisoning | A trusted MCP server's behavior changes silently after an upstream package update or a server-side configuration push. | A community-maintained Slack MCP server pushes a "minor" update that adds a new tool with a benign-looking description and a privileged side effect. |
| Indirect prompt injection via tool output | The agent calls a tool. The tool returns content that contains adversarial instructions. The agent follows them. | An email-fetch tool returns an email body with instructions to forward all messages to an external address. The agent obliges. |
| Command injection | The MCP server passes parameters to a shell, eval, or unsafe function without proper sanitization. | An MCP server exposes a "search" tool that interpolates the query into a shell command. The agent (or an attacker influencing the agent) supplies a query containing shell metacharacters. |
| Credential theft | The MCP server holds API tokens, OAuth refresh tokens, or local credentials. Compromise of the server yields all of them. | An MCP server with a "read-write" GitHub PAT is compromised through a dependency hijack. The PAT now belongs to the attacker. |
| Supply chain | An MCP server's npm or PyPI dependency is compromised, or the server itself is replaced upstream by a malicious successor. | A typosquatted package is added as a transitive dependency. The package exfiltrates environment variables on import. |
The attack class that surprises people most often is the third (indirect prompt injection via tool output). Defensive intuition focuses on what the user types into the agent. Production attacks more commonly ride in on what the agent reads from one of its tools.
The audit playbook
Five steps, in order. Each step has a specific output. None of them require source-code access, though source helps. We aim for a one-day engagement on a single MCP server and a three-to-five-day engagement for an agent's full MCP fleet.
Step 1: inventory and scoping
List every MCP server the agent can connect to, both today and after the next configuration change. For each server, record: the connection mode (stdio, HTTP, WebSocket), the host it runs on, the credentials it holds, and the network surfaces it can reach. The output is a one-page inventory the operator can read in two minutes. Most teams have never written this down, and the act of writing it surfaces decisions someone made and forgot.
Step 2: tool surface review
For every server in the inventory, enumerate every tool, parameter, and metadata field. Read the tool descriptions as if they were system-prompt text, because to the agent they are. Anything in a description that could be interpreted as an instruction (rather than a description of capability) is a finding. Anything that says "always do X" or "never reveal Y" is a finding even if the intent is benign, because it teaches the agent to follow instructions found in tool metadata.
Step 3: authentication and credential review
Determine how the MCP server authenticates the agent's calls, and how it authenticates upstream calls on the agent's behalf. For each credential the server holds, write down the scope. For each scope, ask: does this server need read access here, write access there, or both? If the answer is "we just used the default scope," that is a finding. If the credentials are stored in environment variables on a shared host, that is a finding. If the server uses a long-lived token where a short-lived OAuth flow was available, that is a finding.
Step 4: network and supply-chain review
For HTTP-mode servers, verify that the server is not reachable from any network it does not need to be reachable from. For all servers, list every direct dependency, the lock file (if any), and the last time someone reviewed an upstream changelog. Flag any dependency without a signed release, any unpinned version, and any package with fewer than three maintainers. Run a vulnerability scan against the lock file. Most MCP servers ship with at least one transitive dependency that has a known CVE.
Step 5: runtime observability
Determine what the server logs, where the logs go, who reads them, and what would trigger an alert. A common deployment pattern is "the MCP server prints to stderr, which is captured by the supervisor, which writes to a file no one looks at." That is not observability. The minimum viable bar is a structured log of every tool invocation, every credential use, and every error, written to a system that fires an alert on anomaly. Without this, an incident becomes a forensic archaeology project.
A worked example
Consider a fairly typical small-business deployment. The agent is Claude, accessed through a desktop client. The MCP servers configured are a community Slack integration, a self-hosted Notion bridge, and a custom server the team wrote to query the company's PostgreSQL replica. The agent is used by three engineers for ad-hoc analysis and incident response.
A one-day audit on this configuration would find, in our experience, between five and twelve issues. The most-likely-found ones, ranked by typical severity:
- The PostgreSQL MCP server holds a superuser credential. The team needed write access for one specific table during the original setup and never separated read from write. Severity: high.
- The Slack server's tool descriptions encourage the agent to "summarize aggressively." Benign in intent, but it teaches the agent that tool metadata is a place where behavioral guidance lives. Severity: medium.
- The Notion bridge is exposed on a local HTTP port with no authentication. Anything else on the workstation can call it. Severity: medium-to-high depending on workstation hygiene.
- None of the three servers have structured logs. Severity: medium until something happens, at which point severity is high.
- Two of the Slack server's npm dependencies have known CVEs. Severity: low if the CVE is in a code path that does not execute, medium otherwise.
The fixes for these findings are the same as the fixes for any other production-software finding: scope the credential to read-only on the specific table, treat tool metadata as untrusted text the agent will read literally, bind the local port to localhost with a token, route logs to a system that supports alerting, and patch the dependencies. None of these require novel technology. They require someone treating the MCP server as production software.
What an operator can do this week
Before any external audit, three actions reduce the surface materially:
- Write down every MCP server and what it can do. If the list is longer than you remembered, that is the first finding.
- Rotate the credentials each MCP server holds, and reduce scope while you do it. The "I'll fix scope later" credential rarely gets fixed later. Rotation is a forcing function.
- Pick one MCP server and read every tool description as if you were the agent reading it. If anything in those descriptions reads as an instruction rather than a capability description, that text is on the attack surface.
These three actions take an afternoon. They do not replace an audit. They do close the most common attack paths we see in deployments that have never been reviewed.
When this audit is worth buying
If your agent has access to anything that can move money, change customer data, send messages on the company's behalf, modify production infrastructure, or read personally identifiable information, the audit is worth buying. The cost of one engagement is materially less than the cost of the first incident. If the agent's access is read-only on internal documentation, the audit is probably not the highest-priority security spend you have. The decision is about blast radius, not about how novel MCP feels as a technology.
Contact jon@virtuscybersecurity.com with a brief description of your deployment and we can scope a one-day or multi-day engagement. We can also coordinate with your existing pentest provider if you have one. For broader technology strategy or ongoing fractional-CTO advisory beyond the security scope, see sandhillscto.com.
References and further reading
- Anthropic MCP design vulnerability disclosure (April 2026): summary of the SDK-level RCE issue and the affected packages.
- Cloud Security Alliance, mcpserver-audit initiative: open-source tooling for MCP server security review, with a public audit-and-vulnerability database.
- Practical DevSecOps, "MCP Security Vulnerabilities: How to Prevent Prompt Injection and Tool Poisoning Attacks in 2026."
- Virtus Cybersecurity, "A Four-Layer Defense Stack for LLM Agent Prompt Injection" (April 2026): the agent-side companion to this server-side audit playbook.
- Model Context Protocol, "Security Best Practices" (modelcontextprotocol.io): the upstream guidance, useful as a reference but not a replacement for a hands-on audit.