Unsafe by Flow: Uncovering Bidirectional Data-Flow Risks in MCP Ecosystem
Xinyi Hou, Yanjie Zhao, Haoyu Wang
TLDR
MCP-BiFlow is a static analysis framework that uncovers bidirectional data-flow risks in Model Context Protocol (MCP) ecosystems.
Key contributions
- Introduces MCP-BiFlow, a bidirectional static analysis framework for Model Context Protocol (MCP).
- Uncovers unsafe data flows and vulnerabilities in LLM agent-tool interfaces.
- Achieves 93.8% recall on 32 confirmed MCP vulnerabilities, outperforming other tools.
- Discovered 118 new vulnerability paths across 87 real-world MCP server repositories.
Why it matters
MCP is vital for LLM agents but introduces unique bidirectional data-flow vulnerabilities existing tools miss. MCP-BiFlow offers robust, protocol-aware analysis, significantly improving detection of critical flaws. This enhances the safety and trustworthiness of LLM agent-tool ecosystems.
Original Abstract
Model Context Protocol (MCP) have quickly become the interface layer between LLM agents and external tools, yet they also introduce unsafe data flows that existing analyzers handle poorly. Vulnerabilities manifest in two directions: requester-controlled arguments may propagate to sensitive operations, while untrusted external or sensitive internal data may surface through MCP-visible outputs and subsequently influence host or model behavior. Accurate detection is complicated by the heterogeneous registration and dispatch patterns MCP servers employ, the need for MCP-specific taint semantics, and the fact that bugs often only materialize along complete tool-scoped execution paths. We present MCP-BiFlow, a bidirectional static analysis framework built around MCP-aware entrypoint recovery, protocol-specific taint modeling, and interprocedural propagation analysis. Against a benchmark of 32 confirmed MCP vulnerability cases, MCP-BiFlow identifies 30 (93.8% recall), substantially outperforming CodeQL, Semgrep, Snyk Code, and MCPScan. Across 15,452 real-world MCP server repositories, MCP-BiFlow surfaces 549 overlap-compressed candidate clusters; manual review confirms 118 vulnerability paths in 87 servers, establishing unsafe propagation as a recurring failure mode that resists detection without protocol-aware recovery of both request-side and return-side flows.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.