Tracing Agent Tool Flows: A Taint-Analysis View of Tool Calls

by Steven Jung

Applying Taint Analysis to Agent's Tool Flows

As developers increasingly rely on AI agents and the Model Context Protocol (MCP), the security paradigm shifts. We face a familiar challenge in a new form: ensuring agents behave as expected when chaining tools together. This problem mirrors the classic data flow challenges of static code analysis. Instead of tracing variables through functions, we must now trace data through tool calls mediated by an LLM.

From Code Flows to Agent Flows

In traditional static analysis, taint analysis is used to track how untrusted input or sensitive data propagates through a system. We can apply this same model to AI agents:

  • A tool that fetches private data is a source.
  • A tool that mutates state or exposes data is a sink.
  • The LLM, sitting between them, is an opaque transformation layer and a potential source of risk itself.

By categorizing tools, we can systematically reason about risk:

  • (P) = Private Data Source
  • (U) = Untrusted Data Source
  • (S) = State-Changing Sink

This allows us to define two primary risk templates:

  • Data Leak Risk (P → S): A confidentiality failure where a tool reading private data is followed by one that exposes it.
  • Tamper Risk (U → S): An integrity failure where a tool consuming untrusted input is followed by one that mutates state.

How We Identify Risky Tool Chains

Unlike in static code, data flow between tools is not explicit. We use a combination of signals to identify potentially hazardous connections:

  • Schema Overlap: Do two tools share input parameters (e.g., owner, repo)? A strong overlap suggests they operate on the same logical entity.
  • Output-to-Input Mapping: Can the output of a source tool directly satisfy the inputs of a sink tool? This is a strong indicator of a direct data flow.
  • Runtime Analysis: Observing real-world tool call sequences reveals which flows are common in practice, helping to validate static assumptions.
  • LLM-Powered Categorization: We use LLMs to interpret tool descriptions and classify them as sources (P, U) or sinks (S), providing a scalable foundation for our analysis.

By combining these signals, we can build a realistic model of how agents chain tools together and identify high-risk patterns.

Real-World Examples from the GitHub MCP

Here are three examples from the GitHub MCP that illustrate these risks:

1. get_pull_request_files (P) → add_comment_to_pending_review (S)

  • Risk: Data Leak.
  • Flow: The contents of private files (get_pull_request_files) could be fed into the body of a pull request comment (add_comment_to_pending_review), exposing sensitive code.

2. get_issue_comments (U) → create_issue (S)

  • Risk: Tamper.
  • Flow: An untrusted comment from a public issue (get_issue_comments) could contain a malicious prompt that instructs the agent to create a new issue with harmful content (create_issue).

3. get_pull_request_status (U) → merge_pull_request (S)

  • Risk: Tamper.
  • Flow: A manipulated or misunderstood pull request status (get_pull_request_status) could trick the agent into merging a branch that is not ready (merge_pull_request), compromising the repository's integrity.

Putting It All Together

Applying the principles of taint analysis to tool call flows provides a systematic framework for securing AI agents and knowing the risks. By categorizing tools as sources and sinks and analyzing the data flows between them, developers can identify and mitigate potential data leak and tampering risks before they are exploited. Rather than just filtering inputs and outputs, this lets us build security rules grounded in actual tool behaviour, mapping how data moves, spotting risky control paths, and enforcing protections intelligently.