When to Use MCP-SD and S2SP

A quick decision guide: does your role benefit from MCP-SD? Does your workload fit the pattern? And if you adopt S2SP, which transfer mode suits your deployment?

1. Who is this for?

Three signals you should adopt MCP-SD: two by role (tool developer, agent developer) and one by workload (multi-server structured-data routing). Any one of them can justify evaluating MCP-SD; all three together make the strongest fit.

Role

MCP tool developers

…whose tools return large chunks of structured data to the agent.

If your MCP server exposes a tool returning list[dict] with dozens of columns—query results, CRM records, search hits, telemetry rows, alert feeds—standard MCP callers receive the full payload in tokens, even when the agent only needs a few columns to make routing decisions.

What you get:

Decorator-based integration: @server.sd_resource_tool(). No MCP spec changes; callers that omit abstract_domains get the full result.
Traditional MCP agents keep working unchanged (backward-compatible).
In the weather-alert demo, MCP-SD-aware agents saw 83–93% fewer tokens for the measured tool result.
Body delivery is handled for you—presigned URL, DDI endpoint, caching, TTL.

Start here: Python SDK → Resource Tool section.

Role

Agent developers

…who want to reduce token spend in structured-data workflows.

If you're building an agent on LangGraph, Claude Agent SDK, AutoGen, or a custom loop, and your workflow shuttles structured data between MCP servers, those payloads are eating your context window and your per-turn cost. MCP-SD gives the agent a way to ask for just the columns it needs.

What you get:

Platform-agnostic core: mcp_sd.agent.S2SPDispatcher plugs into any tool-call lifecycle.
Ready-made adapters for Claude Agent SDK and LangGraph.
Potentially lower end-to-end latency when large body data no longer enters LLM inference.
Lower token bill on compatible tool calls where the agent only needs selected columns.

Start here: Python SDK → Agent-side integrations.

Workload

Multi-server data routing

…your agent requests or forwards structured data among multiple MCP servers.

The canonical topology MCP-SD is built for: one MCP server produces tabular data, the agent filters or routes rows, and another MCP server consumes or transforms them. Database → analytics. CRM → email. Search → summarizer. Log store → alert pipeline. Every hop that currently serializes rows into the LLM context is a hop MCP-SD removes.

What you get:

Body rows flow server-to-server over the DDI (async) or through an S2SP-aware agent SDK (sync), outside the LLM context.
The agent keeps full orchestration control: it decides what moves where, just without carrying the payload itself.
Can be composed in pipelines when each hop exposes the appropriate resource or consumer tool.
Works on heterogeneous deployments via presigned URLs (async) or in-process DDI (sync).

Start here: Introduction → The Complete Flow.

2. Is MCP-SD the right fit for this tool?

MCP-SD applies only to a specific kind of tool: one that returns tabular data (list[dict]) where the agent needs only some columns to reason about the rest.

Adopt MCP-SD when…

Your tool returns structured row data with enough columns or heavy fields for projection to matter (DataFrames, SQL results, JSON array responses).
Only a few columns drive the agent's decisions; the rest are heavy payload (long descriptions, nested JSON, encoded blobs).
The tool is called repeatedly across a conversation—per-turn token savings compound quickly.
You control the resource server (can add the @sd_resource_tool() decorator).
Your agent pipeline forwards data to a downstream consumer tool that does want the full rows.

Skip MCP-SD when…

The tool returns a single scalar or tiny blob (e.g., get_price(symbol) → float).
The agent must reason over every field—no withheld columns to offload.
Data is truly unstructured: images, audio, video, PDFs, free-form text.
Tool is one-shot and you don't care about token cost (debug utilities, one-off ops tools).
Your pipeline has no downstream consumer—the agent itself is the final stage and needs the whole payload.

2. Once you adopt MCP-SD, pick a mode

S2SP—the reference implementation of MCP-SD—offers two transfer modes for the data plane. Both keep body rows out of the LLM context; they differ in where those rows live in transit.

Pick async (default) when…

Resource and consumer servers can reach each other over HTTP (same VPC, public internet, shared overlay).
Body payloads are large enough that keeping them out of the agent process is operationally useful.
You want the body to never touch the agent process (strictest LLM-isolation story).
Multi-hop pipelines: resource → consumer-A → consumer-B, where each stop can fetch from the previous.

Pick sync when…

Servers are firewalled, NAT'd, or mTLS-bound—the consumer cannot reach the resource server directly.
Single-hop flow: resource → agent → consumer is the whole pipeline.
You control the agent process: your own Python loop, LangGraph graph, or Claude Agent SDK agent. The SDK ships adapters for both.
You need the body in-process for auditing, logging, or redaction before forwarding.

Caveat for sync mode

Sync mode requires an S2SP-aware agent (using mcp_sd.agent.S2SPDispatcher or one of its adapters). A generic MCP client that forwards raw tool results directly to the LLM will not hide sync-mode body rows. If your client does not install the dispatcher, choose async mode.

3. Concrete scenarios

How the two decisions above play out on common workloads.

Analytics pipeline: weather alerts → chart generator

Resource server exposes get_alerts(area) returning 29 columns per alert. Agent only needs event, severity, headline to pick which alerts to chart. Consumer server draw_chart needs the full rows to render.

Adopt MCP-SD — async mode. Classic fit. Both servers sit in the same cloud, HTTP data-plane fetch is cheap.

CRM sync: customer records → email marketing service

Resource server returns 30+ fields per customer (PII, history, preferences). Agent filters by segment and forwards matching customers to the email service.

Adopt MCP-SD — async if same-VPC, sync if firewalled. If the email service can't reach the CRM directly, use sync so the body transits the agent process but still bypasses the LLM.

Ops pipeline: log search → summarizer

Search returns many structured rows; agent filters to the rows relevant to the incident and forwards them to a downstream tool.

Adopt MCP-SD — async mode. This is a good fit when only a few fields drive filtering and the downstream tool needs the full rows.

Price lookup: get_price(ticker) returns a single float

Tool has no tabular structure. Nothing to selectively disclose.

Skip MCP-SD. Use the tool as a plain MCP call.

Document OCR: PDF → extracted text

Return value is a text blob, not tabular rows.

Skip MCP-SD. Out of scope—MCP-SD deliberately targets structured tabular data only. File-transfer extensions for unstructured blobs are listed as future work.

Reporting: query → agent summarizes to user

Agent is the final stage. No downstream consumer; everything the agent receives needs to go into the LLM for summarization.

Skip MCP-SD. Without a consumer to route withheld body to, there's nothing to save.

4. Composing with other extensions

MCP-SD operates on the return side of tool calls. It composes cleanly with progressive disclosure schemes that operate on the call side (Claude Skills, the MCP Progressive Disclosure extension, Bounded Context Packs). Run them together:

Skills trims which capabilities the LLM surfaces in a given session.
MCP Progressive Disclosure trims per-tool schema cost.
MCP-SD / S2SP trims the payload size of each tool result.

Each one targets a different layer of the workflow, so they can be evaluated together when their assumptions match your agent stack.

Next steps

If you've decided MCP-SD fits, head to Introduction for the concept in full, Protocol Design for the spec, or Python SDK to start integrating.