How MCP Works: Architecture and Data Flow

Updated May 2026
MCP operates through a layered client-server architecture built on JSON-RPC 2.0. A host application creates isolated client connections to multiple MCP servers, each exposing tools, resources, and prompts through a standardized protocol. The architecture separates the AI experience layer (host), the protocol management layer (client), and the capability layer (server), creating clean security boundaries and deployment flexibility.

The Three-Layer Architecture

MCP defines three distinct roles that work together to connect AI models with external capabilities. Each role has specific responsibilities, and understanding the separation between them is essential for building and deploying MCP-based systems effectively.

The host is the user-facing application. Claude Desktop, Claude Code, Cursor, Windsurf, and VS Code with Copilot are all MCP hosts. The host manages the user interaction, orchestrates the AI model, enforces security policies, and maintains connections to multiple MCP servers. It is the host's responsibility to decide which servers to connect to, what permissions to grant, and how to present tool results to the user.

The client is the protocol handler within the host. The host creates one client for each MCP server it connects to. Each client manages a single, dedicated connection, handles JSON-RPC message exchange, performs capability negotiation, and tracks the session state for its assigned server. The one-to-one relationship between clients and servers ensures isolation: a slow or failing server cannot affect other connections.

The server is a process that exposes specific capabilities. An MCP server is intentionally lightweight and focused. Rather than building one server that provides filesystem access, database queries, and API calls, the MCP pattern encourages small servers that each handle one domain. A typical development setup might include separate servers for the filesystem, GitHub, a database, and web search, all connected simultaneously through the host.

The Connection Lifecycle

An MCP session moves through several phases from initial connection to active use. The lifecycle is designed to establish mutual understanding between client and server before any tools are invoked or data is exchanged.

Connection begins when the host launches or connects to an MCP server. For local servers using stdio transport, the host starts the server as a child process. For remote servers using Streamable HTTP, the host opens an HTTP connection to the server endpoint. In both cases, the transport layer is established first, providing the communication channel for the protocol layer above it.

Initialization follows immediately. The client sends an initialize request containing its protocol version and supported capabilities. The server responds with its own protocol version, its name and version string, and a declaration of which capabilities it supports (tools, resources, prompts, logging, list change notifications, and other optional features). Both sides must agree on a compatible protocol version for the session to continue.

After successful initialization, the client sends an initialized notification to signal that it is ready for normal operations. The server can now accept requests for tool listings, resource listings, prompt listings, and actual tool invocations. The session remains active until either side sends a close notification or the transport connection is lost.

JSON-RPC 2.0 Messaging

All MCP communication uses the JSON-RPC 2.0 message format. This is the same format used by the Language Server Protocol, and it provides a clean, well-understood structure for request-response interactions and one-way notifications.

Requests are messages that expect a response. They contain a unique ID, a method name (like "tools/list" or "tools/call"), and an optional params object with method-specific arguments. The receiver processes the request and returns a response with the same ID, containing either a result object or an error object.

Notifications are one-way messages that do not expect a response. They are used for events like capability change announcements, progress updates, and logging messages. The server might send a "notifications/tools/list_changed" notification when a new tool becomes available, prompting the client to refresh its tool list.

The bidirectional nature of JSON-RPC 2.0 means both client and server can send requests and notifications. While clients primarily send requests (to list tools, invoke tools, read resources), servers can also send requests to clients. The most notable example is sampling, where a server requests an LLM completion from the host. This reverse direction enables recursive agent patterns where a server-side operation needs AI reasoning as part of its execution.

Data Flow of a Tool Call

The complete flow of a tool call illustrates how all the architectural components work together. Consider a user asking Claude Desktop to query a database for recent sales figures.

The user types their request into Claude Desktop (the host). The host sends the user's message to the Claude API along with tool definitions from all connected MCP servers. Claude analyzes the request and determines that a database query tool would produce the best result. It generates a structured tool call with the tool name and the SQL query as an argument.

The host receives Claude's tool call response and routes it to the appropriate MCP client based on which server registered that tool. The client sends a "tools/call" JSON-RPC request to the database MCP server, passing the tool name and arguments.

The database server receives the request, validates the arguments against its configured permissions, executes the SQL query against the connected database, and returns the result as a JSON-RPC response. The result includes both the query output and any metadata about the operation.

The client receives the response and passes it back to the host. The host feeds the tool result back to Claude as a new message in the conversation. Claude processes the database result and generates a natural language summary of the sales figures for the user. The entire round trip, from user query to database result to formatted answer, happens within a single conversation turn.

Transport Mechanisms

MCP separates the protocol layer from the transport layer, allowing the same protocol to work over different communication channels. Two transports are supported: stdio for local servers and Streamable HTTP for remote servers.

Stdio transport is the simplest option. The host launches the MCP server as a child process and communicates by writing JSON-RPC messages to the server's standard input and reading responses from its standard output. No network configuration, no port management, no authentication tokens. The operating system handles process isolation, and all data stays on the local machine. This is the default transport for local development tools and personal utility servers.

Streamable HTTP transport, which replaced the original SSE transport in the November 2025 specification revision, enables MCP servers to run as remote network services. The server exposes an HTTP endpoint, and clients communicate using standard HTTP requests. This transport supports OAuth 2.1 authentication, session management across reconnections, and horizontal scaling behind load balancers. It is the right choice for shared team servers, cloud-hosted tool services, and enterprise integrations where multiple users need access to the same MCP server.

The protocol layer operates identically regardless of which transport is underneath. An MCP server that works over stdio can be adapted to Streamable HTTP without changing any tool implementations. The only differences are in connection setup, authentication, and session management, all of which are handled by the transport layer rather than the application code.

Capability Negotiation in Detail

The initialization handshake is where client and server establish what each side can do. The client declares its capabilities, including whether it supports sampling (server-initiated LLM requests), roots (working context information), and experimental features. The server declares its capabilities, including which primitives it supports (tools, resources, prompts), whether it sends list change notifications, and what logging levels it offers.

After initialization, the client typically requests the full list of available tools, resources, and prompts. Each tool listing includes the tool name, a human-readable description (which the model uses to decide when the tool is appropriate), and a JSON schema defining the expected input parameters. Resource listings include URI patterns and descriptions. Prompt listings include names, descriptions, and parameter definitions.

The quality of these descriptions directly affects how well the AI model uses each capability. A vague tool description like "queries data" gives the model little to work with. A specific description like "executes a read-only SQL SELECT query against the connected PostgreSQL database, returning results as a JSON array of row objects" tells the model exactly what the tool does, what it expects, and what it returns. Good descriptions are arguably the most important aspect of building an effective MCP server.

Session State and Reconnection

MCP sessions are stateful. The server maintains context about the current session, including which capabilities have been discovered, any subscription registrations, and any server-side state accumulated during tool calls. The Streamable HTTP transport supports session tokens that allow clients to reconnect to existing sessions after temporary disconnections, preserving state without requiring full re-initialization.

The 2026 specification additions include standardized session creation, resumption, and migration. Server restarts and horizontal scaling events can be transparent to connected clients, with sessions migrating between server instances without data loss. This makes MCP suitable for production environments where server availability cannot be guaranteed by a single process.

Key Takeaway

MCP works through a three-layer architecture (host, client, server) using JSON-RPC 2.0 messaging over stdio or HTTP transports. The protocol handles capability negotiation, tool discovery, and structured invocation, letting AI models interact with external systems through a consistent, secure interface.