Claude Agent SDK: Complete Guide

Updated May 2026
The Claude Agent SDK is Anthropic's official toolkit for building autonomous AI agents powered by Claude models. Originally released as the Claude Code SDK and renamed in March 2026, it provides the same agent loop, tool infrastructure, and context management that powers Claude Code, packaged as a programmable library for Python and TypeScript developers. It offers the deepest Model Context Protocol (MCP) integration of any agent SDK on the market.

Origins and Evolution

The Claude Agent SDK grew directly out of the engineering work behind Claude Code, Anthropic's AI coding assistant. When the Anthropic team realized that the infrastructure they had built for code editing, file navigation, command execution, and multi-step reasoning had applications far beyond coding, they extracted the core agent loop into a standalone SDK. The initial release under the Claude Code SDK name happened in late 2025, with the rebrand to Claude Agent SDK following in March 2026 to better reflect its broader scope.

This lineage matters because it means the SDK's architecture has been tested against millions of real-world coding sessions before developers ever got access to it. The tool execution patterns, error recovery logic, context management strategies, and session persistence mechanisms were all refined through production usage at scale rather than designed theoretically.

Core Architecture

At its foundation, the Claude Agent SDK implements a persistent agent loop. The agent receives a task, reasons about what actions to take, executes tools, observes the results, and decides what to do next. This cycle repeats until the task is complete, the agent determines it cannot proceed, or a configured limit (turns, tokens, or time) is reached.

The SDK provides built-in tools for the most common agent operations: reading and writing files, executing shell commands, searching codebases, editing code with precision, and browsing the web. These tools are available immediately without any configuration, which means a developer can have a functional agent running within minutes of installing the SDK.

Context management is handled automatically through a compaction mechanism. As the conversation grows and approaches the model's context limit, the SDK summarizes older turns to free space while preserving the essential information the agent needs to continue working. This allows agents to handle tasks that span hundreds or even thousands of tool calls without the developer needing to implement their own context windowing logic.

MCP Integration

The Model Context Protocol is where the Claude Agent SDK truly differentiates itself. MCP is an open standard introduced by Anthropic in November 2024 that standardizes how AI systems connect to external tools and data sources. By mid-2026, MCP has been adopted by OpenAI, Google, and the broader open-source community, with over 500 publicly available MCP servers.

In the Claude Agent SDK, connecting to an MCP server requires a single configuration line. The SDK handles transport negotiation (supporting both stdio and HTTP transports), tool discovery (automatically detecting what tools the server offers), schema validation (ensuring tool inputs match the expected format), and result parsing (converting tool outputs into a format the model can understand). Pre-built MCP servers exist for Playwright (browser automation), GitHub (repository operations), Slack (messaging), PostgreSQL (database queries), filesystem operations, and hundreds of other services.

The depth of MCP integration goes beyond simple tool calling. The SDK supports MCP resources (read-only data the agent can access), MCP prompts (pre-defined interaction patterns), and MCP sampling (allowing MCP servers to request LLM completions). This makes it possible to build complex agent architectures where MCP servers provide not just tools but also contextual data and interaction templates.

Session Persistence and Resumption

One of the most practical features of the Claude Agent SDK is session persistence. When you create an agent session, the SDK assigns a unique session ID. Any subsequent queries can reference this session ID to resume exactly where the previous interaction left off, including all files read, commands executed, tool results observed, and conversation history.

This enables patterns that are difficult to implement with other SDKs. A long-running code review can be paused and resumed across multiple user interactions. A research agent can accumulate findings over many sessions without losing context. A deployment pipeline can checkpoint its progress and recover from failures without starting over.

The session data is stored on Anthropic's servers (for the cloud API) or locally (for the on-premises deployment option). The compaction mechanism ensures that sessions remain functional even as they grow very large, though developers should be aware that compaction involves summarization, which means some low-priority details from early in the session may be compressed.

Lifecycle Hooks

The SDK exposes 18 hook events that let developers intercept the agent's execution at nearly every significant point. Hooks fire before and after tool calls, before and after model completions, on session start and end, on error conditions, and at context compaction boundaries.

Common hook use cases include logging all tool calls for audit purposes, implementing approval workflows where certain tools require human confirmation before execution, tracking token usage and costs across sessions, implementing custom guardrails that go beyond the model's built-in safety features, and injecting additional context or instructions at specific points in the agent's workflow.

Hooks are configured as callback functions in Python or TypeScript. They receive the full context of the event (including the tool name, arguments, and session state) and can modify the execution flow by approving, rejecting, or altering the pending action.

Multi-Agent Coordination

The Claude Agent SDK supports multi-agent patterns through what Anthropic calls "swarms," where multiple specialized Claude agents work together on a task. A typical configuration might include a research agent that gathers information, an analysis agent that processes findings, a writing agent that produces output, and a review agent that checks quality.

Each agent in a swarm has its own system prompt, tool set, and configuration, but they share a coordination layer that manages task delegation, result aggregation, and context sharing. The SDK handles the complexity of maintaining separate conversation contexts for each agent while ensuring they can communicate results to each other.

Swarm coordination is not as structured as Google ADK's workflow-based approach, but it is more flexible. Agents can dynamically decide which specialist to delegate to based on the task at hand, rather than following a predefined execution graph. This makes swarms well-suited for open-ended tasks where the optimal workflow is not known in advance.

Pricing and Billing

The Claude Agent SDK uses the standard Claude API pricing: Haiku 4.5 at /, Sonnet 4.6 at /5, and Opus 4.7 at /5 per million input/output tokens. Agent workloads tend to consume more tokens than single-turn applications because the orchestration loop, tool call formatting, and context management all add overhead.

Starting June 15, 2026, Anthropic introduced dual-bucket billing for subscription plan holders. Claude subscription plans now include a monthly Agent SDK credit, and Agent SDK usage no longer counts toward standard plan usage limits. This makes the SDK more accessible for individual developers and small teams who want to experiment with agent development without committing to API billing from the start.

Prompt caching is a significant cost optimization tool. When the same system prompt or context block appears in multiple agent turns (which is common in agent loops), cached tokens are billed at 10% of the standard input price. The Batch API provides an additional 50% discount for workloads that can tolerate higher latency.

Limitations and Trade-offs

The most significant limitation of the Claude Agent SDK is vendor lock-in. It works exclusively with Claude models. If you need to use GPT, Gemini, or open-source models for certain tasks, you cannot do so within the same SDK. Teams that require multi-provider flexibility should consider the Vercel AI SDK or building a custom abstraction layer.

The SDK's batteries-included approach is a double-edged sword. The built-in tools are convenient but opinionated. If you need custom tool implementations that behave differently from the defaults, you may find yourself working around the SDK's assumptions rather than with them. For highly customized agent architectures, OpenAI's primitives-first approach may offer more flexibility.

Session storage on Anthropic's servers raises data residency concerns for some enterprise customers. The on-premises deployment option addresses this but adds operational complexity. Organizations with strict data sovereignty requirements should evaluate the deployment options carefully before committing to the SDK.

Key Takeaway

The Claude Agent SDK is the fastest path from zero to a working AI agent, offering production-tested infrastructure and the deepest MCP ecosystem integration, but it locks you into Claude models exclusively.