Tool Integration: Connecting Agents to Services
How Tool Definitions Work
Every tool available to an agent is defined by a structured schema that the language model reads on each turn. The schema includes the tool name, a natural language description of what it does, and a JSON Schema specification of the parameters it accepts. The model uses this information to decide when to call the tool and what arguments to pass.
The description is the most important part of the schema because it guides the model decision-making. A description like "Search the web" is too vague. The model does not know what kind of searches this tool is good at, what format the results come in, or when to prefer it over other information-gathering tools. A better description is "Search the web using Google and return the top 10 results with titles, URLs, and snippets. Best for current events, recent data, and topics not covered in the training data." This gives the model enough context to use the tool appropriately.
Parameter schemas use JSON Schema types (string, number, boolean, array, object) with descriptions for each parameter. Required parameters are marked explicitly. Default values help the model make reasonable choices without specifying every parameter. Enum constraints limit parameters to a set of valid values, preventing the model from generating invalid inputs. Well-designed schemas dramatically reduce tool call errors because the model has clear constraints on what inputs are valid.
The Tool Execution Pipeline
When the model generates a tool call, it passes through a multi-stage pipeline before execution. First, the runtime parses the tool call from the model response, extracting the tool name and parameters. Second, it validates the parameters against the tool schema, checking types, required fields, and constraints. Third, it applies any security policies, verifying that the agent has permission to use this tool with these parameters. Fourth, it executes the tool function with the validated parameters. Fifth, it formats the result and adds it to the conversation context for the next model turn.
Validation is critical because models occasionally generate invalid tool calls. A model might pass a string where a number is expected, omit a required parameter, or use a parameter name that does not exist in the schema. Catching these errors at the validation stage prevents them from reaching the external service, where they would cause cryptic failures or unintended behavior. Good validation returns clear error messages that help the model correct its call on the next turn.
Parallel tool execution improves performance for agents that need multiple independent pieces of information. If the model generates three tool calls in a single response (searching the web, querying a database, and reading a file), the runtime can execute all three simultaneously rather than sequentially. The results are all returned together, saving two round trips to the model. Modern APIs from Anthropic and OpenAI support multiple tool calls per response, and production runtimes take advantage of this to reduce latency.
Common Tool Categories
Information retrieval tools help agents gather data from external sources. Web search, knowledge base queries, API lookups, and document retrieval all fall into this category. These tools are typically read-only and safe to execute without approval, since they do not modify any external state.
Computation tools perform calculations, run code, analyze data, and transform information. Code execution sandboxes, data analysis libraries, and mathematical calculators let agents process information in ways that language models cannot do natively. Code execution is particularly powerful because it lets the agent write and run programs to solve novel problems, but it requires strong sandboxing to prevent security issues.
Action tools modify external state. Sending emails, creating database records, deploying code, updating CRM entries, and publishing content are all action tools. These tools require careful access control because their effects are often irreversible. Production agent systems typically require human approval for action tools or limit them to specific contexts.
Communication tools enable the agent to interact with humans or other agents. Sending messages, creating tickets, updating status dashboards, and posting to collaboration platforms keep stakeholders informed about the agent progress and let them provide guidance when needed.
The Model Context Protocol (MCP)
MCP is a standardized protocol for tool discovery and invocation across different platforms and providers. Instead of each agent framework defining its own tool format, MCP provides a common interface that tool providers can implement once and agent frameworks can consume universally. This standardization is similar to what USB did for hardware peripherals: before USB, every device had its own connector and driver. MCP aims to create a universal connector for AI agent tools.
With MCP, a tool provider publishes an MCP server that describes its available tools and handles execution. An agent framework connects to MCP servers as clients, discovering available tools and invoking them through the standardized protocol. The agent can connect to multiple MCP servers simultaneously, aggregating tools from different providers into a single unified tool set.
Tool Design Best Practices
Effective tool design follows several principles. Tools should do one thing well rather than combining multiple operations. A tool that searches a database AND formats the results as HTML is harder for the model to use correctly than two separate tools. Each tool should have a clear, predictable input-output relationship so the model can anticipate what it will get back.
Error messages from tools should be descriptive and actionable. "Error: 404" tells the model nothing. "Error: User not found. No user exists with email john@example.com. Try searching by username or user ID instead." gives the model enough information to self-correct on the next turn.
Result formatting matters because the model needs to process tool results efficiently. Returning raw JSON with hundreds of fields wastes context tokens on irrelevant data. Returning a focused summary of the most relevant fields saves tokens and helps the model focus on what matters. The best tool results include exactly the information the model is likely to need, presented in a format that is easy to parse and reason about.
Tool Security and Access Control
Not every tool should be available to every agent in every context. Tool access control defines which agents can use which tools, with what parameters, and under what conditions. A customer support agent should not have access to tools that modify production databases. An internal analysis agent should not have access to tools that send external communications. These boundaries prevent both accidental and malicious misuse of agent capabilities.
Permission levels create graduated access. Read-only tools (search, query, read) carry lower risk and can be made available broadly. Write tools (create, update, send) carry higher risk and should be restricted to agents that specifically need them. Delete tools (remove, revoke, cancel) carry the highest risk and should require additional authorization or human approval before execution.
Parameter-level restrictions add another layer of control. An agent might have access to the send_email tool but only be allowed to send emails to internal addresses, not external ones. An agent might have access to the database_query tool but only be allowed to query specific tables, not the entire database. These fine-grained restrictions limit the blast radius of tool misuse while still allowing the agent to perform its intended function.
Rate limiting prevents agents from making excessive tool calls, whether due to bugs, loops, or adversarial inputs. A rate limit might cap an agent at 10 web searches per minute, 100 database queries per task, or 5 email sends per hour. These limits protect external services from overload, prevent runaway costs, and serve as an early warning system for agent behavior anomalies.
Tool integration quality is often the bottleneck in agent performance. Well-designed tools with clear descriptions, strict schemas, informative error messages, and focused results enable agents to work efficiently. Poorly designed tools force the agent to guess, retry, and waste resources on avoidable errors.