Docker Compose for AI Agents Explained
Why Compose Matters for Agent Stacks
A production AI agent is never a single process. At minimum you need the agent runtime, a database for state persistence, and typically a model endpoint. More mature deployments add vector databases for RAG, message queues for task distribution, caching layers for performance, and observability tools for monitoring. Managing these services individually means remembering startup order, configuring network connections manually, tracking which ports each service uses, and coordinating shutdowns. Compose eliminates all of this operational friction by encoding the entire stack in a single file.
The declarative nature of Compose is its primary advantage. Instead of writing imperative scripts that run containers in sequence, you describe what your stack looks like: which services exist, how they connect, what resources they need, and in what order they should start. Compose reads this description and handles the orchestration details. If a service needs to wait for a database to be ready before starting, you declare that dependency rather than coding a polling loop in a startup script.
The Compose File Structure
A Compose file (compose.yaml) organizes configuration into several top-level sections. The services section is the core, defining each container in your stack with its image, ports, volumes, environment variables, dependencies, and resource limits. The volumes section declares named volumes for persistent data. The networks section defines custom networks if you need more isolation than the default bridge network provides. Recent versions also support a models section specifically for declaring AI models as infrastructure resources.
Each service definition specifies either an image to pull from a registry or a build context to build from a Dockerfile. For AI agent stacks, you typically build custom images for your agent runtime (since it contains your application code) and pull pre-built images for infrastructure services like PostgreSQL, Redis, Qdrant, and Ollama. This hybrid approach gives you full control over your application while leveraging battle-tested images for common infrastructure.
Service Dependencies and Startup Order
The depends_on directive tells Compose which services must start before others. For AI agents, the typical dependency chain is: databases start first, then model servers, then the agent runtime. However, depends_on by default only waits for the container to start, not for the service inside it to be ready. A PostgreSQL container might start in one second but take five seconds to initialize its database and accept connections.
Health check conditions solve this problem. You define a healthcheck on each service that verifies it is actually ready to handle requests, then use depends_on with condition: service_healthy to wait for that check to pass. For PostgreSQL, the health check runs pg_isready. For a model server like Ollama, it checks the HTTP endpoint. For Redis, it runs redis-cli ping. This ensures your agent does not start until every service it depends on is genuinely ready to handle requests.
Model servers require special attention for startup dependencies because large language models can take 60 to 120 seconds to load into GPU memory. Without a generous start_period on the health check, Compose marks the model server as unhealthy before it finishes loading. Setting start_period: 120s gives the model time to load before health check failures count. This is one of the most common misconfiguration issues in AI agent Compose stacks.
Environment Variables and Configuration
Environment variables are the primary mechanism for configuring services in Compose. Each service can define variables inline in the environment section or reference an external file through env_file. For AI agents, environment variables typically configure the model endpoint URL, API keys for cloud model providers, database connection strings, logging levels, and agent-specific parameters like temperature settings and tool timeout values.
The env_file approach is preferable for secrets because it keeps sensitive values out of the Compose file itself, which is usually committed to version control. A .env file in the same directory as your Compose file is loaded automatically for variable substitution within the Compose file, but service-level env_file directives point to files whose contents are injected directly into the container environment. Keep your .env file in .gitignore and provide a .env.example template with placeholder values so team members know which variables to set.
Multi-File Overrides
Compose supports merging multiple configuration files, which lets you maintain separate settings for development, staging, and production without duplicating your service definitions. The base compose.yaml defines services and their relationships. A compose.override.yaml is automatically merged when you run docker compose up, making it ideal for development-specific settings like source code mounts, debug ports, and verbose logging.
For production, you explicitly specify a production override file: docker compose -f compose.yaml -f compose.prod.yaml up. The production file typically adds resource limits, restart policies, production-grade health checks, and replaces development environment variables with production values. This approach keeps one source of truth for service architecture while allowing environment-specific tuning without modifying the core configuration.
The Models Section
Docker Compose now supports a models top-level element that declares AI models as first-class infrastructure. Instead of manually configuring a model server, downloading model weights, and wiring environment variables, you declare the model name and provider in the Compose file and bind it to your agent service. Compose handles model pulling, server startup, and endpoint configuration.
This feature integrates with Docker Model Runner, which runs models natively on the host GPU rather than inside a container. The model service exposes an OpenAI-compatible API endpoint that your agent service connects to through the internal Docker network. This approach provides better GPU performance than running models inside containers, simpler model lifecycle management, and automatic model caching at the host level.
Common Commands
The most frequently used Compose commands for AI agent work are docker compose up -d (start all services in detached mode), docker compose down (stop and remove all containers), docker compose logs -f service_name (follow logs for a specific service), docker compose ps (show service status and health), docker compose exec service_name bash (open a shell inside a running container), and docker compose build (rebuild images after code changes). The --build flag on up combines building and starting in one command, which is useful during development when you are frequently changing application code.
Docker Compose turns the complexity of multi-service AI agent stacks into a single declarative file. Master service dependencies with health checks, environment-based configuration, and multi-file overrides, and you can manage development through production with the same fundamental tooling.