Agent Communication Protocol (ACP)
Overview
Agent Communication Protocol (ACP) is an open standard designed to facilitate seamless communication and interoperability among AI agents, applications, and humans. Developed by IBM and the BeeAI project, ACP provides a framework-agnostic protocol enabling agents built on diverse technologies (BeeAI, LangChain, CrewAI, custom code) to collaborate effectively. The protocol addresses the problem of agent interoperability through standardized RESTful communication patterns.
ACP emphasizes peer-to-peer agent communication rather than enriching a single model’s context. The architecture supports both synchronous and asynchronous interactions with multimodal messaging using MIME types. A distinctive feature is scale-to-zero support through embedded metadata enabling agent discovery even when inactive.
Key technical components covered:
- RESTful API architecture and HTTP conventions
- Message structure and multimodal MIME types
- Agent discovery and scale-to-zero capabilities
- Architecture patterns (single-agent, multi-agent, distributed)
- Authentication and security (OAuth 2.0, mTLS)
- Capability negotiation and binding
- Python SDK implementation with BeeAI
- Session management and distributed sessions
RESTful API Architecture
ACP utilizes standard HTTP conventions with well-defined REST endpoints ensuring compatibility with common web infrastructure:
REST-Based Communication employs standard HTTP methods:
- GET: Retrieve agent information, capabilities, session state
- POST: Send messages, initiate sessions, invoke agent actions
- PUT: Update session state or agent configuration
- DELETE: Terminate sessions or remove resources
Endpoint structure follows RESTful patterns:
/agents: List available agents/agents/{agent_id}: Agent-specific operations/sessions: Session management/sessions/{session_id}/messages: Message exchange within session/capabilities: Capability discovery
Content negotiation uses HTTP headers:
Content-Type: Specifies request body format (application/json, multipart/mixed)Accept: Indicates preferred response formatAuthorization: Authentication credentials- Custom headers for protocol extensions
Asynchronous and synchronous support:
Synchronous interactions use standard request-response pattern with immediate responses. Client waits for complete agent processing before receiving response. Suitable for simple queries, short-running tasks, and real-time interactions.
Asynchronous interactions employ polling or callbacks for long-running tasks:
- Initial POST returns
202 Acceptedwith status URL - Client polls status URL for completion
- Callbacks via webhook URLs for event notifications
- Suitable for complex reasoning, multi-step workflows, time-intensive operations
Streaming interactions use Server-Sent Events (SSE) over HTTP for real-time updates. Agent streams partial results as they become available. Client receives events incrementally without waiting for complete response. Enables responsive user experiences for generation tasks.
Framework agnostic design means any HTTP client can interact with ACP agents:
- cURL for command-line testing
- Postman for API exploration
- Standard HTTP libraries in any programming language
- No specialized protocol implementation required
Lightweight implementation requires minimal overhead compared to complex RPC protocols. Standard web infrastructure (load balancers, proxies, CDNs) works without modification.
Message Structure and Multimodal MIME Types
ACP messages support diverse content types through multimodal messaging:
Message structure consists of ordered sequence of parts, each representing distinct content:
Part attributes:
- content_type: MIME type specifying content format (text/plain, image/png, audio/wav, video/mp4, application/json)
- content OR content_url: Actual content inline or as external URL
- content_encoding: Encoding method (plain, base64, gzip)
- name: Optional identifier designating part as Artifact
- metadata: Optional key-value pairs providing additional context
Artifacts are specialized message parts representing significant outputs:
- Attachments referenced by name
- Citations with source attribution
- Generated files (documents, code, images)
- Named results for downstream processing
Multimodal message example:
{
"role": "agent/image-analyzer",
"parts": [
{
"content_type": "image/png",
"content": "iVBORw0KGgoAAAANSUhEUgAAAAUA...",
"content_encoding": "base64",
"name": "analyzed_image"
},
{
"content_type": "text/plain",
"content": "Image contains: cat, sofa, window. Confidence: 95%"
},
{
"content_type": "application/json",
"content": "{\"objects\": [{\"label\": \"cat\", \"confidence\": 0.97}]}",
"name": "detection_results"
}
]
}Supported MIME types include:
- Text: text/plain, text/markdown, text/html
- Images: image/jpeg, image/png, image/gif, image/webp
- Audio: audio/mp3, audio/wav, audio/ogg
- Video: video/mp4, video/webm
- Documents: application/pdf, application/msword
- Data: application/json, application/xml, application/octet-stream
- Custom: Any registered MIME type
Content handling strategies:
Inline content: Small data embedded directly in message. Base64 encoding for binary data. Efficient for small payloads avoiding external requests.
External URLs: Large content referenced by URL. Agent or resource server hosts content. Client retrieves content separately when needed. Reduces message size for large files.
Mixed approach: Small critical content inline, large supplementary content via URLs. Balances efficiency with convenience.
Extensibility: Protocol handles new MIME types without core specification changes. Agents negotiate supported content types during capability exchange. Unsupported content types gracefully handled with error responses.
Agent Discovery and Scale-to-Zero
Offline discovery through embedded metadata enables agent discovery without running instances:
Agent Manifest contains essential information embedded in distribution packages:
- Agent name and version
- Capabilities and supported operations
- Input/output specifications
- Resource requirements
- Endpoint URLs when deployed
- Authentication requirements
- Documentation and examples
Embedding mechanisms:
Container image labels: Metadata stored as OCI image annotations. Docker/Kubernetes environments read labels without running containers. Enables registry-based discovery.
Bundled metadata files: JSON/YAML files packaged with agent code. Build process synchronizes manifest with implementation. CI/CD pipelines validate consistency.
Package metadata: Python package metadata, npm package.json, Maven POM. Standard package management tools expose agent information.
Discovery process:
- Client queries registry or repository for available agents
- Retrieves embedded manifests without instantiating agents
- Evaluates agent capabilities against requirements
- Selects appropriate agent(s) for task
- Instantiates agent on-demand when needed
Scale-to-zero architecture benefits:
Resource efficiency: Agents instantiated only when needed. Zero resource consumption when idle. Cost optimization in cloud environments.
Dynamic scaling: Automatic provisioning based on demand. Kubernetes/serverless platforms leverage embedded metadata. Scale from zero to multiple instances seamlessly.
Simplified deployment: Self-contained agent packages with complete metadata. No separate configuration management. Consistent deployment across environments.
Secure environments: Discovery in air-gapped or disconnected networks. No dependency on external discovery services. Local catalogs built from embedded manifests.
Version management: Multiple agent versions discoverable simultaneously. Manifest includes version information. Clients select appropriate version for compatibility.
Use cases:
- Serverless agent deployments (AWS Lambda, Cloud Functions)
- Edge computing with intermittent connectivity
- Development environments with on-demand agent activation
- Multi-tenant platforms with per-tenant agent instances
- Secure enterprise environments with restricted network access
Architecture Patterns
ACP supports flexible deployment patterns from simple single-agent to complex distributed systems:
Single-Agent Architecture:
Client communicates directly with single agent via RESTful interface. ACP server wraps agent exposing HTTP endpoints. Agent processes requests and returns responses in ACP format.
Characteristics:
- Simplest deployment model
- One-to-one client-agent relationship
- Minimal infrastructure requirements
- Suitable for development, testing, specialized applications
Multi-Agent Single Server:
ACP server hosts multiple agents behind single HTTP endpoint. Each agent individually addressable through routing mechanism. Server uses agent metadata to determine appropriate handler.
Benefits:
- Resource efficiency through shared infrastructure
- Simplified deployment with single service
- Centralized logging and monitoring
- Consistent authentication and authorization
- Shared connection pools and caching
Routing mechanisms:
- URL path-based:
/agents/agent1,/agents/agent2 - Header-based:
X-Agent-Id: agent1 - Query parameter:
?agent=agent1
Use cases:
- Agents with similar resource requirements
- Related agents benefiting from co-location
- Development environments with multiple test agents
- Enterprise deployments with centralized management
Distributed Multi-Server Architecture:
ACP clients discover and communicate with multiple independent servers, each hosting one or more agents.
Advantages:
Scalability:
- Independent scaling of different agent types
- Load distribution across multiple servers
- Fault isolation between services
- Horizontal scaling per agent workload
Flexibility:
- Different deployment environments per agent
- Technology stack diversity (Python, TypeScript, Java)
- Independent development and deployment cycles
- Team autonomy in agent development
Resilience:
- Service failure doesn’t affect other agents
- Rolling updates without system-wide downtime
- Geographic distribution for latency optimization
Router Agent Pattern:
Central agent functions as both ACP server and client. Receives incoming requests, decomposes into sub-tasks, delegates to specialized downstream agents, aggregates responses.
Capabilities:
- Task delegation and parallelism
- Specialization routing (route tasks to best-suited agents)
- Complex workflow orchestration
- Response aggregation and synthesis
Implementation:
Client → Router Agent → [Specialized Agent 1]
→ [Specialized Agent 2]
→ [Specialized Agent 3]
← Aggregated Response ←
Use cases:
- Multi-step workflows requiring different expertise
- Parallel processing of independent sub-tasks
- Orchestration of heterogeneous agent ecosystem
- Enterprise integration hubs
Distributed Sessions:
Sessions span multiple independent server instances without shared infrastructure. Session descriptors in protocol itself reference content on arbitrary resource servers.
Architecture:
- Each ACP server manages its own session version
- Session content referenced by HTTP URLs
- Arbitrary resource servers store content
- No centralized session state required
Benefits:
- Fault tolerance (no single point of failure)
- Scalability (no centralized bottleneck)
- Flexibility (any HTTP-accessible storage)
- Cloud-native design (serverless compatible)
Authentication and Security
ACP implements comprehensive security through OAuth 2.0 and Mutual TLS (mTLS):
OAuth 2.0 Implementation:
Authorization flows support various scenarios:
- Authorization Code Flow: Web applications with backend servers
- Client Credentials Flow: Service-to-service communication
- Device Code Flow: IoT devices and limited-input scenarios
- PKCE Extension: Mobile and single-page applications
Token management:
- Access tokens: Short-lived (minutes to hours) for API requests
- Refresh tokens: Long-lived for obtaining new access tokens
- Token rotation: Refresh tokens invalidated after use
- Token revocation: Immediate invalidation on logout or compromise
Security best practices:
- Store tokens securely encrypted at rest
- Never expose tokens in logs or URLs
- Implement token expiration monitoring
- Use HTTPS exclusively for token transmission
Mutual TLS (mTLS) Authentication:
Client certificate authentication requires both parties present X.509 certificates:
Client side:
- Generate public-private key pair
- Obtain signed certificate from trusted CA
- Present certificate during TLS handshake
- Use private key to prove possession
Server side:
- Verify client certificate against trusted CA
- Validate certificate expiration and revocation
- Extract client identity from certificate subject
- Enforce certificate-based access policies
Certificate-bound access tokens:
- Access tokens bound to client’s certificate
- Token includes confirmation claim (
cnf) with certificate digest - Resource servers verify both token and certificate match
- Stolen tokens unusable without corresponding private key
Implementation steps:
- Client certificate generation and registration
- Authorization server mTLS configuration
- Token issuance with certificate binding
- Resource server validation of token and certificate
Benefits:
- Strong client authentication beyond passwords
- Protection against token theft and replay attacks
- Suitable for high-security environments
- Compliance with regulatory requirements
Additional security measures:
Input validation: Sanitize all inputs preventing injection attacks.
Rate limiting: Protect against denial-of-service and brute-force attacks.
Audit logging: Record all authentication attempts and API access for forensics.
Network security: Enforce HTTPS, implement IP whitelisting, use VPNs for sensitive deployments.
Capability Negotiation and Binding
Agent Capability Negotiation and Binding Protocol (ACNBP) provides structured framework for secure agent interactions:
10-step negotiation process:
Capability discovery: Agents publish available capabilities through discovery mechanisms. Includes supported operations, input/output formats, resource requirements, performance characteristics.
Candidate pre-screening: Requesting agent filters potential partners based on published capabilities. Eliminates obviously unsuitable agents early. Reduces negotiation overhead.
Candidate selection: Narrow candidates to shortlist based on more detailed criteria. May involve scoring algorithms or user preferences. Prepares for active negotiation.
Secure negotiation phases: Multi-round negotiation establishing terms of interaction:
- Phase 1: Exchange detailed capability specifications
- Phase 2: Negotiate parameters (timeout, retry policy, error handling)
- Phase 3: Agree on communication protocols and data formats
- Phase 4: Establish security parameters (authentication, encryption)
Digital signatures: Each negotiation message cryptographically signed ensuring:
- Message authenticity (sender verification)
- Message integrity (tampering detection)
- Non-repudiation (sender cannot deny)
Capability attestation: Agents provide verifiable proof of capabilities:
- Cryptographic attestation of supported operations
- Performance guarantees with SLA commitments
- Resource availability declarations
- Compliance and certification proofs
Binding commitment: Final agreement between agents cryptographically bound:
- Binding document includes all negotiated terms
- Both agents sign binding creating mutual commitment
- Binding stored for audit and dispute resolution
- Violation of binding detectable and provable
Protocol extension mechanism: Enables backward-compatible evolution:
- New capability types added without breaking existing agents
- Version negotiation ensures compatibility
- Agents declare supported protocol versions
- Fallback to common subset when versions differ
Security analysis: ACNBP evaluated using MAESTRO threat modeling framework addressing:
- Impersonation attacks (prevented by digital signatures)
- Capability forgery (prevented by attestation)
- Binding repudiation (prevented by cryptographic commitment)
- Protocol downgrade attacks (prevented by version negotiation)
Use case example - Document Translation:
- Translation requester discovers translation agents
- Pre-screens for source/target language support
- Selects candidates based on quality metrics
- Negotiates translation parameters (terminology, style)
- Verifies capability attestations (translation accuracy)
- Establishes secure communication channel
- Commits to binding with agreed terms
- Executes translation with verified capabilities
- Validates results against commitment
- Finalizes transaction with proof of completion
Python SDK and BeeAI Implementation
Official Python SDK provides comprehensive ACP implementation through BeeAI:
Installation:
pip install acp-sdk beeai-sdk
# For framework integration
pip install beeai-framework
pip install 'beeai-framework[acp]'Server implementation:
import asyncio
from acp.server.highlevel import Server, Context
from beeai_sdk.providers.agent import run_agent_provider
from beeai_sdk.schemas.text import TextOutput, TextInput
def main():
server = Server("my-server")
@server.agent(
"hello-world",
"This is my Hello World agent",
input=TextInput,
output=TextOutput,
)
async def run_agent(input: TextInput, ctx: Context) -> TextOutput:
return TextOutput(text=f"Hi there {input.text}")
asyncio.run(run_agent_provider(server))Server components:
- Server class: Manages HTTP endpoints and routing
- @server.agent decorator: Registers agents with metadata
- Context object: Provides access to session state and request information
- Input/Output schemas: Type-safe message handling with Pydantic
Client implementation:
import asyncio
from acp import ClientSession
from acp.client.sse import sse_client
async def run_client():
async with sse_client(url="http://localhost:8000/sse") as streams:
async with ClientSession(*streams) as session:
await session.initialize()
resp = await session.list_agents()
print("Available agents:", [agent.name for agent in resp.agents])
resp = await session.run_agent("hello-world", {"text": "Bee"})
print("Agent:", resp.output["text"])
asyncio.run(run_client())Client features:
- SSE client: Server-Sent Events for streaming responses
- ClientSession: Manages connection lifecycle
- Agent discovery: List available agents dynamically
- Agent invocation: Type-safe agent execution
- Async/await: Native asyncio support for concurrent operations
BeeAI Framework integration:
import asyncio
from beeai_framework.adapters.acp.agents import ACPAgent
from beeai_framework.memory.unconstrained_memory import UnconstrainedMemory
from examples.helpers.io import ConsoleReader
async def main():
reader = ConsoleReader()
agent = ACPAgent(
agent_name="chat",
url="http://127.0.0.1:8001",
memory=UnconstrainedMemory()
)
for prompt in reader:
response = await agent.run(prompt).on(
"update",
lambda data, event: reader.write("Agent 🤖 (debug) : ", data),
)
reader.write("Agent 🤖 : ", response.last_message.text)
asyncio.run(main())Framework features:
- ACPAgent: High-level agent abstraction
- Memory systems: Various memory implementations (unconstrained, windowed, semantic)
- Event handling: Subscribe to agent events (update, error, complete)
- Error handling: Structured exception handling with FrameworkError
SDK capabilities:
- Automatic serialization/deserialization of messages
- Type validation using Pydantic schemas
- Built-in error handling and retry logic
- Session management and state persistence
- Multimodal content handling
- Streaming response support
- Agent discovery and metadata access
Deployment patterns:
- Local development server
- Docker containerized agents
- Kubernetes deployments
- Serverless functions (AWS Lambda, Cloud Functions)
- Edge computing environments
Session Management and Distributed Sessions
Session management enables stateful multi-turn conversations:
Session lifecycle:
- Creation: Client initiates session with initial message or empty state
- Interaction: Multiple message exchanges within session context
- State persistence: Session state maintained across interactions
- Termination: Explicit close or timeout-based cleanup
Session state includes:
- Conversation history: Ordered sequence of messages
- Context variables: Key-value pairs for application state
- Agent memory: Working memory for reasoning processes
- Artifacts: Generated files, images, or documents
- Metadata: Session creation time, participant information, configuration
Local sessions: State stored in server memory or local database. Simple implementation for single-server deployments. Suitable for development and small-scale applications.
Distributed sessions: State distributed across multiple servers without shared infrastructure.
Implementation approach:
Session descriptors: Protocol-level objects describing session state:
- Session ID and version
- Participant agents and roles
- Message references (URLs to messages on resource servers)
- State references (URLs to state snapshots)
- Coordination metadata
Resource servers: Arbitrary HTTP-accessible storage for session content:
- Cloud object storage (S3, GCS, Azure Blob)
- CDN with caching
- Specialized document stores
- Agent-local storage with HTTP access
Session synchronization:
- Each agent maintains its own session view
- Session updates published with descriptors
- Agents fetch content from resource servers as needed
- Eventually consistent model (no strong consistency required)
Benefits:
- Fault tolerance: No single point of failure; agents can continue if some servers unavailable
- Scalability: No centralized session store bottleneck; horizontal scaling unlimited
- Flexibility: Any HTTP storage works; no specific database required
- Cost efficiency: Leverage cheap object storage for session data
- Geographic distribution: Content stored near users for latency optimization
Consistency model:
- Eventual consistency for session state
- Conflict resolution through timestamp ordering
- Optimistic updates with retry on conflict
- Explicit synchronization points when needed
Security considerations:
- Session IDs as capability tokens (knowledge grants access)
- Content URLs with time-limited signed URLs
- Encryption of sensitive session content
- Access control on resource servers
This distributed approach enables cloud-native, scalable ACP deployments without centralized coordination infrastructure.