Agent Communication Protocol (ACP)

Overview

Agent Communication Protocol (ACP) is an open standard designed to facilitate seamless communication and interoperability among AI agents, applications, and humans. Developed by IBM and the BeeAI project, ACP provides a framework-agnostic protocol enabling agents built on diverse technologies (BeeAI, LangChain, CrewAI, custom code) to collaborate effectively. The protocol addresses the problem of agent interoperability through standardized RESTful communication patterns.

ACP emphasizes peer-to-peer agent communication rather than enriching a single model’s context. The architecture supports both synchronous and asynchronous interactions with multimodal messaging using MIME types. A distinctive feature is scale-to-zero support through embedded metadata enabling agent discovery even when inactive.

Key technical components covered:

RESTful API architecture and HTTP conventions
Message structure and multimodal MIME types
Agent discovery and scale-to-zero capabilities
Architecture patterns (single-agent, multi-agent, distributed)
Authentication and security (OAuth 2.0, mTLS)
Capability negotiation and binding
Python SDK implementation with BeeAI
Session management and distributed sessions

RESTful API Architecture

ACP utilizes standard HTTP conventions with well-defined REST endpoints ensuring compatibility with common web infrastructure:

REST-Based Communication employs standard HTTP methods:

GET: Retrieve agent information, capabilities, session state
POST: Send messages, initiate sessions, invoke agent actions
PUT: Update session state or agent configuration
DELETE: Terminate sessions or remove resources

Endpoint structure follows RESTful patterns:

/agents: List available agents
/agents/{agent_id}: Agent-specific operations
/sessions: Session management
/sessions/{session_id}/messages: Message exchange within session
/capabilities: Capability discovery

Content negotiation uses HTTP headers:

Content-Type: Specifies request body format (application/json, multipart/mixed)
Accept: Indicates preferred response format
Authorization: Authentication credentials
Custom headers for protocol extensions

Asynchronous and synchronous support:

Synchronous interactions use standard request-response pattern with immediate responses. Client waits for complete agent processing before receiving response. Suitable for simple queries, short-running tasks, and real-time interactions.

Asynchronous interactions employ polling or callbacks for long-running tasks:

Initial POST returns 202 Accepted with status URL
Client polls status URL for completion
Callbacks via webhook URLs for event notifications
Suitable for complex reasoning, multi-step workflows, time-intensive operations

Streaming interactions use Server-Sent Events (SSE) over HTTP for real-time updates. Agent streams partial results as they become available. Client receives events incrementally without waiting for complete response. Enables responsive user experiences for generation tasks.

Framework agnostic design means any HTTP client can interact with ACP agents:

cURL for command-line testing
Postman for API exploration
Standard HTTP libraries in any programming language
No specialized protocol implementation required

Lightweight implementation requires minimal overhead compared to complex RPC protocols. Standard web infrastructure (load balancers, proxies, CDNs) works without modification.

Message Structure and Multimodal MIME Types

ACP messages support diverse content types through multimodal messaging:

Message structure consists of ordered sequence of parts, each representing distinct content:

Part attributes:

content_type: MIME type specifying content format (text/plain, image/png, audio/wav, video/mp4, application/json)
content OR content_url: Actual content inline or as external URL
content_encoding: Encoding method (plain, base64, gzip)
name: Optional identifier designating part as Artifact
metadata: Optional key-value pairs providing additional context

Artifacts are specialized message parts representing significant outputs:

Attachments referenced by name
Citations with source attribution
Generated files (documents, code, images)
Named results for downstream processing

Multimodal message example:

{
  "role": "agent/image-analyzer",
  "parts": [
    {
      "content_type": "image/png",
      "content": "iVBORw0KGgoAAAANSUhEUgAAAAUA...",
      "content_encoding": "base64",
      "name": "analyzed_image"
    },
    {
      "content_type": "text/plain",
      "content": "Image contains: cat, sofa, window. Confidence: 95%"
    },
    {
      "content_type": "application/json",
      "content": "{\"objects\": [{\"label\": \"cat\", \"confidence\": 0.97}]}",
      "name": "detection_results"
    }
  ]
}

Supported MIME types include:

Text: text/plain, text/markdown, text/html
Images: image/jpeg, image/png, image/gif, image/webp
Audio: audio/mp3, audio/wav, audio/ogg
Video: video/mp4, video/webm
Documents: application/pdf, application/msword
Data: application/json, application/xml, application/octet-stream
Custom: Any registered MIME type

Content handling strategies:

Inline content: Small data embedded directly in message. Base64 encoding for binary data. Efficient for small payloads avoiding external requests.

External URLs: Large content referenced by URL. Agent or resource server hosts content. Client retrieves content separately when needed. Reduces message size for large files.

Mixed approach: Small critical content inline, large supplementary content via URLs. Balances efficiency with convenience.

Extensibility: Protocol handles new MIME types without core specification changes. Agents negotiate supported content types during capability exchange. Unsupported content types gracefully handled with error responses.

Agent Discovery and Scale-to-Zero

Offline discovery through embedded metadata enables agent discovery without running instances:

Agent Manifest contains essential information embedded in distribution packages:

Agent name and version
Capabilities and supported operations
Input/output specifications
Resource requirements
Endpoint URLs when deployed
Authentication requirements
Documentation and examples

Embedding mechanisms:

Container image labels: Metadata stored as OCI image annotations. Docker/Kubernetes environments read labels without running containers. Enables registry-based discovery.

Bundled metadata files: JSON/YAML files packaged with agent code. Build process synchronizes manifest with implementation. CI/CD pipelines validate consistency.

Package metadata: Python package metadata, npm package.json, Maven POM. Standard package management tools expose agent information.

Discovery process:

Client queries registry or repository for available agents
Retrieves embedded manifests without instantiating agents
Evaluates agent capabilities against requirements
Selects appropriate agent(s) for task
Instantiates agent on-demand when needed

Scale-to-zero architecture benefits:

Resource efficiency: Agents instantiated only when needed. Zero resource consumption when idle. Cost optimization in cloud environments.

Dynamic scaling: Automatic provisioning based on demand. Kubernetes/serverless platforms leverage embedded metadata. Scale from zero to multiple instances seamlessly.

Simplified deployment: Self-contained agent packages with complete metadata. No separate configuration management. Consistent deployment across environments.

Secure environments: Discovery in air-gapped or disconnected networks. No dependency on external discovery services. Local catalogs built from embedded manifests.

Version management: Multiple agent versions discoverable simultaneously. Manifest includes version information. Clients select appropriate version for compatibility.

Use cases:

Serverless agent deployments (AWS Lambda, Cloud Functions)
Edge computing with intermittent connectivity
Development environments with on-demand agent activation
Multi-tenant platforms with per-tenant agent instances
Secure enterprise environments with restricted network access

Architecture Patterns

ACP supports flexible deployment patterns from simple single-agent to complex distributed systems:

Single-Agent Architecture:

Client communicates directly with single agent via RESTful interface. ACP server wraps agent exposing HTTP endpoints. Agent processes requests and returns responses in ACP format.

Characteristics:

Simplest deployment model
One-to-one client-agent relationship
Minimal infrastructure requirements
Suitable for development, testing, specialized applications

Multi-Agent Single Server:

ACP server hosts multiple agents behind single HTTP endpoint. Each agent individually addressable through routing mechanism. Server uses agent metadata to determine appropriate handler.

Benefits:

Resource efficiency through shared infrastructure
Simplified deployment with single service
Centralized logging and monitoring
Consistent authentication and authorization
Shared connection pools and caching

Routing mechanisms:

URL path-based: /agents/agent1, /agents/agent2
Header-based: X-Agent-Id: agent1
Query parameter: ?agent=agent1

Use cases:

Agents with similar resource requirements
Related agents benefiting from co-location
Development environments with multiple test agents
Enterprise deployments with centralized management

Distributed Multi-Server Architecture:

ACP clients discover and communicate with multiple independent servers, each hosting one or more agents.

Advantages:

Scalability:

Independent scaling of different agent types
Load distribution across multiple servers
Fault isolation between services
Horizontal scaling per agent workload

Flexibility:

Different deployment environments per agent
Technology stack diversity (Python, TypeScript, Java)
Independent development and deployment cycles
Team autonomy in agent development

Resilience:

Service failure doesn’t affect other agents
Rolling updates without system-wide downtime
Geographic distribution for latency optimization

Router Agent Pattern:

Central agent functions as both ACP server and client. Receives incoming requests, decomposes into sub-tasks, delegates to specialized downstream agents, aggregates responses.

Capabilities:

Task delegation and parallelism
Specialization routing (route tasks to best-suited agents)
Complex workflow orchestration
Response aggregation and synthesis

Implementation:

Client → Router Agent → [Specialized Agent 1]
                      → [Specialized Agent 2]
                      → [Specialized Agent 3]
         ← Aggregated Response ←

Use cases:

Multi-step workflows requiring different expertise
Parallel processing of independent sub-tasks
Orchestration of heterogeneous agent ecosystem
Enterprise integration hubs

Distributed Sessions:

Sessions span multiple independent server instances without shared infrastructure. Session descriptors in protocol itself reference content on arbitrary resource servers.

Architecture:

Each ACP server manages its own session version
Session content referenced by HTTP URLs
Arbitrary resource servers store content
No centralized session state required

Benefits:

Fault tolerance (no single point of failure)
Scalability (no centralized bottleneck)
Flexibility (any HTTP-accessible storage)
Cloud-native design (serverless compatible)

Authentication and Security

ACP implements comprehensive security through OAuth 2.0 and Mutual TLS (mTLS):

OAuth 2.0 Implementation:

Authorization flows support various scenarios:

Authorization Code Flow: Web applications with backend servers
Client Credentials Flow: Service-to-service communication
Device Code Flow: IoT devices and limited-input scenarios
PKCE Extension: Mobile and single-page applications

Token management:

Access tokens: Short-lived (minutes to hours) for API requests
Refresh tokens: Long-lived for obtaining new access tokens
Token rotation: Refresh tokens invalidated after use
Token revocation: Immediate invalidation on logout or compromise

Security best practices:

Store tokens securely encrypted at rest
Never expose tokens in logs or URLs
Implement token expiration monitoring
Use HTTPS exclusively for token transmission

Mutual TLS (mTLS) Authentication:

Client certificate authentication requires both parties present X.509 certificates:

Client side:

Generate public-private key pair
Obtain signed certificate from trusted CA
Present certificate during TLS handshake
Use private key to prove possession

Server side:

Verify client certificate against trusted CA
Validate certificate expiration and revocation
Extract client identity from certificate subject
Enforce certificate-based access policies

Certificate-bound access tokens:

Access tokens bound to client’s certificate
Token includes confirmation claim (cnf) with certificate digest
Resource servers verify both token and certificate match
Stolen tokens unusable without corresponding private key

Implementation steps:

Client certificate generation and registration
Authorization server mTLS configuration
Token issuance with certificate binding
Resource server validation of token and certificate

Benefits:

Strong client authentication beyond passwords
Protection against token theft and replay attacks
Suitable for high-security environments
Compliance with regulatory requirements

Additional security measures:

Input validation: Sanitize all inputs preventing injection attacks.

Rate limiting: Protect against denial-of-service and brute-force attacks.

Audit logging: Record all authentication attempts and API access for forensics.

Network security: Enforce HTTPS, implement IP whitelisting, use VPNs for sensitive deployments.

Capability Negotiation and Binding

Agent Capability Negotiation and Binding Protocol (ACNBP) provides structured framework for secure agent interactions:

10-step negotiation process:

Capability discovery: Agents publish available capabilities through discovery mechanisms. Includes supported operations, input/output formats, resource requirements, performance characteristics.

Candidate pre-screening: Requesting agent filters potential partners based on published capabilities. Eliminates obviously unsuitable agents early. Reduces negotiation overhead.

Candidate selection: Narrow candidates to shortlist based on more detailed criteria. May involve scoring algorithms or user preferences. Prepares for active negotiation.

Secure negotiation phases: Multi-round negotiation establishing terms of interaction:

Phase 1: Exchange detailed capability specifications
Phase 2: Negotiate parameters (timeout, retry policy, error handling)
Phase 3: Agree on communication protocols and data formats
Phase 4: Establish security parameters (authentication, encryption)

Digital signatures: Each negotiation message cryptographically signed ensuring:

Message authenticity (sender verification)
Message integrity (tampering detection)
Non-repudiation (sender cannot deny)

Capability attestation: Agents provide verifiable proof of capabilities:

Cryptographic attestation of supported operations
Performance guarantees with SLA commitments
Resource availability declarations
Compliance and certification proofs

Binding commitment: Final agreement between agents cryptographically bound:

Binding document includes all negotiated terms
Both agents sign binding creating mutual commitment
Binding stored for audit and dispute resolution
Violation of binding detectable and provable

Protocol extension mechanism: Enables backward-compatible evolution:

New capability types added without breaking existing agents
Version negotiation ensures compatibility
Agents declare supported protocol versions
Fallback to common subset when versions differ

Security analysis: ACNBP evaluated using MAESTRO threat modeling framework addressing:

Impersonation attacks (prevented by digital signatures)
Capability forgery (prevented by attestation)
Binding repudiation (prevented by cryptographic commitment)
Protocol downgrade attacks (prevented by version negotiation)

Use case example - Document Translation:

Translation requester discovers translation agents
Pre-screens for source/target language support
Selects candidates based on quality metrics
Negotiates translation parameters (terminology, style)
Verifies capability attestations (translation accuracy)
Establishes secure communication channel
Commits to binding with agreed terms
Executes translation with verified capabilities
Validates results against commitment
Finalizes transaction with proof of completion

Python SDK and BeeAI Implementation

Official Python SDK provides comprehensive ACP implementation through BeeAI:

Installation:

pip install acp-sdk beeai-sdk
# For framework integration
pip install beeai-framework
pip install 'beeai-framework[acp]'

Server implementation:

import asyncio
from acp.server.highlevel import Server, Context
from beeai_sdk.providers.agent import run_agent_provider
from beeai_sdk.schemas.text import TextOutput, TextInput
 
def main():
    server = Server("my-server")
 
    @server.agent(
        "hello-world",
        "This is my Hello World agent",
        input=TextInput,
        output=TextOutput,
    )
    async def run_agent(input: TextInput, ctx: Context) -> TextOutput:
        return TextOutput(text=f"Hi there {input.text}")
 
    asyncio.run(run_agent_provider(server))

Server components:

Server class: Manages HTTP endpoints and routing
@server.agent decorator: Registers agents with metadata
Context object: Provides access to session state and request information
Input/Output schemas: Type-safe message handling with Pydantic

Client implementation:

import asyncio
from acp import ClientSession
from acp.client.sse import sse_client
 
async def run_client():
    async with sse_client(url="http://localhost:8000/sse") as streams:
        async with ClientSession(*streams) as session:
            await session.initialize()
 
            resp = await session.list_agents()
            print("Available agents:", [agent.name for agent in resp.agents])
 
            resp = await session.run_agent("hello-world", {"text": "Bee"})
            print("Agent:", resp.output["text"])
 
asyncio.run(run_client())

Client features:

SSE client: Server-Sent Events for streaming responses
ClientSession: Manages connection lifecycle
Agent discovery: List available agents dynamically
Agent invocation: Type-safe agent execution
Async/await: Native asyncio support for concurrent operations

BeeAI Framework integration:

import asyncio
from beeai_framework.adapters.acp.agents import ACPAgent
from beeai_framework.memory.unconstrained_memory import UnconstrainedMemory
from examples.helpers.io import ConsoleReader
 
async def main():
    reader = ConsoleReader()
 
    agent = ACPAgent(
        agent_name="chat",
        url="http://127.0.0.1:8001",
        memory=UnconstrainedMemory()
    )
 
    for prompt in reader:
        response = await agent.run(prompt).on(
            "update",
            lambda data, event: reader.write("Agent 🤖 (debug) : ", data),
        )
 
        reader.write("Agent 🤖 : ", response.last_message.text)
 
asyncio.run(main())

Framework features:

ACPAgent: High-level agent abstraction
Memory systems: Various memory implementations (unconstrained, windowed, semantic)
Event handling: Subscribe to agent events (update, error, complete)
Error handling: Structured exception handling with FrameworkError

SDK capabilities:

Automatic serialization/deserialization of messages
Type validation using Pydantic schemas
Built-in error handling and retry logic
Session management and state persistence
Multimodal content handling
Streaming response support
Agent discovery and metadata access

Deployment patterns:

Local development server
Docker containerized agents
Kubernetes deployments
Serverless functions (AWS Lambda, Cloud Functions)
Edge computing environments

Session Management and Distributed Sessions

Session management enables stateful multi-turn conversations:

Session lifecycle:

Creation: Client initiates session with initial message or empty state
Interaction: Multiple message exchanges within session context
State persistence: Session state maintained across interactions
Termination: Explicit close or timeout-based cleanup

Session state includes:

Conversation history: Ordered sequence of messages
Context variables: Key-value pairs for application state
Agent memory: Working memory for reasoning processes
Artifacts: Generated files, images, or documents
Metadata: Session creation time, participant information, configuration

Local sessions: State stored in server memory or local database. Simple implementation for single-server deployments. Suitable for development and small-scale applications.

Distributed sessions: State distributed across multiple servers without shared infrastructure.

Implementation approach:

Session descriptors: Protocol-level objects describing session state:

Session ID and version
Participant agents and roles
Message references (URLs to messages on resource servers)
State references (URLs to state snapshots)
Coordination metadata

Resource servers: Arbitrary HTTP-accessible storage for session content:

Cloud object storage (S3, GCS, Azure Blob)
CDN with caching
Specialized document stores
Agent-local storage with HTTP access

Session synchronization:

Each agent maintains its own session view
Session updates published with descriptors
Agents fetch content from resource servers as needed
Eventually consistent model (no strong consistency required)

Benefits:

Fault tolerance: No single point of failure; agents can continue if some servers unavailable
Scalability: No centralized session store bottleneck; horizontal scaling unlimited
Flexibility: Any HTTP storage works; no specific database required
Cost efficiency: Leverage cheap object storage for session data
Geographic distribution: Content stored near users for latency optimization

Consistency model:

Eventual consistency for session state
Conflict resolution through timestamp ordering
Optimistic updates with retry on conflict
Explicit synchronization points when needed

Security considerations:

Session IDs as capability tokens (knowledge grants access)
Content URLs with time-limited signed URLs
Encryption of sensitive session content
Access control on resource servers

This distributed approach enables cloud-native, scalable ACP deployments without centralized coordination infrastructure.

Edmondo's Vault

Explorer

ACP

Agent Communication Protocol (ACP)

Overview

RESTful API Architecture

Message Structure and Multimodal MIME Types

Agent Discovery and Scale-to-Zero

Architecture Patterns

Authentication and Security

Capability Negotiation and Binding

Python SDK and BeeAI Implementation

Session Management and Distributed Sessions

Graph View

Table of Contents

Backlinks