Files
stef-openclaw-skills/docs/openclaw-acp-orchestration.md

14 KiB

OpenClaw ACP Orchestration

This document describes the local OpenClaw ACP setup used to orchestrate Codex and Claude Code from an OpenClaw agent on the gateway machine.

Scope

The target workflow is:

  • OpenClaw remains the orchestration brain
  • natural-language requests like use codex for this or run this in claude code are routed to ACP
  • the coding harness runs on the same gateway machine where the local codex and claude clients are installed
  • session lifecycle is handled through OpenClaw ACP rather than sub-agents or shell relay hacks

Local Baseline Before ACP Enablement

Captured on 2026-03-29:

  • OpenClaw: 2026.3.28 (f9b1079)
  • bundled acpx plugin present locally but disabled and not in the plugin allowlist
  • local codex: /opt/homebrew/bin/codex 0.117.0
  • local claude: /opt/homebrew/bin/claude 2.1.87
  • gateway host: 8 CPU cores, 8 GB RAM
  • default OpenClaw agent workspace: ~/.openclaw/workspace

Architectural Decision

Primary architecture:

  • OpenClaw ACP with acpx

Fallback architecture only if parity is not acceptable:

  • openclaw mcp serve with Codex or Claude Code connected as external MCP clients to existing OpenClaw channel conversations

Why ACP is primary:

  • this is the official OpenClaw architecture for "run this in Codex" / "start Claude Code in a thread"
  • it gives durable ACP sessions, resume, bindings, and programmatic sessions_spawn runtime:"acp"

Important Runtime Caveat

The bundled acpx runtime supports Codex and Claude, but the stock aliases are adapter commands, not necessarily the bare local terminal binaries:

  • codex -> npx -y @zed-industries/codex-acp@0.9.5
  • claude -> npx -y @zed-industries/claude-agent-acp@0.21.0

That means "same as terminal" behavior has to be validated explicitly. It is not guaranteed just because ACP works.

Baseline Configuration Applied

The current host-local OpenClaw config keeps the native main orchestrator and adds ACP-backed agents alongside it:

  • agents.list[0] = main with runtime.type = "embedded"
  • agents.list[1] = codex with runtime.type = "acp"
  • agents.list[2] = claude with runtime.type = "acp"
  • acp.enabled = true
  • acp.dispatch.enabled = true
  • acp.backend = "acpx"
  • acp.defaultAgent = "codex"
  • acp.allowedAgents = ["claude", "codex"]
  • acp.maxConcurrentSessions = 2
  • plugins.allow += acpx
  • plugins.entries.acpx.enabled = true
  • ACP-specific cwd values are absolute paths, not ~-prefixed shortcuts

The main entry is intentional. Once agents.list is populated, OpenClaw treats that list as the agent inventory. If main is omitted, ACP targets can displace the native orchestrator and break the intended architecture.

ACP Health Equivalents

The docs mention /acp doctor, but the operator-friendly local equivalents on this host are:

  • openclaw config validate
  • openclaw plugins inspect acpx --json
  • openclaw gateway status --json
  • openclaw status --deep
  • cd /opt/homebrew/lib/node_modules/openclaw/dist/extensions/acpx && ./node_modules/.bin/acpx config show

Healthy baseline on this machine means:

  • config validates
  • acpx plugin status is loaded
  • gateway RPC is healthy
  • openclaw status --deep shows Agents 3 with default main
  • acpx config show works without bootstrap errors
  • plugins.installs does not need an acpx record because acpx is bundled with OpenClaw, not separately installed

Important health nuance:

  • openclaw plugins inspect acpx --json only tells you the plugin is loaded, not that the ACP backend is healthy enough for sessions_spawn runtime:"acp"
  • the actual readiness signal is the gateway log line acpx runtime backend ready
  • during rollout, the backend stayed unavailable until ACP-specific cwd values were changed from ~/.openclaw/workspace to absolute paths
  • a later startup bug showed that pinning a custom command path disables the plugin-local managed install path and can leave ACP unavailable after a restart if the local acpx artifact is absent at boot
  • the current host fix is to leave plugins.entries.acpx.config.command unset so the bundled plugin can manage its own plugin-local acpx binary

Maintenance note:

  • the current host intentionally uses the managed plugin-local default command path rather than a custom override
  • after any OpenClaw upgrade, re-run:
    • openclaw config validate
    • openclaw plugins inspect acpx --json
    • openclaw logs --limit 80 --plain --timeout 10000 | rg 'acpx runtime backend (registered|ready|probe failed)'
    • ls -l /opt/homebrew/lib/node_modules/openclaw/dist/extensions/acpx/node_modules/.bin/acpx
  • if ACP comes up unavailable at startup, check whether a custom plugins.entries.acpx.config.command override was reintroduced before debugging deeper

Security Review

Why this needs a review

ACP coding sessions are headless and non-interactive. If they are allowed to write files and run shell commands, the permission mode matters a lot.

Leading rollout candidate

  • plugins.entries.acpx.config.permissionMode = "approve-all"
  • plugins.entries.acpx.config.nonInteractivePermissions = "deny"

Why deny instead of fail:

  • on this host, graceful degradation is better than crashing an otherwise useful ACP session at the first blocked headless permission prompt
  • the live acpx plugin schema for OpenClaw 2026.3.28 validates deny, so this is an intentional runtime choice rather than a placeholder

What approve-all means here

On this gateway host, an ACP coding harness may:

  • write files in the configured working tree
  • execute shell commands without an interactive prompt
  • access network resources that are already reachable from the host
  • read local home-directory configuration that the launched harness itself can reach

Risk boundaries

This host already runs OpenClaw with:

  • tools.exec.host = "gateway"
  • tools.exec.security = "full"
  • tools.exec.ask = "off"

So ACP approve-all does not create the first fully trusted execution path on this machine. It extends that trust to ACP-backed Codex/Claude sessions. That is still a meaningful trust expansion and should stay limited to trusted operators and trusted channels.

First-wave rollout stance

Recommended first wave:

  • enable ACP only for trusted direct operators
  • prefer explicit agentId routing and minimal bindings
  • defer broad persistent group bindings until parity and lifecycle behavior are proven
  • keep the plugin-tools bridge off unless there is a proven need for ACP harnesses to call OpenClaw plugin tools from inside the session

Observability And Recovery

Minimum required operational checks:

  • openclaw config validate
  • openclaw plugins inspect acpx --json
  • openclaw gateway status --json
  • openclaw status --deep
  • openclaw logs --follow
  • /tmp/openclaw/openclaw-YYYY-MM-DD.log

Operational questions this setup must answer:

  • did an ACP session start
  • which harness was used
  • which session key is active
  • where a stall or permission denial first occurred
  • whether the gateway restart preserved resumable state

Current host signals:

  • plugin status: openclaw plugins inspect acpx --json
  • gateway/runtime health: openclaw gateway status --json
  • agent inventory and active session count: openclaw status --deep
  • ACP adapter defaults and override file discovery: acpx config show
  • first runtime failure point: gateway log under /tmp/openclaw/

Claude adapter noise:

  • the Claude ACP adapter currently emits session/update validation noise for usage_update after otherwise successful turns
  • when filtering logs during Claude ACP troubleshooting, separate that known noise from startup failures by focusing first on:
    • acpx runtime backend ready
    • ACP runtime backend is currently unavailable
    • probe failed
    • actual session spawn/close lines

Concurrency Stance

This machine has 8 CPU cores and 8 GB RAM. A conservative initial ACP concurrency cap is better than the plan's generic placeholder of 8.

Recommended initial cap:

  • acp.maxConcurrentSessions = 2

Reason:

  • enough for one Codex and one Claude session at the same time
  • low enough to reduce memory pressure and noisy contention on the same laptop-class host
  • if operators start using longer-lived persistent ACP sessions heavily, revisit this only after checking real memory pressure and swap behavior on the gateway host

Plugin Tools Bridge

The planning material discussed plugins.entries.acpx.config.pluginToolsMcpBridge, but the local 2026.3.28 bundled acpx schema does not currently expose that key in openclaw plugins inspect acpx --json.

Current stance:

  • treat plugin-tools bridge as unsupported unless the live runtime proves otherwise
  • do not add that key blindly to openclaw.json

Default Workspace Root

The default ACP workspace root for this install is:

  • ~/.openclaw/workspace

Per-session or per-binding cwd values can narrow from there when a specific repository or skill workspace is known.

For ACP plugin/runtime config, use absolute paths instead of ~-prefixed paths.

Parity Results

Codex ACP parity

Validated directly with acpx codex against a real project worktree.

Observed:

  • correct cwd
  • HOME=/Users/stefano
  • access to ~/.codex
  • access to ~/.openclaw/workspace
  • access to installed Codex skills under ~/.codex/skills
  • persistent named sessions retained state across turns
  • persistent named sessions retained state across an OpenClaw gateway restart

Assessment:

  • Codex ACP is close enough to local terminal behavior for rollout

Claude Code ACP parity

Validated directly with acpx claude against the same project worktree.

Observed:

  • correct cwd
  • HOME=/Users/stefano
  • access to ~/.claude
  • access to ~/.codex when explicitly tested with shell commands
  • persistent named sessions retained state across turns
  • persistent named sessions retained state across an OpenClaw gateway restart

Known defect:

  • the Claude ACP adapter emits an extra session/update validation error after otherwise successful turns:
    • Invalid params
    • sessionUpdate: 'usage_update'

Assessment:

  • Claude ACP is usable, but noisier than Codex
  • this is an adapter/protocol mismatch to monitor, not a rollout blocker for trusted operators

ACPX Override Decision

Decision:

  • do not add ~/.acpx/config.json agent overrides for Codex or Claude right now

Why:

  • Codex parity already passes with the stock alias path
  • swapping Claude from the deprecated package name to @agentclientprotocol/claude-agent-acp@0.24.2 did not remove the session/update validation noise
  • raw local codex and claude CLIs are not drop-in ACP servers, so an override would add maintenance cost without delivering materially better parity

Natural-Language Routing Policy

The main agent is instructed to:

  • stay native as the orchestrator
  • use sessions_spawn with runtime: "acp" when the user explicitly asks for Codex or Claude Code
  • choose agentId: "codex" or agentId: "claude" accordingly
  • use one-shot ACP runs for single tasks
  • use persistent ACP sessions only when the user clearly wants continued context
  • avoid silent fallback to ordinary local exec when ACP was explicitly requested

The live messaging tool surface had to be extended to expose:

  • sessions_spawn
  • sessions_yield

without widening the whole profile beyond what was needed.

Binding Policy

First-wave binding policy is intentionally conservative:

  • no broad top-level persistent bindings[]
  • no automatic permanent channel/topic binds
  • prefer on-demand ACP spawn from the current conversation
  • only introduce persistent binds later if there is a clear operator need

Channel-specific note:

  • WhatsApp does not support ACP thread-bound spawn in the tested path
  • use current-conversation or one-shot ACP behavior there, not thread-bound ACP assumptions

Smoke-Test Findings

What worked:

  • direct acpx codex runs
  • direct acpx claude runs
  • mixed Codex + Claude ACPX runs in parallel
  • persistent ACPX named sessions
  • named-session recall after a gateway restart

What failed and why:

  • channel-less CLI-driven openclaw agent tests can fail ACP spawn with:
    • Channel is required when multiple channels are configured: telegram, whatsapp, bluebubbles
  • this is a context issue, not a backend-registration issue
  • synthetic CLI sessions are not a perfect substitute for a real inbound channel conversation when testing current-conversation ACP spawn

Operational interpretation:

  • ACP backend + harness parity are good enough for rollout
  • final operator confidence should still come from a real inbound Telegram or WhatsApp conversation, not only a synthetic CLI turn

Fallback Decision

Decision:

  • keep ACP via acpx as the primary architecture
  • do not adopt openclaw mcp serve as the primary mode at this stage

Why fallback was not adopted:

  • Codex parity is good
  • Claude parity is acceptable with one known noisy adapter defect
  • OpenClaw can now expose the ACP spawn tool in the messaging profile
  • the remaining limitation is real channel context for current-conversation spawn, not a fundamental mismatch between ACP and the installed gateway clients

Rollback

Back up ~/.openclaw/openclaw.json before any ACP change.

Current ACP implementation backup:

  • ~/.openclaw/openclaw.json.bak.pre-acp-implementation-20260329-231818

Rollback approach:

  1. restore the backup config
  2. validate config
  3. restart the gateway
  4. confirm ACP plugin status and channel health

Example rollback:

cp ~/.openclaw/openclaw.json.bak.pre-acp-implementation-20260329-231818 ~/.openclaw/openclaw.json
openclaw config validate
openclaw gateway restart
openclaw status --deep

Implementation Hazards

Two local quirks were discovered during rollout:

  • openclaw config set is not safe for parallel writes to the same config file. Concurrent config set calls can clobber each other.
  • host-local legacy keys can reappear if a write path round-trips older config state. For this rollout, atomic file edits plus explicit validation were safer than chaining many config set commands.

Implementation Notes

This document is updated milestone by milestone as the ACP rollout is implemented and verified.