diff --git a/docs/openclaw-acp-orchestration.md b/docs/openclaw-acp-orchestration.md index a6f0a39..6974ff8 100644 --- a/docs/openclaw-acp-orchestration.md +++ b/docs/openclaw-acp-orchestration.md @@ -61,6 +61,8 @@ The current host-local OpenClaw config keeps the native `main` orchestrator and - `acp.maxConcurrentSessions = 2` - `plugins.allow += acpx` - `plugins.entries.acpx.enabled = true` +- `plugins.entries.acpx.config.command = /opt/homebrew/lib/node_modules/openclaw/dist/extensions/acpx/node_modules/acpx/dist/cli.js` +- ACP-specific `cwd` values are absolute paths, not `~`-prefixed shortcuts The `main` entry is intentional. Once `agents.list` is populated, OpenClaw treats that list as the agent inventory. If `main` is omitted, ACP targets can displace the native orchestrator and break the intended architecture. @@ -83,6 +85,22 @@ Healthy baseline on this machine means: - `acpx config show` works without bootstrap errors - `plugins.installs` does not need an `acpx` record because `acpx` is bundled with OpenClaw, not separately installed +Important health nuance: + +- `openclaw plugins inspect acpx --json` only tells you the plugin is loaded, not that the ACP backend is healthy enough for `sessions_spawn runtime:"acp"` +- the actual readiness signal is the gateway log line `acpx runtime backend ready` +- during rollout, the backend stayed unavailable until ACP-specific `cwd` values were changed from `~/.openclaw/workspace` to absolute paths and the plugin command was pinned to the direct `acpx/dist/cli.js` entrypoint + +Maintenance note: + +- the pinned command path and `expectedVersion = "0.3.1"` are intentionally explicit because they were needed to get this host healthy +- after any OpenClaw upgrade, re-run: + - `openclaw config validate` + - `openclaw plugins inspect acpx --json` + - `openclaw logs --limit 80 --plain --timeout 10000 | rg 'acpx runtime backend (registered|ready|probe failed)'` + - `node /opt/homebrew/lib/node_modules/openclaw/dist/extensions/acpx/node_modules/acpx/dist/cli.js --version` +- if the pinned path or version no longer matches the bundled layout, update the config deliberately instead of assuming the old override remains valid + ## Security Review ### Why this needs a review @@ -154,6 +172,15 @@ Current host signals: - ACP adapter defaults and override file discovery: `acpx config show` - first runtime failure point: gateway log under `/tmp/openclaw/` +Claude adapter noise: + +- the Claude ACP adapter currently emits `session/update` validation noise for `usage_update` after otherwise successful turns +- when filtering logs during Claude ACP troubleshooting, separate that known noise from startup failures by focusing first on: + - `acpx runtime backend ready` + - `ACP runtime backend is currently unavailable` + - `probe failed` + - actual session spawn/close lines + ## Concurrency Stance This machine has 8 CPU cores and 8 GB RAM. A conservative initial ACP concurrency cap is better than the plan's generic placeholder of `8`. @@ -166,6 +193,7 @@ Reason: - enough for one Codex and one Claude session at the same time - low enough to reduce memory pressure and noisy contention on the same laptop-class host +- if operators start using longer-lived persistent ACP sessions heavily, revisit this only after checking real memory pressure and swap behavior on the gateway host ## Plugin Tools Bridge @@ -184,6 +212,132 @@ The default ACP workspace root for this install is: Per-session or per-binding `cwd` values can narrow from there when a specific repository or skill workspace is known. +For ACP plugin/runtime config, use absolute paths instead of `~`-prefixed paths. + +## Parity Results + +### Codex ACP parity + +Validated directly with `acpx codex` against a real project worktree. + +Observed: + +- correct `cwd` +- `HOME=/Users/stefano` +- access to `~/.codex` +- access to `~/.openclaw/workspace` +- access to installed Codex skills under `~/.codex/skills` +- persistent named sessions retained state across turns +- persistent named sessions retained state across an OpenClaw gateway restart + +Assessment: + +- Codex ACP is close enough to local terminal behavior for rollout + +### Claude Code ACP parity + +Validated directly with `acpx claude` against the same project worktree. + +Observed: + +- correct `cwd` +- `HOME=/Users/stefano` +- access to `~/.claude` +- access to `~/.codex` when explicitly tested with shell commands +- persistent named sessions retained state across turns +- persistent named sessions retained state across an OpenClaw gateway restart + +Known defect: + +- the Claude ACP adapter emits an extra `session/update` validation error after otherwise successful turns: + - `Invalid params` + - `sessionUpdate: 'usage_update'` + +Assessment: + +- Claude ACP is usable, but noisier than Codex +- this is an adapter/protocol mismatch to monitor, not a rollout blocker for trusted operators + +## ACPX Override Decision + +Decision: + +- do **not** add `~/.acpx/config.json` agent overrides for Codex or Claude right now + +Why: + +- Codex parity already passes with the stock alias path +- swapping Claude from the deprecated package name to `@agentclientprotocol/claude-agent-acp@0.24.2` did **not** remove the `session/update` validation noise +- raw local `codex` and `claude` CLIs are not drop-in ACP servers, so an override would add maintenance cost without delivering materially better parity + +## Natural-Language Routing Policy + +The `main` agent is instructed to: + +- stay native as the orchestrator +- use `sessions_spawn` with `runtime: "acp"` when the user explicitly asks for Codex or Claude Code +- choose `agentId: "codex"` or `agentId: "claude"` accordingly +- use one-shot ACP runs for single tasks +- use persistent ACP sessions only when the user clearly wants continued context +- avoid silent fallback to ordinary local exec when ACP was explicitly requested + +The live messaging tool surface had to be extended to expose: + +- `sessions_spawn` +- `sessions_yield` + +without widening the whole profile beyond what was needed. + +## Binding Policy + +First-wave binding policy is intentionally conservative: + +- no broad top-level persistent `bindings[]` +- no automatic permanent channel/topic binds +- prefer on-demand ACP spawn from the current conversation +- only introduce persistent binds later if there is a clear operator need + +Channel-specific note: + +- WhatsApp does not support ACP thread-bound spawn in the tested path +- use current-conversation or one-shot ACP behavior there, not thread-bound ACP assumptions + +## Smoke-Test Findings + +What worked: + +- direct `acpx codex` runs +- direct `acpx claude` runs +- mixed Codex + Claude ACPX runs in parallel +- persistent ACPX named sessions +- named-session recall after a gateway restart + +What failed and why: + +- channel-less CLI-driven `openclaw agent` tests can fail ACP spawn with: + - `Channel is required when multiple channels are configured: telegram, whatsapp, bluebubbles` +- this is a context issue, not a backend-registration issue +- synthetic CLI sessions are not a perfect substitute for a real inbound channel conversation when testing current-conversation ACP spawn + +Operational interpretation: + +- ACP backend + harness parity are good enough for rollout +- final operator confidence should still come from a real inbound Telegram or WhatsApp conversation, not only a synthetic CLI turn + +## Fallback Decision + +Decision: + +- keep ACP via `acpx` as the primary architecture +- do **not** adopt `openclaw mcp serve` as the primary mode at this stage + +Why fallback was not adopted: + +- Codex parity is good +- Claude parity is acceptable with one known noisy adapter defect +- OpenClaw can now expose the ACP spawn tool in the messaging profile +- the remaining limitation is real channel context for current-conversation spawn, not a fundamental mismatch between ACP and the installed gateway clients + ## Rollback Back up `~/.openclaw/openclaw.json` before any ACP change.