diff --git a/docs/architecture.md b/docs/architecture.md
new file mode 100644
index 0000000..8955934
--- /dev/null
+++ b/docs/architecture.md
@@ -0,0 +1,213 @@
+# ai-cli-dispatch Architecture
+
+This document describes the internal design of `ai-cli-dispatch`, the module breakdown, data flow, key design decisions, and how to extend the tool.
+
+## Module Breakdown
+
+```text
+src/
+├── cli.ts        — Entry point: argument parsing, command routing, I/O formatting
+├── types.ts      — Shared types and error classes
+├── constants.ts  — Client name registry and platform helpers
+├── config.ts     — Layered configuration resolution (flags → env → file → PATH)
+├── detect.ts     — Client discovery: binary lookup and version extraction
+├── dispatch.ts   — Prompt-to-client resolution (explicit flag → keywords → default)
+└── execute.ts    — Subprocess spawning, stdout/stderr capture, timeout handling
+```
+
+### Responsibilities
+
+| Module | Responsibility |
+|---|---|
+| `cli.ts` | Parses `argv` with `minimist`, routes to `list` / `run` / `dispatch`, prints JSON or text output, and controls the process exit code. |
+| `types.ts` | Defines `ClientName`, `ClientInfo`, `ExecResult`, `ToolConfig`, and the error hierarchy (`ClientNotFoundError`, `ExecError`). |
+| `constants.ts` | Holds the canonical `CLIENT_NAMES` array and `isWindows()` helper used by discovery and config. |
+| `config.ts` | Resolves per-client binary paths and the optional `defaultClient` from four layered sources. |
+| `detect.ts` | Locates each client binary on `PATH`, falls back to a manual directory scan, and invokes `--version` to extract a semver string. |
+| `dispatch.ts` | Chooses the target client from a prompt string using ordered keyword matching, with overrides for explicit `--client` and `defaultClient`. |
+| `execute.ts` | Spawns the chosen client with its native argument shape, buffers `stdout`/`stderr`, enforces a timeout, and returns an `ExecResult` or throws a typed error. |
+
+## Data Flow
+
+A typical `dispatch` invocation flows through four stages:
+
+```
+┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
+│   detect    │ ──► │   config    │ ──► │  dispatch   │ ──► │   execute   │
+└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
+     │                    │                    │                    │
+  which/where        flags/env/file       keyword scan         spawn child
+  PATH walk          defaultClient        --client override    capture output
+  --version                               fallback default     timeout / exitCode
+```
+
+### 1. Detect
+
+`detectClients()` iterates over `CLIENT_NAMES` and attempts to locate each binary:
+
+1. Invoke `which <name>` (or `where <name>` on Windows).
+2. If that fails, walk `PATH` segments manually and test `existsSync()`.
+3. If a binary is found, run `<binary> --version` and parse the first semver-like match.
+
+Result: an array of `ClientInfo` objects with `name`, `found`, `path`, and `version`.
+
+### 2. Config
+
+`resolveConfig()` builds a `ResolvedConfig` by layering sources (highest to lowest precedence):
+
+1. **CLI flags** — `--codex-path`, `--claude-path`, `--opencode-path`, `--default-client`
+2. **Environment variables** — `AI_CLI_CODEX_PATH`, `AI_CLI_CLAUDE_PATH`, `AI_CLI_OPENCODE_PATH`, `AI_CLI_DEFAULT_CLIENT`
+3. **Config file** — `~/.openclaw/ai-cli-dispatch.json` (`paths` and `defaultClient` keys)
+4. **PATH discovery** — `which`/`where` fallback via `defaultWhichSync()`
+
+Only values for the three known `ClientName` entries are accepted; unknown `defaultClient` values are ignored.
+
+### 3. Dispatch
+
+`resolveClient(prompt, config)` decides which client to use:
+
+1. If `config.client` is a valid `ClientName`, return it immediately.
+2. Lower-case the prompt and scan for substrings in order:
+   - `"open code"` → `opencode`
+   - `"claude"` → `claude`
+   - `"codex"` → `codex`
+   - `"opencode"` → `opencode`
+3. If no keyword matches, return `config.defaultClient` or `null`.
+
+This ordering intentionally prioritizes `"open code"` before `"opencode"` so the spaced natural-language variant wins.
+
+### 4. Execute
+
+`executePrompt(client, prompt, options)` runs the selected client:
+
+1. Reject empty or whitespace-only prompts with `ExecError`.
+2. Validate that an explicit `clientPath` exists on disk (if provided).
+3. Map the client to its native argument array via `CLIENT_ARGS`:
+   - `codex` → `["exec", prompt]`
+   - `claude` → `["-p", prompt]`
+   - `opencode` → `[prompt]`
+4. `spawn()` the process with `shell: false`.
+5. Buffer `stdout` and `stderr` via `"data"` listeners.
+6. Start a `setTimeout`; if it fires, `child.kill()` is sent.
+7. On `close`, resolve with `{ stdout, stderr, exitCode }`.
+8. On `error`, reject with `ClientNotFoundError` for `ENOENT` or `ExecError` for anything else.
+9. On timeout, reject with `ExecError` containing the buffered output so far.
+
+The default timeout is **5 minutes** (`300_000` ms).
+
+## Design Decisions
+
+### Coexistence with ACP
+
+`ai-cli-dispatch` is intentionally **not** an ACP agent. It is a thin, local subprocess wrapper with no session state, no thread binding, and no orchestrator protocol.
+
+- Use `ai-cli-dispatch` when you need a quick, one-shot CLI execution on the gateway host.
+- Use ACP (`docs/openclaw-acp-orchestration.md`) when you need session-bound coding harnesses, multi-turn review, or orchestrator-managed verification gates.
+
+This separation keeps the dispatcher small and avoids duplicating ACP’s scheduling, context persistence, and review-loop responsibilities.
+
+### Keyword Dispatch vs NLP
+
+Client resolution uses deterministic substring matching instead of natural-language parsing or an LLM call.
+
+**Rationale:**
+- **Speed:** No network round-trip or model load; resolution is synchronous and sub-millisecond.
+- **Predictability:** The same prompt always resolves to the same client. There is no temperature, context window, or model-version drift.
+- **Debuggability:** A user can read the ordered keyword list and know exactly why a given prompt resolved to a given client.
+- **Scope fit:** The dispatcher only needs to distinguish three clients. A full NLP pipeline would be overkill.
+
+The trade-off is that prompts like `"compare codex and claude"` resolve to `codex` because `"codex"` is checked first. Users can always override with `--client`.
+
+### Sync-Only Initial Release
+
+The current implementation is entirely synchronous from the caller’s perspective: `executePrompt` returns a promise that resolves only when the child process exits or the timeout fires.
+
+**Rationale:**
+- The primary use case is one-shot tasks (refactor, add tests, migrate) where the agent needs the complete output before proceeding.
+- Streaming would require a different output contract (callbacks, generators, or an event emitter) and complicates the JSON error model.
+- ACP already covers interactive, streaming, and session-based use cases.
+
+Streaming is an intentional future extension point (see below).
+
+### Error Taxonomy
+
+All runtime failures are represented as typed errors so callers and tests can branch precisely:
+
+| Error | When thrown | Data carried |
+|---|---|---|
+| `ClientNotFoundError` | Binary not on `PATH`, explicit `clientPath` missing, or `ENOENT` from `spawn` | `message` with client name |
+| `ExecError` | Empty prompt, unknown client, timeout, non-`ENOENT` spawn error, or child exit | `message` + full `ExecResult` (`stdout`, `stderr`, `exitCode`) |
+
+`ExecError` carries the `ExecResult` so that timeout handlers still return partial output. This avoids losing buffered stdout/stderr when a long-running task is killed.
+
+### Injection-Friendly Module Boundaries
+
+Every non-trivial module accepts an `options` bag with injectable dependencies (`spawnSync`, `spawn`, `existsSync`, `whichSync`, `readFileSync`, etc.).
+
+**Rationale:**
+- Unit tests can run without touching the real filesystem, `PATH`, or subprocess layer.
+- The CLI itself injects its real dependencies through default parameters, so production behavior is unchanged.
+- There is no global mocking required; each test provides its own narrow fakes.
+
+### Minimal Dependency Surface
+
+The runtime dependency graph contains exactly one external package: `minimist` (argument parsing). Everything else uses Node.js built-ins (`child_process`, `fs`, `os`, `path`).
+
+**Rationale:**
+- Reduces supply-chain risk and install time.
+- Avoids version-lock issues across Node.js 20+ environments.
+- Keeps the compiled/bundled footprint negligible for a tool that is often installed as a sidecar.
+
+## Extension Points
+
+### Adding a New Client
+
+To support a fourth (or fifth) AI CLI client, change four files in `src/` and the corresponding tests:
+
+1. **`src/types.ts`** — Add the new name to the `ClientName` union type.
+2. **`src/constants.ts`** — Append the new name to `CLIENT_NAMES`.
+3. **`src/execute.ts`** — Add an entry to `CLIENT_ARGS` with the client’s native argument shape.
+4. **`src/config.ts`** — No change required; the existing loop over `CLIENT_NAMES` automatically picks up the new env/flag/file keys.
+5. **`src/dispatch.ts`** — Add a keyword check for the new client in `resolveClient`. Decide its precedence relative to existing keywords.
+6. **Tests** — Add colocated test cases in `tests/dispatch.test.ts`, `tests/execute.test.ts`, and `tests/detect.test.ts`.
+
+No changes are needed in `cli.ts` because it iterates over `CLIENT_NAMES` for validation.
+
+### Streaming Support
+
+If a future use case requires real-time output (e.g., long-running codegen with progressive feedback), the cleanest extension is to add an optional `onData` callback to `ExecuteOptions`:
+
+```typescript
+export interface ExecuteOptions {
+  clientPath?: string;
+  timeoutMs?: number;
+  spawn?: ...;
+  existsSync?: ...;
+  onData?: (chunk: string, stream: "stdout" | "stderr") => void;
+}
+```
+
+When `onData` is provided, `executePrompt` would:
+- Continue buffering internally for the final `ExecResult`.
+- Also emit each chunk through `onData` so the caller can stream to a UI or logger.
+- Reject/resolve with the same error taxonomy.
+
+This preserves backward compatibility: existing callers that omit `onData` receive the exact same buffered `ExecResult` they get today.
+
+### Platform Backends
+
+The current Windows support is limited to discovery (`where` instead of `which`, `.exe` extension assumptions). If future clients require platform-specific spawn options (e.g., PowerShell quoting rules), the extension point is `CLIENT_ARGS` or a new `CLIENT_SPAWN_OPTIONS` record keyed by `ClientName`.
+
+## Testing Strategy
+
+The test suite in `tests/` mirrors the `src/` structure:
+
+| Test file | Coverage |
+|---|---|
+| `cli.test.ts` | Argument parsing, command routing, JSON/text output modes, exit codes, error formatting |
+| `config.test.ts` | Layered precedence of flags, env, file, and `which` fallback; malformed JSON tolerance |
+| `detect.test.ts` | `which` success/failure, PATH directory fallback, version parsing, missing binary handling |
+| `dispatch.test.ts` | Keyword matching, case insensitivity, `--client` precedence, `defaultClient` fallback, invalid flag handling |
+| `execute.test.ts` | Successful execution, stderr capture, non-zero exit codes, `ENOENT` → `ClientNotFoundError`, timeout, empty prompt rejection, special-character preservation |
+
+All tests use injected mocks; no test spawns real client binaries or reads the real filesystem.