298 lines
18 KiB
Markdown
298 lines
18 KiB
Markdown
# ai-cli-dispatch Architecture
|
||
|
||
This document describes the internal design of `ai-cli-dispatch`, the module breakdown, data flow, key design decisions, and how to extend the tool.
|
||
|
||
## Module Breakdown
|
||
|
||
```text
|
||
src/
|
||
├── cli.ts — Entry point: argument parsing, command routing, I/O formatting
|
||
├── cli-helpers.ts — Shared formatting, sync/async run handlers, error reporters
|
||
├── types.ts — Shared types and error classes
|
||
├── constants.ts — Client name registry and platform helpers
|
||
├── config.ts — Layered configuration resolution (flags → env → file → PATH)
|
||
├── detect.ts — Client discovery: binary lookup and version extraction
|
||
├── dispatch.ts — Prompt-to-client resolution (explicit flag → keywords → default)
|
||
├── execute.ts — Synchronous subprocess spawning, stdout/stderr capture, timeout handling
|
||
└── jobs.ts — Async job lifecycle: detached spawn, disk-backed state, polling API
|
||
```
|
||
|
||
### Responsibilities
|
||
|
||
| Module | Responsibility |
|
||
|---|---|
|
||
| `cli.ts` | Parses `argv` with `minimist`, routes to all commands, prints JSON or text output, and controls the process exit code. |
|
||
| `cli-helpers.ts` | Shared helpers for `reportError`, `reportCliError`, `handleSyncRun`, and `handleAsyncRun` to keep `cli.ts` focused on routing. |
|
||
| `types.ts` | Defines `ClientName`, `ClientInfo`, `ExecResult`, `ToolConfig`, `Job`, `JobRecord`, `JobStatus`, and the error hierarchy (`ClientNotFoundError`, `ExecError`, `JobNotFoundError`, `JobResultUnavailableError`). |
|
||
| `constants.ts` | Holds the canonical `CLIENT_NAMES` array and `isWindows()` helper used by discovery and config. |
|
||
| `config.ts` | Resolves per-client binary paths and the optional `defaultClient` from four layered sources. |
|
||
| `detect.ts` | Locates each client binary on `PATH`, falls back to a manual directory scan, and invokes `--version` to extract a semver string. |
|
||
| `dispatch.ts` | Chooses the target client from a prompt string using ordered keyword matching, with overrides for explicit `--client` and `defaultClient`. |
|
||
| `execute.ts` | Spawns the chosen client with its native argument shape, buffers `stdout`/`stderr`, enforces a timeout, and returns an `ExecResult` or throws a typed error. |
|
||
| `jobs.ts` | Manages background jobs: writes job records to disk, spawns detached child processes, tracks running children in memory, and provides `status`, `results`, `cancel`, `list`, and `cleanup` operations. |
|
||
|
||
## Data Flow
|
||
|
||
### Synchronous dispatch (`run --sync`, `dispatch --sync`)
|
||
|
||
A sync invocation flows through four stages:
|
||
|
||
```
|
||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||
│ detect │ ──► │ config │ ──► │ dispatch │ ──► │ execute │
|
||
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
||
│ │ │ │
|
||
which/where flags/env/file keyword scan spawn child
|
||
PATH walk defaultClient --client override capture output
|
||
--version fallback default timeout / exitCode
|
||
```
|
||
|
||
### Asynchronous dispatch (`run`, `dispatch`, `start`)
|
||
|
||
An async invocation adds the `jobs.ts` stage. The caller receives a job ID immediately; the child process continues in the background.
|
||
|
||
```
|
||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||
│ detect │ ──► │ config │ ──► │ dispatch │ ──► │ execute │ ──► │ jobs │
|
||
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
|
||
│ │ │ │ │
|
||
which/where flags/env/file keyword scan spawn child write job file
|
||
PATH walk defaultClient --client override capture output detached + unref
|
||
--version fallback default timeout / exitCode update on close
|
||
```
|
||
|
||
Later, lifecycle commands read from or modify the job store:
|
||
|
||
```
|
||
status <jobId> ──► readJobFile ──► return Job (sans stdout/stderr)
|
||
results <jobId> ──► readJobFile ──► return ExecResult (completed only)
|
||
cancel <jobId> ──► readJobFile ──► kill child or PID ──► write cancelled status
|
||
list-jobs ──► readdir jobDir ──► read each file ──► sort + filter
|
||
cleanup-jobs ──► readdir jobDir ──► stat mtime ──► unlink old files
|
||
```
|
||
|
||
### 1. Detect
|
||
|
||
`detectClients()` iterates over `CLIENT_NAMES` and attempts to locate each binary:
|
||
|
||
1. Invoke `which <name>` (or `where <name>` on Windows).
|
||
2. If that fails, walk `PATH` segments manually and test `existsSync()`.
|
||
3. If a binary is found, run `<binary> --version` and parse the first semver-like match.
|
||
|
||
Result: an array of `ClientInfo` objects with `name`, `found`, `path`, and `version`.
|
||
|
||
### 2. Config
|
||
|
||
`resolveConfig()` builds a `ResolvedConfig` by layering sources (highest to lowest precedence):
|
||
|
||
1. **CLI flags** — `--codex-path`, `--claude-path`, `--opencode-path`, `--default-client`, `--timeout`
|
||
2. **Environment variables** — `AI_CLI_CODEX_PATH`, `AI_CLI_CLAUDE_PATH`, `AI_CLI_OPENCODE_PATH`, `AI_CLI_DEFAULT_CLIENT`
|
||
3. **Config file** — `~/.openclaw/ai-cli-dispatch.json` (`paths`, `defaultClient`, `timeout` keys)
|
||
4. **PATH discovery** — `which`/`where` fallback via `defaultWhichSync()`
|
||
|
||
Only values for the three known `ClientName` entries are accepted; unknown `defaultClient` values are ignored.
|
||
|
||
### 3. Dispatch
|
||
|
||
`resolveClient(prompt, config)` decides which client to use:
|
||
|
||
1. If `config.client` is a valid `ClientName`, return it immediately.
|
||
2. Lower-case the prompt and scan for substrings in order:
|
||
- `"open code"` → `opencode`
|
||
- `"claude"` → `claude`
|
||
- `"codex"` → `codex`
|
||
- `"opencode"` → `opencode`
|
||
3. If no keyword matches, return `config.defaultClient` or `null`.
|
||
|
||
This ordering intentionally prioritizes `"open code"` before `"opencode"` so the spaced natural-language variant wins.
|
||
|
||
### 4. Execute
|
||
|
||
`executePrompt(client, prompt, options)` runs the selected client synchronously:
|
||
|
||
1. Reject empty or whitespace-only prompts with `ExecError`.
|
||
2. Validate that an explicit `clientPath` exists on disk (if provided).
|
||
3. Map the client to its native argument array via `CLIENT_ARGS`:
|
||
- `codex` → `["exec", "--yolo", prompt]`
|
||
- `claude` → `["-p", prompt, "--dangerously-skip-permissions"]`
|
||
- `opencode` → `["run", "--dangerously-skip-permissions", prompt]`
|
||
4. `spawn()` the process with `shell: false`.
|
||
5. Buffer `stdout` and `stderr` via `"data"` listeners.
|
||
6. Start a `setTimeout`; if it fires, `child.kill()` is sent.
|
||
7. On `close`, resolve with `{ stdout, stderr, exitCode, client, durationMs }`.
|
||
8. On `error`, reject with `ClientNotFoundError` for `ENOENT` or `ExecError` for anything else.
|
||
9. On timeout, reject with `ExecError` containing the buffered output so far.
|
||
10. If `debug` is enabled, emit a `DebugInfo` object via `onDebug`.
|
||
|
||
The default timeout is **10 minutes** (`600_000` ms).
|
||
|
||
### 5. Jobs
|
||
|
||
`startJob(client, prompt, options)` launches a background job:
|
||
|
||
1. Generate a UUID for the job ID.
|
||
2. Build the client argument array via `CLIENT_ARGS`.
|
||
3. `spawn()` the process with `detached: true` and `stdio: ["ignore", "pipe", "pipe"]`.
|
||
4. Write an initial `JobRecord` to `~/.openclaw/ai-cli-dispatch/jobs/<jobId>.json` with status `running`.
|
||
5. Update the record with the child `pid` once available.
|
||
6. Register the child in an in-memory `runningChildren` Map for cancellation and timeout tracking.
|
||
7. Buffer `stdout`/`stderr` via `"data"` listeners.
|
||
8. On `close`, finalize the record: write status (`completed`, `failed`, `timed_out`, or `cancelled`), capture stdout/stderr, and record `durationMs`.
|
||
9. Call `child.unref()` so the dispatcher process can exit without waiting for the child.
|
||
|
||
`getJob(jobId)` reads the job file and returns a `Job` (omitting the full stdout/stderr buffers).
|
||
|
||
`getJobResult(jobId)` returns the `ExecResult` for a completed job.
|
||
|
||
`cancelJob(jobId)` looks up the running child in memory, sends `SIGTERM`, and writes a `cancelled` status. If the child is no longer in memory, it attempts `process.kill(pid, "SIGTERM")` as a fallback.
|
||
|
||
`listJobs({ filter })` reads all `.json` files in the job directory, parses them, sorts by `startedAt` descending, and optionally filters by status.
|
||
|
||
`cleanupJobs({ maxAgeMs })` deletes job files whose `mtime` exceeds the threshold. Default max age is 24 hours.
|
||
|
||
## Design Decisions
|
||
|
||
### Async-First Architecture
|
||
|
||
The default execution mode is **async** (background job). Synchronous execution requires an explicit `--sync` flag.
|
||
|
||
**Rationale:**
|
||
- **Primary use case alignment:** Most AI CLI tasks (refactoring, test generation, migration) run for multiple minutes. Blocking the caller for that long is often undesirable in automation and orchestration contexts.
|
||
- **Resilience:** A detached background job survives an unexpected dispatcher exit. The caller can reconnect later via `status` and `results`.
|
||
- **Batching:** Multiple jobs can be started in parallel without blocking the dispatcher process.
|
||
- **Backward compatibility path:** `--sync` preserves the original one-shot behavior for callers that need it, without changing the default.
|
||
|
||
### Disk-Backed Job Store
|
||
|
||
Job state is persisted as JSON files on disk rather than kept solely in memory.
|
||
|
||
**Rationale:**
|
||
- **Durability across restarts:** If the dispatcher process crashes or the host reboots, job files remain. A caller can still query `status` or `results` after recovery.
|
||
- **No memory leaks:** Long-running or forgotten jobs do not accumulate in heap. Cleanup is explicit via `cleanup-jobs`.
|
||
- **External observability:** Operators can inspect `~/.openclaw/ai-cli-dispatch/jobs/` directly without calling the CLI.
|
||
- **Simplicity:** A file-per-job model avoids the need for an embedded database or external service. It maps cleanly to the Node.js `fs` API and is trivial to mock in tests.
|
||
|
||
**Trade-off:** High-frequency job creation could strain the filesystem, but the expected volume is low (tens to hundreds of jobs, not thousands per second).
|
||
|
||
### Detached-Process Approach
|
||
|
||
Async jobs use `detached: true` with `child.unref()`.
|
||
|
||
**Rationale:**
|
||
- **Parent independence:** The dispatcher can start a job and exit immediately. This is essential for CLI usage where the user or orchestrator should not hold a shell open for the duration of the task.
|
||
- **Signal isolation:** A detached process group means the child does not receive `SIGINT` or `SIGHUP` sent to the dispatcher terminal session.
|
||
- **PID tracking:** Even though the child is detached, the `pid` is captured and written to the job file. This enables `cancelJob` to send signals even if the dispatcher has restarted and lost its in-memory `runningChildren` map.
|
||
|
||
**Trade-off:** The child is truly independent. If the host reboots, the child is lost (same as any other process). The job file will eventually reflect `timed_out` or remain `running` until `cancel` or `cleanup` is run.
|
||
|
||
### Coexistence with ACP
|
||
|
||
`ai-cli-dispatch` is intentionally **not** an ACP agent. It is a thin, local subprocess wrapper with no session state, no thread binding, and no orchestrator protocol.
|
||
|
||
- Use `ai-cli-dispatch` when you need a quick, one-shot CLI execution or a background job on the gateway host.
|
||
- Use ACP (`docs/openclaw-acp-orchestration.md`) when you need session-bound coding harnesses, multi-turn review, or orchestrator-managed verification gates.
|
||
|
||
This separation keeps the dispatcher small and avoids duplicating ACP’s scheduling, context persistence, and review-loop responsibilities.
|
||
|
||
### Keyword Dispatch vs NLP
|
||
|
||
Client resolution uses deterministic substring matching instead of natural-language parsing or an LLM call.
|
||
|
||
**Rationale:**
|
||
- **Speed:** No network round-trip or model load; resolution is synchronous and sub-millisecond.
|
||
- **Predictability:** The same prompt always resolves to the same client. There is no temperature, context window, or model-version drift.
|
||
- **Debuggability:** A user can read the ordered keyword list and know exactly why a given prompt resolved to a given client.
|
||
- **Scope fit:** The dispatcher only needs to distinguish three clients. A full NLP pipeline would be overkill.
|
||
|
||
The trade-off is that prompts like `"compare codex and claude"` resolve to `codex` because `"codex"` is checked first. Users can always override with `--client`.
|
||
|
||
### Error Taxonomy
|
||
|
||
All runtime failures are represented as typed errors so callers and tests can branch precisely:
|
||
|
||
| Error | When thrown | Data carried |
|
||
|---|---|---|
|
||
| `ClientNotFoundError` | Binary not on `PATH`, explicit `clientPath` missing, or `ENOENT` from `spawn` | `message` with client name |
|
||
| `ExecError` | Empty prompt, unknown client, timeout, non-`ENOENT` spawn error, or child exit | `message` + full `ExecResult` (`stdout`, `stderr`, `exitCode`, `client`, `durationMs`) |
|
||
| `JobNotFoundError` | Job ID not found in the job store | `message` with job ID |
|
||
| `JobResultUnavailableError` | `results` called on a non-completed job | `message` with job ID and current status |
|
||
|
||
`ExecError` carries the `ExecResult` so that timeout handlers still return partial output. This avoids losing buffered stdout/stderr when a long-running task is killed.
|
||
|
||
### Injection-Friendly Module Boundaries
|
||
|
||
Every non-trivial module accepts an `options` bag with injectable dependencies (`spawnSync`, `spawn`, `existsSync`, `whichSync`, `readFileSync`, etc.).
|
||
|
||
**Rationale:**
|
||
- Unit tests can run without touching the real filesystem, `PATH`, or subprocess layer.
|
||
- The CLI itself injects its real dependencies through default parameters, so production behavior is unchanged.
|
||
- There is no global mocking required; each test provides its own narrow fakes.
|
||
|
||
### Minimal Dependency Surface
|
||
|
||
The runtime dependency graph contains exactly one external package: `minimist` (argument parsing). Everything else uses Node.js built-ins (`child_process`, `fs`, `os`, `path`, `crypto`).
|
||
|
||
**Rationale:**
|
||
- Reduces supply-chain risk and install time.
|
||
- Avoids version-lock issues across Node.js 20+ environments.
|
||
- Keeps the compiled/bundled footprint negligible for a tool that is often installed as a sidecar.
|
||
|
||
## Extension Points
|
||
|
||
### Adding a New Client
|
||
|
||
To support a fourth (or fifth) AI CLI client, change four files in `src/` and the corresponding tests:
|
||
|
||
1. **`src/types.ts`** — Add the new name to the `ClientName` union type.
|
||
2. **`src/constants.ts`** — Append the new name to `CLIENT_NAMES`.
|
||
3. **`src/execute.ts`** — Add an entry to `CLIENT_ARGS` with the client’s native argument shape.
|
||
4. **`src/config.ts`** — No change required; the existing loop over `CLIENT_NAMES` automatically picks up the new env/flag/file keys.
|
||
5. **`src/dispatch.ts`** — Add a keyword check for the new client in `resolveClient`. Decide its precedence relative to existing keywords.
|
||
6. **`src/jobs.ts`** — No change required; `CLIENT_ARGS` is already shared.
|
||
7. **Tests** — Add colocated test cases in `tests/dispatch.test.ts`, `tests/execute.test.ts`, `tests/detect.test.ts`, and `tests/jobs.test.ts`.
|
||
|
||
No changes are needed in `cli.ts` because it iterates over `CLIENT_NAMES` for validation.
|
||
|
||
### Streaming Support
|
||
|
||
If a future use case requires real-time output (e.g., long-running codegen with progressive feedback), the cleanest extension is to add an optional `onData` callback to `ExecuteOptions`:
|
||
|
||
```typescript
|
||
export interface ExecuteOptions {
|
||
clientPath?: string;
|
||
timeoutMs?: number;
|
||
spawn?: ...;
|
||
existsSync?: ...;
|
||
onData?: (chunk: string, stream: "stdout" | "stderr") => void;
|
||
}
|
||
```
|
||
|
||
When `onData` is provided, `executePrompt` would:
|
||
- Continue buffering internally for the final `ExecResult`.
|
||
- Also emit each chunk through `onData` so the caller can stream to a UI or logger.
|
||
- Reject/resolve with the same error taxonomy.
|
||
|
||
This preserves backward compatibility: existing callers that omit `onData` receive the exact same buffered `ExecResult` they get today.
|
||
|
||
For async jobs, `jobs.ts` could store a partial `stdout`/`stderr` in the job file on each chunk (or at a throttled interval) so `status` callers can see progress without waiting for completion.
|
||
|
||
### Platform Backends
|
||
|
||
The current Windows support is limited to discovery (`where` instead of `which`, `.exe` extension assumptions). If future clients require platform-specific spawn options (e.g., PowerShell quoting rules), the extension point is `CLIENT_ARGS` or a new `CLIENT_SPAWN_OPTIONS` record keyed by `ClientName`.
|
||
|
||
## Testing Strategy
|
||
|
||
The test suite in `tests/` mirrors the `src/` structure:
|
||
|
||
| Test file | Coverage |
|
||
|---|---|
|
||
| `cli.test.ts` | Argument parsing, command routing, JSON/text output modes, exit codes, error formatting, sync vs async branches, all job lifecycle commands |
|
||
| `cli-helpers.test.ts` | `reportError`, `reportCliError`, `handleSyncRun`, `handleAsyncRun` with JSON and text modes |
|
||
| `config.test.ts` | Layered precedence of flags, env, file, and `which` fallback; malformed JSON tolerance |
|
||
| `detect.test.ts` | `which` success/failure, PATH directory fallback, version parsing, missing binary handling |
|
||
| `dispatch.test.ts` | Keyword matching, case insensitivity, `--client` precedence, `defaultClient` fallback, invalid flag handling |
|
||
| `execute.test.ts` | Successful execution, stderr capture, non-zero exit codes, `ENOENT` → `ClientNotFoundError`, timeout, empty prompt rejection, special-character preservation, debug info emission |
|
||
| `jobs.test.ts` | Job start, status query, result retrieval, cancellation, listing, cleanup, timeout handling, unknown client fallback, detached process behavior, in-memory vs on-disk consistency |
|
||
|
||
All tests use injected mocks; no test spawns real client binaries or reads the real filesystem.
|