feat(create-plan): route review through shared runtime

2026-03-05 23:07:43 -06:00
parent 41a3b9d1ee
commit 04bf34544b
5 changed files with 310 additions and 33 deletions
@@ -117,6 +117,8 @@ Verify Superpowers dependencies exist in your agent skills root:
 - Commits `.gitignore` update locally when added.
 - Asks which reviewer CLI and model to use (or accepts `skip` for no review).
 - Iteratively reviews the plan with the chosen reviewer (max 5 rounds) before generating files.
+- Runs reviewer commands through `reviewer-runtime/run-review.sh` when available, with fallback to direct synchronous execution only if the helper is missing.
+- Captures reviewer stderr and helper status logs for diagnostics and retains them on failed, empty-output, or operator-decision review rounds.
 - Produces:
  - `original-plan.md`
  - `final-transcript.md`
@@ -129,11 +131,38 @@ Verify Superpowers dependencies exist in your agent skills root:
 After the plan is created (design + milestones + stories), the skill sends it to a second model for review:

 1. **Configure** — user picks a reviewer CLI (`codex`, `claude`, `cursor`) and model, or skips
-2. **Submit** — plan is written to a temp file and sent to the reviewer in read-only/ask mode
-3. **Feedback** — reviewer evaluates correctness, risks, missing steps, alternatives, security
-4. **Revise** — the planning agent addresses each issue and re-submits
-5. **Repeat** — up to 5 rounds until the reviewer returns `VERDICT: APPROVED`
-6. **Finalize** — approved plan is used to generate the plan file package
+2. **Prepare** — plan payload and a bash reviewer command script are written to temp files
+3. **Run** — the command script is executed through `reviewer-runtime/run-review.sh` when installed
+4. **Feedback** — reviewer evaluates correctness, risks, missing steps, alternatives, security
+5. **Revise** — the planning agent addresses each issue and re-submits
+6. **Repeat** — up to 5 rounds until the reviewer returns `VERDICT: APPROVED`
+7. **Finalize** — approved plan is used to generate the plan file package
+
+### Runtime Artifacts
+
+The review flow may create these temp artifacts:
+
+- `/tmp/plan-<id>.md` — plan payload
+- `/tmp/plan-review-<id>.md` — normalized review text
+- `/tmp/plan-review-<id>.json` — raw Cursor JSON output
+- `/tmp/plan-review-<id>.stderr` — reviewer stderr
+- `/tmp/plan-review-<id>.status` — helper heartbeat/status log
+- `/tmp/plan-review-<id>.runner.out` — helper-managed stdout from the reviewer command process
+- `/tmp/plan-review-<id>.sh` — reviewer command script
+
+Status log lines use this format:
+
+```text
+ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
+```
+
+`stall-warning` is a heartbeat/status-log state only. It is not a terminal review result.
+
+### Failure Handling
+
+- `completed-empty-output` means the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
+- `needs-operator-decision` means the helper reached hard-timeout escalation; surface `.status` and decide whether to keep waiting, abort, or retry with different parameters.
+- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds should retain `.stderr`, `.status`, and `.runner.out` until diagnosed.

 ### Supported Reviewer CLIs

@@ -143,6 +172,12 @@ After the plan is created (design + milestones + stories), the skill sends it to
 | `claude` | `claude -p --model <model> --allowedTools Read` | No (fresh call each round) | `--allowedTools Read` |
 | `cursor` | `cursor-agent -p --mode=ask --model <model> --trust --output-format json` | Yes (`--resume <id>`) | `--mode=ask` |

+For all three CLIs, the preferred execution path is:
+
+1. write the reviewer command to a bash script
+2. run that script through `reviewer-runtime/run-review.sh`
+3. fall back to direct synchronous execution only if the helper is missing or not executable
+
 ## Template Guardrails

 All plan templates now include guardrail sections that enforce: