feat(create-plan): route review through shared runtime

2026-03-05 23:07:43 -06:00
parent 41a3b9d1ee
commit 04bf34544b
5 changed files with 310 additions and 33 deletions
--- a/skills/create-plan/codex/SKILL.md
+++ b/skills/create-plan/codex/SKILL.md
@@ -94,7 +94,20 @@ Send the plan to the configured reviewer CLI for feedback. Revise and re-submit
 REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
 ```

-Use for all temp file paths: `/tmp/plan-${REVIEW_ID}.md` and `/tmp/plan-review-${REVIEW_ID}.md`.
+Use for temp artifacts:
+- `/tmp/plan-${REVIEW_ID}.md` - plan payload
+- `/tmp/plan-review-${REVIEW_ID}.md` - normalized review text presented to the user
+- `/tmp/plan-review-${REVIEW_ID}.json` - raw Cursor JSON (only for `cursor`)
+- `/tmp/plan-review-${REVIEW_ID}.stderr` - reviewer stderr
+- `/tmp/plan-review-${REVIEW_ID}.status` - helper heartbeat/status log
+- `/tmp/plan-review-${REVIEW_ID}.runner.out` - helper-managed stdout from the reviewer command process
+- `/tmp/plan-review-${REVIEW_ID}.sh` - reviewer command script
+
+Resolve the shared reviewer helper from the installed Codex skills directory:
+
+```bash
+REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh
+```

 #### Step 2: Write Plan to Temp File

@@ -102,6 +115,13 @@ Write the complete plan (milestones, stories, design decisions, specs) to `/tmp/

 #### Step 3: Submit to Reviewer (Round 1)

+Write the reviewer invocation to `/tmp/plan-review-${REVIEW_ID}.sh` as a bash script:
+
+```bash
+#!/usr/bin/env bash
+set -euo pipefail
+```
+
 **If `REVIEWER_CLI` is `codex`:**

 ```bash
@@ -136,8 +156,7 @@ claude -p \
 Be specific and actionable. If the plan is solid, end with exactly: VERDICT: APPROVED
 If changes are needed, end with exactly: VERDICT: REVISE" \
  --model ${REVIEWER_MODEL} \
-  --allowedTools Read \
-  > /tmp/plan-review-${REVIEW_ID}.md
+  --allowedTools Read
 ```

 **If `REVIEWER_CLI` is `cursor`:**
@@ -169,10 +188,41 @@ jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_I

 If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.

+Run the command script through the shared helper when available:
+
+```bash
+if [ -x "$REVIEWER_RUNTIME" ]; then
+  "$REVIEWER_RUNTIME" \
+    --command-file /tmp/plan-review-${REVIEW_ID}.sh \
+    --stdout-file /tmp/plan-review-${REVIEW_ID}.runner.out \
+    --stderr-file /tmp/plan-review-${REVIEW_ID}.stderr \
+    --status-file /tmp/plan-review-${REVIEW_ID}.status
+else
+  echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
+  bash /tmp/plan-review-${REVIEW_ID}.sh >/tmp/plan-review-${REVIEW_ID}.runner.out 2>/tmp/plan-review-${REVIEW_ID}.stderr
+fi
+```
+
+After the command completes:
+- If `REVIEWER_CLI=cursor`, extract the final review text:
+
+```bash
+CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/plan-review-${REVIEW_ID}.json)
+jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md
+```
+
+- If `REVIEWER_CLI=codex` and the review text is only in `/tmp/plan-review-${REVIEW_ID}.runner.out`, move or copy the actual review body into `/tmp/plan-review-${REVIEW_ID}.md` before verdict parsing.
+
+Fallback is allowed only when the helper is missing or not executable.
+
 #### Step 4: Read Review & Check Verdict

 1. Read `/tmp/plan-review-${REVIEW_ID}.md`
-2. Present review to the user:
+2. If the review failed, produced empty output, or reached helper timeout, also read:
+   - `/tmp/plan-review-${REVIEW_ID}.stderr`
+   - `/tmp/plan-review-${REVIEW_ID}.status`
+   - `/tmp/plan-review-${REVIEW_ID}.runner.out`
+3. Present review to the user:

 ```
 ## Plan Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
@@ -184,6 +234,8 @@ If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivale
   - **VERDICT: APPROVED** → proceed to Phase 7 (Initialize workspace)
   - **VERDICT: REVISE** → go to Step 5
   - No clear verdict but positive / no actionable items → treat as approved
+   - Helper state `completed-empty-output` → treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
+   - Helper state `needs-operator-decision` → surface status log and decide whether to keep waiting, abort, or retry with different helper parameters
   - Max rounds (5) reached → proceed with warning

 #### Step 5: Revise the Plan
@@ -237,8 +289,7 @@ Changes made:
 Re-review the full plan. If solid, end with: VERDICT: APPROVED
 If more changes needed, end with: VERDICT: REVISE" \
  --model ${REVIEWER_MODEL} \
-  --allowedTools Read \
-  > /tmp/plan-review-${REVIEW_ID}.md
+  --allowedTools Read
 ```

 **If `REVIEWER_CLI` is `cursor`:**
@@ -265,6 +316,8 @@ jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_I

 If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.

+After updating `/tmp/plan-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
+
 Return to Step 4.

 #### Step 7: Present Final Result
@@ -282,9 +335,17 @@ Return to Step 4.
 #### Step 8: Cleanup

 ```bash
-rm -f /tmp/plan-${REVIEW_ID}.md /tmp/plan-review-${REVIEW_ID}.md /tmp/plan-review-${REVIEW_ID}.json
+rm -f /tmp/plan-${REVIEW_ID}.md \
+  /tmp/plan-review-${REVIEW_ID}.md \
+  /tmp/plan-review-${REVIEW_ID}.json \
+  /tmp/plan-review-${REVIEW_ID}.stderr \
+  /tmp/plan-review-${REVIEW_ID}.status \
+  /tmp/plan-review-${REVIEW_ID}.runner.out \
+  /tmp/plan-review-${REVIEW_ID}.sh
 ```

+If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
+
 ### Phase 7: Initialize Local Plan Workspace (MANDATORY)

 At project root: