feat(implement-plan): route milestone review through shared runtime
This commit is contained in:
@@ -132,6 +132,8 @@ Verify Superpowers execution dependencies exist in your agent skills root:
|
|||||||
- Executes milestones one-by-one, tracking stories in `story-tracker.md`.
|
- Executes milestones one-by-one, tracking stories in `story-tracker.md`.
|
||||||
- Runs lint/typecheck/tests as a gate before each milestone review.
|
- Runs lint/typecheck/tests as a gate before each milestone review.
|
||||||
- Sends each milestone to a reviewer CLI for approval (max rounds configurable, default 10).
|
- Sends each milestone to a reviewer CLI for approval (max rounds configurable, default 10).
|
||||||
|
- Runs reviewer commands through `reviewer-runtime/run-review.sh` when available, with fallback to direct synchronous execution only if the helper is missing.
|
||||||
|
- Captures reviewer stderr and helper status logs for diagnostics and retains them on failed, empty-output, or operator-decision review rounds.
|
||||||
- Commits each milestone locally only after reviewer approval (does not push).
|
- Commits each milestone locally only after reviewer approval (does not push).
|
||||||
- After all milestones approved, merges worktree branch to parent and deletes worktree.
|
- After all milestones approved, merges worktree branch to parent and deletes worktree.
|
||||||
- Supports resume: detects existing worktree and `in-dev`/`completed` stories.
|
- Supports resume: detects existing worktree and `in-dev`/`completed` stories.
|
||||||
@@ -141,11 +143,38 @@ Verify Superpowers execution dependencies exist in your agent skills root:
|
|||||||
After each milestone is implemented and verified, the skill sends it to a second model for review:
|
After each milestone is implemented and verified, the skill sends it to a second model for review:
|
||||||
|
|
||||||
1. **Configure** — user picks a reviewer CLI (`codex`, `claude`, `cursor`) and model, or skips
|
1. **Configure** — user picks a reviewer CLI (`codex`, `claude`, `cursor`) and model, or skips
|
||||||
2. **Submit** — milestone spec, acceptance criteria, git diff, and verification output are written to a temp file and sent to the reviewer in read-only/ask mode
|
2. **Prepare** — milestone payload and a bash reviewer command script are written to temp files
|
||||||
3. **Feedback** — reviewer evaluates correctness, acceptance criteria, code quality, test coverage, security
|
3. **Run** — the command script is executed through `reviewer-runtime/run-review.sh` when installed
|
||||||
4. **Revise** — the implementing agent addresses each issue, re-verifies, and re-submits
|
4. **Feedback** — reviewer evaluates correctness, acceptance criteria, code quality, test coverage, security
|
||||||
5. **Repeat** — up to max rounds (default 10) until the reviewer returns `VERDICT: APPROVED`
|
5. **Revise** — the implementing agent addresses each issue, re-verifies, and re-submits
|
||||||
6. **Approve** — milestone is marked approved in `story-tracker.md`
|
6. **Repeat** — up to max rounds (default 10) until the reviewer returns `VERDICT: APPROVED`
|
||||||
|
7. **Approve** — milestone is marked approved in `story-tracker.md`
|
||||||
|
|
||||||
|
### Runtime Artifacts
|
||||||
|
|
||||||
|
The milestone review flow may create these temp artifacts:
|
||||||
|
|
||||||
|
- `/tmp/milestone-<id>.md` — milestone review payload
|
||||||
|
- `/tmp/milestone-review-<id>.md` — normalized review text
|
||||||
|
- `/tmp/milestone-review-<id>.json` — raw Cursor JSON output
|
||||||
|
- `/tmp/milestone-review-<id>.stderr` — reviewer stderr
|
||||||
|
- `/tmp/milestone-review-<id>.status` — helper heartbeat/status log
|
||||||
|
- `/tmp/milestone-review-<id>.runner.out` — helper-managed stdout from the reviewer command process
|
||||||
|
- `/tmp/milestone-review-<id>.sh` — reviewer command script
|
||||||
|
|
||||||
|
Status log lines use this format:
|
||||||
|
|
||||||
|
```text
|
||||||
|
ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
|
||||||
|
```
|
||||||
|
|
||||||
|
`stall-warning` is a heartbeat/status-log state only. It is not a terminal review result.
|
||||||
|
|
||||||
|
### Failure Handling
|
||||||
|
|
||||||
|
- `completed-empty-output` means the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
|
||||||
|
- `needs-operator-decision` means the helper reached hard-timeout escalation; surface `.status` and decide whether to keep waiting, abort, or retry with different parameters.
|
||||||
|
- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds should retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
|
||||||
|
|
||||||
### Supported Reviewer CLIs
|
### Supported Reviewer CLIs
|
||||||
|
|
||||||
@@ -155,6 +184,26 @@ After each milestone is implemented and verified, the skill sends it to a second
|
|||||||
| `claude` | `claude -p --model <model> --allowedTools Read` | No (fresh call each round) | `--allowedTools Read` |
|
| `claude` | `claude -p --model <model> --allowedTools Read` | No (fresh call each round) | `--allowedTools Read` |
|
||||||
| `cursor` | `cursor-agent -p --mode=ask --model <model> --trust --output-format json` | Yes (`--resume <id>`) | `--mode=ask` |
|
| `cursor` | `cursor-agent -p --mode=ask --model <model> --trust --output-format json` | Yes (`--resume <id>`) | `--mode=ask` |
|
||||||
|
|
||||||
|
For all three CLIs, the preferred execution path is:
|
||||||
|
|
||||||
|
1. write the reviewer command to a bash script
|
||||||
|
2. run that script through `reviewer-runtime/run-review.sh`
|
||||||
|
3. fall back to direct synchronous execution only if the helper is missing or not executable
|
||||||
|
|
||||||
|
The helper also supports manual override flags for diagnostics:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
run-review.sh \
|
||||||
|
--command-file <path> \
|
||||||
|
--stdout-file <path> \
|
||||||
|
--stderr-file <path> \
|
||||||
|
--status-file <path> \
|
||||||
|
--poll-seconds 10 \
|
||||||
|
--soft-timeout-seconds 600 \
|
||||||
|
--stall-warning-seconds 300 \
|
||||||
|
--hard-timeout-seconds 1800
|
||||||
|
```
|
||||||
|
|
||||||
## Variant Hardening Notes
|
## Variant Hardening Notes
|
||||||
|
|
||||||
### Claude Code
|
### Claude Code
|
||||||
|
|||||||
@@ -136,7 +136,20 @@ Do NOT push. After committing:
|
|||||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||||
```
|
```
|
||||||
|
|
||||||
Use for all temp file paths: `/tmp/milestone-${REVIEW_ID}.md` and `/tmp/milestone-review-${REVIEW_ID}.md`.
|
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||||
|
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||||
|
|
||||||
|
Resolve the shared runtime helper path before writing the command script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
REVIEWER_RUNTIME=~/.claude/skills/reviewer-runtime/run-review.sh
|
||||||
|
```
|
||||||
|
|
||||||
#### Step 2: Write Review Payload
|
#### Step 2: Write Review Payload
|
||||||
|
|
||||||
@@ -165,6 +178,13 @@ Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
|||||||
|
|
||||||
#### Step 3: Submit to Reviewer (Round 1)
|
#### Step 3: Submit to Reviewer (Round 1)
|
||||||
|
|
||||||
|
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -185,7 +205,7 @@ Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
|||||||
If changes are needed, end with exactly: VERDICT: REVISE"
|
If changes are needed, end with exactly: VERDICT: REVISE"
|
||||||
```
|
```
|
||||||
|
|
||||||
Capture the Codex session ID from output (line `session id: <uuid>`). Store as `CODEX_SESSION_ID`.
|
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `claude`:**
|
**If `REVIEWER_CLI` is `claude`:**
|
||||||
|
|
||||||
@@ -203,8 +223,7 @@ Evaluate:
|
|||||||
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
||||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||||
--model ${REVIEWER_MODEL} \
|
--model ${REVIEWER_MODEL} \
|
||||||
--allowedTools Read \
|
--allowedTools Read
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `cursor`:**
|
**If `REVIEWER_CLI` is `cursor`:**
|
||||||
@@ -229,19 +248,43 @@ If changes are needed, end with exactly: VERDICT: REVISE" \
|
|||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
```
|
```
|
||||||
|
|
||||||
Extract session ID and review text (requires `jq`):
|
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||||
|
|
||||||
|
Run the command script through the shared helper when available:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||||
|
"$REVIEWER_RUNTIME" \
|
||||||
|
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||||
|
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
--status-file /tmp/milestone-review-${REVIEW_ID}.status
|
||||||
|
else
|
||||||
|
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||||
|
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
After the command completes:
|
||||||
|
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||||
```
|
```
|
||||||
|
|
||||||
If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||||
|
|
||||||
|
Fallback is allowed only when the helper is missing or not executable.
|
||||||
|
|
||||||
#### Step 4: Read Review & Check Verdict
|
#### Step 4: Read Review & Check Verdict
|
||||||
|
|
||||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
2. Present review to the user:
|
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
3. Present review to the user:
|
||||||
|
|
||||||
```
|
```
|
||||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||||
@@ -249,10 +292,12 @@ If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivale
|
|||||||
[Reviewer feedback]
|
[Reviewer feedback]
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Check verdict:
|
4. Check verdict:
|
||||||
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
||||||
- **VERDICT: REVISE** -> go to Step 5
|
- **VERDICT: REVISE** -> go to Step 5
|
||||||
- No clear verdict but positive / no actionable items -> treat as approved
|
- No clear verdict but positive / no actionable items -> treat as approved
|
||||||
|
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||||
|
- Helper state `needs-operator-decision` -> surface status log, note any `stall-warning` heartbeat lines as non-terminal operator hints, and decide whether to keep waiting, abort, or retry with different helper parameters
|
||||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||||
|
|
||||||
#### Step 5: Address Feedback & Re-verify
|
#### Step 5: Address Feedback & Re-verify
|
||||||
@@ -272,6 +317,8 @@ If a revision contradicts the user's explicit requirements, skip it and note it
|
|||||||
|
|
||||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||||
|
|
||||||
|
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
Resume the existing session:
|
Resume the existing session:
|
||||||
@@ -330,20 +377,31 @@ Changes made:
|
|||||||
Re-review. If solid, end with: VERDICT: APPROVED
|
Re-review. If solid, end with: VERDICT: APPROVED
|
||||||
If more changes needed, end with: VERDICT: REVISE" \
|
If more changes needed, end with: VERDICT: REVISE" \
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
|
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||||
|
|
||||||
|
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||||
|
|
||||||
|
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||||
|
|
||||||
Return to Step 4.
|
Return to Step 4.
|
||||||
|
|
||||||
#### Step 7: Cleanup Per Milestone
|
#### Step 7: Cleanup Per Milestone
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f /tmp/milestone-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.json
|
rm -f \
|
||||||
|
/tmp/milestone-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||||
|
|
||||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||||
|
|
||||||
After all milestones are approved and committed:
|
After all milestones are approved and committed:
|
||||||
|
|||||||
@@ -169,7 +169,20 @@ Do NOT push. After committing:
|
|||||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||||
```
|
```
|
||||||
|
|
||||||
Use for all temp file paths: `/tmp/milestone-${REVIEW_ID}.md` and `/tmp/milestone-review-${REVIEW_ID}.md`.
|
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||||
|
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||||
|
|
||||||
|
Resolve the shared runtime helper path before writing the command script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh
|
||||||
|
```
|
||||||
|
|
||||||
#### Step 2: Write Review Payload
|
#### Step 2: Write Review Payload
|
||||||
|
|
||||||
@@ -198,6 +211,13 @@ Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
|||||||
|
|
||||||
#### Step 3: Submit to Reviewer (Round 1)
|
#### Step 3: Submit to Reviewer (Round 1)
|
||||||
|
|
||||||
|
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -218,7 +238,7 @@ Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
|||||||
If changes are needed, end with exactly: VERDICT: REVISE"
|
If changes are needed, end with exactly: VERDICT: REVISE"
|
||||||
```
|
```
|
||||||
|
|
||||||
Capture the Codex session ID from output (line `session id: <uuid>`). Store as `CODEX_SESSION_ID`.
|
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `claude`:**
|
**If `REVIEWER_CLI` is `claude`:**
|
||||||
|
|
||||||
@@ -236,8 +256,7 @@ Evaluate:
|
|||||||
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
||||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||||
--model ${REVIEWER_MODEL} \
|
--model ${REVIEWER_MODEL} \
|
||||||
--allowedTools Read \
|
--allowedTools Read
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `cursor`:**
|
**If `REVIEWER_CLI` is `cursor`:**
|
||||||
@@ -262,19 +281,43 @@ If changes are needed, end with exactly: VERDICT: REVISE" \
|
|||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
```
|
```
|
||||||
|
|
||||||
Extract session ID and review text (requires `jq`):
|
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||||
|
|
||||||
|
Run the command script through the shared helper when available:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||||
|
"$REVIEWER_RUNTIME" \
|
||||||
|
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||||
|
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
--status-file /tmp/milestone-review-${REVIEW_ID}.status
|
||||||
|
else
|
||||||
|
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||||
|
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
After the command completes:
|
||||||
|
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||||
```
|
```
|
||||||
|
|
||||||
If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||||
|
|
||||||
|
Fallback is allowed only when the helper is missing or not executable.
|
||||||
|
|
||||||
#### Step 4: Read Review & Check Verdict
|
#### Step 4: Read Review & Check Verdict
|
||||||
|
|
||||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
2. Present review to the user:
|
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
3. Present review to the user:
|
||||||
|
|
||||||
```
|
```
|
||||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||||
@@ -282,10 +325,12 @@ If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivale
|
|||||||
[Reviewer feedback]
|
[Reviewer feedback]
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Check verdict:
|
4. Check verdict:
|
||||||
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
||||||
- **VERDICT: REVISE** -> go to Step 5
|
- **VERDICT: REVISE** -> go to Step 5
|
||||||
- No clear verdict but positive / no actionable items -> treat as approved
|
- No clear verdict but positive / no actionable items -> treat as approved
|
||||||
|
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||||
|
- Helper state `needs-operator-decision` -> surface status log, note any `stall-warning` heartbeat lines as non-terminal operator hints, and decide whether to keep waiting, abort, or retry with different helper parameters
|
||||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||||
|
|
||||||
#### Step 5: Address Feedback & Re-verify
|
#### Step 5: Address Feedback & Re-verify
|
||||||
@@ -305,6 +350,8 @@ If a revision contradicts the user's explicit requirements, skip it and note it
|
|||||||
|
|
||||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||||
|
|
||||||
|
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
Resume the existing session:
|
Resume the existing session:
|
||||||
@@ -363,20 +410,31 @@ Changes made:
|
|||||||
Re-review. If solid, end with: VERDICT: APPROVED
|
Re-review. If solid, end with: VERDICT: APPROVED
|
||||||
If more changes needed, end with: VERDICT: REVISE" \
|
If more changes needed, end with: VERDICT: REVISE" \
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
|
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||||
|
|
||||||
|
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||||
|
|
||||||
|
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||||
|
|
||||||
Return to Step 4.
|
Return to Step 4.
|
||||||
|
|
||||||
#### Step 7: Cleanup Per Milestone
|
#### Step 7: Cleanup Per Milestone
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f /tmp/milestone-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.json
|
rm -f \
|
||||||
|
/tmp/milestone-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||||
|
|
||||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||||
|
|
||||||
After all milestones are approved and committed:
|
After all milestones are approved and committed:
|
||||||
|
|||||||
@@ -169,7 +169,24 @@ Do NOT push. After committing:
|
|||||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||||
```
|
```
|
||||||
|
|
||||||
Use for all temp file paths: `/tmp/milestone-${REVIEW_ID}.md` and `/tmp/milestone-review-${REVIEW_ID}.md`.
|
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||||
|
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||||
|
|
||||||
|
Resolve the shared runtime helper path before writing the command script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [ -x .cursor/skills/reviewer-runtime/run-review.sh ]; then
|
||||||
|
REVIEWER_RUNTIME=.cursor/skills/reviewer-runtime/run-review.sh
|
||||||
|
else
|
||||||
|
REVIEWER_RUNTIME=~/.cursor/skills/reviewer-runtime/run-review.sh
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
#### Step 2: Write Review Payload
|
#### Step 2: Write Review Payload
|
||||||
|
|
||||||
@@ -198,6 +215,13 @@ Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
|||||||
|
|
||||||
#### Step 3: Submit to Reviewer (Round 1)
|
#### Step 3: Submit to Reviewer (Round 1)
|
||||||
|
|
||||||
|
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -218,7 +242,7 @@ Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
|||||||
If changes are needed, end with exactly: VERDICT: REVISE"
|
If changes are needed, end with exactly: VERDICT: REVISE"
|
||||||
```
|
```
|
||||||
|
|
||||||
Capture the Codex session ID from output (line `session id: <uuid>`). Store as `CODEX_SESSION_ID`.
|
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `claude`:**
|
**If `REVIEWER_CLI` is `claude`:**
|
||||||
|
|
||||||
@@ -236,8 +260,7 @@ Evaluate:
|
|||||||
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
||||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||||
--model ${REVIEWER_MODEL} \
|
--model ${REVIEWER_MODEL} \
|
||||||
--allowedTools Read \
|
--allowedTools Read
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `cursor`:**
|
**If `REVIEWER_CLI` is `cursor`:**
|
||||||
@@ -262,12 +285,7 @@ If changes are needed, end with exactly: VERDICT: REVISE" \
|
|||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
```
|
```
|
||||||
|
|
||||||
Extract session ID and review text:
|
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes.
|
||||||
|
|
||||||
```bash
|
|
||||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
|
||||||
|
|
||||||
Notes on Cursor flags:
|
Notes on Cursor flags:
|
||||||
- `--mode=ask` — read-only mode, no file modifications
|
- `--mode=ask` — read-only mode, no file modifications
|
||||||
@@ -275,10 +293,41 @@ Notes on Cursor flags:
|
|||||||
- `-p` / `--print` — non-interactive mode, output to stdout
|
- `-p` / `--print` — non-interactive mode, output to stdout
|
||||||
- `--output-format json` — structured output with `session_id` and `result` fields
|
- `--output-format json` — structured output with `session_id` and `result` fields
|
||||||
|
|
||||||
|
Run the command script through the shared helper when available:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||||
|
"$REVIEWER_RUNTIME" \
|
||||||
|
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||||
|
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
--status-file /tmp/milestone-review-${REVIEW_ID}.status
|
||||||
|
else
|
||||||
|
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||||
|
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
After the command completes:
|
||||||
|
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||||
|
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||||
|
```
|
||||||
|
|
||||||
|
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||||
|
|
||||||
|
Fallback is allowed only when the helper is missing or not executable.
|
||||||
|
|
||||||
#### Step 4: Read Review & Check Verdict
|
#### Step 4: Read Review & Check Verdict
|
||||||
|
|
||||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
2. Present review to the user:
|
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
3. Present review to the user:
|
||||||
|
|
||||||
```
|
```
|
||||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||||
@@ -286,10 +335,12 @@ Notes on Cursor flags:
|
|||||||
[Reviewer feedback]
|
[Reviewer feedback]
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Check verdict:
|
4. Check verdict:
|
||||||
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
- **VERDICT: APPROVED** -> proceed to Phase 4 Step 6 (commit & approve)
|
||||||
- **VERDICT: REVISE** -> go to Step 5
|
- **VERDICT: REVISE** -> go to Step 5
|
||||||
- No clear verdict but positive / no actionable items -> treat as approved
|
- No clear verdict but positive / no actionable items -> treat as approved
|
||||||
|
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||||
|
- Helper state `needs-operator-decision` -> surface status log, note any `stall-warning` heartbeat lines as non-terminal operator hints, and decide whether to keep waiting, abort, or retry with different helper parameters
|
||||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||||
|
|
||||||
#### Step 5: Address Feedback & Re-verify
|
#### Step 5: Address Feedback & Re-verify
|
||||||
@@ -309,6 +360,8 @@ If a revision contradicts the user's explicit requirements, skip it and note it
|
|||||||
|
|
||||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||||
|
|
||||||
|
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
Resume the existing session:
|
Resume the existing session:
|
||||||
@@ -367,20 +420,31 @@ Changes made:
|
|||||||
Re-review. If solid, end with: VERDICT: APPROVED
|
Re-review. If solid, end with: VERDICT: APPROVED
|
||||||
If more changes needed, end with: VERDICT: REVISE" \
|
If more changes needed, end with: VERDICT: REVISE" \
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
|
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||||
|
|
||||||
|
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||||
|
|
||||||
|
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||||
|
|
||||||
Return to Step 4.
|
Return to Step 4.
|
||||||
|
|
||||||
#### Step 7: Cleanup Per Milestone
|
#### Step 7: Cleanup Per Milestone
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f /tmp/milestone-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.json
|
rm -f \
|
||||||
|
/tmp/milestone-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||||
|
|
||||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||||
|
|
||||||
After all milestones are approved and committed:
|
After all milestones are approved and committed:
|
||||||
|
|||||||
@@ -154,7 +154,20 @@ Do NOT push. After committing:
|
|||||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||||
```
|
```
|
||||||
|
|
||||||
Use for all temp file paths: `/tmp/milestone-${REVIEW_ID}.md` and `/tmp/milestone-review-${REVIEW_ID}.md`.
|
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||||
|
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||||
|
|
||||||
|
Resolve the shared runtime helper path before writing the command script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
REVIEWER_RUNTIME=~/.config/opencode/skills/reviewer-runtime/run-review.sh
|
||||||
|
```
|
||||||
|
|
||||||
#### Step 2: Write Review Payload
|
#### Step 2: Write Review Payload
|
||||||
|
|
||||||
@@ -183,6 +196,13 @@ Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
|||||||
|
|
||||||
#### Step 3: Submit to Reviewer (Round 1)
|
#### Step 3: Submit to Reviewer (Round 1)
|
||||||
|
|
||||||
|
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -203,7 +223,7 @@ Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
|||||||
If changes are needed, end with exactly: VERDICT: REVISE"
|
If changes are needed, end with exactly: VERDICT: REVISE"
|
||||||
```
|
```
|
||||||
|
|
||||||
Capture the Codex session ID from output (line `session id: <uuid>`). Store as `CODEX_SESSION_ID`.
|
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `claude`:**
|
**If `REVIEWER_CLI` is `claude`:**
|
||||||
|
|
||||||
@@ -221,8 +241,7 @@ Evaluate:
|
|||||||
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
Be specific and actionable. If solid, end with exactly: VERDICT: APPROVED
|
||||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||||
--model ${REVIEWER_MODEL} \
|
--model ${REVIEWER_MODEL} \
|
||||||
--allowedTools Read \
|
--allowedTools Read
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `cursor`:**
|
**If `REVIEWER_CLI` is `cursor`:**
|
||||||
@@ -247,19 +266,43 @@ If changes are needed, end with exactly: VERDICT: REVISE" \
|
|||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
```
|
```
|
||||||
|
|
||||||
Extract session ID and review text (requires `jq`):
|
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||||
|
|
||||||
|
Run the command script through the shared helper when available:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||||
|
"$REVIEWER_RUNTIME" \
|
||||||
|
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||||
|
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
--status-file /tmp/milestone-review-${REVIEW_ID}.status
|
||||||
|
else
|
||||||
|
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||||
|
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||||
|
fi
|
||||||
|
```
|
||||||
|
|
||||||
|
After the command completes:
|
||||||
|
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||||
```
|
```
|
||||||
|
|
||||||
If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||||
|
|
||||||
|
Fallback is allowed only when the helper is missing or not executable.
|
||||||
|
|
||||||
#### Step 4: Read Review & Check Verdict
|
#### Step 4: Read Review & Check Verdict
|
||||||
|
|
||||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||||
2. Present review to the user:
|
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||||
|
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||||
|
3. Present review to the user:
|
||||||
|
|
||||||
```
|
```
|
||||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||||
@@ -267,10 +310,12 @@ If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivale
|
|||||||
[Reviewer feedback]
|
[Reviewer feedback]
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Check verdict:
|
4. Check verdict:
|
||||||
- **VERDICT: APPROVED** -> proceed to Phase 5 Step 6 (commit & approve)
|
- **VERDICT: APPROVED** -> proceed to Phase 5 Step 6 (commit & approve)
|
||||||
- **VERDICT: REVISE** -> go to Step 5
|
- **VERDICT: REVISE** -> go to Step 5
|
||||||
- No clear verdict but positive / no actionable items -> treat as approved
|
- No clear verdict but positive / no actionable items -> treat as approved
|
||||||
|
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||||
|
- Helper state `needs-operator-decision` -> surface status log, note any `stall-warning` heartbeat lines as non-terminal operator hints, and decide whether to keep waiting, abort, or retry with different helper parameters
|
||||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||||
|
|
||||||
#### Step 5: Address Feedback & Re-verify
|
#### Step 5: Address Feedback & Re-verify
|
||||||
@@ -290,6 +335,8 @@ If a revision contradicts the user's explicit requirements, skip it and note it
|
|||||||
|
|
||||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||||
|
|
||||||
|
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||||
|
|
||||||
**If `REVIEWER_CLI` is `codex`:**
|
**If `REVIEWER_CLI` is `codex`:**
|
||||||
|
|
||||||
Resume the existing session:
|
Resume the existing session:
|
||||||
@@ -348,20 +395,31 @@ Changes made:
|
|||||||
Re-review. If solid, end with: VERDICT: APPROVED
|
Re-review. If solid, end with: VERDICT: APPROVED
|
||||||
If more changes needed, end with: VERDICT: REVISE" \
|
If more changes needed, end with: VERDICT: REVISE" \
|
||||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||||
|
|
||||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
|
||||||
```
|
```
|
||||||
|
|
||||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||||
|
|
||||||
|
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||||
|
|
||||||
|
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||||
|
|
||||||
Return to Step 4.
|
Return to Step 4.
|
||||||
|
|
||||||
#### Step 7: Cleanup Per Milestone
|
#### Step 7: Cleanup Per Milestone
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
rm -f /tmp/milestone-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.md /tmp/milestone-review-${REVIEW_ID}.json
|
rm -f \
|
||||||
|
/tmp/milestone-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||||
|
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||||
|
|
||||||
### Phase 7: Completion (REQUIRED SUB-SKILL)
|
### Phase 7: Completion (REQUIRED SUB-SKILL)
|
||||||
|
|
||||||
After all milestones are approved and committed:
|
After all milestones are approved and committed:
|
||||||
|
|||||||
Reference in New Issue
Block a user