Perform code optimization and document cleanup (#1)
## Summary - add repository-wide quality tooling and verification scaffolding, including CI workflows, pnpm workspace setup, ESLint/Prettier/markdown checks, and generated-output verification helpers - reorganize skill sources and generation flow by introducing canonical `_source` variants, generator/manifests, reusable helper abstractions, and shared web-automation/browser utilities - clean up and expand documentation so the root README flows into docs and skill docs, with clearer development, reviewer, installer, and workflow guidance ## Notable changes - docs flow and consistency cleanup across `README.md`, `docs/README.md`, and related docs - new scripts for `check`, docs verification, generated-file verification, shell portability, and safe directory replacement - refactors in Atlassian and web-automation skill runtimes to reduce duplication and centralize reusable code - changelog, development documentation, and CI surface updates ## Test Plan - [ ] `pnpm run check` - [ ] review generated/manifests and skill sync outputs - [ ] smoke-check docs flow from `README.md` to `docs/README.md` to skill docs ## Notes - this branch currently includes tracked `skills/web-automation/shared/node_modules` content that should be reviewed carefully as potentially noisy/accidental committed artifacts Co-authored-by: Stefano Fiorini <stefano.fiorini@firsthorizon.com> Reviewed-on: #1
This commit was merged in pull request #1.
This commit is contained in:
@@ -0,0 +1,646 @@
|
||||
---
|
||||
name: implement-plan
|
||||
description: Use when a plan folder (from create-plan) exists and needs to be executed in an isolated git worktree with iterative cross-model milestone review. ALWAYS invoke when user says "implement the plan", "execute the plan", "start implementation", "resume the plan", or similar execution requests.
|
||||
---
|
||||
|
||||
# Implement Plan (Claude Code)
|
||||
|
||||
Execute an existing plan (created by `create-plan`) in an isolated git worktree, with iterative cross-model review at each milestone boundary.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Plan folder exists under `ai_plan/` at project root
|
||||
- `continuation-runbook.md` exists in plan folder
|
||||
- `milestone-plan.md` exists in plan folder
|
||||
- `story-tracker.md` exists in plan folder
|
||||
- Git repo with worktree support: `git worktree list`
|
||||
- Superpowers execution skills:
|
||||
- `superpowers:executing-plans`
|
||||
- `superpowers:using-git-worktrees`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
|
||||
If any dependency is missing, stop immediately and return:
|
||||
|
||||
"Missing dependency: [specific missing item]. Ensure all prerequisites are met, then retry."
|
||||
|
||||
If no plan folder exists:
|
||||
|
||||
"No plan found under `ai_plan/`. Run `create-plan` first."
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Locate Plan
|
||||
|
||||
1. Scan `ai_plan/` for plan directories (most recent first by date prefix).
|
||||
2. If multiple plans exist, ask user which one to implement.
|
||||
3. If no plan exists, stop: "No plan found. Run create-plan first."
|
||||
4. Read `continuation-runbook.md` first (source of truth).
|
||||
5. Read `story-tracker.md` to detect resume state (`in-dev` or `completed` stories).
|
||||
6. Read `milestone-plan.md` for implementation details.
|
||||
|
||||
### Phase 2: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "implement the plan, review with claude sonnet"), use those values. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review each milestone?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `skip` — No external review, proceed with user approval only
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `o4-mini`, alternatives: `gpt-5.3-codex`, `o3`
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models
|
||||
- Accept any model string the user provides
|
||||
|
||||
3. **Max review rounds per milestone?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`. These values are fixed for the entire run.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 3: Set Up Worktree (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:using-git-worktrees` explicitly.
|
||||
|
||||
1. Branch naming: `implement/<plan-folder-name>` (e.g., `implement/2026-03-04-auth-system`).
|
||||
2. Follow worktree skill's directory priority: `.worktrees/` > `worktrees/` > CLAUDE.md > ask user.
|
||||
3. Verify `.gitignore` covers worktree directory.
|
||||
4. Run project setup (auto-detect: `npm install`, `cargo build`, `pip install`, etc.).
|
||||
5. Verify clean baseline (run tests).
|
||||
|
||||
**Resume detection:** If `story-tracker.md` shows `in-dev` or `completed` stories, check if worktree branch already exists (`git worktree list`). If so, `cd` into existing worktree instead of creating a new one.
|
||||
|
||||
### Phase 4: Execute Milestones (Loop)
|
||||
|
||||
For each milestone (M1, M2, ...):
|
||||
|
||||
#### Step 1: Read Milestone Spec
|
||||
|
||||
Read the milestone section from `milestone-plan.md`.
|
||||
|
||||
#### Step 2: Update Tracker
|
||||
|
||||
Mark first story `in-dev` in `story-tracker.md`.
|
||||
|
||||
#### Step 3: Implement Stories
|
||||
|
||||
Execute each story in order. After completing each story:
|
||||
|
||||
1. Mark `in-dev` -> `completed` in `story-tracker.md`
|
||||
2. Update counts
|
||||
3. Mark next story `in-dev`
|
||||
|
||||
Commit hashes are not available yet — they are backfilled in Step 6 after the milestone is approved and committed.
|
||||
|
||||
#### Step 4: Verify Milestone (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:verification-before-completion` explicitly.
|
||||
|
||||
```bash
|
||||
# Lint changed files
|
||||
# Typecheck
|
||||
# Run tests (targeted first, then full suite)
|
||||
```
|
||||
|
||||
All must pass before proceeding. If failures: fix, re-verify. Do NOT proceed to review with failures.
|
||||
|
||||
#### Step 5: Milestone Review Loop
|
||||
|
||||
Send to reviewer for approval **before committing**. See Phase 5 for details. The review payload uses working-tree diffs (`git diff` for unstaged, `git diff --staged` for staged changes).
|
||||
|
||||
**Skip this step if reviewer was set to `skip`.** When skipped, present the milestone summary to the user and ask for approval directly.
|
||||
|
||||
#### Step 6: Commit & Approve
|
||||
|
||||
Only after the reviewer approves (or user overrides at max rounds):
|
||||
|
||||
```bash
|
||||
git add <changed-files>
|
||||
git commit -m "feat(<scope>): implement milestone M<N> - <description>"
|
||||
```
|
||||
|
||||
Do NOT push. After committing:
|
||||
|
||||
1. Backfill the commit hash into the Notes column for all stories in this milestone in `story-tracker.md`.
|
||||
2. Mark milestone as `approved` in `story-tracker.md`.
|
||||
3. Move to next milestone.
|
||||
|
||||
### Phase 5: Milestone Review Loop (Detail)
|
||||
|
||||
**Skip this phase entirely if reviewer was set to `skip`.**
|
||||
|
||||
#### Step 1: Generate Session ID
|
||||
|
||||
```bash
|
||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
```
|
||||
|
||||
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||
|
||||
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared runtime helper path before writing the command script:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.claude/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
#### Step 2: Write Review Payload
|
||||
|
||||
Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
||||
|
||||
```markdown
|
||||
# Milestone M<N> Review: <title>
|
||||
|
||||
## Milestone Spec (from plan)
|
||||
[Copy milestone section from milestone-plan.md]
|
||||
|
||||
## Acceptance Criteria
|
||||
[Copy acceptance criteria checkboxes]
|
||||
|
||||
## Changes Made (git diff)
|
||||
[Output of: git diff -- for unstaged changes, or git diff --staged for staged changes]
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
[lint output]
|
||||
### Typecheck
|
||||
[typecheck output]
|
||||
### Tests
|
||||
[test output with pass/fail counts]
|
||||
```
|
||||
|
||||
#### Review Contract (Applies to Every Round)
|
||||
|
||||
The reviewer response must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Order findings from `P0` to `P3`.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
|
||||
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
|
||||
|
||||
#### Liveness Contract (Applies While Review Is Running)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
#### Step 3: Submit to Reviewer (Round 1)
|
||||
|
||||
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/milestone-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"Review this milestone implementation. The spec, acceptance criteria, git diff, and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"Review this milestone implementation using the following spec, acceptance criteria, git diff, and verification output:
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/milestone-${REVIEW_ID}.md and review this milestone implementation.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/milestone-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/milestone-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
After the command completes:
|
||||
|
||||
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||
- If `REVIEWER_CLI=claude` or `REVIEWER_CLI=pi`, promote stdout captured by the helper or fallback runner into the markdown review file:
|
||||
|
||||
```bash
|
||||
cp /tmp/milestone-review-${REVIEW_ID}.runner.out /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Fallback is allowed only when the helper is missing or not executable.
|
||||
|
||||
#### Step 4: Read Review & Check Verdict
|
||||
|
||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
3. Present review to the user:
|
||||
|
||||
```markdown
|
||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||
|
||||
[Reviewer feedback]
|
||||
```
|
||||
|
||||
1. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
|
||||
2. Check verdict:
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings -> proceed to Phase 4 Step 6 (commit & approve)
|
||||
- **VERDICT: APPROVED** with only `P3` findings -> optionally fix the `P3` items if they are cheap and safe, then proceed
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding -> go to Step 5
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` -> treat as approved
|
||||
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||
- Helper state `needs-operator-decision` -> surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
|
||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||
|
||||
#### Step 5: Address Feedback & Re-verify
|
||||
|
||||
1. Address the reviewer findings in priority order (`P0` -> `P1` -> `P2`, then `P3` when practical) (do NOT commit yet).
|
||||
2. Re-run verification (lint/typecheck/tests) — all must pass.
|
||||
3. Update `/tmp/milestone-${REVIEW_ID}.md` with new diff and verification output.
|
||||
|
||||
Summarize revisions for the user:
|
||||
|
||||
```markdown
|
||||
### Revisions (Round N)
|
||||
- [Change and reason, one bullet per issue addressed]
|
||||
```
|
||||
|
||||
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
|
||||
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||
|
||||
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call with prior-round context (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "You previously reviewed this milestone and requested revisions. Read the updated payload at /tmp/milestone-${REVIEW_ID}.md and re-review using the same ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${CODEX_SESSION_ID} \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call with accumulated context (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"You previously reviewed milestone M<N> and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've addressed your feedback. Updated diff and verification output are below.
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${CURSOR_SESSION_ID} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||
|
||||
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||
|
||||
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||
|
||||
Return to Step 4.
|
||||
|
||||
#### Step 7: Cleanup Per Milestone
|
||||
|
||||
```bash
|
||||
rm -f \
|
||||
/tmp/milestone-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||
|
||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||
|
||||
After all milestones are approved and committed:
|
||||
|
||||
1. Invoke `superpowers:finishing-a-development-branch` explicitly.
|
||||
2. Run full test suite one final time — all must pass.
|
||||
3. Merge the worktree branch into the parent branch:
|
||||
|
||||
```bash
|
||||
# From the main repo (not the worktree)
|
||||
git merge implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Delete the worktree and its branch:
|
||||
|
||||
```bash
|
||||
git worktree remove <worktree-path>
|
||||
git branch -d implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Mark plan status as `completed` in `story-tracker.md`.
|
||||
|
||||
### Phase 7: Final Report
|
||||
|
||||
Present summary:
|
||||
|
||||
```markdown
|
||||
## Implementation Complete
|
||||
|
||||
**Plan:** <plan-folder-name>
|
||||
**Milestones:** <N> completed, <N> approved
|
||||
**Review rounds:** <total across all milestones>
|
||||
**Branch:** implement/<plan-folder-name> (merged and deleted)
|
||||
```
|
||||
|
||||
### Phase 8: Telegram Notification (MANDATORY)
|
||||
|
||||
Resolve the Telegram notifier helper from the installed Claude Code skills directory:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.claude/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome for the implement-plan run (fully completed, stopped after max rounds, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "implement-plan completed for <plan-folder-name>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path. Do not use desktop notifications, `say`, email, or any other notifier.
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `story-tracker.md` before/after each story. NEVER proceed with stale tracker state.**
|
||||
|
||||
Before starting any story:
|
||||
|
||||
1. Open `story-tracker.md`
|
||||
2. Mark story `in-dev`
|
||||
3. Add notes if relevant
|
||||
4. Then begin implementation
|
||||
|
||||
After completing any story:
|
||||
|
||||
1. Mark story `completed`
|
||||
2. Review pending stories
|
||||
3. Update Last Updated and Stories Complete counts
|
||||
|
||||
Note: Commit hashes are backfilled into story Notes after the milestone commit (Step 6), not per-story.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Plan folder located and all required files present
|
||||
- [ ] Reviewer configured or explicitly skipped
|
||||
- [ ] Max review rounds confirmed (default: 10)
|
||||
- [ ] Worktree created with branch `implement/<plan-folder-name>`
|
||||
- [ ] Worktree directory verified in .gitignore
|
||||
- [ ] Baseline tests pass in worktree
|
||||
- [ ] Each milestone: stories tracked (in-dev -> completed)
|
||||
- [ ] Each milestone: lint/typecheck/tests pass before review
|
||||
- [ ] Each milestone: reviewer approved (or max rounds + user override)
|
||||
- [ ] Each milestone: committed locally only after approval
|
||||
- [ ] Each milestone: marked approved in story-tracker.md
|
||||
- [ ] All milestones completed, approved, and committed
|
||||
- [ ] Final test suite passes
|
||||
- [ ] Worktree branch merged to parent and worktree deleted
|
||||
- [ ] Story tracker updated with final status
|
||||
- [ ] Telegram notification attempted if configured
|
||||
@@ -0,0 +1,723 @@
|
||||
---
|
||||
name: implement-plan
|
||||
description: Use when a plan folder (from create-plan) exists and needs to be executed in an isolated git worktree with iterative cross-model milestone review. ALWAYS invoke when user says "implement the plan", "execute the plan", "start implementation", "resume the plan", or similar execution requests.
|
||||
---
|
||||
|
||||
# Implement Plan (Codex Native Superpowers)
|
||||
|
||||
Execute an existing plan (created by `create-plan`) in an isolated git worktree, with iterative cross-model review at each milestone boundary.
|
||||
|
||||
## Overview
|
||||
|
||||
This skill wraps the Superpowers execution flow for Codex:
|
||||
|
||||
1. Locate plan files under `ai_plan/`
|
||||
2. Set up an isolated git worktree
|
||||
3. Execute milestones one-by-one with lint/typecheck/test gates
|
||||
4. Review each milestone with a second model/provider
|
||||
5. Commit approved milestones, merge to parent branch, and delete worktree
|
||||
|
||||
**Core principle:** Codex uses native skill discovery from `~/.agents/skills/`. Do not use deprecated `superpowers-codex bootstrap` or `use-skill` CLI commands.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Plan folder exists under `ai_plan/` at project root
|
||||
- `continuation-runbook.md` exists in plan folder
|
||||
- `milestone-plan.md` exists in plan folder
|
||||
- `story-tracker.md` exists in plan folder
|
||||
- Git repo with worktree support: `git worktree list`
|
||||
- Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
- Superpowers execution skills:
|
||||
- `superpowers:executing-plans`
|
||||
- `superpowers:using-git-worktrees`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
test -L ~/.agents/skills/superpowers
|
||||
test -f ~/.agents/skills/superpowers/executing-plans/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
If any dependency is missing, stop and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Ensure all prerequisites are met, then retry.`
|
||||
|
||||
If no plan folder exists:
|
||||
|
||||
`No plan found under ai_plan/. Run create-plan first.`
|
||||
|
||||
## Required Skill Invocation Rules
|
||||
|
||||
- Invoke relevant skills through native discovery (no CLI wrapper).
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- For skills with checklists, track checklist items with `update_plan` todos.
|
||||
- Tool mapping for Codex:
|
||||
- `TodoWrite` -> `update_plan`
|
||||
- `Task` subagents -> unavailable in Codex; do the work directly and state the limitation
|
||||
- `Skill` -> use native skill discovery from `~/.agents/skills/`
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Locate Plan
|
||||
|
||||
1. Scan `ai_plan/` for plan directories (most recent first by date prefix).
|
||||
2. If multiple plans exist, ask user which one to implement.
|
||||
3. If no plan exists, stop: "No plan found. Run create-plan first."
|
||||
4. Read `continuation-runbook.md` first (source of truth).
|
||||
5. Read `story-tracker.md` to detect resume state (`in-dev` or `completed` stories).
|
||||
6. Read `milestone-plan.md` for implementation details.
|
||||
|
||||
### Phase 2: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "implement the plan, review with claude sonnet"), use those values. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review each milestone?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `skip` — No external review, proceed with user approval only
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `o4-mini`, alternatives: `gpt-5.3-codex`, `o3`
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models
|
||||
- Accept any model string the user provides
|
||||
|
||||
3. **Max review rounds per milestone?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`. These values are fixed for the entire run.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 3: Set Up Worktree (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:using-git-worktrees`.
|
||||
|
||||
1. Branch naming: `implement/<plan-folder-name>` (e.g., `implement/2026-03-04-auth-system`).
|
||||
2. Follow worktree skill's directory priority: `.worktrees/` > `worktrees/` > CLAUDE.md > ask user.
|
||||
3. Verify `.gitignore` covers worktree directory.
|
||||
4. Run project setup (auto-detect: `npm install`, `cargo build`, `pip install`, etc.).
|
||||
5. Verify clean baseline (run tests).
|
||||
|
||||
**Resume detection:** If `story-tracker.md` shows `in-dev` or `completed` stories, check if worktree branch already exists (`git worktree list`). If so, `cd` into existing worktree instead of creating a new one.
|
||||
|
||||
### Phase 4: Execute Milestones (Loop)
|
||||
|
||||
For each milestone (M1, M2, ...):
|
||||
|
||||
#### Step 1: Read Milestone Spec
|
||||
|
||||
Read the milestone section from `milestone-plan.md`.
|
||||
|
||||
#### Step 2: Update Tracker
|
||||
|
||||
Mark first story `in-dev` in `story-tracker.md`.
|
||||
|
||||
#### Step 3: Implement Stories
|
||||
|
||||
Execute each story in order. After completing each story:
|
||||
|
||||
1. Mark `in-dev` -> `completed` in `story-tracker.md`
|
||||
2. Update counts
|
||||
3. Mark next story `in-dev`
|
||||
|
||||
Commit hashes are not available yet — they are backfilled in Step 6 after the milestone is approved and committed.
|
||||
|
||||
#### Step 4: Verify Milestone (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:verification-before-completion`.
|
||||
|
||||
```bash
|
||||
# Lint changed files
|
||||
# Typecheck
|
||||
# Run tests (targeted first, then full suite)
|
||||
```
|
||||
|
||||
All must pass before proceeding. If failures: fix, re-verify. Do NOT proceed to review with failures.
|
||||
|
||||
#### Step 5: Milestone Review Loop
|
||||
|
||||
Send to reviewer for approval **before committing**. See Phase 5 for details. The review payload uses working-tree diffs (`git diff` for unstaged, `git diff --staged` for staged changes).
|
||||
|
||||
**Skip this step if reviewer was set to `skip`.** When skipped, present the milestone summary to the user and ask for approval directly.
|
||||
|
||||
#### Step 6: Commit & Approve
|
||||
|
||||
Only after the reviewer approves (or user overrides at max rounds):
|
||||
|
||||
```bash
|
||||
git add <changed-files>
|
||||
git commit -m "feat(<scope>): implement milestone M<N> - <description>"
|
||||
```
|
||||
|
||||
Do NOT push. After committing:
|
||||
|
||||
1. Backfill the commit hash into the Notes column for all stories in this milestone in `story-tracker.md`.
|
||||
2. Mark milestone as `approved` in `story-tracker.md`.
|
||||
3. Move to next milestone.
|
||||
|
||||
### Phase 5: Milestone Review Loop (Detail)
|
||||
|
||||
**Skip this phase entirely if reviewer was set to `skip`.**
|
||||
|
||||
#### Step 1: Generate Session ID
|
||||
|
||||
```bash
|
||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
```
|
||||
|
||||
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||
|
||||
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared runtime helper path before writing the command script:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
#### Step 2: Write Review Payload
|
||||
|
||||
Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
||||
|
||||
```markdown
|
||||
# Milestone M<N> Review: <title>
|
||||
|
||||
## Milestone Spec (from plan)
|
||||
[Copy milestone section from milestone-plan.md]
|
||||
|
||||
## Acceptance Criteria
|
||||
[Copy acceptance criteria checkboxes]
|
||||
|
||||
## Changes Made (git diff)
|
||||
[Output of: git diff -- for unstaged changes, or git diff --staged for staged changes]
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
[lint output]
|
||||
### Typecheck
|
||||
[typecheck output]
|
||||
### Tests
|
||||
[test output with pass/fail counts]
|
||||
```
|
||||
|
||||
#### Review Contract (Applies to Every Round)
|
||||
|
||||
The reviewer response must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Order findings from `P0` to `P3`.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
|
||||
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
|
||||
|
||||
#### Liveness Contract (Applies While Review Is Running)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
#### Step 3: Submit to Reviewer (Round 1)
|
||||
|
||||
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/milestone-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"Review this milestone implementation. The spec, acceptance criteria, git diff, and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"Review this milestone implementation using the following spec, acceptance criteria, git diff, and verification output:
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/milestone-${REVIEW_ID}.md and review this milestone implementation.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/milestone-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/milestone-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
After the command completes:
|
||||
|
||||
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||
- If `REVIEWER_CLI=claude` or `REVIEWER_CLI=pi`, promote stdout captured by the helper or fallback runner into the markdown review file:
|
||||
|
||||
```bash
|
||||
cp /tmp/milestone-review-${REVIEW_ID}.runner.out /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Fallback is allowed only when the helper is missing or not executable.
|
||||
|
||||
#### Step 4: Read Review & Check Verdict
|
||||
|
||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
3. Present review to the user:
|
||||
|
||||
```markdown
|
||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||
|
||||
[Reviewer feedback]
|
||||
```
|
||||
|
||||
1. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
|
||||
2. Check verdict:
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings -> proceed to Phase 4 Step 6 (commit & approve)
|
||||
- **VERDICT: APPROVED** with only `P3` findings -> optionally fix the `P3` items if they are cheap and safe, then proceed
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding -> go to Step 5
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` -> treat as approved
|
||||
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||
- Helper state `needs-operator-decision` -> surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
|
||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||
|
||||
#### Step 5: Address Feedback & Re-verify
|
||||
|
||||
1. Address the reviewer findings in priority order (`P0` -> `P1` -> `P2`, then `P3` when practical) (do NOT commit yet).
|
||||
2. Re-run verification (lint/typecheck/tests) — all must pass.
|
||||
3. Update `/tmp/milestone-${REVIEW_ID}.md` with new diff and verification output.
|
||||
|
||||
Summarize revisions for the user:
|
||||
|
||||
```markdown
|
||||
### Revisions (Round N)
|
||||
- [Change and reason, one bullet per issue addressed]
|
||||
```
|
||||
|
||||
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
|
||||
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||
|
||||
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call with prior-round context (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "You previously reviewed this milestone and requested revisions. Read the updated payload at /tmp/milestone-${REVIEW_ID}.md and re-review using the same ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${CODEX_SESSION_ID} \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call with accumulated context (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"You previously reviewed milestone M<N> and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've addressed your feedback. Updated diff and verification output are below.
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${CURSOR_SESSION_ID} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||
|
||||
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||
|
||||
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||
|
||||
Return to Step 4.
|
||||
|
||||
#### Step 7: Cleanup Per Milestone
|
||||
|
||||
```bash
|
||||
rm -f \
|
||||
/tmp/milestone-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||
|
||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||
|
||||
After all milestones are approved and committed:
|
||||
|
||||
1. Invoke `superpowers:finishing-a-development-branch`.
|
||||
2. Run full test suite one final time — all must pass.
|
||||
3. Merge the worktree branch into the parent branch:
|
||||
|
||||
```bash
|
||||
# From the main repo (not the worktree)
|
||||
git merge implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Delete the worktree and its branch:
|
||||
|
||||
```bash
|
||||
git worktree remove <worktree-path>
|
||||
git branch -d implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Mark plan status as `completed` in `story-tracker.md`.
|
||||
|
||||
### Phase 7: Final Report
|
||||
|
||||
Present summary:
|
||||
|
||||
```markdown
|
||||
## Implementation Complete
|
||||
|
||||
**Plan:** <plan-folder-name>
|
||||
**Milestones:** <N> completed, <N> approved
|
||||
**Review rounds:** <total across all milestones>
|
||||
**Branch:** implement/<plan-folder-name> (merged and deleted)
|
||||
```
|
||||
|
||||
### Phase 8: Telegram Notification (MANDATORY)
|
||||
|
||||
Resolve the Telegram notifier helper from the installed Codex skills directory:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.codex/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome for the implement-plan run (fully completed, stopped after max rounds, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "implement-plan completed for <plan-folder-name>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path. Do not use desktop notifications, `say`, email, or any other notifier.
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Phase | Action | Required Output |
|
||||
|---|---|---|
|
||||
| 1 | Locate plan in `ai_plan/` | Plan folder identified, files read |
|
||||
| 2 | Configure reviewer CLI and model | `REVIEWER_CLI`, `REVIEWER_MODEL`, `MAX_ROUNDS` |
|
||||
| 3 | Invoke `superpowers:using-git-worktrees` | Worktree created, baseline passing |
|
||||
| 4 | Execute milestones (loop) | Stories tracked, verified, committed, reviewed |
|
||||
| 5 | Milestone review loop (per milestone) | Reviewer approval or max rounds + user override |
|
||||
| 6 | Invoke `superpowers:finishing-a-development-branch` | Branch merged to parent, worktree deleted |
|
||||
| 7 | Final report | Summary presented |
|
||||
| 8 | Send Telegram notification | User notified or notification status reported |
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `story-tracker.md` before/after each story. NEVER proceed with stale tracker state.**
|
||||
|
||||
Before starting any story:
|
||||
|
||||
1. Open `story-tracker.md`
|
||||
2. Mark story `in-dev`
|
||||
3. Add notes if relevant
|
||||
4. Then begin implementation
|
||||
|
||||
After completing any story:
|
||||
|
||||
1. Mark story `completed`
|
||||
2. Review pending stories
|
||||
3. Update Last Updated and Stories Complete counts
|
||||
|
||||
Note: Commit hashes are backfilled into story Notes after the milestone commit (Step 6), not per-story.
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Using deprecated commands like `superpowers-codex bootstrap` or `superpowers-codex use-skill`.
|
||||
- Proceeding to milestone review with failing tests.
|
||||
- Pushing commits before milestone approval.
|
||||
- Skipping worktree setup and working directly on the main branch.
|
||||
- Not capturing the Codex session ID for resume in subsequent review rounds.
|
||||
- Forgetting to update `story-tracker.md` between stories.
|
||||
- Creating a new worktree when one already exists for a resumed plan.
|
||||
- Using any notification path other than Telegram.
|
||||
|
||||
## Rationalizations and Counters
|
||||
|
||||
| Rationalization | Counter |
|
||||
|---|---|
|
||||
| "Bootstrap CLI is faster" | Deprecated for Codex; native discovery is the supported path. |
|
||||
| "I can skip the worktree for small plans" | Worktree isolation is mandatory — it protects the main branch. |
|
||||
| "Tests passed earlier, I can skip re-verification" | Each milestone must be independently verified before review. |
|
||||
| "The reviewer approved, I can skip my own validation" | Reviewer feedback supplements but does not replace your own verification. |
|
||||
| "I can commit before the reviewer approves" | Code must only be committed after milestone approval — reviewers evaluate uncommitted diffs. |
|
||||
|
||||
## Red Flags - Stop and Correct
|
||||
|
||||
- You are about to run any `superpowers-codex` command.
|
||||
- You are pushing commits without user approval.
|
||||
- You did not announce which skill you invoked and why.
|
||||
- You are proceeding to review with failing lint/typecheck/tests.
|
||||
- You are skipping the worktree and working on the main branch.
|
||||
- You are applying a reviewer suggestion that contradicts user requirements.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Plan folder located and all required files present
|
||||
- [ ] Reviewer configured or explicitly skipped
|
||||
- [ ] Max review rounds confirmed (default: 10)
|
||||
- [ ] Worktree created with branch `implement/<plan-folder-name>`
|
||||
- [ ] Worktree directory verified in .gitignore
|
||||
- [ ] Baseline tests pass in worktree
|
||||
- [ ] Each milestone: stories tracked (in-dev -> completed)
|
||||
- [ ] Each milestone: lint/typecheck/tests pass before review
|
||||
- [ ] Each milestone: reviewer approved (or max rounds + user override)
|
||||
- [ ] Each milestone: committed locally only after approval
|
||||
- [ ] Each milestone: marked approved in story-tracker.md
|
||||
- [ ] All milestones completed, approved, and committed
|
||||
- [ ] Final test suite passes
|
||||
- [ ] Worktree branch merged to parent and worktree deleted
|
||||
- [ ] Story tracker updated with final status
|
||||
- [ ] Telegram notification attempted if configured
|
||||
@@ -0,0 +1,727 @@
|
||||
---
|
||||
name: implement-plan
|
||||
description: Use when a plan folder (from create-plan) exists and needs to be executed in an isolated git worktree with iterative cross-model milestone review in Cursor Agent CLI workflows. ALWAYS invoke when user says "implement the plan", "execute the plan", "start implementation", "resume the plan", or similar execution requests.
|
||||
---
|
||||
|
||||
# Implement Plan (Cursor Agent CLI)
|
||||
|
||||
Execute an existing plan (created by `create-plan`) in an isolated git worktree, with iterative cross-model review at each milestone boundary.
|
||||
|
||||
## Overview
|
||||
|
||||
This skill wraps the Superpowers execution flow for the Cursor Agent CLI (`cursor-agent`):
|
||||
|
||||
1. Locate plan files under `ai_plan/`
|
||||
2. Set up an isolated git worktree
|
||||
3. Execute milestones one-by-one with lint/typecheck/test gates
|
||||
4. Review each milestone with a second model/provider
|
||||
5. Commit approved milestones, merge to parent branch, and delete worktree
|
||||
|
||||
**Core principle:** Cursor Agent CLI discovers skills from `.cursor/skills/` (repo-local), `~/.cursor/skills/` (global), and installed Cursor plugin cache entries. It also reads `AGENTS.md` at the repo root for additional instructions.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Cursor Agent CLI: `cursor-agent --version` (install via `curl https://cursor.com/install -fsS | bash`). The binary is `cursor-agent` (installed to `~/.local/bin/`). Some environments alias it as `cursor agent` (subcommand of the Cursor IDE CLI) — both forms work, but this skill uses `cursor-agent` throughout.
|
||||
- `jq` (required only if using `cursor` as the reviewer CLI): `jq --version` (install via `brew install jq` or your package manager)
|
||||
- Plan folder exists under `ai_plan/` at project root
|
||||
- `continuation-runbook.md` exists in plan folder
|
||||
- `milestone-plan.md` exists in plan folder
|
||||
- `story-tracker.md` exists in plan folder
|
||||
- Git repo with worktree support: `git worktree list`
|
||||
- Superpowers skills available from the Cursor plugin cache, `.cursor/skills/` (repo-local), or `~/.cursor/skills/` (global). Do not install both the plugin and a manual Superpowers copy, or Cursor may show duplicate skill entries.
|
||||
- Superpowers execution skills:
|
||||
- `superpowers:executing-plans`
|
||||
- `superpowers:using-git-worktrees`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
cursor-agent --version
|
||||
test -f .cursor/skills/superpowers/skills/executing-plans/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/executing-plans/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/executing-plans/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/using-git-worktrees/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/using-git-worktrees/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/using-git-worktrees/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/verification-before-completion/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/finishing-a-development-branch/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
# Only if using cursor as reviewer CLI:
|
||||
# jq --version
|
||||
```
|
||||
|
||||
If any dependency is missing, stop and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Install the Cursor Superpowers plugin or install Superpowers under .cursor/skills/ or ~/.cursor/skills/, then retry.`
|
||||
|
||||
If no plan folder exists:
|
||||
|
||||
`No plan found under ai_plan/. Run create-plan first.`
|
||||
|
||||
## Required Skill Invocation Rules
|
||||
|
||||
- Invoke relevant skills through Cursor-native discovery (`.cursor/skills/`, `~/.cursor/skills/`, or installed Cursor plugin cache entries).
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- For skills with checklists, track checklist items explicitly in conversation.
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Locate Plan
|
||||
|
||||
1. Scan `ai_plan/` for plan directories (most recent first by date prefix).
|
||||
2. If multiple plans exist, ask user which one to implement.
|
||||
3. If no plan exists, stop: "No plan found. Run create-plan first."
|
||||
4. Read `continuation-runbook.md` first (source of truth).
|
||||
5. Read `story-tracker.md` to detect resume state (`in-dev` or `completed` stories).
|
||||
6. Read `milestone-plan.md` for implementation details.
|
||||
|
||||
### Phase 2: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "implement the plan, review with codex o4-mini"), use those values. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review each milestone?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `skip` — No external review, proceed with user approval only
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `o4-mini`, alternatives: `gpt-5.3-codex`, `o3`
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models
|
||||
- Accept any model string the user provides
|
||||
|
||||
3. **Max review rounds per milestone?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`. These values are fixed for the entire run.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 3: Set Up Worktree (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:using-git-worktrees`.
|
||||
|
||||
1. Branch naming: `implement/<plan-folder-name>` (e.g., `implement/2026-03-04-auth-system`).
|
||||
2. Follow worktree skill's directory priority: `.worktrees/` > `worktrees/` > CLAUDE.md > ask user.
|
||||
3. Verify `.gitignore` covers worktree directory.
|
||||
4. Run project setup (auto-detect: `npm install`, `cargo build`, `pip install`, etc.).
|
||||
5. Verify clean baseline (run tests).
|
||||
|
||||
**Resume detection:** If `story-tracker.md` shows `in-dev` or `completed` stories, check if worktree branch already exists (`git worktree list`). If so, `cd` into existing worktree instead of creating a new one.
|
||||
|
||||
### Phase 4: Execute Milestones (Loop)
|
||||
|
||||
For each milestone (M1, M2, ...):
|
||||
|
||||
#### Step 1: Read Milestone Spec
|
||||
|
||||
Read the milestone section from `milestone-plan.md`.
|
||||
|
||||
#### Step 2: Update Tracker
|
||||
|
||||
Mark first story `in-dev` in `story-tracker.md`.
|
||||
|
||||
#### Step 3: Implement Stories
|
||||
|
||||
Execute each story in order. After completing each story:
|
||||
|
||||
1. Mark `in-dev` -> `completed` in `story-tracker.md`
|
||||
2. Update counts
|
||||
3. Mark next story `in-dev`
|
||||
|
||||
Commit hashes are not available yet — they are backfilled in Step 6 after the milestone is approved and committed.
|
||||
|
||||
#### Step 4: Verify Milestone (REQUIRED SUB-SKILL)
|
||||
|
||||
Invoke `superpowers:verification-before-completion`.
|
||||
|
||||
```bash
|
||||
# Lint changed files
|
||||
# Typecheck
|
||||
# Run tests (targeted first, then full suite)
|
||||
```
|
||||
|
||||
All must pass before proceeding. If failures: fix, re-verify. Do NOT proceed to review with failures.
|
||||
|
||||
#### Step 5: Milestone Review Loop
|
||||
|
||||
Send to reviewer for approval **before committing**. See Phase 5 for details. The review payload uses working-tree diffs (`git diff` for unstaged, `git diff --staged` for staged changes).
|
||||
|
||||
**Skip this step if reviewer was set to `skip`.** When skipped, present the milestone summary to the user and ask for approval directly.
|
||||
|
||||
#### Step 6: Commit & Approve
|
||||
|
||||
Only after the reviewer approves (or user overrides at max rounds):
|
||||
|
||||
```bash
|
||||
git add <changed-files>
|
||||
git commit -m "feat(<scope>): implement milestone M<N> - <description>"
|
||||
```
|
||||
|
||||
Do NOT push. After committing:
|
||||
|
||||
1. Backfill the commit hash into the Notes column for all stories in this milestone in `story-tracker.md`.
|
||||
2. Mark milestone as `approved` in `story-tracker.md`.
|
||||
3. Move to next milestone.
|
||||
|
||||
### Phase 5: Milestone Review Loop (Detail)
|
||||
|
||||
**Skip this phase entirely if reviewer was set to `skip`.**
|
||||
|
||||
#### Step 1: Generate Session ID
|
||||
|
||||
```bash
|
||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
```
|
||||
|
||||
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||
|
||||
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared runtime helper path before writing the command script:
|
||||
|
||||
```bash
|
||||
if [ -x .cursor/skills/reviewer-runtime/run-review.sh ]; then
|
||||
REVIEWER_RUNTIME=.cursor/skills/reviewer-runtime/run-review.sh
|
||||
else
|
||||
REVIEWER_RUNTIME=~/.cursor/skills/reviewer-runtime/run-review.sh
|
||||
fi
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
#### Step 2: Write Review Payload
|
||||
|
||||
Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
||||
|
||||
```markdown
|
||||
# Milestone M<N> Review: <title>
|
||||
|
||||
## Milestone Spec (from plan)
|
||||
[Copy milestone section from milestone-plan.md]
|
||||
|
||||
## Acceptance Criteria
|
||||
[Copy acceptance criteria checkboxes]
|
||||
|
||||
## Changes Made (git diff)
|
||||
[Output of: git diff -- for unstaged changes, or git diff --staged for staged changes]
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
[lint output]
|
||||
### Typecheck
|
||||
[typecheck output]
|
||||
### Tests
|
||||
[test output with pass/fail counts]
|
||||
```
|
||||
|
||||
#### Review Contract (Applies to Every Round)
|
||||
|
||||
The reviewer response must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Order findings from `P0` to `P3`.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
|
||||
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
|
||||
|
||||
#### Liveness Contract (Applies While Review Is Running)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
#### Step 3: Submit to Reviewer (Round 1)
|
||||
|
||||
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/milestone-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"Review this milestone implementation. The spec, acceptance criteria, git diff, and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"Review this milestone implementation using the following spec, acceptance criteria, git diff, and verification output:
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/milestone-${REVIEW_ID}.md and review this milestone implementation.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes.
|
||||
|
||||
Notes on Cursor flags:
|
||||
|
||||
- `--mode=ask` — read-only mode, no file modifications
|
||||
- `--trust` — trust workspace without prompting (required for non-interactive use)
|
||||
- `-p` / `--print` — non-interactive mode, output to stdout
|
||||
- `--output-format json` — structured output with `session_id` and `result` fields
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/milestone-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/milestone-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
After the command completes:
|
||||
|
||||
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||
- If `REVIEWER_CLI=claude` or `REVIEWER_CLI=pi`, promote stdout captured by the helper or fallback runner into the markdown review file:
|
||||
|
||||
```bash
|
||||
cp /tmp/milestone-review-${REVIEW_ID}.runner.out /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Fallback is allowed only when the helper is missing or not executable.
|
||||
|
||||
#### Step 4: Read Review & Check Verdict
|
||||
|
||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
3. Present review to the user:
|
||||
|
||||
```markdown
|
||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||
|
||||
[Reviewer feedback]
|
||||
```
|
||||
|
||||
1. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
|
||||
2. Check verdict:
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings -> proceed to Phase 4 Step 6 (commit & approve)
|
||||
- **VERDICT: APPROVED** with only `P3` findings -> optionally fix the `P3` items if they are cheap and safe, then proceed
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding -> go to Step 5
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` -> treat as approved
|
||||
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||
- Helper state `needs-operator-decision` -> surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
|
||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||
|
||||
#### Step 5: Address Feedback & Re-verify
|
||||
|
||||
1. Address the reviewer findings in priority order (`P0` -> `P1` -> `P2`, then `P3` when practical) (do NOT commit yet).
|
||||
2. Re-run verification (lint/typecheck/tests) — all must pass.
|
||||
3. Update `/tmp/milestone-${REVIEW_ID}.md` with new diff and verification output.
|
||||
|
||||
Summarize revisions for the user:
|
||||
|
||||
```markdown
|
||||
### Revisions (Round N)
|
||||
- [Change and reason, one bullet per issue addressed]
|
||||
```
|
||||
|
||||
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
|
||||
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||
|
||||
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call with prior-round context (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "You previously reviewed this milestone and requested revisions. Read the updated payload at /tmp/milestone-${REVIEW_ID}.md and re-review using the same ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${CODEX_SESSION_ID} \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call with accumulated context (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"You previously reviewed milestone M<N> and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've addressed your feedback. Updated diff and verification output are below.
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${CURSOR_SESSION_ID} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||
|
||||
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||
|
||||
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||
|
||||
Return to Step 4.
|
||||
|
||||
#### Step 7: Cleanup Per Milestone
|
||||
|
||||
```bash
|
||||
rm -f \
|
||||
/tmp/milestone-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||
|
||||
### Phase 6: Completion (REQUIRED SUB-SKILL)
|
||||
|
||||
After all milestones are approved and committed:
|
||||
|
||||
1. Invoke `superpowers:finishing-a-development-branch`.
|
||||
2. Run full test suite one final time — all must pass.
|
||||
3. Merge the worktree branch into the parent branch:
|
||||
|
||||
```bash
|
||||
# From the main repo (not the worktree)
|
||||
git merge implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Delete the worktree and its branch:
|
||||
|
||||
```bash
|
||||
git worktree remove <worktree-path>
|
||||
git branch -d implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Mark plan status as `completed` in `story-tracker.md`.
|
||||
|
||||
### Phase 7: Final Report
|
||||
|
||||
Present summary:
|
||||
|
||||
```markdown
|
||||
## Implementation Complete
|
||||
|
||||
**Plan:** <plan-folder-name>
|
||||
**Milestones:** <N> completed, <N> approved
|
||||
**Review rounds:** <total across all milestones>
|
||||
**Branch:** implement/<plan-folder-name> (merged and deleted)
|
||||
```
|
||||
|
||||
### Phase 8: Telegram Notification (MANDATORY)
|
||||
|
||||
Resolve the Telegram notifier helper from Cursor's installed skills directory:
|
||||
|
||||
```bash
|
||||
if [ -x .cursor/skills/reviewer-runtime/notify-telegram.sh ]; then
|
||||
TELEGRAM_NOTIFY_RUNTIME=.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
else
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
fi
|
||||
```
|
||||
|
||||
On every terminal outcome for the implement-plan run (fully completed, stopped after max rounds, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "implement-plan completed for <plan-folder-name>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path. Do not use desktop notifications, `say`, email, or any other notifier.
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Phase | Action | Required Output |
|
||||
|---|---|---|
|
||||
| 1 | Locate plan in `ai_plan/` | Plan folder identified, files read |
|
||||
| 2 | Configure reviewer CLI and model | `REVIEWER_CLI`, `REVIEWER_MODEL`, `MAX_ROUNDS` |
|
||||
| 3 | Invoke `superpowers:using-git-worktrees` | Worktree created, baseline passing |
|
||||
| 4 | Execute milestones (loop) | Stories tracked, verified, committed, reviewed |
|
||||
| 5 | Milestone review loop (per milestone) | Reviewer approval or max rounds + user override |
|
||||
| 6 | Invoke `superpowers:finishing-a-development-branch` | Branch merged to parent, worktree deleted |
|
||||
| 7 | Final report | Summary presented |
|
||||
| 8 | Send Telegram notification | User notified or notification status reported |
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
Before starting any story:
|
||||
|
||||
1. Open `story-tracker.md`
|
||||
2. Mark story `in-dev`
|
||||
3. Add notes if relevant
|
||||
4. Then begin implementation
|
||||
|
||||
After completing any story:
|
||||
|
||||
1. Mark story `completed`
|
||||
2. Review pending stories
|
||||
3. Update Last Updated and Stories Complete counts
|
||||
|
||||
Note: Commit hashes are backfilled into story Notes after the milestone commit (Step 6), not per-story.
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Forgetting `--trust` flag when running `cursor-agent` non-interactively (causes interactive prompt).
|
||||
- Using `--mode=agent` or `--force` for reviews (reviewer should be read-only, use `--mode=ask`).
|
||||
- Proceeding to milestone review with failing tests.
|
||||
- Pushing commits before milestone approval.
|
||||
- Skipping worktree setup and working directly on the main branch.
|
||||
- Forgetting to update `story-tracker.md` between stories.
|
||||
- Creating a new worktree when one already exists for a resumed plan.
|
||||
- Using any notification path other than Telegram.
|
||||
|
||||
## Red Flags - Stop and Correct
|
||||
|
||||
- You started implementing without reading `continuation-runbook.md` first.
|
||||
- You are pushing commits without user approval.
|
||||
- You did not announce which skill you invoked and why.
|
||||
- You are proceeding to review with failing lint/typecheck/tests.
|
||||
- You are skipping the worktree and working on the main branch.
|
||||
- You are applying a reviewer suggestion that contradicts user requirements.
|
||||
- Reviewer CLI is running with write permissions (must be read-only).
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Plan folder located and all required files present
|
||||
- [ ] Reviewer configured or explicitly skipped
|
||||
- [ ] Max review rounds confirmed (default: 10)
|
||||
- [ ] Worktree created with branch `implement/<plan-folder-name>`
|
||||
- [ ] Worktree directory verified in .gitignore
|
||||
- [ ] Baseline tests pass in worktree
|
||||
- [ ] Each milestone: stories tracked (in-dev -> completed)
|
||||
- [ ] Each milestone: lint/typecheck/tests pass before review
|
||||
- [ ] Each milestone: reviewer approved (or max rounds + user override)
|
||||
- [ ] Each milestone: committed locally only after approval
|
||||
- [ ] Each milestone: marked approved in story-tracker.md
|
||||
- [ ] All milestones completed, approved, and committed
|
||||
- [ ] Final test suite passes
|
||||
- [ ] Worktree branch merged to parent and worktree deleted
|
||||
- [ ] Story tracker updated with final status
|
||||
- [ ] Telegram notification attempted if configured
|
||||
@@ -0,0 +1,797 @@
|
||||
---
|
||||
name: implement-plan
|
||||
description: Use when a plan folder (from create-plan) exists and needs to be executed in an isolated git worktree with iterative cross-model milestone review in OpenCode workflows. ALWAYS invoke when user says "implement the plan", "execute the plan", "start implementation", "resume the plan", or similar execution requests.
|
||||
---
|
||||
|
||||
# Implement Plan (OpenCode)
|
||||
|
||||
Execute an existing plan (created by `create-plan`) in an isolated git worktree, with iterative cross-model review at each milestone boundary.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
This OpenCode variant depends on Superpowers execution skills being installed via OpenCode's native skill system.
|
||||
|
||||
Required:
|
||||
|
||||
- Plan folder exists under `ai_plan/` at project root
|
||||
- `continuation-runbook.md` exists in plan folder
|
||||
- `milestone-plan.md` exists in plan folder
|
||||
- `story-tracker.md` exists in plan folder
|
||||
- Git repo with worktree support: `git worktree list`
|
||||
- OpenCode Superpowers skills available at `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`
|
||||
- Superpowers execution skills:
|
||||
- `superpowers/executing-plans`
|
||||
- `superpowers/using-git-worktrees`
|
||||
- `superpowers/verification-before-completion`
|
||||
- `superpowers/finishing-a-development-branch`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
test -f ~/.agents/skills/superpowers/executing-plans/SKILL.md || test -f ~/.config/opencode/skills/superpowers/executing-plans/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md || test -f ~/.config/opencode/skills/superpowers/using-git-worktrees/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md || test -f ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
If dependencies are missing, stop immediately and return:
|
||||
|
||||
"Missing dependency: OpenCode Superpowers execution skills are required (`superpowers/executing-plans`, `superpowers/using-git-worktrees`, `superpowers/verification-before-completion`, `superpowers/finishing-a-development-branch`). Install from https://github.com/obra/superpowers (OpenCode setup), then retry."
|
||||
|
||||
If no plan folder exists:
|
||||
|
||||
"No plan found under `ai_plan/`. Run `create-plan` first."
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Bootstrap Superpowers Context (REQUIRED)
|
||||
|
||||
Use OpenCode's native skill tool:
|
||||
|
||||
- list skills
|
||||
- verify `superpowers/executing-plans`, `superpowers/using-git-worktrees`, `superpowers/verification-before-completion`, and `superpowers/finishing-a-development-branch` are discoverable
|
||||
|
||||
### Phase 2: Locate Plan
|
||||
|
||||
1. Scan `ai_plan/` for plan directories (most recent first by date prefix).
|
||||
2. If multiple plans exist, ask user which one to implement.
|
||||
3. If no plan exists, stop: "No plan found. Run create-plan first."
|
||||
4. Read `continuation-runbook.md` first (source of truth).
|
||||
5. Read `story-tracker.md` to detect resume state (`in-dev` or `completed` stories).
|
||||
6. Read `milestone-plan.md` for implementation details.
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "implement the plan, review with codex o4-mini"), use those values. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review each milestone?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `skip` — No external review, proceed with user approval only
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `o4-mini`, alternatives: `gpt-5.3-codex`, `o3`
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models
|
||||
- Accept any model string the user provides
|
||||
|
||||
3. **Max review rounds per milestone?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`. These values are fixed for the entire run.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 4: Set Up Worktree (REQUIRED SUB-SKILL)
|
||||
|
||||
Use OpenCode's native skill tool to load:
|
||||
|
||||
- `superpowers/using-git-worktrees`
|
||||
|
||||
Then:
|
||||
|
||||
1. Branch naming: `implement/<plan-folder-name>` (e.g., `implement/2026-03-04-auth-system`).
|
||||
2. Follow worktree skill's directory priority: `.worktrees/` > `worktrees/` > CLAUDE.md > ask user.
|
||||
3. Verify `.gitignore` covers worktree directory.
|
||||
4. Run project setup (auto-detect: `npm install`, `cargo build`, `pip install`, etc.).
|
||||
5. Verify clean baseline (run tests).
|
||||
|
||||
**Resume detection:** If `story-tracker.md` shows `in-dev` or `completed` stories, check if worktree branch already exists (`git worktree list`). If so, `cd` into existing worktree instead of creating a new one.
|
||||
|
||||
### Phase 5: Execute Milestones (Loop)
|
||||
|
||||
For each milestone (M1, M2, ...):
|
||||
|
||||
#### Step 1: Read Milestone Spec
|
||||
|
||||
Read the milestone section from `milestone-plan.md`.
|
||||
|
||||
#### Step 2: Update Tracker
|
||||
|
||||
Mark first story `in-dev` in `story-tracker.md`.
|
||||
|
||||
#### Step 3: Implement Stories
|
||||
|
||||
Execute each story in order. After completing each story:
|
||||
|
||||
1. Mark `in-dev` -> `completed` in `story-tracker.md`
|
||||
2. Update counts
|
||||
3. Mark next story `in-dev`
|
||||
|
||||
Commit hashes are not available yet — they are backfilled in Step 6 after the milestone is approved and committed.
|
||||
|
||||
#### Step 4: Verify Milestone (REQUIRED SUB-SKILL)
|
||||
|
||||
Use OpenCode's native skill tool to load:
|
||||
|
||||
- `superpowers/verification-before-completion`
|
||||
|
||||
```bash
|
||||
# Lint changed files
|
||||
# Typecheck
|
||||
# Run tests (targeted first, then full suite)
|
||||
```
|
||||
|
||||
All must pass before proceeding. If failures: fix, re-verify. Do NOT proceed to review with failures.
|
||||
|
||||
#### Step 5: Milestone Review Loop
|
||||
|
||||
Send to reviewer for approval **before committing**. See Phase 6 for details. The review payload uses working-tree diffs (`git diff` for unstaged, `git diff --staged` for staged changes).
|
||||
|
||||
**Skip this step if reviewer was set to `skip`.** When skipped, present the milestone summary to the user and ask for approval directly.
|
||||
|
||||
#### Step 6: Commit & Approve
|
||||
|
||||
Only after the reviewer approves (or user overrides at max rounds):
|
||||
|
||||
```bash
|
||||
git add <changed-files>
|
||||
git commit -m "feat(<scope>): implement milestone M<N> - <description>"
|
||||
```
|
||||
|
||||
Do NOT push. After committing:
|
||||
|
||||
1. Backfill the commit hash into the Notes column for all stories in this milestone in `story-tracker.md`.
|
||||
2. Mark milestone as `approved` in `story-tracker.md`.
|
||||
3. Move to next milestone.
|
||||
|
||||
### Phase 6: Milestone Review Loop (Detail)
|
||||
|
||||
**Skip this phase entirely if reviewer was set to `skip`.**
|
||||
|
||||
#### Step 1: Generate Session ID
|
||||
|
||||
```bash
|
||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
```
|
||||
|
||||
Use `REVIEW_ID` for all milestone review temp file paths:
|
||||
|
||||
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared runtime helper path before writing the command script:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.config/opencode/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
opencode)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/milestone-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
#### Step 2: Write Review Payload
|
||||
|
||||
Write to `/tmp/milestone-${REVIEW_ID}.md`:
|
||||
|
||||
```markdown
|
||||
# Milestone M<N> Review: <title>
|
||||
|
||||
## Milestone Spec (from plan)
|
||||
[Copy milestone section from milestone-plan.md]
|
||||
|
||||
## Acceptance Criteria
|
||||
[Copy acceptance criteria checkboxes]
|
||||
|
||||
## Changes Made (git diff)
|
||||
[Output of: git diff -- for unstaged changes, or git diff --staged for staged changes]
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
[lint output]
|
||||
### Typecheck
|
||||
[typecheck output]
|
||||
### Tests
|
||||
[test output with pass/fail counts]
|
||||
```
|
||||
|
||||
#### Review Contract (Applies to Every Round)
|
||||
|
||||
The reviewer response must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Order findings from `P0` to `P3`.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
|
||||
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
|
||||
|
||||
#### Liveness Contract (Applies While Review Is Running)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
#### Step 3: Submit to Reviewer (Round 1)
|
||||
|
||||
Write the reviewer invocation to `/tmp/milestone-review-${REVIEW_ID}.sh` as a bash script:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/milestone-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"Review this milestone implementation. The spec, acceptance criteria, git diff, and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"Review this milestone implementation using the following spec, acceptance criteria, git diff, and verification output:
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/milestone-${REVIEW_ID}.md and review this milestone implementation.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
For `cursor`, the command script writes raw JSON to `/tmp/milestone-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
OpenCode uses `--agent plan` for read-oriented review. Fresh call is the recommended default.
|
||||
|
||||
Round 1:
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"Read the file /tmp/milestone-${REVIEW_ID}.md and review this milestone implementation.
|
||||
|
||||
Evaluate:
|
||||
1. Correctness — Does the implementation match the milestone spec?
|
||||
2. Acceptance criteria — Are all criteria met?
|
||||
3. Code quality — Clean, maintainable, no obvious issues?
|
||||
4. Test coverage — Are changes adequately tested?
|
||||
5. Security — Any security concerns introduced?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use \`- None.\` when a severity has no findings.
|
||||
- \`P0\` = total blocker, \`P1\` = major risk, \`P2\` = must-fix before approval, \`P3\` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: \`VERDICT: APPROVED\` or \`VERDICT: REVISE\`
|
||||
- \`VERDICT: APPROVED\` is allowed only when there are no \`P0\`, \`P1\`, or \`P2\` findings. \`P3\` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later (fresh-call, recommended default):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this milestone and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've revised. Updated milestone payload is in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Extract the review body:
|
||||
|
||||
```bash
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/milestone-review-${REVIEW_ID}.json /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If the JSON parse falls through, promote the raw JSON file as the review output. On any opencode
|
||||
CLI or JSON parsing failure, treat this loop round as `completed-empty-output` and follow the
|
||||
helper-failure escalation in Step 4.
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/milestone-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/milestone-review-${REVIEW_ID}.sh >/tmp/milestone-review-${REVIEW_ID}.runner.out 2>/tmp/milestone-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/milestone-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
After the command completes:
|
||||
|
||||
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/milestone-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/milestone-review-${REVIEW_ID}.json > /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/milestone-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/milestone-review-${REVIEW_ID}.md` before verdict parsing.
|
||||
- If `REVIEWER_CLI=opencode`, the `jq` extraction above covers output capture. If it falls through, copy runner output: `cp /tmp/milestone-review-${REVIEW_ID}.runner.out /tmp/milestone-review-${REVIEW_ID}.md`. On Round 1, also attempt to capture the session id for optional use in subsequent rounds: `OPENCODE_SESSION_ID=$(jq -r 'if type == "array" then (.[0] | (.id? // .session_id?)) else (.id? // .session_id?) end // empty' /tmp/milestone-review-${REVIEW_ID}.json 2>/dev/null || true)`
|
||||
- If `REVIEWER_CLI=claude` or `REVIEWER_CLI=pi`, promote stdout captured by the helper or fallback runner into the markdown review file:
|
||||
|
||||
```bash
|
||||
cp /tmp/milestone-review-${REVIEW_ID}.runner.out /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Fallback is allowed only when the helper is missing or not executable.
|
||||
|
||||
#### Step 4: Read Review & Check Verdict
|
||||
|
||||
1. Read `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
2. If the review failed, produced empty output, or reached helper timeout, also read:
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
3. Present review to the user:
|
||||
|
||||
```markdown
|
||||
## Milestone Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
|
||||
|
||||
[Reviewer feedback]
|
||||
```
|
||||
|
||||
1. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
|
||||
2. Check verdict:
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings -> proceed to Phase 5 Step 6 (commit & approve)
|
||||
- **VERDICT: APPROVED** with only `P3` findings -> optionally fix the `P3` items if they are cheap and safe, then proceed
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding -> go to Step 5
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` -> treat as approved
|
||||
- Helper state `completed-empty-output` -> treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||
- Helper state `needs-operator-decision` -> surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
|
||||
- Max rounds (`MAX_ROUNDS`) reached -> present to user for manual decision (proceed or stop)
|
||||
|
||||
#### Step 5: Address Feedback & Re-verify
|
||||
|
||||
1. Address the reviewer findings in priority order (`P0` -> `P1` -> `P2`, then `P3` when practical) (do NOT commit yet).
|
||||
2. Re-run verification (lint/typecheck/tests) — all must pass.
|
||||
3. Update `/tmp/milestone-${REVIEW_ID}.md` with new diff and verification output.
|
||||
|
||||
Summarize revisions for the user:
|
||||
|
||||
```markdown
|
||||
### Revisions (Round N)
|
||||
- [Change and reason, one bullet per issue addressed]
|
||||
```
|
||||
|
||||
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
|
||||
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||
|
||||
Rewrite `/tmp/milestone-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call with prior-round context (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "You previously reviewed this milestone and requested revisions. Read the updated payload at /tmp/milestone-${REVIEW_ID}.md and re-review using the same ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${CODEX_SESSION_ID} \
|
||||
-o /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call with accumulated context (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"You previously reviewed milestone M<N> and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've addressed your feedback. Updated diff and verification output are below.
|
||||
|
||||
$(cat /tmp/milestone-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Resume the existing session:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${CURSOR_SESSION_ID} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've addressed your feedback. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
Fresh call (recommended default — opencode has no guaranteed stable session ID in headless mode):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this milestone and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've addressed your feedback. Updated milestone payload is in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same \`## Summary\`, \`## Findings\`, and \`## Verdict\` structure as before.
|
||||
Keep findings ordered \`P0\` to \`P3\`, use \`- None.\` when a severity has no findings, and only use \`VERDICT: APPROVED\` when no \`P0\`, \`P1\`, or \`P2\` findings remain. \`P3\` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/milestone-review-${REVIEW_ID}.json /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Optional session-resume path (only if `OPENCODE_SESSION_ID` was captured on Round 1 and your installed opencode accepts `-s <id>` reliably in headless mode):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-s ${OPENCODE_SESSION_ID} \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"I've addressed your feedback on this milestone. Updated diff and verification output are in /tmp/milestone-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same \`## Summary\`, \`## Findings\`, and \`## Verdict\` structure as before.
|
||||
Keep findings ordered \`P0\` to \`P3\`, use \`- None.\` when a severity has no findings, and only use \`VERDICT: APPROVED\` when no \`P0\`, \`P1\`, or \`P2\` findings remain. \`P3\` findings are non-blocking." \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.json
|
||||
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
> /tmp/milestone-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/milestone-review-${REVIEW_ID}.json /tmp/milestone-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If session resume fails (session expired or not supported), fall back to the fresh-call path above.
|
||||
|
||||
Do not run `jq` extraction until after the helper or fallback execution completes, then extract `/tmp/milestone-review-${REVIEW_ID}.md` from the JSON response.
|
||||
|
||||
After updating `/tmp/milestone-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
|
||||
|
||||
Return to Step 4.
|
||||
|
||||
#### Step 7: Cleanup Per Milestone
|
||||
|
||||
```bash
|
||||
rm -f \
|
||||
/tmp/milestone-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.md \
|
||||
/tmp/milestone-review-${REVIEW_ID}.json \
|
||||
/tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
/tmp/milestone-review-${REVIEW_ID}.status \
|
||||
/tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/milestone-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
|
||||
|
||||
### Phase 7: Completion (REQUIRED SUB-SKILL)
|
||||
|
||||
After all milestones are approved and committed:
|
||||
|
||||
1. Use OpenCode's native skill tool to load: `superpowers/finishing-a-development-branch`
|
||||
2. Run full test suite one final time — all must pass.
|
||||
3. Merge the worktree branch into the parent branch:
|
||||
|
||||
```bash
|
||||
# From the main repo (not the worktree)
|
||||
git merge implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Delete the worktree and its branch:
|
||||
|
||||
```bash
|
||||
git worktree remove <worktree-path>
|
||||
git branch -d implement/<plan-folder-name>
|
||||
```
|
||||
|
||||
1. Mark plan status as `completed` in `story-tracker.md`.
|
||||
|
||||
### Phase 8: Final Report
|
||||
|
||||
Present summary:
|
||||
|
||||
```markdown
|
||||
## Implementation Complete
|
||||
|
||||
**Plan:** <plan-folder-name>
|
||||
**Milestones:** <N> completed, <N> approved
|
||||
**Review rounds:** <total across all milestones>
|
||||
**Branch:** implement/<plan-folder-name> (merged and deleted)
|
||||
```
|
||||
|
||||
### Phase 9: Telegram Notification (MANDATORY)
|
||||
|
||||
Resolve the Telegram notifier helper from the installed OpenCode skills directory:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome for the implement-plan run (fully completed, stopped after max rounds, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "implement-plan completed for <plan-folder-name>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path. Do not use desktop notifications, `say`, email, or any other notifier.
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
Before starting any story:
|
||||
|
||||
1. Open `story-tracker.md`
|
||||
2. Mark story `in-dev`
|
||||
3. Add notes if relevant
|
||||
4. Then begin implementation
|
||||
|
||||
After completing any story:
|
||||
|
||||
1. Mark story `completed`
|
||||
2. Review pending stories
|
||||
3. Update Last Updated and Stories Complete counts
|
||||
|
||||
Note: Commit hashes are backfilled into story Notes after the milestone commit (Step 6), not per-story.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] Plan folder located and all required files present
|
||||
- [ ] Reviewer configured or explicitly skipped
|
||||
- [ ] Max review rounds confirmed (default: 10)
|
||||
- [ ] Worktree created with branch `implement/<plan-folder-name>`
|
||||
- [ ] Worktree directory verified in .gitignore
|
||||
- [ ] Baseline tests pass in worktree
|
||||
- [ ] Each milestone: stories tracked (in-dev -> completed)
|
||||
- [ ] Each milestone: lint/typecheck/tests pass before review
|
||||
- [ ] Each milestone: reviewer approved (or max rounds + user override)
|
||||
- [ ] Each milestone: committed locally only after approval
|
||||
- [ ] Each milestone: marked approved in story-tracker.md
|
||||
- [ ] All milestones completed, approved, and committed
|
||||
- [ ] Final test suite passes
|
||||
- [ ] Worktree branch merged to parent and worktree deleted
|
||||
- [ ] Story tracker updated with final status
|
||||
- [ ] Telegram notification attempted if configured
|
||||
@@ -0,0 +1,241 @@
|
||||
---
|
||||
name: implement-plan
|
||||
description: Use when a plan folder created by create-plan must be executed in pi with milestone verification, reviewer gates, local commits, and resumable tracker updates.
|
||||
---
|
||||
|
||||
# Implement Plan (Pi)
|
||||
|
||||
Execute an existing plan under `ai_plan/` milestone by milestone, using verification gates, reviewer approval, and local commits after each approved milestone.
|
||||
|
||||
## Shared Setup
|
||||
|
||||
Before using this skill, read:
|
||||
|
||||
- [docs/PI-SUPERPOWERS.md](../../../docs/PI-SUPERPOWERS.md)
|
||||
- [docs/PI-COMMON-REVIEWER.md](../../../docs/PI-COMMON-REVIEWER.md)
|
||||
|
||||
This workflow depends on:
|
||||
|
||||
- Superpowers execution skills being visible to pi
|
||||
- the pi reviewer-runtime helper being installed in a supported location
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- `pi --version`
|
||||
- a plan folder under `ai_plan/`
|
||||
- `continuation-runbook.md`
|
||||
- `milestone-plan.md`
|
||||
- `story-tracker.md`
|
||||
- git worktree support
|
||||
- Superpowers `executing-plans`
|
||||
- Superpowers `using-git-worktrees`
|
||||
- Superpowers `verification-before-completion`
|
||||
- Superpowers `finishing-a-development-branch`
|
||||
- pi reviewer runtime helper
|
||||
- pi Telegram notifier helper
|
||||
|
||||
Quick checks for common installs:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
git worktree list
|
||||
test -f ~/.agents/skills/superpowers/executing-plans/SKILL.md || test -f ~/.pi/agent/skills/superpowers/executing-plans/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md || test -f ~/.pi/agent/skills/superpowers/using-git-worktrees/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md || test -f ~/.pi/agent/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.pi/agent/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
test -x .pi/skills/reviewer-runtime/pi/run-review.sh || test -x ~/.pi/agent/skills/reviewer-runtime/pi/run-review.sh
|
||||
test -x .pi/skills/reviewer-runtime/pi/notify-telegram.sh || test -x ~/.pi/agent/skills/reviewer-runtime/pi/notify-telegram.sh
|
||||
```
|
||||
|
||||
If you use a settings-defined skill path for Superpowers, confirm it matches [docs/PI-SUPERPOWERS.md](../../../docs/PI-SUPERPOWERS.md) before continuing.
|
||||
|
||||
If you install the reviewer helper in a nonstandard location, confirm it matches [docs/PI-COMMON-REVIEWER.md](../../../docs/PI-COMMON-REVIEWER.md) before continuing.
|
||||
|
||||
If any dependency is missing, stop and return:
|
||||
|
||||
`Missing dependency: pi implement-plan requires the execution skills and reviewer setup documented in docs/PI-SUPERPOWERS.md and docs/PI-COMMON-REVIEWER.md.`
|
||||
|
||||
## Required Workflow Rules
|
||||
|
||||
- Load the relevant workflow skill before entering its phase. If pi did not auto-load it, use `/skill:<name>`.
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- Update `story-tracker.md` before starting and after completing every story.
|
||||
- Do not use deprecated wrapper CLIs.
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Locate Plan
|
||||
|
||||
1. Scan `ai_plan/` and identify the target plan folder
|
||||
2. Read `continuation-runbook.md` first
|
||||
3. Read `story-tracker.md` to identify resume state
|
||||
4. Read `milestone-plan.md` for the implementation spec
|
||||
|
||||
### Phase 2: Configure Reviewer
|
||||
|
||||
If the user already provided reviewer settings, use them. Otherwise ask:
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`
|
||||
|
||||
1. Which CLI should review milestone implementations?
|
||||
2. Reviewer model
|
||||
3. Max rounds, default `10`
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the pi model running this workflow. Use any configured pi model string, including provider-qualified model IDs. If the reviewer model or provider is unavailable, surface the review helper stderr/status and ask for a configured model; use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
The pi reviewer command rendered into `/tmp/milestone-review-${REVIEW_ID}.sh` must be isolated and read-only:
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files --model "$REVIEWER_MODEL" --tools read,grep,find,ls -p "Read the file /tmp/milestone-${REVIEW_ID}.md and review."
|
||||
```
|
||||
|
||||
The pi reviewer invocation must not load workflow skills and must not include `write`, `edit`, or `bash` tools.
|
||||
|
||||
### Phase 3: Set Up Workspace
|
||||
|
||||
1. Load `using-git-worktrees`
|
||||
2. Create or resume the implementation branch/worktree described by the plan
|
||||
3. Verify baseline setup and tests before changing code
|
||||
|
||||
### Phase 4: Execute Milestones
|
||||
|
||||
For each milestone:
|
||||
|
||||
1. Mark the next story `in-dev` in `story-tracker.md`
|
||||
2. Implement the story
|
||||
3. Mark the story `completed`
|
||||
4. Continue until the milestone stories are done
|
||||
5. Load `verification-before-completion`
|
||||
6. Run lint, typecheck, and tests for the changed scope
|
||||
7. Send the milestone diff and verification output to the reviewer before committing
|
||||
8. Commit only after approval
|
||||
|
||||
### Phase 5: Milestone Review Loop
|
||||
|
||||
Skip this phase if `REVIEWER_CLI=skip`.
|
||||
|
||||
#### Step 1: Generate Session ID
|
||||
|
||||
```bash
|
||||
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
```
|
||||
|
||||
Use these temp artifacts:
|
||||
|
||||
- `/tmp/milestone-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.md`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.json`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.status`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/milestone-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the pi reviewer runtime helper in this order:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=""
|
||||
for candidate in ".pi/skills/reviewer-runtime/pi/run-review.sh" "$HOME/.pi/agent/skills/reviewer-runtime/pi/run-review.sh"; do
|
||||
if [ -x "$candidate" ]; then
|
||||
REVIEWER_RUNTIME="$candidate"
|
||||
break
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
#### Step 2: Build Review Payload
|
||||
|
||||
Write the milestone spec, acceptance criteria, diff, and verification output to `/tmp/milestone-${REVIEW_ID}.md`.
|
||||
|
||||
Reviewer responses must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Order findings from `P0` to `P3`
|
||||
- Use `- None.` when a severity has no findings
|
||||
- `VERDICT: APPROVED` is valid only when no `P0`, `P1`, or `P2` findings remain
|
||||
|
||||
#### Step 3: Run Review
|
||||
|
||||
Execute the reviewer command script through the helper when available:
|
||||
|
||||
```bash
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/milestone-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/milestone-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/milestone-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/milestone-review-${REVIEW_ID}.status
|
||||
```
|
||||
|
||||
Fallback to direct execution only if the helper is missing.
|
||||
|
||||
#### Step 4: Handle Findings
|
||||
|
||||
- Keep waiting while fresh `state=in-progress note="In progress N"` heartbeats continue
|
||||
- Fix `P0`, `P1`, and `P2` findings before approval
|
||||
- Fix cheap `P3` findings when safe
|
||||
- Re-run verification after each revision
|
||||
|
||||
### Phase 6: Commit And Track Approval
|
||||
|
||||
After milestone approval:
|
||||
|
||||
1. Commit the milestone locally
|
||||
2. Backfill the commit hash into that milestone's story notes
|
||||
3. Mark the milestone `approved` in `story-tracker.md`
|
||||
4. Move to the next milestone
|
||||
|
||||
### Phase 7: Finalization
|
||||
|
||||
After all milestones are approved:
|
||||
|
||||
1. Load `finishing-a-development-branch`
|
||||
2. Run the full verification suite
|
||||
3. Ask whether to push or keep the work local
|
||||
4. Mark the plan completed in `story-tracker.md`
|
||||
|
||||
### Phase 8: Telegram Completion Notification
|
||||
|
||||
Resolve the helper in this order:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=""
|
||||
for candidate in ".pi/skills/reviewer-runtime/pi/notify-telegram.sh" "$HOME/.pi/agent/skills/reviewer-runtime/pi/notify-telegram.sh"; do
|
||||
if [ -x "$candidate" ]; then
|
||||
TELEGRAM_NOTIFY_RUNTIME="$candidate"
|
||||
break
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
If the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured, send a short completion summary. Otherwise state that no Telegram completion notification was sent.
|
||||
Reference in New Issue
Block a user