Files

T

Stefano Fiorini be993429c1 feat(M2): Documentation flow, accuracy, consistency cleanup, and cross-platform shell portability

2026-05-03 20:14:44 -05:00

18 KiB

Raw Blame History

IMPLEMENT-PLAN

Purpose

Execute an existing plan (created by create-plan) in an isolated git worktree, with iterative cross-model review at each milestone boundary. Milestones are implemented one-by-one with lint/typecheck/test gates, reviewed by a second model/provider, and committed locally until all are approved.

Requirements

Plan folder under ai_plan/ (created by create-plan) with:
- continuation-runbook.md
- milestone-plan.md
- story-tracker.md
Git repo with worktree support
Superpowers execution skills installed from: https://github.com/obra/superpowers
Required dependencies:
- superpowers:executing-plans
- superpowers:using-git-worktrees
- superpowers:verification-before-completion
- superpowers:finishing-a-development-branch
For Codex, native skill discovery must be configured:
- ~/.agents/skills/superpowers -> ~/.codex/superpowers/skills
Cursor can use the Cursor Superpowers plugin cache or manual .cursor/skills/superpowers/skills / ~/.cursor/skills/superpowers/skills installs.
OpenCode can use ~/.agents/skills/superpowers or ~/.config/opencode/skills/superpowers.
Shared reviewer runtime must be installed beside agent skills when using reviewer CLIs:
- Codex: ~/.codex/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}
- Claude Code: ~/.claude/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}
- OpenCode: ~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}
- Cursor: .cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh} or ~/.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}
- Pi: .pi/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh} or ~/.pi/agent/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh}
Telegram notification setup is documented in TELEGRAM-NOTIFICATIONS.md

If dependencies are missing, stop and return:

"Missing dependency: [specific missing item]. Ensure all prerequisites are met, then retry."

Reviewer CLI Requirements (Optional)

The canonical reviewer CLI support matrix is documented in REVIEWERS.md. To use the iterative milestone review feature, one of these CLIs must be installed:

Reviewer CLI	Install	Verify
`codex`	`npm install -g @openai/codex`	`codex --version`
`claude`	`npm install -g @anthropic-ai/claude-code`	`claude --version`
`cursor`	`curl https://cursor.com/install -fsS \| bash`	`cursor-agent --version` (binary: `cursor-agent`; alias `cursor agent` also works)
`opencode`	`brew install opencode` or your package manager	`opencode --version`
`pi`	Install Pi coding agent	`pi --version`; list models with `pi --list-models [search]`

The reviewer CLI is independent of which agent is running the implementation — e.g., Claude Code can send milestones to Codex for review, and vice versa.

Additional dependency for cursor reviewer: jq is required to parse Cursor's JSON output. Install via brew install jq (macOS) or your package manager. Verify: jq --version.

Install

Codex

mkdir -p ~/.codex/skills/implement-plan
cp -R skills/implement-plan/codex/* ~/.codex/skills/implement-plan/
mkdir -p ~/.codex/skills/reviewer-runtime
cp skills/reviewer-runtime/run-review.sh skills/reviewer-runtime/notify-telegram.sh ~/.codex/skills/reviewer-runtime/
chmod +x ~/.codex/skills/reviewer-runtime/*.sh

Claude Code

mkdir -p ~/.claude/skills/implement-plan
cp -R skills/implement-plan/claude-code/* ~/.claude/skills/implement-plan/
mkdir -p ~/.claude/skills/reviewer-runtime
cp skills/reviewer-runtime/run-review.sh skills/reviewer-runtime/notify-telegram.sh ~/.claude/skills/reviewer-runtime/
chmod +x ~/.claude/skills/reviewer-runtime/*.sh

OpenCode

mkdir -p ~/.config/opencode/skills/implement-plan
cp -R skills/implement-plan/opencode/* ~/.config/opencode/skills/implement-plan/
mkdir -p ~/.config/opencode/skills/reviewer-runtime
cp skills/reviewer-runtime/run-review.sh skills/reviewer-runtime/notify-telegram.sh ~/.config/opencode/skills/reviewer-runtime/
chmod +x ~/.config/opencode/skills/reviewer-runtime/*.sh

Cursor

Copy into the repo-local .cursor/skills/ directory (where the Cursor Agent CLI discovers skills):

mkdir -p .cursor/skills/implement-plan
cp -R skills/implement-plan/cursor/* .cursor/skills/implement-plan/
mkdir -p .cursor/skills/reviewer-runtime
cp skills/reviewer-runtime/run-review.sh skills/reviewer-runtime/notify-telegram.sh .cursor/skills/reviewer-runtime/
chmod +x .cursor/skills/reviewer-runtime/*.sh

Or install globally (loaded via ~/.cursor/skills/):

mkdir -p ~/.cursor/skills/implement-plan
cp -R skills/implement-plan/cursor/* ~/.cursor/skills/implement-plan/
mkdir -p ~/.cursor/skills/reviewer-runtime
cp skills/reviewer-runtime/run-review.sh skills/reviewer-runtime/notify-telegram.sh ~/.cursor/skills/reviewer-runtime/
chmod +x ~/.cursor/skills/reviewer-runtime/*.sh

Pi

Recommended full Pi package install:

./scripts/install-pi-package.sh --global
# or, for project-local Pi package install
./scripts/install-pi-package.sh --local

Manual single-skill Pi install from the package mirror:

./scripts/sync-pi-package-skills.sh
mkdir -p .pi/skills/implement-plan
cp -R pi-package/skills/implement-plan/* .pi/skills/implement-plan/
mkdir -p .pi/skills/reviewer-runtime/pi
cp -R skills/reviewer-runtime/pi/* .pi/skills/reviewer-runtime/pi/
chmod +x .pi/skills/reviewer-runtime/pi/*.sh

Global manual installs use ~/.pi/agent/skills/implement-plan/ and ~/.pi/agent/skills/reviewer-runtime/pi/ instead of .pi/skills/....

Pi workflow skills also require Superpowers. See PI-SUPERPOWERS.md and PI-COMMON-REVIEWER.md.

Verify Installation

test -f ~/.codex/skills/implement-plan/SKILL.md || true
test -f ~/.claude/skills/implement-plan/SKILL.md || true
test -f ~/.config/opencode/skills/implement-plan/SKILL.md || true
test -f .cursor/skills/implement-plan/SKILL.md || test -f ~/.cursor/skills/implement-plan/SKILL.md || true
test -f .pi/skills/implement-plan/SKILL.md || test -f ~/.pi/agent/skills/implement-plan/SKILL.md || true
test -x ~/.codex/skills/reviewer-runtime/run-review.sh || true
test -x ~/.claude/skills/reviewer-runtime/run-review.sh || true
test -x ~/.config/opencode/skills/reviewer-runtime/run-review.sh || true
test -x .cursor/skills/reviewer-runtime/run-review.sh || test -x ~/.cursor/skills/reviewer-runtime/run-review.sh || true
test -x .pi/skills/reviewer-runtime/pi/run-review.sh || test -x ~/.pi/agent/skills/reviewer-runtime/pi/run-review.sh || true

Verify Superpowers execution dependencies exist in your agent skills root:

Codex: ~/.agents/skills/superpowers/executing-plans/SKILL.md
Codex: ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md
Codex: ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
Codex: ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
Claude Code: ~/.claude/skills/superpowers/executing-plans/SKILL.md
Claude Code: ~/.claude/skills/superpowers/using-git-worktrees/SKILL.md
Claude Code: ~/.claude/skills/superpowers/verification-before-completion/SKILL.md
Claude Code: ~/.claude/skills/superpowers/finishing-a-development-branch/SKILL.md
OpenCode: ~/.agents/skills/superpowers/executing-plans/SKILL.md or ~/.config/opencode/skills/superpowers/executing-plans/SKILL.md
OpenCode: ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md or ~/.config/opencode/skills/superpowers/using-git-worktrees/SKILL.md
OpenCode: ~/.agents/skills/superpowers/verification-before-completion/SKILL.md or ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
OpenCode: ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md or ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
Cursor: .cursor/skills/superpowers/skills/executing-plans/SKILL.md, ~/.cursor/skills/superpowers/skills/executing-plans/SKILL.md, or the Cursor Superpowers plugin cache
Cursor: .cursor/skills/superpowers/skills/using-git-worktrees/SKILL.md, ~/.cursor/skills/superpowers/skills/using-git-worktrees/SKILL.md, or the Cursor Superpowers plugin cache
Cursor: .cursor/skills/superpowers/skills/verification-before-completion/SKILL.md, ~/.cursor/skills/superpowers/skills/verification-before-completion/SKILL.md, or the Cursor Superpowers plugin cache
Cursor: .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md, ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md, or the Cursor Superpowers plugin cache
Pi: .pi/skills/superpowers/executing-plans/SKILL.md or ~/.pi/agent/skills/superpowers/executing-plans/SKILL.md or ~/.agents/skills/superpowers/executing-plans/SKILL.md
Pi: .pi/skills/superpowers/using-git-worktrees/SKILL.md or ~/.pi/agent/skills/superpowers/using-git-worktrees/SKILL.md or ~/.agents/skills/superpowers/using-git-worktrees/SKILL.md
Pi: .pi/skills/superpowers/verification-before-completion/SKILL.md or ~/.pi/agent/skills/superpowers/verification-before-completion/SKILL.md or ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
Pi: .pi/skills/superpowers/finishing-a-development-branch/SKILL.md or ~/.pi/agent/skills/superpowers/finishing-a-development-branch/SKILL.md or ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md

Key Behavior

Reads existing plan from ai_plan/YYYY-MM-DD-<short-title>/.
Sets up an isolated git worktree with branch implement/<plan-folder-name>.
Executes milestones one-by-one, tracking stories in story-tracker.md.
Runs lint/typecheck/tests as a gate before each milestone review.
Sends each milestone to a reviewer CLI for approval (max rounds configurable, default 10).
Runs reviewer commands through reviewer-runtime/run-review.sh when available, with fallback to direct synchronous execution only if the helper is missing.
Waits as long as the reviewer runtime keeps emitting per-minute In progress N heartbeats.
Requires reviewer findings to be ordered P0 through P3, with P3 explicitly non-blocking.
Captures reviewer stderr and helper status logs for diagnostics and retains them on failed, empty-output, or operator-decision review rounds.
Commits each milestone locally only after reviewer approval (does not push).
After all milestones approved, merges worktree branch to parent and deletes worktree.
Supports resume: detects existing worktree and in-dev/completed stories.
Sends completion notifications through Telegram only when the shared setup in TELEGRAM-NOTIFICATIONS.md is installed and configured.

Milestone Review Loop

After each milestone is implemented and verified, the skill sends it to a second model for review:

Configure — user picks a reviewer CLI (codex, claude, cursor, opencode, pi) and model, or skips
Prepare — milestone payload and a bash reviewer command script are written to temp files
Run — the command script is executed through reviewer-runtime/run-review.sh when installed
Feedback — reviewer evaluates correctness, acceptance criteria, code quality, test coverage, security, and returns ## Summary, ## Findings, and ## Verdict
Prioritize — findings are ordered P0, P1, P2, P3
Revise — the implementing agent addresses findings in priority order, re-verifies, and re-submits
Repeat — up to max rounds (default 10) until the reviewer returns VERDICT: APPROVED
Approve — milestone is marked approved in story-tracker.md

Reviewer Output Contract

P0 = total blocker
P1 = major risk
P2 = must-fix before approval
P3 = cosmetic / nice to have
Each severity section should use - None. when empty
VERDICT: APPROVED is valid only when no P0, P1, or P2 findings remain
P3 findings are non-blocking, but the caller should still try to fix them when cheap and safe

Runtime Artifacts

The milestone review flow may create these temp artifacts:

/tmp/milestone-<id>.md — milestone review payload
/tmp/milestone-review-<id>.md — normalized review text
/tmp/milestone-review-<id>.json — raw Cursor JSON output
/tmp/milestone-review-<id>.stderr — reviewer stderr
/tmp/milestone-review-<id>.status — helper heartbeat/status log
/tmp/milestone-review-<id>.runner.out — helper-managed stdout from the reviewer command process
/tmp/milestone-review-<id>.sh — reviewer command script

Status log lines use this format:

ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|in-progress|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"

in-progress is the liveness heartbeat emitted roughly once per minute with note="In progress N". stall-warning is a non-terminal status-log state only. It does not mean the caller should stop waiting if in-progress heartbeats continue.

Failure Handling

completed-empty-output means the reviewer exited without producing review text; surface .stderr and .status, then retry only after diagnosing the cause.
needs-operator-decision means the helper reached hard-timeout escalation; surface .status and decide whether to extend the timeout, abort, or retry with different parameters.
Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds should retain .stderr, .status, and .runner.out until diagnosed.
As long as fresh in-progress heartbeats continue to arrive roughly once per minute, the caller should keep waiting.

Supported Reviewer CLIs

CLI	Command	Session Resume	Read-Only Mode
`codex`	`codex exec -m <model> -s read-only`	Yes (`codex exec resume <id>`)	`-s read-only`
`claude`	`claude -p --model <model> --strict-mcp-config --setting-sources user`	No (fresh call each round)	`--strict-mcp-config --setting-sources user`
`cursor`	`cursor-agent -p --mode=ask --model <model> --trust --output-format json`	Yes (`--resume <id>`)	`--mode=ask`
`opencode`	`opencode run -m <provider>/<model> --agent plan --format json`	Fresh call default; optional `-s <id>`	`--agent plan`
`pi`	See PI-COMMON-REVIEWER.md	No (fresh call each round)	`--tools read,grep,find,ls`

For all supported reviewer CLIs, the preferred execution path is:

write the reviewer command to a bash script
run that script through reviewer-runtime/run-review.sh
fall back to direct synchronous execution only if the helper is missing or not executable

Pi Reviewer Support

All workflow variants can use Pi itself as a reviewer CLI. Use pi/<pi-model-name> shorthand, for example pi/claude-opus-4-7; this means REVIEWER_CLI=pi and REVIEWER_MODEL=claude-opus-4-7. Provider-qualified or multi-slash Pi model IDs are preserved after the first pi/ prefix, for example pi/anthropic/claude-opus-4-7.

The canonical isolated read-only Pi reviewer flag contract lives in PI-COMMON-REVIEWER.md. This workflow passes the milestone review payload at /tmp/milestone-${REVIEW_ID}.md and expects the standard ## Summary, ## Findings, and ## Verdict response. Pi reviewer output is captured as markdown stdout, not JSON.

If the Pi reviewer model or provider is unavailable, surface the helper stderr/status and use pi --list-models [search] to inspect configured models.

Notifications

Telegram is the only supported notification path.
Shared setup: TELEGRAM-NOTIFICATIONS.md
Notification failures are non-blocking, but they must be surfaced to the user.
Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.

The helper also supports manual override flags for diagnostics:

run-review.sh \
  --command-file <path> \
  --stdout-file <path> \
  --stderr-file <path> \
  --status-file <path> \
  --poll-seconds 10 \
  --soft-timeout-seconds 600 \
  --stall-warning-seconds 300 \
  --hard-timeout-seconds 1800

Variant Hardening Notes

Claude Code Hardening

Must invoke explicit required sub-skills:
- superpowers:executing-plans
- superpowers:using-git-worktrees
- superpowers:verification-before-completion
- superpowers:finishing-a-development-branch

Codex Hardening

Must use native skill discovery from ~/.agents/skills/ (no CLI wrappers).
Must verify Superpowers skills symlink: ~/.agents/skills/superpowers -> ~/.codex/superpowers/skills
Must invoke required sub-skills with explicit announcements.
Must track checklist-driven skills with update_plan todos.
Deprecated CLI commands (superpowers-codex bootstrap, use-skill) must NOT be used.

OpenCode Hardening

Must use OpenCode native skill tool (not Claude/Codex invocation syntax).
Must verify Superpowers skill discovery under:
- ~/.agents/skills/superpowers
- ~/.config/opencode/skills/superpowers
Must explicitly load all four execution sub-skills.

Cursor Hardening

Must use Cursor-native discovery from .cursor/skills/, ~/.cursor/skills/, or installed Cursor plugin cache entries.
Must announce skill usage explicitly before invocation.
Must use --mode=ask (read-only) and --trust when running reviewer non-interactively.
Must not use --force or --mode=agent for review (reviewer should never write files).

Execution Workflow Rules

Read continuation-runbook.md first (source of truth).
Set up worktree before any implementation work.
Complete one milestone at a time.
Update story-tracker.md before/after each story.
Lint/typecheck/test after each milestone (all must pass).
Send to reviewer for approval before committing.
Address review feedback, re-verify, re-submit (do not commit between rounds).
Commit locally (do not push) only after reviewer approves the milestone.
Move to next milestone only after approval and commit.
After all milestones: run full test suite, merge worktree branch to parent, delete worktree.
Max review rounds default to 10 if user does not specify a value.

18 KiB Raw Blame History