Perform code optimization and document cleanup (#1)
check / check (ubuntu-latest) (push) Successful in 2m5s
check / check (macos-latest) (push) Has been cancelled
check-online / check-online (ubuntu-latest) (push) Successful in 1m53s

## Summary
- add repository-wide quality tooling and verification scaffolding, including CI workflows, pnpm workspace setup, ESLint/Prettier/markdown checks, and generated-output verification helpers
- reorganize skill sources and generation flow by introducing canonical `_source` variants, generator/manifests, reusable helper abstractions, and shared web-automation/browser utilities
- clean up and expand documentation so the root README flows into docs and skill docs, with clearer development, reviewer, installer, and workflow guidance

## Notable changes
- docs flow and consistency cleanup across `README.md`, `docs/README.md`, and related docs
- new scripts for `check`, docs verification, generated-file verification, shell portability, and safe directory replacement
- refactors in Atlassian and web-automation skill runtimes to reduce duplication and centralize reusable code
- changelog, development documentation, and CI surface updates

## Test Plan
- [ ] `pnpm run check`
- [ ] review generated/manifests and skill sync outputs
- [ ] smoke-check docs flow from `README.md` to `docs/README.md` to skill docs

## Notes
- this branch currently includes tracked `skills/web-automation/shared/node_modules` content that should be reviewed carefully as potentially noisy/accidental committed artifacts

Co-authored-by: Stefano Fiorini <stefano.fiorini@firsthorizon.com>
Reviewed-on: #1
This commit was merged in pull request #1.
This commit is contained in:
2026-05-04 04:41:34 +00:00
parent 2deab1c1b4
commit 251148c3ff
373 changed files with 28504 additions and 1281 deletions
+140 -61
View File
@@ -2,13 +2,19 @@
## Purpose
Execute a single user-supplied prompt end-to-end with **two reviewer loops** (plan review + implementation review), with TDD-first execution, a pre-implementation verification gate, and a single task commit — all in one run of the skill. `do-task` is scoped to small-to-medium ad-hoc tasks; for multi-milestone work use `create-plan` + `implement-plan` instead.
Execute a single user-supplied prompt end-to-end with **two reviewer loops** (plan review +
implementation review), with TDD-first execution, a pre-implementation verification gate, and a
single task commit — all in one run of the skill. `do-task` is scoped to small-to-medium ad-hoc
tasks; for multi-milestone work use `create-plan` + `implement-plan` instead.
`do-task` persists one plan artifact per run: `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`. The folder is kept as a record after success (not deleted). Resume is supported via the `Status` enum and Runtime State fields.
`do-task` persists one plan artifact per run: `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`. The
folder is kept as a record after success (not deleted). Resume is supported via the `Status` enum
and Runtime State fields.
## Requirements
- Git repo with `/ai_plan/` entry in `.gitignore` (the skill adds the entry automatically if missing and commits it as a separate infra commit).
- Git repo with `/ai_plan/` entry in `.gitignore` (the skill adds the entry automatically if
missing and commits it as a separate infra commit).
- Superpowers skills installed from: https://github.com/obra/superpowers
- Required dependencies (vary by variant; see Install below):
- `superpowers:brainstorming` (or `superpowers/brainstorming` for OpenCode)
@@ -18,32 +24,50 @@ Execute a single user-supplied prompt end-to-end with **two reviewer loops** (pl
- `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
- For Codex, native skill discovery must be configured:
- `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
- Cursor can use the Cursor Superpowers plugin cache or manual `.cursor/skills/superpowers/skills` / `~/.cursor/skills/superpowers/skills` installs, and `jq` is a hard prerequisite for the Cursor variant.
- Cursor can use the Cursor Superpowers plugin cache or manual `.cursor/skills/superpowers/skills`
/ `~/.cursor/skills/superpowers/skills` installs, and `jq` is a hard prerequisite for the
Cursor variant.
- OpenCode can use `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`.
- Shared reviewer runtime (`run-review.sh`) AND Telegram notifier helper (`notify-telegram.sh`) must be installed beside agent skills. Both scripts ship under `skills/reviewer-runtime/` in this repo and must be copied into the per-variant location:
- Shared reviewer runtime (`run-review.sh`) AND Telegram notifier helper (`notify-telegram.sh`)
must be installed beside agent skills. Both scripts ship under `skills/reviewer-runtime/` in this
repo and must be copied into the per-variant location:
- Codex: `~/.codex/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
- Claude Code: `~/.claude/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
- OpenCode: `~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
- Cursor: `.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (repo-local, preferred) or `~/.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (global fallback)
- Pi: `.pi/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh}` (repo-local) or `~/.pi/agent/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh}` (global)
- Cursor: `.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (repo-local,
preferred) or `~/.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
(global fallback)
- Pi: `.pi/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh}` (repo-local) or
`~/.pi/agent/skills/reviewer-runtime/pi/{run-review.sh,notify-telegram.sh}` (global)
- Variant-specific prerequisites:
- **Claude Code:** `claude --version`, explicit `Skill`-tool invocation of sub-skills.
- **Codex:** `codex --version`; `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills` symlink present.
- **Cursor:** `cursor-agent --version`, `jq --version` (hard prereq), Superpowers available from the Cursor plugin cache or manual Cursor skill roots.
- **OpenCode:** `opencode --version`; Superpowers available from `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`; Phase 1 runs Bootstrap Superpowers Context.
- **Cursor:** `cursor-agent --version`, `jq --version` (hard prereq), Superpowers available
from the Cursor plugin cache or manual Cursor skill roots.
- **OpenCode:** `opencode --version`; Superpowers available from `~/.agents/skills/superpowers`
or `~/.config/opencode/skills/superpowers`; Phase 1 runs Bootstrap Superpowers Context.
- Telegram notification setup is documented in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
Dependency-missing messages are variant-specific:
- **Claude Code:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
- **Codex:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
- **Cursor:** `Missing dependency: [specific missing item]. Install Cursor Agent CLI, jq, and the Cursor Superpowers plugin or Superpowers skills under .cursor/skills/ or ~/.cursor/skills/, then retry.`
- **OpenCode:** `Missing dependency: [specific missing item]. Install required OpenCode Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the reviewer-runtime helper, then retry.`
- **Pi:** `Missing dependency: [specific missing item]. Install Pi, required Superpowers skills, and the Pi reviewer-runtime helper, then retry.`
- **Claude Code:** `Missing dependency: [specific missing item]. Install required Superpowers
skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
- **Codex:** `Missing dependency: [specific missing item]. Install required Superpowers skills
(https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
- **Cursor:** `Missing dependency: [specific missing item]. Install Cursor Agent CLI, jq, and the
Cursor Superpowers plugin or Superpowers skills under .cursor/skills/ or ~/.cursor/skills/,
then retry.`
- **OpenCode:** `Missing dependency: [specific missing item]. Install required OpenCode
Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the
reviewer-runtime helper, then retry.`
- **Pi:** `Missing dependency: [specific missing item]. Install Pi, required Superpowers skills,
and the Pi reviewer-runtime helper, then retry.`
### Reviewer CLI Requirements
One of these CLIs must be installed to drive either of the two review loops:
The canonical reviewer CLI support matrix is documented in
[REVIEWERS.md](./REVIEWERS.md). One of these CLIs must be installed to drive either of the two
review loops:
| Reviewer CLI | Install | Verify | Read-Only Mode | Session Resume |
|---|---|---|---|---|
@@ -53,9 +77,12 @@ One of these CLIs must be installed to drive either of the two review loops:
| `opencode` | `brew install opencode` or your package manager | `opencode --version` | `--agent plan` | Opt-in (`-s <id>`; fresh call is the default) |
| `pi` | Install Pi coding agent | `pi --version`; list models with `pi --list-models [search]` | `--tools read,grep,find,ls` | No (fresh call each round) |
The reviewer CLI is independent of which agent is running the skill — e.g., Claude Code can send both the plan and the implementation to Codex for review.
The reviewer CLI is independent of which agent is running the skill — e.g., Claude Code can send
both the plan and the implementation to Codex for review.
**Additional dependency for `cursor` reviewer:** `jq` is required to parse Cursor's JSON output. Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`. The cursor variant of `do-task` makes `jq` a hard prerequisite regardless of which reviewer CLI is selected.
**Additional dependency for `cursor` reviewer:** `jq` is required to parse Cursor's JSON output.
Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`. The cursor
variant of `do-task` makes `jq` a hard prerequisite regardless of which reviewer CLI is selected.
## Install
@@ -124,7 +151,7 @@ Recommended full Pi package install:
Manual single-skill Pi install from the package mirror:
```bash
./scripts/sync-pi-package-skills.sh
pnpm run sync:pi
mkdir -p .pi/skills/do-task
cp -R pi-package/skills/do-task/* .pi/skills/do-task/
mkdir -p .pi/skills/reviewer-runtime/pi
@@ -138,9 +165,11 @@ Pi workflow skills also require Superpowers. See [PI-SUPERPOWERS.md](./PI-SUPERP
## Verify Installation
Run the per-variant checks for everything the corresponding `SKILL.md` enforces. Each check is structured: (1) CLI binary version, (2) skill file presence, (3) reviewer-runtime + notifier helper presence, (4) Superpowers sub-skill discovery, (5) variant-specific extras.
Run the per-variant checks for everything the corresponding `SKILL.md` enforces. Each check is
structured: (1) CLI binary version, (2) skill file presence, (3) reviewer-runtime + notifier
helper presence, (4) Superpowers sub-skill discovery, (5) variant-specific extras.
### Codex
### Codex Verify
```bash
codex --version
@@ -154,7 +183,7 @@ test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
```
### Claude Code
### Claude Code Verify
```bash
claude --version
@@ -167,7 +196,7 @@ test -f ~/.claude/skills/superpowers/verification-before-completion/SKILL.md
test -f ~/.claude/skills/superpowers/finishing-a-development-branch/SKILL.md
```
### OpenCode
### OpenCode Verify
```bash
opencode --version
@@ -180,7 +209,7 @@ test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md ||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
```
### Cursor
### Cursor Verify
```bash
cursor-agent --version
@@ -194,7 +223,7 @@ test -f .cursor/skills/superpowers/skills/verification-before-completion/SKILL.m
test -f .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/finishing-a-development-branch/SKILL.md' -print -quit 2>/dev/null | grep -q .
```
### Pi
### Pi Verify
```bash
pi --version
@@ -210,26 +239,44 @@ test -f .pi/skills/superpowers/finishing-a-development-branch/SKILL.md || test -
## Key Behavior
- Creates one persistent plan artifact at `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`.
- Ensures `/ai_plan/` is in `.gitignore`. If missing, adds it and creates a separate `chore(gitignore): ignore ai_plan local planning artifacts` commit.
- Parses the user prompt, detects the trigger phrase, and asks 1-3 clarifying questions unless the prompt already has a concrete target + outcome + unambiguous scope + resolvable identifiers.
- Invokes `superpowers:brainstorming` for any behavior-changing task (feature creation, non-trivial bug fix, refactor, design decision). The only skip conditions are `pure-documentation` and `pure-comment-whitespace-rename`.
- Asks which reviewer CLI, model, and max rounds to use (or accepts `skip` for no review). "Use defaults" maps to `codex / gpt-5.4 / MAX_ROUNDS=10`.
- Runs the plan review loop (Phase 5) before implementation, iterating up to `MAX_ROUNDS` (default 10) or until the reviewer returns `VERDICT: APPROVED`.
- Executes with TDD-first (Phase 6) via `superpowers:test-driven-development`. Auto-skip permitted only for `pure-documentation` and `pure-comment-whitespace-rename`; all other skips (including config-file additions) require explicit user approval, recorded in the TDD Approach section with an ISO-8601 timestamp.
- Ensures `/ai_plan/` is in `.gitignore`. If missing, adds it and creates a separate
`chore(gitignore): ignore ai_plan local planning artifacts` commit.
- Parses the user prompt, detects the trigger phrase, and asks 1-3 clarifying questions unless
the prompt already has a concrete target + outcome + unambiguous scope + resolvable identifiers.
- Invokes `superpowers:brainstorming` for any behavior-changing task (feature creation,
non-trivial bug fix, refactor, design decision). The only skip conditions are
`pure-documentation` and `pure-comment-whitespace-rename`.
- Asks which reviewer CLI, model, and max rounds to use (or accepts `skip` for no review).
"Use defaults" maps to `codex / gpt-5.4 / MAX_ROUNDS=10`.
- Runs the plan review loop (Phase 5) before implementation, iterating up to `MAX_ROUNDS`
(default 10) or until the reviewer returns `VERDICT: APPROVED`.
- Executes with TDD-first (Phase 6) via `superpowers:test-driven-development`. Auto-skip
permitted only for `pure-documentation` and `pure-comment-whitespace-rename`; all other skips
(including config-file additions) require explicit user approval, recorded in the TDD Approach
section with an ISO-8601 timestamp.
- Runs lint/typecheck/tests as a **verification gate** (Phase 7) before the implementation review loop.
- Runs the implementation review loop (Phase 8) against the diff + verification output, iterating up to `MAX_ROUNDS` or until `APPROVED`.
- Runs the implementation review loop (Phase 8) against the diff + verification output,
iterating up to `MAX_ROUNDS` or until `APPROVED`.
- Scans every outbound reviewer payload for secrets (subroutine step 1a). Per-payload, no caching.
- Creates a **single commit** after the implementation review approves. Does NOT push. Asks the user for explicit `yes` before any push.
- Defaults to the **current branch**. Worktree only on explicit opt-in (`"in a worktree"`, `"use a worktree"`, `"on an isolated branch"`, `"on a new branch called X"`).
- Creates a **single commit** after the implementation review approves. Does NOT push. Asks the
user for explicit `yes` before any push.
- Defaults to the **current branch**. Worktree only on explicit opt-in (`"in a worktree"`,
`"use a worktree"`, `"on an isolated branch"`, `"on a new branch called X"`).
- Supports resume: detects existing folder by slug and uses `Status` + Runtime State to decide how to re-enter.
- Sends completion notifications through Telegram only when the shared setup in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) is installed and configured.
- Sends completion notifications through Telegram only when the shared setup in
[TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) is installed and configured.
## Dual Review Loops
`do-task` runs the reviewer twice per successful run, with separate session IDs so reviewer context never leaks across loops.
1. **Plan review loop (Phase 5)** — payload is the current `task-plan.md` with `Runtime State` and `Review History` stripped. The reviewer evaluates whether the plan matches the prompt, whether assumptions are surfaced, whether acceptance criteria are testable, whether the TDD approach is appropriate, and whether there are missing files/risks/security concerns.
2. **Implementation review loop (Phase 8)** — payload is the approved task plan (without Runtime State) + `git diff` (unstaged + staged) + verification output (lint, typecheck, tests). The reviewer evaluates correctness, code quality, test coverage, security, and regression risk.
1. **Plan review loop (Phase 5)** — payload is the current `task-plan.md` with `Runtime State`
and `Review History` stripped. The reviewer evaluates whether the plan matches the prompt,
whether assumptions are surfaced, whether acceptance criteria are testable, whether the TDD
approach is appropriate, and whether there are missing files/risks/security concerns.
2. **Implementation review loop (Phase 8)** — payload is the approved task plan (without Runtime
State) + `git diff` (unstaged + staged) + verification output (lint, typecheck, tests). The
reviewer evaluates correctness, code quality, test coverage, security, and regression risk.
Both loops share the same 9-step subroutine and the same `MAX_ROUNDS` counter (default 10).
@@ -239,7 +286,10 @@ Both loops share the same 9-step subroutine and the same `MAX_ROUNDS` counter (d
2. **Secret scan (step 1a)** — per-payload, no caching. See Secret Scan section below.
3. Generate reviewer command script at `/tmp/do-task-<kind>-review-<REVIEW_ID>.sh`.
4. Run via `reviewer-runtime/run-review.sh`.
5. Promote reviewer output and capture the session ID on Round 1; persist it to `task-plan.md` Runtime State under the loop-specific variable (`CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, or `OPENCODE_IMPL_SESSION_ID`).
5. Promote reviewer output and capture the session ID on Round 1; persist it to `task-plan.md`
Runtime State under the loop-specific variable (`CODEX_PLAN_SESSION_ID`,
`CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`,
`OPENCODE_PLAN_SESSION_ID`, or `OPENCODE_IMPL_SESSION_ID`).
6. Parse verdict; append an entry to Review History; bump the round counter.
7. Branch: `APPROVED` → exit, `REVISE` → caller revises and re-enters, `MAX_ROUNDS` → caller decides.
8. Liveness contract: wait while `In progress N` heartbeats arrive from the runner.
@@ -274,7 +324,8 @@ ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|in-pr
```
`in-progress` is the liveness heartbeat emitted roughly once per minute with `note="In progress N"`.
`stall-warning` is a non-terminal status-log state only. It does not mean the caller should stop waiting if `in-progress` heartbeats continue.
`stall-warning` is a non-terminal status-log state only. It does not mean the caller should
stop waiting if `in-progress` heartbeats continue.
### Persistent Artifact
@@ -295,19 +346,25 @@ The one file kept across runs is `ai_plan/<slug>/task-plan.md`. Its `Status` enu
## Failure Handling
- `completed-empty-output` — the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
- `needs-operator-decision` — the helper reached hard-timeout escalation; surface `.status` and decide whether to extend the timeout, abort, or retry with different parameters.
- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
- Verification gate (Phase 7) retries up to 3 times. On exhaustion, `Status` becomes `aborted-verification` and the user is asked whether to retry, override, or abort.
- `completed-empty-output` — the reviewer exited without producing review text; surface
`.stderr` and `.status`, then retry only after diagnosing the cause.
- `needs-operator-decision` — the helper reached hard-timeout escalation; surface `.status`
and decide whether to extend the timeout, abort, or retry with different parameters.
- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds
retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
- Verification gate (Phase 7) retries up to 3 times. On exhaustion, `Status` becomes
`aborted-verification` and the user is asked whether to retry, override, or abort.
- As long as fresh `in-progress` heartbeats continue to arrive roughly once per minute, the caller keeps waiting.
## Secret Scan (subroutine step 1a; per-payload; no caching)
Every outbound reviewer payload is scanned **before** being sent to the reviewer CLI. This scan runs on every round of both loops. No results are cached, because the Phase 8 payload includes newly-introduced diff content that earlier rounds never saw.
Every outbound reviewer payload is scanned **before** being sent to the reviewer CLI. This scan
runs on every round of both loops. No results are cached, because the Phase 8 payload includes
newly-introduced diff content that earlier rounds never saw.
Canonical anchored regex list (10 patterns):
```
```text
AWS access key: AKIA[0-9A-Z]{16}
GCP service-acct: "type"\s*:\s*"service_account"
GitHub tokens: (ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
@@ -320,9 +377,14 @@ PEM private keys: -----BEGIN [A-Z ]+ PRIVATE KEY-----
JWT: eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
```
If a match is found, the skill **redacts the matched text before showing it to the user** using the fixed token `[REDACTED:<pattern-label>:<match-length>-chars]` (pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`). File paths and line numbers are kept. Raw match text is never echoed to terminal, chat log, or any persistent file.
If a match is found, the skill **redacts the matched text before showing it to the user** using
the fixed token `[REDACTED:<pattern-label>:<match-length>-chars]` (pattern labels:
`aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`,
`anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`). File paths and line numbers are kept.
Raw match text is never echoed to terminal, chat log, or any persistent file.
The user answers `yes` / `no` / `redact`:
- `yes` — proceed; Runtime State records `last_scan_outcome_<kind>=user-approved-with-matches`.
- `redact` — the user supplies redactions, the skill applies them, and re-scans before sending. Runtime State records `last_scan_outcome_<kind>=redacted-and-approved`.
- `no` — stop the loop, set `Status: failed`, send Telegram summary.
@@ -343,15 +405,21 @@ For all supported reviewer CLIs, the preferred execution path is:
2. Run that script through `reviewer-runtime/run-review.sh`.
3. Fall back to direct synchronous execution only if the helper is missing or not executable.
## Pi Reviewer Support
All workflow variants can use Pi itself as a reviewer CLI. Use `pi/<pi-model-name>` shorthand, for example `pi/claude-opus-4-7`; this means `REVIEWER_CLI=pi` and `REVIEWER_MODEL=claude-opus-4-7`. Provider-qualified or multi-slash Pi model IDs are preserved after the first `pi/` prefix, for example `pi/anthropic/claude-opus-4-7`.
All workflow variants can use Pi itself as a reviewer CLI. Use `pi/<pi-model-name>` shorthand,
for example `pi/claude-opus-4-7`; this means `REVIEWER_CLI=pi` and
`REVIEWER_MODEL=claude-opus-4-7`. Provider-qualified or multi-slash Pi model IDs are preserved
after the first `pi/` prefix, for example `pi/anthropic/claude-opus-4-7`.
The canonical isolated read-only Pi reviewer flag contract lives in [PI-COMMON-REVIEWER.md](./PI-COMMON-REVIEWER.md). This workflow passes the plan and implementation review payload at `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` and expects the standard `## Summary`, `## Findings`, and `## Verdict` response. Pi reviewer output is captured as markdown stdout, not JSON.
The canonical isolated read-only Pi reviewer flag contract lives in
[PI-COMMON-REVIEWER.md](./PI-COMMON-REVIEWER.md). This workflow passes the plan and
implementation review payload at `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` and expects the
standard `## Summary`, `## Findings`, and `## Verdict` response. Pi reviewer output is captured
as markdown stdout, not JSON.
If the Pi reviewer model or provider is unavailable, surface the helper stderr/status and use `pi --list-models [search]` to inspect configured models.
If the Pi reviewer model or provider is unavailable, surface the helper stderr/status and use
`pi --list-models [search]` to inspect configured models.
## Notifications
@@ -359,7 +427,8 @@ If the Pi reviewer model or provider is unavailable, surface the helper stderr/s
- Shared setup: [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
- Notification failures are non-blocking, but they must be surfaced to the user.
- Before stopping for any user interaction, approval, or manual decision, the skill sends a Telegram summary first if configured.
- Terminal outcomes that trigger Telegram: `pushed`, `local-only`, `aborted-plan-review`, `aborted-impl-review`, `aborted-verification`, `failed`.
- Terminal outcomes that trigger Telegram: `pushed`, `local-only`, `aborted-plan-review`,
`aborted-impl-review`, `aborted-verification`, `failed`.
The reviewer-runtime helper also supports manual override flags for diagnostics:
@@ -377,7 +446,9 @@ run-review.sh \
## Template Guardrails
All four `templates/task-plan.md` files share identical core sections (14 `## `-level headings) and identical Status enum (10 values). Variant-specific guardrail language is permitted in the leading blockquote and in the `Runtime` field of the Metadata table.
All four `templates/task-plan.md` files share identical core sections (14 `##`-level headings)
and identical Status enum (10 values). Variant-specific guardrail language is permitted in the
leading blockquote and in the `Runtime` field of the Metadata table.
**Core sections** (appear in every variant, same order):
@@ -396,11 +467,15 @@ All four `templates/task-plan.md` files share identical core sections (14 `## `-
13. Final Status
14. Guardrails (do NOT remove)
**Runtime State keys** (same across all variants): `plan_review_round`, `implementation_review_round`, `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, `last_phase_entered`, `last_round_ts`, `last_scan_outcome_plan`, `last_scan_outcome_impl`, `verification_attempts`, `tests_added_count`, `tdd_used`.
**Runtime State keys** (same across all variants): `plan_review_round`,
`implementation_review_round`, `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`,
`CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`,
`OPENCODE_IMPL_SESSION_ID`, `last_phase_entered`, `last_round_ts`, `last_scan_outcome_plan`,
`last_scan_outcome_impl`, `verification_attempts`, `tests_added_count`, `tdd_used`.
## Variant Hardening Notes
### Claude Code
### Claude Code Hardening
- Must invoke explicit required sub-skills via the `Skill` tool:
- `superpowers:brainstorming`
@@ -411,7 +486,7 @@ All four `templates/task-plan.md` files share identical core sections (14 `## `-
- Must enforce plan-mode file-write guard in Phase 4:
- If currently in plan mode, instruct user to exit plan mode before writing `task-plan.md`.
### Codex
### Codex Hardening
- Must use native skill discovery from `~/.agents/skills/` (no CLI wrappers).
- Must verify Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
@@ -422,22 +497,25 @@ All four `templates/task-plan.md` files share identical core sections (14 `## `-
- Helper paths: `~/.codex/skills/reviewer-runtime/...`.
- No plan-mode guard (Codex has no plan-mode concept).
### OpenCode
### OpenCode Hardening
- Must use OpenCode's native skill tool (not Claude's `Skill` tool syntax). OpenCode may load shared skill files from `~/.agents/skills/`, but invocation is still OpenCode-native.
- Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms the required `superpowers/<skill>` set is discoverable before any other phase runs.
- Must use OpenCode's native skill tool (not Claude's `Skill` tool syntax). OpenCode may load
shared skill files from `~/.agents/skills/`, but invocation is still OpenCode-native.
- Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms
the required `superpowers/<skill>` set is discoverable before any other phase runs.
- Must verify Superpowers skill discovery under `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`.
- Helper paths: `~/.config/opencode/skills/reviewer-runtime/...`.
- Opencode reviewer calls MUST use `--agent plan` (the built-in plan primary agent) for read-only posture.
- No plan-mode guard (OpenCode has no plan-mode concept).
### Cursor
### Cursor Hardening
- Must use Cursor-native discovery from `.cursor/skills/`, `~/.cursor/skills/`, or installed Cursor plugin cache entries.
- Must announce skill usage explicitly before invocation.
- `jq` is a hard prerequisite.
- Helper paths: `.cursor/skills/reviewer-runtime/...` preferred, `~/.cursor/skills/reviewer-runtime/...` fallback.
- Reviewer invocations MUST use `--mode=ask --trust --output-format json`. Never `--mode=agent`, never `--force`, never write-capable modes for reviewer calls.
- Reviewer invocations MUST use `--mode=ask --trust --output-format json`. Never `--mode=agent`,
never `--force`, never write-capable modes for reviewer calls.
- No plan-mode guard (Cursor has no plan-mode concept).
## Execution Workflow Rules
@@ -447,7 +525,8 @@ All four `templates/task-plan.md` files share identical core sections (14 `## `-
- Plan review completes before any implementation starts.
- Phase 7 verification gate must pass before the implementation review starts.
- The task commit is a single commit created in Phase 9.
- The `.gitignore` infra commit (Phase 1) is explicitly separate from the task commit and is allowed even when the final task ends up `aborted` or `failed`.
- The `.gitignore` infra commit (Phase 1) is explicitly separate from the task commit and is
allowed even when the final task ends up `aborted` or `failed`.
- No push without explicit `yes` from the user.
- Secret scan runs per-payload with no caching.
- `MAX_ROUNDS=10` is shared across both loops (single mental model).