docs(do-task): add DO-TASK.md + README updates (M6)
docs/DO-TASK.md covers: - Purpose, Requirements (variant-specific prereqs + dependency- missing messages per variant), Reviewer CLI Requirements table (4 CLIs including opencode with fresh-call default). - Install (4 subsections: Codex, Claude Code, OpenCode, Cursor). - Per-variant Verify Installation subsections checking CLI binary, SKILL.md, run-review.sh, notify-telegram.sh, Superpowers sub-skills, and variant extras (Codex symlink, Cursor jq, OpenCode Superpowers ls, Cursor repo-vs-global lookup). - Key Behavior, Dual Review Loops, Subroutine Steps, Reviewer Output Contract, Runtime Artifacts, Persistent Artifact Status enum (10 values), Failure Handling. - Secret Scan (subroutine step 1a; per-payload; no caching) with canonical 10-pattern regex list and redaction contract. - Supported Reviewer CLIs table (4 rows, including opencode). - Notifications, Template Guardrails (14 core sections + Runtime State keys), Variant Hardening Notes (4 subsections), Execution Workflow Rules. docs/README.md adds DO-TASK.md entry. README.md: - Skills table adds 4 do-task rows (codex, claude-code, opencode, cursor). - Docs links add "Do-task guide" entry. - Repository Layout adds do-task/ subdirectory. Reviewer: codex / gpt-5.4. Approved round 2: - Round 1: 2 P2 (prereqs inaccurate, Verify Installation incomplete) + 1 P3 -> REVISE. - Round 2: 0 P0/P1/P2/P3 -> APPROVED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -34,6 +34,11 @@ ai-coding-skills/
|
||||
│ │ ├── claude-code/
|
||||
│ │ ├── opencode/
|
||||
│ │ └── cursor/
|
||||
│ ├── do-task/
|
||||
│ │ ├── codex/
|
||||
│ │ ├── claude-code/
|
||||
│ │ ├── opencode/
|
||||
│ │ └── cursor/
|
||||
│ ├── implement-plan/
|
||||
│ │ ├── codex/
|
||||
│ │ ├── claude-code/
|
||||
@@ -64,6 +69,10 @@ ai-coding-skills/
|
||||
| create-plan | claude-code | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
|
||||
| create-plan | opencode | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
|
||||
| create-plan | cursor | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
|
||||
| do-task | codex | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
|
||||
| do-task | claude-code | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
|
||||
| do-task | opencode | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
|
||||
| do-task | cursor | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
|
||||
| implement-plan | codex | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
|
||||
| implement-plan | claude-code | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
|
||||
| implement-plan | opencode | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
|
||||
@@ -75,6 +84,7 @@ ai-coding-skills/
|
||||
- Docs index: `docs/README.md`
|
||||
- Atlassian guide: `docs/ATLASSIAN.md`
|
||||
- Create-plan guide: `docs/CREATE-PLAN.md`
|
||||
- Do-task guide: `docs/DO-TASK.md`
|
||||
- Implement-plan guide: `docs/IMPLEMENT-PLAN.md`
|
||||
- Web-automation guide: `docs/WEB-AUTOMATION.md`
|
||||
|
||||
|
||||
+397
@@ -0,0 +1,397 @@
|
||||
# DO-TASK
|
||||
|
||||
## Purpose
|
||||
|
||||
Execute a single user-supplied prompt end-to-end with **two reviewer loops** (plan review + implementation review), with TDD-first execution, a pre-implementation verification gate, and a single task commit — all in one run of the skill. `do-task` is scoped to small-to-medium ad-hoc tasks; for multi-milestone work use `create-plan` + `implement-plan` instead.
|
||||
|
||||
`do-task` persists one plan artifact per run: `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`. The folder is kept as a record after success (not deleted). Resume is supported via the `Status` enum and Runtime State fields.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Git repo with `/ai_plan/` entry in `.gitignore` (the skill adds the entry automatically if missing and commits it as a separate infra commit).
|
||||
- Superpowers skills installed from: https://github.com/obra/superpowers
|
||||
- Required dependencies (vary by variant; see Install below):
|
||||
- `superpowers:brainstorming` (or `superpowers/brainstorming` for OpenCode)
|
||||
- `superpowers:test-driven-development`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
- `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
|
||||
- For Codex, native skill discovery must be configured:
|
||||
- `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
- For Cursor, skills must be installed under `.cursor/skills/` (repo-local) or `~/.cursor/skills/` (global), and `jq` is a hard prerequisite.
|
||||
- For OpenCode, Superpowers must be installed at `~/.config/opencode/skills/superpowers`.
|
||||
- Shared reviewer runtime (`run-review.sh`) AND Telegram notifier helper (`notify-telegram.sh`) must be installed beside agent skills. Both scripts ship under `skills/reviewer-runtime/` in this repo and must be copied into the per-variant location:
|
||||
- Codex: `~/.codex/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
|
||||
- Claude Code: `~/.claude/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
|
||||
- OpenCode: `~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
|
||||
- Cursor: `.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (repo-local, preferred) or `~/.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (global fallback)
|
||||
- Variant-specific prerequisites:
|
||||
- **Claude Code:** `claude --version`, explicit `Skill`-tool invocation of sub-skills.
|
||||
- **Codex:** `codex --version`; `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills` symlink present.
|
||||
- **Cursor:** `cursor-agent --version`, `jq --version` (hard prereq), Superpowers installed under `.cursor/skills/` or `~/.cursor/skills/`.
|
||||
- **OpenCode:** `opencode --version`; Superpowers installed at `~/.config/opencode/skills/superpowers`; Phase 1 runs Bootstrap Superpowers Context.
|
||||
- Telegram notification setup is documented in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
|
||||
|
||||
Dependency-missing messages are variant-specific:
|
||||
|
||||
- **Claude Code:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
|
||||
- **Codex:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
|
||||
- **Cursor:** `Missing dependency: [specific missing item]. Install Cursor Agent CLI, jq, and Superpowers skills under .cursor/skills/ or ~/.cursor/skills/, then retry.`
|
||||
- **OpenCode:** `Missing dependency: [specific missing item]. Install required OpenCode Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the reviewer-runtime helper, then retry.`
|
||||
|
||||
### Reviewer CLI Requirements
|
||||
|
||||
One of these CLIs must be installed to drive either of the two review loops:
|
||||
|
||||
| Reviewer CLI | Install | Verify | Read-Only Mode | Session Resume |
|
||||
|---|---|---|---|---|
|
||||
| `codex` | `npm install -g @openai/codex` | `codex --version` | `-s read-only` | Yes (`codex exec resume <id>`) |
|
||||
| `claude` | `npm install -g @anthropic-ai/claude-code` | `claude --version` | `--strict-mcp-config --setting-sources user` | No (fresh call each round) |
|
||||
| `cursor` | `curl https://cursor.com/install -fsS \| bash` | `cursor-agent --version` (binary: `cursor-agent`; alias `cursor agent` also works) | `--mode=ask` | Yes (`--resume <id>`) |
|
||||
| `opencode` | `brew install opencode` or your package manager | `opencode --version` | `--agent plan` | Opt-in (`-s <id>`; fresh call is the default) |
|
||||
|
||||
The reviewer CLI is independent of which agent is running the skill — e.g., Claude Code can send both the plan and the implementation to Codex for review.
|
||||
|
||||
**Additional dependency for `cursor` reviewer:** `jq` is required to parse Cursor's JSON output. Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`. The cursor variant of `do-task` makes `jq` a hard prerequisite regardless of which reviewer CLI is selected.
|
||||
|
||||
## Install
|
||||
|
||||
### Codex
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.codex/skills/do-task
|
||||
cp -R skills/do-task/codex/* ~/.codex/skills/do-task/
|
||||
mkdir -p ~/.codex/skills/reviewer-runtime
|
||||
cp -R skills/reviewer-runtime/* ~/.codex/skills/reviewer-runtime/
|
||||
```
|
||||
|
||||
### Claude Code
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.claude/skills/do-task
|
||||
cp -R skills/do-task/claude-code/* ~/.claude/skills/do-task/
|
||||
mkdir -p ~/.claude/skills/reviewer-runtime
|
||||
cp -R skills/reviewer-runtime/* ~/.claude/skills/reviewer-runtime/
|
||||
```
|
||||
|
||||
### OpenCode
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.config/opencode/skills/do-task
|
||||
cp -R skills/do-task/opencode/* ~/.config/opencode/skills/do-task/
|
||||
mkdir -p ~/.config/opencode/skills/reviewer-runtime
|
||||
cp -R skills/reviewer-runtime/* ~/.config/opencode/skills/reviewer-runtime/
|
||||
```
|
||||
|
||||
### Cursor
|
||||
|
||||
Copy into the repo-local `.cursor/skills/` directory (where the Cursor Agent CLI discovers skills):
|
||||
|
||||
```bash
|
||||
mkdir -p .cursor/skills/do-task
|
||||
cp -R skills/do-task/cursor/* .cursor/skills/do-task/
|
||||
mkdir -p .cursor/skills/reviewer-runtime
|
||||
cp -R skills/reviewer-runtime/* .cursor/skills/reviewer-runtime/
|
||||
```
|
||||
|
||||
Or install globally (loaded via `~/.cursor/skills/`):
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.cursor/skills/do-task
|
||||
cp -R skills/do-task/cursor/* ~/.cursor/skills/do-task/
|
||||
mkdir -p ~/.cursor/skills/reviewer-runtime
|
||||
cp -R skills/reviewer-runtime/* ~/.cursor/skills/reviewer-runtime/
|
||||
```
|
||||
|
||||
## Verify Installation
|
||||
|
||||
Run the per-variant checks for everything the corresponding `SKILL.md` enforces. Each check is structured: (1) CLI binary version, (2) skill file presence, (3) reviewer-runtime + notifier helper presence, (4) Superpowers sub-skill discovery, (5) variant-specific extras.
|
||||
|
||||
### Codex
|
||||
|
||||
```bash
|
||||
codex --version
|
||||
test -f ~/.codex/skills/do-task/SKILL.md
|
||||
test -x ~/.codex/skills/reviewer-runtime/run-review.sh
|
||||
test -x ~/.codex/skills/reviewer-runtime/notify-telegram.sh
|
||||
test -L ~/.agents/skills/superpowers
|
||||
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
### Claude Code
|
||||
|
||||
```bash
|
||||
claude --version
|
||||
test -f ~/.claude/skills/do-task/SKILL.md
|
||||
test -x ~/.claude/skills/reviewer-runtime/run-review.sh
|
||||
test -x ~/.claude/skills/reviewer-runtime/notify-telegram.sh
|
||||
test -f ~/.claude/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.claude/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.claude/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.claude/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
### OpenCode
|
||||
|
||||
```bash
|
||||
opencode --version
|
||||
test -f ~/.config/opencode/skills/do-task/SKILL.md
|
||||
test -x ~/.config/opencode/skills/reviewer-runtime/run-review.sh
|
||||
test -x ~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
|
||||
ls -l ~/.config/opencode/skills/superpowers
|
||||
test -f ~/.config/opencode/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.config/opencode/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
### Cursor
|
||||
|
||||
```bash
|
||||
cursor-agent --version
|
||||
jq --version
|
||||
test -f .cursor/skills/do-task/SKILL.md || test -f ~/.cursor/skills/do-task/SKILL.md
|
||||
test -x .cursor/skills/reviewer-runtime/run-review.sh || test -x ~/.cursor/skills/reviewer-runtime/run-review.sh
|
||||
test -x .cursor/skills/reviewer-runtime/notify-telegram.sh || test -x ~/.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
test -f .cursor/skills/superpowers/skills/brainstorming/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/brainstorming/SKILL.md
|
||||
test -f .cursor/skills/superpowers/skills/test-driven-development/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/test-driven-development/SKILL.md
|
||||
test -f .cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/verification-before-completion/SKILL.md
|
||||
test -f .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
## Key Behavior
|
||||
|
||||
- Creates one persistent plan artifact at `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`.
|
||||
- Ensures `/ai_plan/` is in `.gitignore`. If missing, adds it and creates a separate `chore(gitignore): ignore ai_plan local planning artifacts` commit.
|
||||
- Parses the user prompt, detects the trigger phrase, and asks 1-3 clarifying questions unless the prompt already has a concrete target + outcome + unambiguous scope + resolvable identifiers.
|
||||
- Invokes `superpowers:brainstorming` for any behavior-changing task (feature creation, non-trivial bug fix, refactor, design decision). The only skip conditions are `pure-documentation` and `pure-comment-whitespace-rename`.
|
||||
- Asks which reviewer CLI, model, and max rounds to use (or accepts `skip` for no review). "Use defaults" maps to `codex / gpt-5.4 / MAX_ROUNDS=10`.
|
||||
- Runs the plan review loop (Phase 5) before implementation, iterating up to `MAX_ROUNDS` (default 10) or until the reviewer returns `VERDICT: APPROVED`.
|
||||
- Executes with TDD-first (Phase 6) via `superpowers:test-driven-development`. Auto-skip permitted only for `pure-documentation` and `pure-comment-whitespace-rename`; all other skips (including config-file additions) require explicit user approval, recorded in the TDD Approach section with an ISO-8601 timestamp.
|
||||
- Runs lint/typecheck/tests as a **verification gate** (Phase 7) before the implementation review loop.
|
||||
- Runs the implementation review loop (Phase 8) against the diff + verification output, iterating up to `MAX_ROUNDS` or until `APPROVED`.
|
||||
- Scans every outbound reviewer payload for secrets (subroutine step 1a). Per-payload, no caching.
|
||||
- Creates a **single commit** after the implementation review approves. Does NOT push. Asks the user for explicit `yes` before any push.
|
||||
- Defaults to the **current branch**. Worktree only on explicit opt-in (`"in a worktree"`, `"use a worktree"`, `"on an isolated branch"`, `"on a new branch called X"`).
|
||||
- Supports resume: detects existing folder by slug and uses `Status` + Runtime State to decide how to re-enter.
|
||||
- Sends completion notifications through Telegram only when the shared setup in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) is installed and configured.
|
||||
|
||||
## Dual Review Loops
|
||||
|
||||
`do-task` runs the reviewer twice per successful run, with separate session IDs so reviewer context never leaks across loops.
|
||||
|
||||
1. **Plan review loop (Phase 5)** — payload is the current `task-plan.md` with `Runtime State` and `Review History` stripped. The reviewer evaluates whether the plan matches the prompt, whether assumptions are surfaced, whether acceptance criteria are testable, whether the TDD approach is appropriate, and whether there are missing files/risks/security concerns.
|
||||
2. **Implementation review loop (Phase 8)** — payload is the approved task plan (without Runtime State) + `git diff` (unstaged + staged) + verification output (lint, typecheck, tests). The reviewer evaluates correctness, code quality, test coverage, security, and regression risk.
|
||||
|
||||
Both loops share the same 9-step subroutine and the same `MAX_ROUNDS` counter (default 10).
|
||||
|
||||
### Subroutine Steps (inside each review loop)
|
||||
|
||||
1. Write payload to `/tmp/do-task-<kind>-<REVIEW_ID>.md`.
|
||||
2. **Secret scan (step 1a)** — per-payload, no caching. See Secret Scan section below.
|
||||
3. Generate reviewer command script at `/tmp/do-task-<kind>-review-<REVIEW_ID>.sh`.
|
||||
4. Run via `reviewer-runtime/run-review.sh`.
|
||||
5. Promote reviewer output and capture the session ID on Round 1; persist it to `task-plan.md` Runtime State under the loop-specific variable (`CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, or `OPENCODE_IMPL_SESSION_ID`).
|
||||
6. Parse verdict; append an entry to Review History; bump the round counter.
|
||||
7. Branch: `APPROVED` → exit, `REVISE` → caller revises and re-enters, `MAX_ROUNDS` → caller decides.
|
||||
8. Liveness contract: wait while `In progress N` heartbeats arrive from the runner.
|
||||
9. Cleanup temp artifacts on success.
|
||||
|
||||
### Reviewer Output Contract
|
||||
|
||||
- `P0` = total blocker
|
||||
- `P1` = major risk
|
||||
- `P2` = must-fix before approval
|
||||
- `P3` = cosmetic / nice to have
|
||||
- Each severity section uses `- None.` when empty.
|
||||
- `VERDICT: APPROVED` is valid only when no `P0`, `P1`, or `P2` findings remain.
|
||||
- `P3` findings are non-blocking, but the caller should still try to fix them when cheap and safe.
|
||||
|
||||
## Runtime Artifacts
|
||||
|
||||
Per review loop (`<kind>` = `plan` or `implementation`):
|
||||
|
||||
- `/tmp/do-task-<kind>-<REVIEW_ID>.md` — payload
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.md` — normalized review text
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.json` — raw JSON (cursor always; opencode with `--format json`)
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.stderr` — reviewer stderr
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.status` — helper heartbeat/status log
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.runner.out` — helper-managed stdout
|
||||
- `/tmp/do-task-<kind>-review-<REVIEW_ID>.sh` — reviewer command script
|
||||
|
||||
Status log lines use this format:
|
||||
|
||||
```text
|
||||
ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|in-progress|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
|
||||
```
|
||||
|
||||
`in-progress` is the liveness heartbeat emitted roughly once per minute with `note="In progress N"`.
|
||||
`stall-warning` is a non-terminal status-log state only. It does not mean the caller should stop waiting if `in-progress` heartbeats continue.
|
||||
|
||||
### Persistent Artifact
|
||||
|
||||
The one file kept across runs is `ai_plan/<slug>/task-plan.md`. Its `Status` enum drives resume decisions:
|
||||
|
||||
| Status | Meaning |
|
||||
|---|---|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
## Failure Handling
|
||||
|
||||
- `completed-empty-output` — the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
|
||||
- `needs-operator-decision` — the helper reached hard-timeout escalation; surface `.status` and decide whether to extend the timeout, abort, or retry with different parameters.
|
||||
- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
|
||||
- Verification gate (Phase 7) retries up to 3 times. On exhaustion, `Status` becomes `aborted-verification` and the user is asked whether to retry, override, or abort.
|
||||
- As long as fresh `in-progress` heartbeats continue to arrive roughly once per minute, the caller keeps waiting.
|
||||
|
||||
## Secret Scan (subroutine step 1a; per-payload; no caching)
|
||||
|
||||
Every outbound reviewer payload is scanned **before** being sent to the reviewer CLI. This scan runs on every round of both loops. No results are cached, because the Phase 8 payload includes newly-introduced diff content that earlier rounds never saw.
|
||||
|
||||
Canonical anchored regex list (10 patterns):
|
||||
|
||||
```
|
||||
AWS access key: AKIA[0-9A-Z]{16}
|
||||
GCP service-acct: "type"\s*:\s*"service_account"
|
||||
GitHub tokens: (ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
|
||||
Slack tokens: xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
|
||||
xox[abpsr]-[A-Za-z0-9]{10,48}
|
||||
OpenAI API keys: sk-(proj-)?[A-Za-z0-9_-]{20,}
|
||||
Anthropic API keys: sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
|
||||
PEM private keys: -----BEGIN [A-Z ]+ PRIVATE KEY-----
|
||||
.env-style: (TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
|
||||
JWT: eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
|
||||
```
|
||||
|
||||
If a match is found, the skill **redacts the matched text before showing it to the user** using the fixed token `[REDACTED:<pattern-label>:<match-length>-chars]` (pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`). File paths and line numbers are kept. Raw match text is never echoed to terminal, chat log, or any persistent file.
|
||||
|
||||
The user answers `yes` / `no` / `redact`:
|
||||
- `yes` — proceed; Runtime State records `last_scan_outcome_<kind>=user-approved-with-matches`.
|
||||
- `redact` — the user supplies redactions, the skill applies them, and re-scans before sending. Runtime State records `last_scan_outcome_<kind>=redacted-and-approved`.
|
||||
- `no` — stop the loop, set `Status: failed`, send Telegram summary.
|
||||
|
||||
## Supported Reviewer CLIs
|
||||
|
||||
| CLI | Round-1 command | Round-N resume | Output capture |
|
||||
|---|---|---|---|
|
||||
| `codex` | `codex exec -m <model> -s read-only -o <out.md> "<prompt>"` | `codex exec resume <session-id> -o <out.md> "<prompt>"` | `<out.md>` directly (helper `--success-file`) |
|
||||
| `claude` | `claude -p "<prompt>" --model <model> --strict-mcp-config --setting-sources user` | Fresh call with prior-round context summary | `cp <runner.out> <out.md>` |
|
||||
| `cursor` | `cursor-agent -p --mode=ask --model <model> --trust --output-format json "<prompt>" > <out.json>` | `cursor-agent --resume <id> -p --mode=ask --model <model> --trust --output-format json "<prompt>" > <out.json>` | `jq -r '.result' <out.json> > <out.md>` |
|
||||
| `opencode` | `opencode run -m <provider>/<model> --agent plan --format json "<prompt>" > <out.json>` | Fresh call (default) OR `opencode run -s <id> -m <provider>/<model> --agent plan --format json "<prompt>" > <out.json>` (opt-in) | `jq -r '.[] \| select(.type == "message" and .role == "assistant") \| .content' <out.json> > <out.md>` |
|
||||
|
||||
For all four CLIs, the preferred execution path is:
|
||||
|
||||
1. Write the reviewer command to a bash script.
|
||||
2. Run that script through `reviewer-runtime/run-review.sh`.
|
||||
3. Fall back to direct synchronous execution only if the helper is missing or not executable.
|
||||
|
||||
## Notifications
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Shared setup: [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, the skill sends a Telegram summary first if configured.
|
||||
- Terminal outcomes that trigger Telegram: `pushed`, `local-only`, `aborted-plan-review`, `aborted-impl-review`, `aborted-verification`, `failed`.
|
||||
|
||||
The reviewer-runtime helper also supports manual override flags for diagnostics:
|
||||
|
||||
```bash
|
||||
run-review.sh \
|
||||
--command-file <path> \
|
||||
--stdout-file <path> \
|
||||
--stderr-file <path> \
|
||||
--status-file <path> \
|
||||
--poll-seconds 10 \
|
||||
--soft-timeout-seconds 600 \
|
||||
--stall-warning-seconds 300 \
|
||||
--hard-timeout-seconds 1800
|
||||
```
|
||||
|
||||
## Template Guardrails
|
||||
|
||||
All four `templates/task-plan.md` files share identical core sections (14 `## `-level headings) and identical Status enum (10 values). Variant-specific guardrail language is permitted in the leading blockquote and in the `Runtime` field of the Metadata table.
|
||||
|
||||
**Core sections** (appear in every variant, same order):
|
||||
|
||||
1. Metadata
|
||||
2. Prompt
|
||||
3. Interpretation
|
||||
4. Assumptions
|
||||
5. Files
|
||||
6. Approach
|
||||
7. TDD Approach
|
||||
8. Acceptance Criteria
|
||||
9. Verification
|
||||
10. Rollback
|
||||
11. Runtime State
|
||||
12. Review History
|
||||
13. Final Status
|
||||
14. Guardrails (do NOT remove)
|
||||
|
||||
**Runtime State keys** (same across all variants): `plan_review_round`, `implementation_review_round`, `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, `last_phase_entered`, `last_round_ts`, `last_scan_outcome_plan`, `last_scan_outcome_impl`, `verification_attempts`, `tests_added_count`, `tdd_used`.
|
||||
|
||||
## Variant Hardening Notes
|
||||
|
||||
### Claude Code
|
||||
|
||||
- Must invoke explicit required sub-skills via the `Skill` tool:
|
||||
- `superpowers:brainstorming`
|
||||
- `superpowers:test-driven-development`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
- `superpowers:using-git-worktrees` (conditional)
|
||||
- Must enforce plan-mode file-write guard in Phase 4:
|
||||
- If currently in plan mode, instruct user to exit plan mode before writing `task-plan.md`.
|
||||
|
||||
### Codex
|
||||
|
||||
- Must use native skill discovery from `~/.agents/skills/` (no CLI wrappers).
|
||||
- Must verify Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
- Must invoke required sub-skills with explicit announcements before any action.
|
||||
- Must track checklist-driven sub-skills with `update_plan` todos (Codex equivalent of `TodoWrite`).
|
||||
- `Task` subagents are unavailable — do the work directly and state the limitation.
|
||||
- Deprecated CLI commands (`superpowers-codex bootstrap`, `use-skill`) must NOT be used.
|
||||
- Helper paths: `~/.codex/skills/reviewer-runtime/...`.
|
||||
- No plan-mode guard (Codex has no plan-mode concept).
|
||||
|
||||
### OpenCode
|
||||
|
||||
- Must use OpenCode's native skill tool (not Claude's `Skill` tool syntax, not Codex's `~/.agents/skills/` paths).
|
||||
- Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms the required `superpowers/<skill>` set is discoverable before any other phase runs.
|
||||
- Must verify Superpowers skill discovery under `~/.config/opencode/skills/superpowers`.
|
||||
- Helper paths: `~/.config/opencode/skills/reviewer-runtime/...`.
|
||||
- Opencode reviewer calls MUST use `--agent plan` (the built-in plan primary agent) for read-only posture.
|
||||
- No plan-mode guard (OpenCode has no plan-mode concept).
|
||||
|
||||
### Cursor
|
||||
|
||||
- Must use workspace discovery from `.cursor/skills/` (repo-local) or `~/.cursor/skills/` (global).
|
||||
- Must announce skill usage explicitly before invocation.
|
||||
- `jq` is a hard prerequisite.
|
||||
- Helper paths: `.cursor/skills/reviewer-runtime/...` preferred, `~/.cursor/skills/reviewer-runtime/...` fallback.
|
||||
- Reviewer invocations MUST use `--mode=ask --trust --output-format json`. Never `--mode=agent`, never `--force`, never write-capable modes for reviewer calls.
|
||||
- No plan-mode guard (Cursor has no plan-mode concept).
|
||||
|
||||
## Execution Workflow Rules
|
||||
|
||||
- The skill works from `ai_plan/YYYY-MM-DD-<slug>/task-plan.md` as its single persistent artifact.
|
||||
- Current branch is the default; worktree is opt-in only through explicit trigger phrases.
|
||||
- Plan review completes before any implementation starts.
|
||||
- Phase 7 verification gate must pass before the implementation review starts.
|
||||
- The task commit is a single commit created in Phase 9.
|
||||
- The `.gitignore` infra commit (Phase 1) is explicitly separate from the task commit and is allowed even when the final task ends up `aborted` or `failed`.
|
||||
- No push without explicit `yes` from the user.
|
||||
- Secret scan runs per-payload with no caching.
|
||||
- `MAX_ROUNDS=10` is shared across both loops (single mental model).
|
||||
@@ -6,6 +6,7 @@ This directory contains user-facing docs for each skill.
|
||||
|
||||
- [ATLASSIAN.md](./ATLASSIAN.md) — Includes requirements, generated bundle sync, install, auth, safety rules, and usage examples for the Atlassian skill.
|
||||
- [CREATE-PLAN.md](./CREATE-PLAN.md) — Includes requirements, install, verification, and execution workflow for create-plan.
|
||||
- [DO-TASK.md](./DO-TASK.md) — Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit. Sibling of create-plan/implement-plan scoped to ad-hoc tasks.
|
||||
- [IMPLEMENT-PLAN.md](./IMPLEMENT-PLAN.md) — Includes requirements, install, verification, and milestone review workflow for implement-plan.
|
||||
- [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) — Shared Telegram notification setup used by reviewer-driven skills.
|
||||
- [WEB-AUTOMATION.md](./WEB-AUTOMATION.md) — Includes requirements, install, dependency verification, and usage examples for web-automation.
|
||||
|
||||
Reference in New Issue
Block a user