docs(do-task): add DO-TASK.md + README updates (M6)

docs/DO-TASK.md covers: - Purpose, Requirements (variant-specific prereqs + dependency- missing messages per variant), Reviewer CLI Requirements table (4 CLIs including opencode with fresh-call default). - Install (4 subsections: Codex, Claude Code, OpenCode, Cursor). - Per-variant Verify Installation subsections checking CLI binary, SKILL.md, run-review.sh, notify-telegram.sh, Superpowers sub-skills, and variant extras (Codex symlink, Cursor jq, OpenCode Superpowers ls, Cursor repo-vs-global lookup). - Key Behavior, Dual Review Loops, Subroutine Steps, Reviewer Output Contract, Runtime Artifacts, Persistent Artifact Status enum (10 values), Failure Handling. - Secret Scan (subroutine step 1a; per-payload; no caching) with canonical 10-pattern regex list and redaction contract. - Supported Reviewer CLIs table (4 rows, including opencode). - Notifications, Template Guardrails (14 core sections + Runtime State keys), Variant Hardening Notes (4 subsections), Execution Workflow Rules. docs/README.md adds DO-TASK.md entry. README.md: - Skills table adds 4 do-task rows (codex, claude-code, opencode, cursor). - Docs links add "Do-task guide" entry. - Repository Layout adds do-task/ subdirectory. Reviewer: codex / gpt-5.4. Approved round 2: - Round 1: 2 P2 (prereqs inaccurate, Verify Installation incomplete) + 1 P3 -> REVISE. - Round 2: 0 P0/P1/P2/P3 -> APPROVED. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:40:49 -05:00
parent f5161f584d
commit 9853d4937b
3 changed files with 408 additions and 0 deletions
@@ -34,6 +34,11 @@ ai-coding-skills/
 │   │   ├── claude-code/
 │   │   ├── opencode/
 │   │   └── cursor/
+│   ├── do-task/
+│   │   ├── codex/
+│   │   ├── claude-code/
+│   │   ├── opencode/
+│   │   └── cursor/
 │   ├── implement-plan/
 │   │   ├── codex/
 │   │   ├── claude-code/
@@ -64,6 +69,10 @@ ai-coding-skills/
 | create-plan | claude-code | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
 | create-plan | opencode | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
 | create-plan | cursor | Structured planning with milestones, iterative cross-model review, and runbook-first execution workflow | Ready | [CREATE-PLAN](docs/CREATE-PLAN.md) |
+| do-task | codex | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
+| do-task | claude-code | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
+| do-task | opencode | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
+| do-task | cursor | Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit | Ready | [DO-TASK](docs/DO-TASK.md) |
 | implement-plan | codex | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
 | implement-plan | claude-code | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
 | implement-plan | opencode | Worktree-isolated plan execution with iterative cross-model milestone review | Ready | [IMPLEMENT-PLAN](docs/IMPLEMENT-PLAN.md) |
@@ -75,6 +84,7 @@ ai-coding-skills/
 - Docs index: `docs/README.md`
 - Atlassian guide: `docs/ATLASSIAN.md`
 - Create-plan guide: `docs/CREATE-PLAN.md`
+- Do-task guide: `docs/DO-TASK.md`
 - Implement-plan guide: `docs/IMPLEMENT-PLAN.md`
 - Web-automation guide: `docs/WEB-AUTOMATION.md`

@@ -0,0 +1,397 @@
+# DO-TASK
+
+## Purpose
+
+Execute a single user-supplied prompt end-to-end with **two reviewer loops** (plan review + implementation review), with TDD-first execution, a pre-implementation verification gate, and a single task commit — all in one run of the skill. `do-task` is scoped to small-to-medium ad-hoc tasks; for multi-milestone work use `create-plan` + `implement-plan` instead.
+
+`do-task` persists one plan artifact per run: `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`. The folder is kept as a record after success (not deleted). Resume is supported via the `Status` enum and Runtime State fields.
+
+## Requirements
+
+- Git repo with `/ai_plan/` entry in `.gitignore` (the skill adds the entry automatically if missing and commits it as a separate infra commit).
+- Superpowers skills installed from: https://github.com/obra/superpowers
+- Required dependencies (vary by variant; see Install below):
+  - `superpowers:brainstorming` (or `superpowers/brainstorming` for OpenCode)
+  - `superpowers:test-driven-development`
+  - `superpowers:verification-before-completion`
+  - `superpowers:finishing-a-development-branch`
+  - `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
+- For Codex, native skill discovery must be configured:
+  - `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
+- For Cursor, skills must be installed under `.cursor/skills/` (repo-local) or `~/.cursor/skills/` (global), and `jq` is a hard prerequisite.
+- For OpenCode, Superpowers must be installed at `~/.config/opencode/skills/superpowers`.
+- Shared reviewer runtime (`run-review.sh`) AND Telegram notifier helper (`notify-telegram.sh`) must be installed beside agent skills. Both scripts ship under `skills/reviewer-runtime/` in this repo and must be copied into the per-variant location:
+  - Codex: `~/.codex/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
+  - Claude Code: `~/.claude/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
+  - OpenCode: `~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`
+  - Cursor: `.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (repo-local, preferred) or `~/.cursor/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}` (global fallback)
+- Variant-specific prerequisites:
+  - **Claude Code:** `claude --version`, explicit `Skill`-tool invocation of sub-skills.
+  - **Codex:** `codex --version`; `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills` symlink present.
+  - **Cursor:** `cursor-agent --version`, `jq --version` (hard prereq), Superpowers installed under `.cursor/skills/` or `~/.cursor/skills/`.
+  - **OpenCode:** `opencode --version`; Superpowers installed at `~/.config/opencode/skills/superpowers`; Phase 1 runs Bootstrap Superpowers Context.
+- Telegram notification setup is documented in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
+
+Dependency-missing messages are variant-specific:
+
+- **Claude Code:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
+- **Codex:** `Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
+- **Cursor:** `Missing dependency: [specific missing item]. Install Cursor Agent CLI, jq, and Superpowers skills under .cursor/skills/ or ~/.cursor/skills/, then retry.`
+- **OpenCode:** `Missing dependency: [specific missing item]. Install required OpenCode Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the reviewer-runtime helper, then retry.`
+
+### Reviewer CLI Requirements
+
+One of these CLIs must be installed to drive either of the two review loops:
+
+| Reviewer CLI | Install | Verify | Read-Only Mode | Session Resume |
+|---|---|---|---|---|
+| `codex` | `npm install -g @openai/codex` | `codex --version` | `-s read-only` | Yes (`codex exec resume <id>`) |
+| `claude` | `npm install -g @anthropic-ai/claude-code` | `claude --version` | `--strict-mcp-config --setting-sources user` | No (fresh call each round) |
+| `cursor` | `curl https://cursor.com/install -fsS \| bash` | `cursor-agent --version` (binary: `cursor-agent`; alias `cursor agent` also works) | `--mode=ask` | Yes (`--resume <id>`) |
+| `opencode` | `brew install opencode` or your package manager | `opencode --version` | `--agent plan` | Opt-in (`-s <id>`; fresh call is the default) |
+
+The reviewer CLI is independent of which agent is running the skill — e.g., Claude Code can send both the plan and the implementation to Codex for review.
+
+**Additional dependency for `cursor` reviewer:** `jq` is required to parse Cursor's JSON output. Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`. The cursor variant of `do-task` makes `jq` a hard prerequisite regardless of which reviewer CLI is selected.
+
+## Install
+
+### Codex
+
+```bash
+mkdir -p ~/.codex/skills/do-task
+cp -R skills/do-task/codex/* ~/.codex/skills/do-task/
+mkdir -p ~/.codex/skills/reviewer-runtime
+cp -R skills/reviewer-runtime/* ~/.codex/skills/reviewer-runtime/
+```
+
+### Claude Code
+
+```bash
+mkdir -p ~/.claude/skills/do-task
+cp -R skills/do-task/claude-code/* ~/.claude/skills/do-task/
+mkdir -p ~/.claude/skills/reviewer-runtime
+cp -R skills/reviewer-runtime/* ~/.claude/skills/reviewer-runtime/
+```
+
+### OpenCode
+
+```bash
+mkdir -p ~/.config/opencode/skills/do-task
+cp -R skills/do-task/opencode/* ~/.config/opencode/skills/do-task/
+mkdir -p ~/.config/opencode/skills/reviewer-runtime
+cp -R skills/reviewer-runtime/* ~/.config/opencode/skills/reviewer-runtime/
+```
+
+### Cursor
+
+Copy into the repo-local `.cursor/skills/` directory (where the Cursor Agent CLI discovers skills):
+
+```bash
+mkdir -p .cursor/skills/do-task
+cp -R skills/do-task/cursor/* .cursor/skills/do-task/
+mkdir -p .cursor/skills/reviewer-runtime
+cp -R skills/reviewer-runtime/* .cursor/skills/reviewer-runtime/
+```
+
+Or install globally (loaded via `~/.cursor/skills/`):
+
+```bash
+mkdir -p ~/.cursor/skills/do-task
+cp -R skills/do-task/cursor/* ~/.cursor/skills/do-task/
+mkdir -p ~/.cursor/skills/reviewer-runtime
+cp -R skills/reviewer-runtime/* ~/.cursor/skills/reviewer-runtime/
+```
+
+## Verify Installation
+
+Run the per-variant checks for everything the corresponding `SKILL.md` enforces. Each check is structured: (1) CLI binary version, (2) skill file presence, (3) reviewer-runtime + notifier helper presence, (4) Superpowers sub-skill discovery, (5) variant-specific extras.
+
+### Codex
+
+```bash
+codex --version
+test -f ~/.codex/skills/do-task/SKILL.md
+test -x ~/.codex/skills/reviewer-runtime/run-review.sh
+test -x ~/.codex/skills/reviewer-runtime/notify-telegram.sh
+test -L ~/.agents/skills/superpowers
+test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
+test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md
+test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
+test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
+```
+
+### Claude Code
+
+```bash
+claude --version
+test -f ~/.claude/skills/do-task/SKILL.md
+test -x ~/.claude/skills/reviewer-runtime/run-review.sh
+test -x ~/.claude/skills/reviewer-runtime/notify-telegram.sh
+test -f ~/.claude/skills/superpowers/brainstorming/SKILL.md
+test -f ~/.claude/skills/superpowers/test-driven-development/SKILL.md
+test -f ~/.claude/skills/superpowers/verification-before-completion/SKILL.md
+test -f ~/.claude/skills/superpowers/finishing-a-development-branch/SKILL.md
+```
+
+### OpenCode
+
+```bash
+opencode --version
+test -f ~/.config/opencode/skills/do-task/SKILL.md
+test -x ~/.config/opencode/skills/reviewer-runtime/run-review.sh
+test -x ~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
+ls -l ~/.config/opencode/skills/superpowers
+test -f ~/.config/opencode/skills/superpowers/brainstorming/SKILL.md
+test -f ~/.config/opencode/skills/superpowers/test-driven-development/SKILL.md
+test -f ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
+test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
+```
+
+### Cursor
+
+```bash
+cursor-agent --version
+jq --version
+test -f .cursor/skills/do-task/SKILL.md || test -f ~/.cursor/skills/do-task/SKILL.md
+test -x .cursor/skills/reviewer-runtime/run-review.sh || test -x ~/.cursor/skills/reviewer-runtime/run-review.sh
+test -x .cursor/skills/reviewer-runtime/notify-telegram.sh || test -x ~/.cursor/skills/reviewer-runtime/notify-telegram.sh
+test -f .cursor/skills/superpowers/skills/brainstorming/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/brainstorming/SKILL.md
+test -f .cursor/skills/superpowers/skills/test-driven-development/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/test-driven-development/SKILL.md
+test -f .cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/verification-before-completion/SKILL.md
+test -f .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md
+```
+
+## Key Behavior
+
+- Creates one persistent plan artifact at `ai_plan/YYYY-MM-DD-<slug>/task-plan.md`.
+- Ensures `/ai_plan/` is in `.gitignore`. If missing, adds it and creates a separate `chore(gitignore): ignore ai_plan local planning artifacts` commit.
+- Parses the user prompt, detects the trigger phrase, and asks 1-3 clarifying questions unless the prompt already has a concrete target + outcome + unambiguous scope + resolvable identifiers.
+- Invokes `superpowers:brainstorming` for any behavior-changing task (feature creation, non-trivial bug fix, refactor, design decision). The only skip conditions are `pure-documentation` and `pure-comment-whitespace-rename`.
+- Asks which reviewer CLI, model, and max rounds to use (or accepts `skip` for no review). "Use defaults" maps to `codex / gpt-5.4 / MAX_ROUNDS=10`.
+- Runs the plan review loop (Phase 5) before implementation, iterating up to `MAX_ROUNDS` (default 10) or until the reviewer returns `VERDICT: APPROVED`.
+- Executes with TDD-first (Phase 6) via `superpowers:test-driven-development`. Auto-skip permitted only for `pure-documentation` and `pure-comment-whitespace-rename`; all other skips (including config-file additions) require explicit user approval, recorded in the TDD Approach section with an ISO-8601 timestamp.
+- Runs lint/typecheck/tests as a **verification gate** (Phase 7) before the implementation review loop.
+- Runs the implementation review loop (Phase 8) against the diff + verification output, iterating up to `MAX_ROUNDS` or until `APPROVED`.
+- Scans every outbound reviewer payload for secrets (subroutine step 1a). Per-payload, no caching.
+- Creates a **single commit** after the implementation review approves. Does NOT push. Asks the user for explicit `yes` before any push.
+- Defaults to the **current branch**. Worktree only on explicit opt-in (`"in a worktree"`, `"use a worktree"`, `"on an isolated branch"`, `"on a new branch called X"`).
+- Supports resume: detects existing folder by slug and uses `Status` + Runtime State to decide how to re-enter.
+- Sends completion notifications through Telegram only when the shared setup in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) is installed and configured.
+
+## Dual Review Loops
+
+`do-task` runs the reviewer twice per successful run, with separate session IDs so reviewer context never leaks across loops.
+
+1. **Plan review loop (Phase 5)** — payload is the current `task-plan.md` with `Runtime State` and `Review History` stripped. The reviewer evaluates whether the plan matches the prompt, whether assumptions are surfaced, whether acceptance criteria are testable, whether the TDD approach is appropriate, and whether there are missing files/risks/security concerns.
+2. **Implementation review loop (Phase 8)** — payload is the approved task plan (without Runtime State) + `git diff` (unstaged + staged) + verification output (lint, typecheck, tests). The reviewer evaluates correctness, code quality, test coverage, security, and regression risk.
+
+Both loops share the same 9-step subroutine and the same `MAX_ROUNDS` counter (default 10).
+
+### Subroutine Steps (inside each review loop)
+
+1. Write payload to `/tmp/do-task-<kind>-<REVIEW_ID>.md`.
+2. **Secret scan (step 1a)** — per-payload, no caching. See Secret Scan section below.
+3. Generate reviewer command script at `/tmp/do-task-<kind>-review-<REVIEW_ID>.sh`.
+4. Run via `reviewer-runtime/run-review.sh`.
+5. Promote reviewer output and capture the session ID on Round 1; persist it to `task-plan.md` Runtime State under the loop-specific variable (`CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, or `OPENCODE_IMPL_SESSION_ID`).
+6. Parse verdict; append an entry to Review History; bump the round counter.
+7. Branch: `APPROVED` → exit, `REVISE` → caller revises and re-enters, `MAX_ROUNDS` → caller decides.
+8. Liveness contract: wait while `In progress N` heartbeats arrive from the runner.
+9. Cleanup temp artifacts on success.
+
+### Reviewer Output Contract
+
+- `P0` = total blocker
+- `P1` = major risk
+- `P2` = must-fix before approval
+- `P3` = cosmetic / nice to have
+- Each severity section uses `- None.` when empty.
+- `VERDICT: APPROVED` is valid only when no `P0`, `P1`, or `P2` findings remain.
+- `P3` findings are non-blocking, but the caller should still try to fix them when cheap and safe.
+
+## Runtime Artifacts
+
+Per review loop (`<kind>` = `plan` or `implementation`):
+
+- `/tmp/do-task-<kind>-<REVIEW_ID>.md` — payload
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.md` — normalized review text
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.json` — raw JSON (cursor always; opencode with `--format json`)
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.stderr` — reviewer stderr
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.status` — helper heartbeat/status log
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.runner.out` — helper-managed stdout
+- `/tmp/do-task-<kind>-review-<REVIEW_ID>.sh` — reviewer command script
+
+Status log lines use this format:
+
+```text
+ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|in-progress|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
+```
+
+`in-progress` is the liveness heartbeat emitted roughly once per minute with `note="In progress N"`.
+`stall-warning` is a non-terminal status-log state only. It does not mean the caller should stop waiting if `in-progress` heartbeats continue.
+
+### Persistent Artifact
+
+The one file kept across runs is `ai_plan/<slug>/task-plan.md`. Its `Status` enum drives resume decisions:
+
+| Status | Meaning |
+|---|---|
+| `draft` | Newly created; plan review not yet started |
+| `plan-approved` | Plan review loop returned APPROVED |
+| `implementation-in-progress` | Phase 6 executing |
+| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
+| `pushed` | Committed + pushed to remote |
+| `local-only` | Committed locally; user declined push |
+| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
+| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
+| `aborted-verification` | Phase 7 retries exhausted; user aborted |
+| `failed` | Hard tooling failure |
+
+## Failure Handling
+
+- `completed-empty-output` — the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
+- `needs-operator-decision` — the helper reached hard-timeout escalation; surface `.status` and decide whether to extend the timeout, abort, or retry with different parameters.
+- Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
+- Verification gate (Phase 7) retries up to 3 times. On exhaustion, `Status` becomes `aborted-verification` and the user is asked whether to retry, override, or abort.
+- As long as fresh `in-progress` heartbeats continue to arrive roughly once per minute, the caller keeps waiting.
+
+## Secret Scan (subroutine step 1a; per-payload; no caching)
+
+Every outbound reviewer payload is scanned **before** being sent to the reviewer CLI. This scan runs on every round of both loops. No results are cached, because the Phase 8 payload includes newly-introduced diff content that earlier rounds never saw.
+
+Canonical anchored regex list (10 patterns):
+
+```
+AWS access key:     AKIA[0-9A-Z]{16}
+GCP service-acct:   "type"\s*:\s*"service_account"
+GitHub tokens:      (ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
+Slack tokens:       xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
+                    xox[abpsr]-[A-Za-z0-9]{10,48}
+OpenAI API keys:    sk-(proj-)?[A-Za-z0-9_-]{20,}
+Anthropic API keys: sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
+PEM private keys:   -----BEGIN [A-Z ]+ PRIVATE KEY-----
+.env-style:         (TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
+JWT:                eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
+```
+
+If a match is found, the skill **redacts the matched text before showing it to the user** using the fixed token `[REDACTED:<pattern-label>:<match-length>-chars]` (pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`). File paths and line numbers are kept. Raw match text is never echoed to terminal, chat log, or any persistent file.
+
+The user answers `yes` / `no` / `redact`:
+- `yes` — proceed; Runtime State records `last_scan_outcome_<kind>=user-approved-with-matches`.
+- `redact` — the user supplies redactions, the skill applies them, and re-scans before sending. Runtime State records `last_scan_outcome_<kind>=redacted-and-approved`.
+- `no` — stop the loop, set `Status: failed`, send Telegram summary.
+
+## Supported Reviewer CLIs
+
+| CLI | Round-1 command | Round-N resume | Output capture |
+|---|---|---|---|
+| `codex` | `codex exec -m <model> -s read-only -o <out.md> "<prompt>"` | `codex exec resume <session-id> -o <out.md> "<prompt>"` | `<out.md>` directly (helper `--success-file`) |
+| `claude` | `claude -p "<prompt>" --model <model> --strict-mcp-config --setting-sources user` | Fresh call with prior-round context summary | `cp <runner.out> <out.md>` |
+| `cursor` | `cursor-agent -p --mode=ask --model <model> --trust --output-format json "<prompt>" > <out.json>` | `cursor-agent --resume <id> -p --mode=ask --model <model> --trust --output-format json "<prompt>" > <out.json>` | `jq -r '.result' <out.json> > <out.md>` |
+| `opencode` | `opencode run -m <provider>/<model> --agent plan --format json "<prompt>" > <out.json>` | Fresh call (default) OR `opencode run -s <id> -m <provider>/<model> --agent plan --format json "<prompt>" > <out.json>` (opt-in) | `jq -r '.[] \| select(.type == "message" and .role == "assistant") \| .content' <out.json> > <out.md>` |
+
+For all four CLIs, the preferred execution path is:
+
+1. Write the reviewer command to a bash script.
+2. Run that script through `reviewer-runtime/run-review.sh`.
+3. Fall back to direct synchronous execution only if the helper is missing or not executable.
+
+## Notifications
+
+- Telegram is the only supported notification path.
+- Shared setup: [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
+- Notification failures are non-blocking, but they must be surfaced to the user.
+- Before stopping for any user interaction, approval, or manual decision, the skill sends a Telegram summary first if configured.
+- Terminal outcomes that trigger Telegram: `pushed`, `local-only`, `aborted-plan-review`, `aborted-impl-review`, `aborted-verification`, `failed`.
+
+The reviewer-runtime helper also supports manual override flags for diagnostics:
+
+```bash
+run-review.sh \
+  --command-file <path> \
+  --stdout-file <path> \
+  --stderr-file <path> \
+  --status-file <path> \
+  --poll-seconds 10 \
+  --soft-timeout-seconds 600 \
+  --stall-warning-seconds 300 \
+  --hard-timeout-seconds 1800
+```
+
+## Template Guardrails
+
+All four `templates/task-plan.md` files share identical core sections (14 `## `-level headings) and identical Status enum (10 values). Variant-specific guardrail language is permitted in the leading blockquote and in the `Runtime` field of the Metadata table.
+
+**Core sections** (appear in every variant, same order):
+
+1. Metadata
+2. Prompt
+3. Interpretation
+4. Assumptions
+5. Files
+6. Approach
+7. TDD Approach
+8. Acceptance Criteria
+9. Verification
+10. Rollback
+11. Runtime State
+12. Review History
+13. Final Status
+14. Guardrails (do NOT remove)
+
+**Runtime State keys** (same across all variants): `plan_review_round`, `implementation_review_round`, `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, `last_phase_entered`, `last_round_ts`, `last_scan_outcome_plan`, `last_scan_outcome_impl`, `verification_attempts`, `tests_added_count`, `tdd_used`.
+
+## Variant Hardening Notes
+
+### Claude Code
+
+- Must invoke explicit required sub-skills via the `Skill` tool:
+  - `superpowers:brainstorming`
+  - `superpowers:test-driven-development`
+  - `superpowers:verification-before-completion`
+  - `superpowers:finishing-a-development-branch`
+  - `superpowers:using-git-worktrees` (conditional)
+- Must enforce plan-mode file-write guard in Phase 4:
+  - If currently in plan mode, instruct user to exit plan mode before writing `task-plan.md`.
+
+### Codex
+
+- Must use native skill discovery from `~/.agents/skills/` (no CLI wrappers).
+- Must verify Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
+- Must invoke required sub-skills with explicit announcements before any action.
+- Must track checklist-driven sub-skills with `update_plan` todos (Codex equivalent of `TodoWrite`).
+- `Task` subagents are unavailable — do the work directly and state the limitation.
+- Deprecated CLI commands (`superpowers-codex bootstrap`, `use-skill`) must NOT be used.
+- Helper paths: `~/.codex/skills/reviewer-runtime/...`.
+- No plan-mode guard (Codex has no plan-mode concept).
+
+### OpenCode
+
+- Must use OpenCode's native skill tool (not Claude's `Skill` tool syntax, not Codex's `~/.agents/skills/` paths).
+- Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms the required `superpowers/<skill>` set is discoverable before any other phase runs.
+- Must verify Superpowers skill discovery under `~/.config/opencode/skills/superpowers`.
+- Helper paths: `~/.config/opencode/skills/reviewer-runtime/...`.
+- Opencode reviewer calls MUST use `--agent plan` (the built-in plan primary agent) for read-only posture.
+- No plan-mode guard (OpenCode has no plan-mode concept).
+
+### Cursor
+
+- Must use workspace discovery from `.cursor/skills/` (repo-local) or `~/.cursor/skills/` (global).
+- Must announce skill usage explicitly before invocation.
+- `jq` is a hard prerequisite.
+- Helper paths: `.cursor/skills/reviewer-runtime/...` preferred, `~/.cursor/skills/reviewer-runtime/...` fallback.
+- Reviewer invocations MUST use `--mode=ask --trust --output-format json`. Never `--mode=agent`, never `--force`, never write-capable modes for reviewer calls.
+- No plan-mode guard (Cursor has no plan-mode concept).
+
+## Execution Workflow Rules
+
+- The skill works from `ai_plan/YYYY-MM-DD-<slug>/task-plan.md` as its single persistent artifact.
+- Current branch is the default; worktree is opt-in only through explicit trigger phrases.
+- Plan review completes before any implementation starts.
+- Phase 7 verification gate must pass before the implementation review starts.
+- The task commit is a single commit created in Phase 9.
+- The `.gitignore` infra commit (Phase 1) is explicitly separate from the task commit and is allowed even when the final task ends up `aborted` or `failed`.
+- No push without explicit `yes` from the user.
+- Secret scan runs per-payload with no caching.
+- `MAX_ROUNDS=10` is shared across both loops (single mental model).
@@ -6,6 +6,7 @@ This directory contains user-facing docs for each skill.

 - [ATLASSIAN.md](./ATLASSIAN.md) — Includes requirements, generated bundle sync, install, auth, safety rules, and usage examples for the Atlassian skill.
 - [CREATE-PLAN.md](./CREATE-PLAN.md) — Includes requirements, install, verification, and execution workflow for create-plan.
+- [DO-TASK.md](./DO-TASK.md) — Single-prompt end-to-end execution with dual reviewer loops (plan + implementation), TDD-first, single task commit. Sibling of create-plan/implement-plan scoped to ad-hoc tasks.
 - [IMPLEMENT-PLAN.md](./IMPLEMENT-PLAN.md) — Includes requirements, install, verification, and milestone review workflow for implement-plan.
 - [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) — Shared Telegram notification setup used by reviewer-driven skills.
 - [WEB-AUTOMATION.md](./WEB-AUTOMATION.md) — Includes requirements, install, dependency verification, and usage examples for web-automation.