Perform code optimization and document cleanup (#1)
## Summary - add repository-wide quality tooling and verification scaffolding, including CI workflows, pnpm workspace setup, ESLint/Prettier/markdown checks, and generated-output verification helpers - reorganize skill sources and generation flow by introducing canonical `_source` variants, generator/manifests, reusable helper abstractions, and shared web-automation/browser utilities - clean up and expand documentation so the root README flows into docs and skill docs, with clearer development, reviewer, installer, and workflow guidance ## Notable changes - docs flow and consistency cleanup across `README.md`, `docs/README.md`, and related docs - new scripts for `check`, docs verification, generated-file verification, shell portability, and safe directory replacement - refactors in Atlassian and web-automation skill runtimes to reduce duplication and centralize reusable code - changelog, development documentation, and CI surface updates ## Test Plan - [ ] `pnpm run check` - [ ] review generated/manifests and skill sync outputs - [ ] smoke-check docs flow from `README.md` to `docs/README.md` to skill docs ## Notes - this branch currently includes tracked `skills/web-automation/shared/node_modules` content that should be reviewed carefully as potentially noisy/accidental committed artifacts Co-authored-by: Stefano Fiorini <stefano.fiorini@firsthorizon.com> Reviewed-on: #1
This commit was merged in pull request #1.
This commit is contained in:
@@ -0,0 +1,798 @@
|
||||
---
|
||||
name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review). ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
# Do Task (Claude Code)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
|
||||
This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `implement-plan`, `do-task` operates on one persistent `task-plan.md` (not a full milestone plan) and defaults to the **current branch** (not a worktree).
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Claude Code CLI: `claude --version`
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- `superpowers:brainstorming`
|
||||
- `superpowers:test-driven-development`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
- `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
|
||||
- Shared reviewer runtime: `~/.claude/skills/reviewer-runtime/run-review.sh`
|
||||
- Telegram notifier helper: `~/.claude/skills/reviewer-runtime/notify-telegram.sh`
|
||||
|
||||
If any required dependency is missing, stop immediately and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
|
||||
|
||||
This variant depends on explicit sub-skill invocation via the `Skill` tool. Do NOT use shell wrappers.
|
||||
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
- "execute this task"
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
- "on a new branch called X"
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Preflight
|
||||
|
||||
1. Verify git repo: `git rev-parse --is-inside-work-tree`.
|
||||
2. Verify `/ai_plan/` is present in `.gitignore`. If missing:
|
||||
- Append `/ai_plan/` to `.gitignore`.
|
||||
- Commit that infra change immediately with message `chore(gitignore): ignore ai_plan local planning artifacts`.
|
||||
- This infra commit is EXPLICITLY separate from the task commit in Phase 9. It may occur even when the final task ends up `aborted` or `failed`.
|
||||
3. Verify required sub-skills are discoverable through the `Skill` tool. Announce each sub-skill before invocation using:
|
||||
`I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
|
||||
### Phase 2: Parse Prompt and Question
|
||||
|
||||
1. Capture the exact user prompt verbatim.
|
||||
2. Detect trigger phrase (see above) and record which one matched.
|
||||
3. Detect escape phrase. If set, skip clarifying questions entirely.
|
||||
4. Apply the ask-first heuristic:
|
||||
- Skip clarifying questions ONLY if ALL are true:
|
||||
- Prompt names a concrete target (file, feature, or function).
|
||||
- Prompt names a concrete outcome (what success looks like).
|
||||
- Prompt has no ambiguous scope (no "and maybe also ...").
|
||||
- All identifiers in the prompt are resolvable against the codebase.
|
||||
- Otherwise, ask 1-3 clarifying questions, ONE AT A TIME, multiple-choice preferred.
|
||||
- Empty prompt → ask exactly once: "what task?".
|
||||
5. Invoke `superpowers:brainstorming` via the `Skill` tool for any **behavior-changing** task — feature creation, bug fix with multiple plausible approaches, refactor, design decision. Present 2-3 approaches and recommend one before finalizing the plan. The ONLY skip conditions are the same ones that allow TDD auto-skip: `pure-documentation` and `pure-comment-whitespace-rename`. When skipping, record the skip reason in the Interpretation section of `task-plan.md`.
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "do task X, review with codex gpt-5.4"), use those values. If the user says "use defaults" or otherwise opts out of explicit configuration, proceed with `REVIEWER_CLI=codex`, `REVIEWER_MODEL=gpt-5.4`, and `MAX_ROUNDS=10`. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review both the plan and the implementation?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `opencode` — OpenCode CLI (`opencode run`)
|
||||
- `skip` — No external review, proceed with user approval only at each loop.
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `gpt-5.4`, alternatives: `gpt-5.3-codex`, `o4-mini`, `o3`.
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`.
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models.
|
||||
- For `opencode`: provider-qualified form `<provider>/<model>` (e.g., `anthropic/claude-sonnet-4-5`, `openai/gpt-5.4`). Run `opencode models` to list available models.
|
||||
- Accept any model string the user provides.
|
||||
|
||||
3. **Max review rounds shared across both loops?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 4: Initialize Plan Workspace
|
||||
|
||||
**PLAN MODE CHECK:** If currently in plan mode:
|
||||
|
||||
1. Inform user that `task-plan.md` cannot be written while in plan mode.
|
||||
2. Instruct user to exit plan mode (approve plan or use `ExitPlanMode`).
|
||||
3. Proceed with file generation only after exiting plan mode.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
- If `Status` is `draft` or `plan-approved` or `implementation-in-progress`: offer to resume, pick a new suffix (`<slug>-v2`), or abort. Default is resume.
|
||||
- If `Status` is any terminal value (`pushed`, `local-only`, `aborted-*`, `failed`): offer a new suffix or abort. Default is new suffix.
|
||||
4. If not resuming, create the folder and write `task-plan.md` from the template at `templates/task-plan.md` (this skill's template folder; falls back to `~/.claude/skills/do-task/templates/task-plan.md` when installed directly).
|
||||
5. Fill in:
|
||||
- `Metadata` block.
|
||||
- `Prompt` (verbatim).
|
||||
- `Interpretation`, `Assumptions`, `Files`, `Approach`, `TDD Approach`, `Acceptance Criteria`, `Verification`, `Rollback`.
|
||||
- Leave `Runtime State`, `Review History`, `Final Status` empty (skill updates these).
|
||||
6. Set `Status: draft`.
|
||||
|
||||
**Worktree branch:** If the prompt opts in to a worktree (see Trigger Phrase Detection), invoke `superpowers:using-git-worktrees` via the `Skill` tool before proceeding. Otherwise continue on the current branch.
|
||||
|
||||
### Phase 5: Plan Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = PLAN_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_PLAN_SESSION_ID | CURSOR_PLAN_SESSION_ID | OPENCODE_PLAN_SESSION_ID
|
||||
```
|
||||
|
||||
Payload is the current `task-plan.md` **with the `Runtime State` and `Review History` blocks stripped** before writing to `PAYLOAD_PATH`. Those two blocks contain reviewer session IDs and scan outcomes that must never be sent back to any reviewer CLI. Reviewers only need the Prompt, Interpretation, Assumptions, Files, Approach, TDD Approach, Acceptance Criteria, Verification, Rollback, and Metadata sections.
|
||||
|
||||
`PLAN_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this task plan for completeness, correctness, and risk. Focus on:
|
||||
1. Does the plan match the user's prompt?
|
||||
2. Are all assumptions surfaced?
|
||||
3. Are acceptance criteria testable?
|
||||
4. Is the TDD approach appropriate per the TDD Approach section?
|
||||
5. Are there missing files, risks, or security concerns?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
|
||||
### Phase 6: Execute (TDD-first where applicable)
|
||||
|
||||
Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
|
||||
1. Set `Status: implementation-in-progress`.
|
||||
2. For every behavior-changing file edit:
|
||||
- Invoke `superpowers:test-driven-development` via the `Skill` tool.
|
||||
- Write the failing test first. Run it. Confirm it fails.
|
||||
- Implement the minimal code to make it pass. Run the test. Confirm green.
|
||||
- Do NOT commit yet — a single task commit happens in Phase 9.
|
||||
3. Auto-skip of TDD is permitted ONLY for tasks classified in `task-plan.md` TDD Approach as:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
4. Any other skip (including `pure-config-addition`) requires explicit user approval recorded in `task-plan.md` with an ISO-8601 timestamp.
|
||||
5. Update `task-plan.md` after each logical step: add notes to `Approach`, check off `Acceptance Criteria` items as they complete.
|
||||
|
||||
### Phase 7: Verification Gate
|
||||
|
||||
Invoke `superpowers:verification-before-completion` via the `Skill` tool.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
|
||||
### Phase 8: Implementation Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = IMPL_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_IMPL_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_IMPL_SESSION_ID
|
||||
```
|
||||
|
||||
Payload contents (assembled by the skill):
|
||||
|
||||
```markdown
|
||||
# Implementation Review: [Short Title]
|
||||
|
||||
## Task Plan (the plan that was approved)
|
||||
<embed approved task-plan.md, excluding Runtime State block>
|
||||
|
||||
## Changes Made (git diff)
|
||||
<output of: `git diff` for unstaged + `git diff --staged` for staged>
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
<lint output>
|
||||
### Typecheck
|
||||
<typecheck output>
|
||||
### Tests
|
||||
<test output, pass/fail counts>
|
||||
```
|
||||
|
||||
`IMPL_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this implementation against the task plan. Focus on:
|
||||
1. Correctness — Does the diff satisfy the Acceptance Criteria?
|
||||
2. Code quality — Clean, maintainable, no obvious issues?
|
||||
3. Test coverage — Are behavior changes adequately tested (per the plan's TDD Approach)?
|
||||
4. Security — Any security concerns introduced?
|
||||
5. Regressions — Does the diff risk breaking unrelated code?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
|
||||
### Phase 9: Commit + Push Ask
|
||||
|
||||
Invoke `superpowers:finishing-a-development-branch` via the `Skill` tool.
|
||||
|
||||
1. Stage all changed files explicitly (avoid `git add -A`).
|
||||
2. Single commit with message derived from the task goal:
|
||||
- Format: `<type>(<scope>): <short description>`
|
||||
- Example: `feat(auth): add session token rotation`
|
||||
3. Do NOT push. Update `Status: local-only`.
|
||||
4. Ask the user: "Push to remote? (yes / no)"
|
||||
- On explicit `yes` → push, then set `Status: pushed`.
|
||||
- Any other response → leave `Status: local-only`.
|
||||
|
||||
### Phase 10: Telegram Notification + Finalize
|
||||
|
||||
Resolve the notifier helper:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.claude/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome (`pushed`, `local-only`, `aborted-*`, `failed`), send a Telegram summary if both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are set:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "do-task <slug>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
Fill in `Final Status` in `task-plan.md` (include commit hash if any). Do NOT delete the plan folder — it stays as a record.
|
||||
|
||||
---
|
||||
|
||||
## Review Loop (Shared Subroutine)
|
||||
|
||||
This subroutine is invoked twice per `do-task` run: once in Phase 5 (`REVIEW_KIND=plan`) and once in Phase 8 (`REVIEW_KIND=implementation`). Separate session IDs are used for each loop so reviewer context never leaks across loops.
|
||||
|
||||
### Subroutine Inputs
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `REVIEW_KIND` | `plan` or `implementation` |
|
||||
| `REVIEW_ID` | 8-char hex (from `uuidgen`); reused across rounds of the same loop |
|
||||
| `PAYLOAD_PATH` | `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` |
|
||||
| `PROMPT_TEMPLATE` | `PLAN_REVIEW_PROMPT` or `IMPL_REVIEW_PROMPT` |
|
||||
| `REVIEWER_CLI` | `codex` \| `claude` \| `cursor` \| `opencode` \| `pi` |
|
||||
| `REVIEWER_MODEL` | Model name |
|
||||
| `MAX_ROUNDS` | Default 10 |
|
||||
| `SESSION_ID_VAR` | `CODEX_PLAN_SESSION_ID` \| `CODEX_IMPL_SESSION_ID` \| `CURSOR_PLAN_SESSION_ID` \| `CURSOR_IMPL_SESSION_ID` \| `OPENCODE_PLAN_SESSION_ID` \| `OPENCODE_IMPL_SESSION_ID` |
|
||||
|
||||
Temp artifact paths (per loop):
|
||||
|
||||
- `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` — payload
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md` — normalized review text
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json` — raw Cursor/OpenCode JSON (cursor only, plus opencode when `--format json` is used)
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared helper:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.claude/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
### Step 1: Write Payload
|
||||
|
||||
Write the full payload for this round to `PAYLOAD_PATH`.
|
||||
|
||||
### Step 1a: Secret Scan (per-payload, no caching)
|
||||
|
||||
**BEFORE** sending the payload to any reviewer CLI, scan it for secrets. This scan runs EVERY round — no results are cached. Rationale: Phase 8 payloads include newly-introduced diff content that earlier rounds never saw.
|
||||
|
||||
Run the secret scan with all of these anchored regexes. Use `grep -En` on the payload file:
|
||||
|
||||
```bash
|
||||
SECRET_REGEX_FILE=$(mktemp)
|
||||
cat >"$SECRET_REGEX_FILE" <<'EOF'
|
||||
AKIA[0-9A-Z]{16}
|
||||
"type"\s*:\s*"service_account"
|
||||
(ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
|
||||
xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
|
||||
xox[abpsr]-[A-Za-z0-9]{10,48}
|
||||
sk-(proj-)?[A-Za-z0-9_-]{20,}
|
||||
sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
|
||||
-----BEGIN [A-Z ]+ PRIVATE KEY-----
|
||||
(TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
|
||||
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
|
||||
EOF
|
||||
|
||||
SCAN_MATCHES=$(grep -Ensf "$SECRET_REGEX_FILE" "$PAYLOAD_PATH" || true)
|
||||
rm -f "$SECRET_REGEX_FILE"
|
||||
```
|
||||
|
||||
If `SCAN_MATCHES` is non-empty:
|
||||
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
### Step 2: Generate Reviewer Command Script
|
||||
|
||||
Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh` as a bash script starting with:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Round 1 — fresh `codex exec`:
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"Review the ${REVIEW_KIND} payload in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
${PROMPT_TEMPLATE}"
|
||||
```
|
||||
|
||||
Do not capture the Codex session ID yet. After Round 1 completes, extract it from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` (look for `session id: <uuid>`) and persist it to Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
Round 2 and later — resume session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${SESSION_ID_VAR_VALUE} \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before.
|
||||
Keep findings ordered P0 to P3, use '- None.' when a severity has no findings, and only use VERDICT: APPROVED when no P0, P1, or P2 findings remain."
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `codex exec` with prior-round context.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call every round (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"${ROUND_PREFIX}Review the following ${REVIEW_KIND} payload.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
Where `${ROUND_PREFIX}` is empty for Round 1 and `"You previously reviewed this ${REVIEW_KIND} and requested revisions. Previous feedback summary: [key points]. "` for subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Round 1:
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later — resume:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${SESSION_ID_VAR_VALUE} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p`.
|
||||
|
||||
After the command completes, extract the session id and review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SID=$(jq -r '.session_id' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Persist `CURSOR_SID` to Runtime State under `${SESSION_ID_VAR}` on Round 1.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
OpenCode does not expose a dedicated read-only flag at the CLI level; use the built-in `plan` primary agent (`--agent plan`) for review, which is read-oriented and does not modify files. Session resume is supported via `-s <session-id>`, but the most reliable pattern for non-interactive review is **fresh call each round** (like `claude`) because opencode's session lifecycle and ID capture are less standardized than codex/cursor for headless runs. Skills MAY opt-in to session resume when they have verified the installed opencode version exposes a stable session id in `--format json` output.
|
||||
|
||||
Round 1 (preferred, fresh call):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later (fresh-call fallback path — recommended default):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this ${REVIEW_KIND} and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've revised. Updated payload is below.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Optional session-resume path (only if the installed opencode reliably emits a session id in `--format json` output and accepts it back via `-s`):
|
||||
|
||||
```bash
|
||||
# Round 2+ with resume
|
||||
opencode run \
|
||||
-s ${SESSION_ID_VAR_VALUE} \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"I've revised. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Extract the review body (the JSON stream emits events; the final assistant message contains the review text):
|
||||
|
||||
```bash
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If the JSON parse falls through, promote the raw JSON file as the review output and surface a warning to the user. On any opencode CLI or JSON parsing failure, treat this loop round as `completed-empty-output` and follow the helper-failure escalation in Step 6.
|
||||
|
||||
### Step 3: Run via `run-review.sh`
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
2>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll the `.status` file instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
### Step 4: Promote Reviewer Output + Capture Session ID
|
||||
|
||||
After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
### Step 5: Parse Verdict + Update Review History
|
||||
|
||||
1. Read `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md`.
|
||||
2. Append one row to `task-plan.md` Review History:
|
||||
- Timestamp (ISO-8601 UTC).
|
||||
- Loop (`plan` or `implementation`).
|
||||
- Round number.
|
||||
- Verdict (`APPROVED` or `REVISE`).
|
||||
- Summary (first line of the `## Summary` section).
|
||||
3. Increment `plan_review_round` or `implementation_review_round` in Runtime State.
|
||||
|
||||
### Step 6: Branch APPROVED / REVISE / MAX_ROUNDS
|
||||
|
||||
Verdict rules:
|
||||
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → exit the subroutine with `APPROVED`.
|
||||
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if cheap and safe, then exit with `APPROVED`.
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to revision (see below), then return to Step 1 for the next round.
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as APPROVED.
|
||||
- Helper state `completed-empty-output` → treat as failed review attempt, surface `.stderr`/`.status`, fix invocation or prompt handling, then retry.
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters.
|
||||
- Round counter ≥ `MAX_ROUNDS` → exit the subroutine with `MAX_ROUNDS`. Caller decides next action per Phase 5 or Phase 8.
|
||||
|
||||
**Revision:** The caller (Phase 5 for plan, Phase 6/7 for implementation) applies findings in priority order (`P0` → `P1` → `P2` → `P3`). For implementation review revisions, Phase 7 verification must be re-run after every revision before returning to Step 1.
|
||||
|
||||
### Step 7: Liveness Contract (during Step 3)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- Keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
### Step 8: Cleanup (on successful round exit)
|
||||
|
||||
```bash
|
||||
rm -f /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, KEEP `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them.
|
||||
|
||||
---
|
||||
|
||||
## Resume Semantics
|
||||
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
| `plan-approved` | Resume at Phase 6 (execute) |
|
||||
| `implementation-in-progress` | Resume at Phase 6 (continue execute) |
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
Review History is append-only.
|
||||
|
||||
---
|
||||
|
||||
## Execution Workflow Rules
|
||||
|
||||
- Current branch is the default; worktree is opt-in only.
|
||||
- Do NOT push without explicit "yes".
|
||||
- Secret scan runs **per-payload, no caching** — every round, including revisions.
|
||||
- Review loops use `MAX_ROUNDS=10` by default, shared across both loops.
|
||||
- The task commit is a single commit created in Phase 9; interim WIP commits are NOT created.
|
||||
- The `.gitignore` infra commit in Phase 1 is explicitly separate from the task commit and is allowed even on abort.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] `ai_plan/` exists and `/ai_plan/` is in `.gitignore`
|
||||
- [ ] `task-plan.md` created under `ai_plan/YYYY-MM-DD-<slug>/`
|
||||
- [ ] Reviewer CLI + model + `MAX_ROUNDS` configured (or `skip`)
|
||||
- [ ] Secret scan ran on every outbound reviewer payload
|
||||
- [ ] Plan review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Phase 6 executed TDD-first for all behavior-changing steps (or documented skip)
|
||||
- [ ] Phase 7 verification green before Phase 8
|
||||
- [ ] Implementation review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Single task commit created locally, no push without explicit yes
|
||||
- [ ] Telegram notification attempted if configured
|
||||
- [ ] `task-plan.md` Final Status filled in
|
||||
@@ -0,0 +1,143 @@
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Claude Code):** When generating or updating this file, the agent MUST be out of plan mode. Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through the `Skill` tool explicitly — no shell wrappers.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Created | YYYY-MM-DD |
|
||||
| Slug | YYYY-MM-DD-<slug> |
|
||||
| Runtime | claude-code |
|
||||
| Reviewer CLI | codex \| claude \| cursor \| opencode \| pi |
|
||||
| Reviewer Model | <model> |
|
||||
| MAX_ROUNDS | 10 |
|
||||
| Branch Strategy | current-branch \| worktree |
|
||||
| Branch Name | <current branch name, or new branch name when worktree is used> |
|
||||
| Worktree Path | <absolute path to worktree dir; blank when Branch Strategy = current-branch> |
|
||||
| Status | draft |
|
||||
|
||||
### Status Enum (authoritative)
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
<!-- Exact user prompt, verbatim. -->
|
||||
|
||||
## Interpretation
|
||||
|
||||
<!-- Short restatement of goal + out-of-scope items. -->
|
||||
|
||||
## Assumptions
|
||||
|
||||
<!-- Anything we're assuming and needs confirmation. Empty list OK after clarifying questions. -->
|
||||
|
||||
## Files
|
||||
|
||||
<!-- Files expected to be created / modified / deleted. Paths are absolute or repo-relative. -->
|
||||
|
||||
| Action | Path | Why |
|
||||
|--------|------|-----|
|
||||
| | | |
|
||||
|
||||
## Approach
|
||||
|
||||
<!-- 3-10 bullets describing implementation order. -->
|
||||
|
||||
## TDD Approach
|
||||
|
||||
<!-- One of:
|
||||
(a) **TDD applies** — list the failing test(s) to write first, then implementation, then confirm green.
|
||||
(b) **TDD auto-skipped** — reason must be exactly one of:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
(c) **TDD user-approved skip** — user explicitly approved skipping TDD for this task.
|
||||
Record the approval timestamp (ISO-8601) and the specific reason (e.g., `pure-config-addition`).
|
||||
-->
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] <criterion 1>
|
||||
- [ ] <criterion 2>
|
||||
|
||||
## Verification
|
||||
|
||||
<!-- Commands to run:
|
||||
lint: <cmd>
|
||||
typecheck: <cmd>
|
||||
tests: <cmd>
|
||||
-->
|
||||
|
||||
## Rollback
|
||||
|
||||
<!-- How to undo: `git revert <hash>`, or manual steps if the change is not easily revertable. -->
|
||||
|
||||
---
|
||||
|
||||
## Runtime State
|
||||
|
||||
<!-- Updated by the skill at runtime. Used to detect resume and to persist reviewer session IDs across rounds. -->
|
||||
|
||||
```yaml
|
||||
plan_review_round: 0
|
||||
implementation_review_round: 0
|
||||
CODEX_PLAN_SESSION_ID:
|
||||
CODEX_IMPL_SESSION_ID:
|
||||
CURSOR_PLAN_SESSION_ID:
|
||||
CURSOR_IMPL_SESSION_ID:
|
||||
OPENCODE_PLAN_SESSION_ID:
|
||||
OPENCODE_IMPL_SESSION_ID:
|
||||
last_phase_entered:
|
||||
last_round_ts:
|
||||
last_scan_outcome_plan:
|
||||
last_scan_outcome_impl:
|
||||
verification_attempts: 0
|
||||
tests_added_count: 0
|
||||
tdd_used: false
|
||||
```
|
||||
|
||||
## Review History
|
||||
|
||||
<!-- Append one entry per reviewer round, both loops. -->
|
||||
|
||||
| Timestamp (ISO-8601) | Loop | Round | Verdict | Summary |
|
||||
|----------------------|------|-------|---------|---------|
|
||||
| | | | | |
|
||||
|
||||
## Final Status
|
||||
|
||||
<!-- Filled at the terminal outcome (phase 9/10). Populate at least:
|
||||
- Terminal status (one of the 10 Status enum values)
|
||||
- Commit hash (if any)
|
||||
- Plan-review rounds used / MAX_ROUNDS
|
||||
- Implementation-review rounds used / MAX_ROUNDS
|
||||
- TDD used (true|false)
|
||||
- Tests added count
|
||||
- Verification attempts used
|
||||
- Last round ISO-8601 timestamp
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
- This file is the single persistent artifact for `do-task`. Do not split it or delete it on success.
|
||||
- `Status` must always match one of the 10 enum values.
|
||||
- `Runtime State` is updated by the skill, not by the user.
|
||||
- Review History is append-only.
|
||||
- `last_scan_outcome_plan` and `last_scan_outcome_impl` record the most recent secret-scan result for each loop. They are informational; the scan itself runs per-payload with no caching.
|
||||
@@ -0,0 +1,848 @@
|
||||
---
|
||||
name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in Codex. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
# Do Task (Codex Native Superpowers)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
|
||||
This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `implement-plan`, `do-task` operates on one persistent `task-plan.md` (not a full milestone plan) and defaults to the **current branch** (not a worktree).
|
||||
|
||||
**Core principle:** Codex uses native skill discovery from `~/.agents/skills/`. Do not use deprecated `superpowers-codex bootstrap` or `use-skill` CLI commands.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Codex CLI: `codex --version`
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
- `superpowers:brainstorming`
|
||||
- `superpowers:test-driven-development`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
- `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
|
||||
- Shared reviewer runtime: `~/.codex/skills/reviewer-runtime/run-review.sh`
|
||||
- Telegram notifier helper: `~/.codex/skills/reviewer-runtime/notify-telegram.sh`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
test -L ~/.agents/skills/superpowers
|
||||
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
If any required dependency is missing, stop immediately and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Install required Superpowers skills (https://github.com/obra/superpowers) and the reviewer-runtime helper, then retry.`
|
||||
|
||||
## Required Skill Invocation Rules
|
||||
|
||||
- Invoke relevant skills through native discovery (no CLI wrapper).
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- For skills with checklists, track checklist items with `update_plan` todos.
|
||||
- Tool mapping for Codex:
|
||||
- `TodoWrite` -> `update_plan`
|
||||
- `Task` subagents -> unavailable in Codex; do the work directly and state the limitation
|
||||
- `Skill` -> use native skill discovery from `~/.agents/skills/`
|
||||
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
- "execute this task"
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
- "on a new branch called X"
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Preflight
|
||||
|
||||
1. Verify git repo: `git rev-parse --is-inside-work-tree`.
|
||||
2. Verify `/ai_plan/` is present in `.gitignore`. If missing:
|
||||
- Append `/ai_plan/` to `.gitignore`.
|
||||
- Commit that infra change immediately with message `chore(gitignore): ignore ai_plan local planning artifacts`.
|
||||
- This infra commit is EXPLICITLY separate from the task commit in Phase 9. It may occur even when the final task ends up `aborted` or `failed`.
|
||||
3. Verify required sub-skills are discoverable under `~/.agents/skills/superpowers/`. Announce each sub-skill before invocation using:
|
||||
`I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
|
||||
### Phase 2: Parse Prompt and Question
|
||||
|
||||
1. Capture the exact user prompt verbatim.
|
||||
2. Detect trigger phrase (see above) and record which one matched.
|
||||
3. Detect escape phrase. If set, skip clarifying questions entirely.
|
||||
4. Apply the ask-first heuristic:
|
||||
- Skip clarifying questions ONLY if ALL are true:
|
||||
- Prompt names a concrete target (file, feature, or function).
|
||||
- Prompt names a concrete outcome (what success looks like).
|
||||
- Prompt has no ambiguous scope (no "and maybe also ...").
|
||||
- All identifiers in the prompt are resolvable against the codebase.
|
||||
- Otherwise, ask 1-3 clarifying questions, ONE AT A TIME, multiple-choice preferred.
|
||||
- Empty prompt → ask exactly once: "what task?".
|
||||
5. Invoke `superpowers:brainstorming` via native discovery (`~/.agents/skills/superpowers/brainstorming/SKILL.md`) for any **behavior-changing** task — feature creation, bug fix with multiple plausible approaches, refactor, design decision. Present 2-3 approaches and recommend one before finalizing the plan. The ONLY skip conditions are the same ones that allow TDD auto-skip: `pure-documentation` and `pure-comment-whitespace-rename`. When skipping, record the skip reason in the Interpretation section of `task-plan.md`.
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "do task X, review with codex gpt-5.4"), use those values. If the user says "use defaults" or otherwise opts out of explicit configuration, proceed with `REVIEWER_CLI=codex`, `REVIEWER_MODEL=gpt-5.4`, and `MAX_ROUNDS=10`. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review both the plan and the implementation?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `opencode` — OpenCode CLI (`opencode run`)
|
||||
- `skip` — No external review, proceed with user approval only at each loop.
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `gpt-5.4`, alternatives: `gpt-5.3-codex`, `o4-mini`, `o3`.
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`.
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models.
|
||||
- For `opencode`: provider-qualified form `<provider>/<model>` (e.g., `anthropic/claude-sonnet-4-5`, `openai/gpt-5.4`). Run `opencode models` to list available models.
|
||||
- Accept any model string the user provides.
|
||||
|
||||
3. **Max review rounds shared across both loops?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 4: Initialize Plan Workspace
|
||||
|
||||
Codex has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
- If `Status` is `draft` or `plan-approved` or `implementation-in-progress`: offer to resume, pick a new suffix (`<slug>-v2`), or abort. Default is resume.
|
||||
- If `Status` is any terminal value (`pushed`, `local-only`, `aborted-*`, `failed`): offer a new suffix or abort. Default is new suffix.
|
||||
4. If not resuming, create the folder and write `task-plan.md` from the template at `templates/task-plan.md` (this skill's template folder; falls back to `~/.codex/skills/do-task/templates/task-plan.md` when installed directly).
|
||||
5. Fill in:
|
||||
- `Metadata` block.
|
||||
- `Prompt` (verbatim).
|
||||
- `Interpretation`, `Assumptions`, `Files`, `Approach`, `TDD Approach`, `Acceptance Criteria`, `Verification`, `Rollback`.
|
||||
- Leave `Runtime State`, `Review History`, `Final Status` empty (skill updates these).
|
||||
6. Set `Status: draft`.
|
||||
|
||||
**Worktree branch:** If the prompt opts in to a worktree (see Trigger Phrase Detection), invoke `superpowers:using-git-worktrees` via native discovery before proceeding. Otherwise continue on the current branch.
|
||||
|
||||
### Phase 5: Plan Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = PLAN_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_PLAN_SESSION_ID | CURSOR_PLAN_SESSION_ID | OPENCODE_PLAN_SESSION_ID
|
||||
```
|
||||
|
||||
Payload is the current `task-plan.md` **with the `Runtime State` and `Review History` blocks stripped** before writing to `PAYLOAD_PATH`. Those two blocks contain reviewer session IDs and scan outcomes that must never be sent back to any reviewer CLI. Reviewers only need the Prompt, Interpretation, Assumptions, Files, Approach, TDD Approach, Acceptance Criteria, Verification, Rollback, and Metadata sections.
|
||||
|
||||
`PLAN_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this task plan for completeness, correctness, and risk. Focus on:
|
||||
1. Does the plan match the user's prompt?
|
||||
2. Are all assumptions surfaced?
|
||||
3. Are acceptance criteria testable?
|
||||
4. Is the TDD approach appropriate per the TDD Approach section?
|
||||
5. Are there missing files, risks, or security concerns?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
|
||||
### Phase 6: Execute (TDD-first where applicable)
|
||||
|
||||
Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
|
||||
1. Set `Status: implementation-in-progress`.
|
||||
2. For every behavior-changing file edit:
|
||||
- Invoke `superpowers:test-driven-development` via native discovery.
|
||||
- Write the failing test first. Run it. Confirm it fails.
|
||||
- Implement the minimal code to make it pass. Run the test. Confirm green.
|
||||
- Do NOT commit yet — a single task commit happens in Phase 9.
|
||||
3. Auto-skip of TDD is permitted ONLY for tasks classified in `task-plan.md` TDD Approach as:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
4. Any other skip (including `pure-config-addition`) requires explicit user approval recorded in `task-plan.md` with an ISO-8601 timestamp.
|
||||
5. Update `task-plan.md` after each logical step: add notes to `Approach`, check off `Acceptance Criteria` items as they complete.
|
||||
|
||||
### Phase 7: Verification Gate
|
||||
|
||||
Invoke `superpowers:verification-before-completion` via native discovery.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
|
||||
### Phase 8: Implementation Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = IMPL_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_IMPL_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_IMPL_SESSION_ID
|
||||
```
|
||||
|
||||
Payload contents (assembled by the skill):
|
||||
|
||||
```markdown
|
||||
# Implementation Review: [Short Title]
|
||||
|
||||
## Task Plan (the plan that was approved)
|
||||
<embed approved task-plan.md, excluding Runtime State block>
|
||||
|
||||
## Changes Made (git diff)
|
||||
<output of: `git diff` for unstaged + `git diff --staged` for staged>
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
<lint output>
|
||||
### Typecheck
|
||||
<typecheck output>
|
||||
### Tests
|
||||
<test output, pass/fail counts>
|
||||
```
|
||||
|
||||
`IMPL_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this implementation against the task plan. Focus on:
|
||||
1. Correctness — Does the diff satisfy the Acceptance Criteria?
|
||||
2. Code quality — Clean, maintainable, no obvious issues?
|
||||
3. Test coverage — Are behavior changes adequately tested (per the plan's TDD Approach)?
|
||||
4. Security — Any security concerns introduced?
|
||||
5. Regressions — Does the diff risk breaking unrelated code?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
|
||||
### Phase 9: Commit + Push Ask
|
||||
|
||||
Invoke `superpowers:finishing-a-development-branch` via native discovery.
|
||||
|
||||
1. Stage all changed files explicitly (avoid `git add -A`).
|
||||
2. Single commit with message derived from the task goal:
|
||||
- Format: `<type>(<scope>): <short description>`
|
||||
- Example: `feat(auth): add session token rotation`
|
||||
3. Do NOT push. Update `Status: local-only`.
|
||||
4. Ask the user: "Push to remote? (yes / no)"
|
||||
- On explicit `yes` → push, then set `Status: pushed`.
|
||||
- Any other response → leave `Status: local-only`.
|
||||
|
||||
### Phase 10: Telegram Notification + Finalize
|
||||
|
||||
Resolve the notifier helper:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.codex/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome (`pushed`, `local-only`, `aborted-*`, `failed`), send a Telegram summary if both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are set:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "do-task <slug>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
Fill in `Final Status` in `task-plan.md` (include commit hash if any). Do NOT delete the plan folder — it stays as a record.
|
||||
|
||||
---
|
||||
|
||||
## Review Loop (Shared Subroutine)
|
||||
|
||||
This subroutine is invoked twice per `do-task` run: once in Phase 5 (`REVIEW_KIND=plan`) and once in Phase 8 (`REVIEW_KIND=implementation`). Separate session IDs are used for each loop so reviewer context never leaks across loops.
|
||||
|
||||
### Subroutine Inputs
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `REVIEW_KIND` | `plan` or `implementation` |
|
||||
| `REVIEW_ID` | 8-char hex (from `uuidgen`); reused across rounds of the same loop |
|
||||
| `PAYLOAD_PATH` | `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` |
|
||||
| `PROMPT_TEMPLATE` | `PLAN_REVIEW_PROMPT` or `IMPL_REVIEW_PROMPT` |
|
||||
| `REVIEWER_CLI` | `codex` \| `claude` \| `cursor` \| `opencode` \| `pi` |
|
||||
| `REVIEWER_MODEL` | Model name |
|
||||
| `MAX_ROUNDS` | Default 10 |
|
||||
| `SESSION_ID_VAR` | `CODEX_PLAN_SESSION_ID` \| `CODEX_IMPL_SESSION_ID` \| `CURSOR_PLAN_SESSION_ID` \| `CURSOR_IMPL_SESSION_ID` \| `OPENCODE_PLAN_SESSION_ID` \| `OPENCODE_IMPL_SESSION_ID` |
|
||||
|
||||
Temp artifact paths (per loop):
|
||||
|
||||
- `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` — payload
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md` — normalized review text
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json` — raw Cursor/OpenCode JSON (cursor only, plus opencode when `--format json` is used)
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared helper:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
### Step 1: Write Payload
|
||||
|
||||
Write the full payload for this round to `PAYLOAD_PATH`.
|
||||
|
||||
### Step 1a: Secret Scan (per-payload, no caching)
|
||||
|
||||
**BEFORE** sending the payload to any reviewer CLI, scan it for secrets. This scan runs EVERY round — no results are cached. Rationale: Phase 8 payloads include newly-introduced diff content that earlier rounds never saw.
|
||||
|
||||
Run the secret scan with all of these anchored regexes. Use `grep -En` on the payload file:
|
||||
|
||||
```bash
|
||||
SECRET_REGEX_FILE=$(mktemp)
|
||||
cat >"$SECRET_REGEX_FILE" <<'EOF'
|
||||
AKIA[0-9A-Z]{16}
|
||||
"type"\s*:\s*"service_account"
|
||||
(ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
|
||||
xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
|
||||
xox[abpsr]-[A-Za-z0-9]{10,48}
|
||||
sk-(proj-)?[A-Za-z0-9_-]{20,}
|
||||
sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
|
||||
-----BEGIN [A-Z ]+ PRIVATE KEY-----
|
||||
(TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
|
||||
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
|
||||
EOF
|
||||
|
||||
SCAN_MATCHES=$(grep -Ensf "$SECRET_REGEX_FILE" "$PAYLOAD_PATH" || true)
|
||||
rm -f "$SECRET_REGEX_FILE"
|
||||
```
|
||||
|
||||
If `SCAN_MATCHES` is non-empty:
|
||||
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
### Step 2: Generate Reviewer Command Script
|
||||
|
||||
Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh` as a bash script starting with:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Round 1 — fresh `codex exec`:
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"Review the ${REVIEW_KIND} payload in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
${PROMPT_TEMPLATE}"
|
||||
```
|
||||
|
||||
Do not capture the Codex session ID yet. After Round 1 completes, extract it from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` (look for `session id: <uuid>`) and persist it to Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
Round 2 and later — resume session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${SESSION_ID_VAR_VALUE} \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before.
|
||||
Keep findings ordered P0 to P3, use '- None.' when a severity has no findings, and only use VERDICT: APPROVED when no P0, P1, or P2 findings remain."
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `codex exec` with prior-round context.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call every round (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"${ROUND_PREFIX}Review the following ${REVIEW_KIND} payload.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
Where `${ROUND_PREFIX}` is empty for Round 1 and `"You previously reviewed this ${REVIEW_KIND} and requested revisions. Previous feedback summary: [key points]. "` for subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Round 1:
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later — resume:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${SESSION_ID_VAR_VALUE} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p`.
|
||||
|
||||
After the command completes, extract the session id and review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SID=$(jq -r '.session_id' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Persist `CURSOR_SID` to Runtime State under `${SESSION_ID_VAR}` on Round 1.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
OpenCode does not expose a dedicated read-only flag at the CLI level; use the built-in `plan` primary agent (`--agent plan`) for review, which is read-oriented and does not modify files. Session resume is supported via `-s <session-id>`, but the most reliable pattern for non-interactive review is **fresh call each round** (like `claude`) because opencode's session lifecycle and ID capture are less standardized than codex/cursor for headless runs. Skills MAY opt-in to session resume when they have verified the installed opencode version exposes a stable session id in `--format json` output.
|
||||
|
||||
Round 1 (preferred, fresh call):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later (fresh-call fallback path — recommended default):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this ${REVIEW_KIND} and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've revised. Updated payload is below.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Optional session-resume path (only if the installed opencode reliably emits a session id in `--format json` output and accepts it back via `-s`):
|
||||
|
||||
```bash
|
||||
# Round 2+ with resume
|
||||
opencode run \
|
||||
-s ${SESSION_ID_VAR_VALUE} \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"I've revised. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Extract the review body (the JSON stream emits events; the final assistant message contains the review text):
|
||||
|
||||
```bash
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If the JSON parse falls through, promote the raw JSON file as the review output and surface a warning to the user. On any opencode CLI or JSON parsing failure, treat this loop round as `completed-empty-output` and follow the helper-failure escalation in Step 6.
|
||||
|
||||
### Step 3: Run via `run-review.sh`
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
2>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll the `.status` file instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
### Step 4: Promote Reviewer Output + Capture Session ID
|
||||
|
||||
After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
### Step 5: Parse Verdict + Update Review History
|
||||
|
||||
1. Read `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md`.
|
||||
2. Append one row to `task-plan.md` Review History:
|
||||
- Timestamp (ISO-8601 UTC).
|
||||
- Loop (`plan` or `implementation`).
|
||||
- Round number.
|
||||
- Verdict (`APPROVED` or `REVISE`).
|
||||
- Summary (first line of the `## Summary` section).
|
||||
3. Increment `plan_review_round` or `implementation_review_round` in Runtime State.
|
||||
|
||||
### Step 6: Branch APPROVED / REVISE / MAX_ROUNDS
|
||||
|
||||
Verdict rules:
|
||||
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → exit the subroutine with `APPROVED`.
|
||||
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if cheap and safe, then exit with `APPROVED`.
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to revision (see below), then return to Step 1 for the next round.
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as APPROVED.
|
||||
- Helper state `completed-empty-output` → treat as failed review attempt, surface `.stderr`/`.status`, fix invocation or prompt handling, then retry.
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters.
|
||||
- Round counter ≥ `MAX_ROUNDS` → exit the subroutine with `MAX_ROUNDS`. Caller decides next action per Phase 5 or Phase 8.
|
||||
|
||||
**Revision:** The caller (Phase 5 for plan, Phase 6/7 for implementation) applies findings in priority order (`P0` → `P1` → `P2` → `P3`). For implementation review revisions, Phase 7 verification must be re-run after every revision before returning to Step 1.
|
||||
|
||||
### Step 7: Liveness Contract (during Step 3)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- Keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
### Step 8: Cleanup (on successful round exit)
|
||||
|
||||
```bash
|
||||
rm -f /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, KEEP `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them.
|
||||
|
||||
---
|
||||
|
||||
## Resume Semantics
|
||||
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
| `plan-approved` | Resume at Phase 6 (execute) |
|
||||
| `implementation-in-progress` | Resume at Phase 6 (continue execute) |
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
Review History is append-only.
|
||||
|
||||
---
|
||||
|
||||
## Execution Workflow Rules
|
||||
|
||||
- Current branch is the default; worktree is opt-in only.
|
||||
- Do NOT push without explicit "yes".
|
||||
- Secret scan runs **per-payload, no caching** — every round, including revisions.
|
||||
- Review loops use `MAX_ROUNDS=10` by default, shared across both loops.
|
||||
- The task commit is a single commit created in Phase 9; interim WIP commits are NOT created.
|
||||
- The `.gitignore` infra commit in Phase 1 is explicitly separate from the task commit and is allowed even on abort.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] `ai_plan/` exists and `/ai_plan/` is in `.gitignore`
|
||||
- [ ] `task-plan.md` created under `ai_plan/YYYY-MM-DD-<slug>/`
|
||||
- [ ] Reviewer CLI + model + `MAX_ROUNDS` configured (or `skip`)
|
||||
- [ ] Secret scan ran on every outbound reviewer payload
|
||||
- [ ] Plan review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Phase 6 executed TDD-first for all behavior-changing steps (or documented skip)
|
||||
- [ ] Phase 7 verification green before Phase 8
|
||||
- [ ] Implementation review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Single task commit created locally, no push without explicit yes
|
||||
- [ ] Telegram notification attempted if configured
|
||||
- [ ] `task-plan.md` Final Status filled in
|
||||
|
||||
---
|
||||
|
||||
## Variant Hardening Notes — Codex
|
||||
|
||||
- Must use native skill discovery from `~/.agents/skills/` (no CLI wrappers).
|
||||
- Must verify Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
- Must invoke required sub-skills with explicit announcements before any action.
|
||||
- Must track checklist-driven sub-skills with `update_plan` todos (the Codex equivalent of `TodoWrite`).
|
||||
- `Task` subagents are unavailable in Codex — do the work directly and state the limitation in the plan if a subagent was expected.
|
||||
- Deprecated CLI commands (`superpowers-codex bootstrap`, `use-skill`) must NOT be used.
|
||||
- Helper paths are `~/.codex/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`.
|
||||
- No plan-mode guard (Codex has no plan-mode concept).
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Using deprecated commands like `superpowers-codex bootstrap` or `superpowers-codex use-skill`.
|
||||
- Jumping to Phase 6 execution without running `superpowers:brainstorming` on behavior-changing work.
|
||||
- Asking multiple clarifying questions in a single message.
|
||||
- Forgetting `update_plan` tracking for sub-skill checklists.
|
||||
- Forgetting to create/update `.gitignore` for `/ai_plan/`.
|
||||
- Skipping the per-payload secret scan because "the previous round was clean".
|
||||
- Pushing the task commit without explicit user approval.
|
||||
|
||||
## Red Flags — Stop and Correct
|
||||
|
||||
- You are about to run any `superpowers-codex` command.
|
||||
- You are writing `task-plan.md` before the brainstorming step (when required) completes.
|
||||
- You are pushing commits without user approval.
|
||||
- You did not announce which skill you invoked and why.
|
||||
- You are proceeding to implementation review with failing lint/typecheck/tests.
|
||||
- You are echoing raw secret-scan matches to the user or logs.
|
||||
@@ -0,0 +1,143 @@
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Codex):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through native skill discovery from `~/.agents/skills/superpowers/<skill>/SKILL.md` — no `superpowers-codex` CLI wrappers. Checklist-driven sub-skills MUST track items with `update_plan` todos.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Created | YYYY-MM-DD |
|
||||
| Slug | YYYY-MM-DD-<slug> |
|
||||
| Runtime | codex |
|
||||
| Reviewer CLI | codex \| claude \| cursor \| opencode \| pi |
|
||||
| Reviewer Model | <model> |
|
||||
| MAX_ROUNDS | 10 |
|
||||
| Branch Strategy | current-branch \| worktree |
|
||||
| Branch Name | <current branch name, or new branch name when worktree is used> |
|
||||
| Worktree Path | <absolute path to worktree dir; blank when Branch Strategy = current-branch> |
|
||||
| Status | draft |
|
||||
|
||||
### Status Enum (authoritative)
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
<!-- Exact user prompt, verbatim. -->
|
||||
|
||||
## Interpretation
|
||||
|
||||
<!-- Short restatement of goal + out-of-scope items. -->
|
||||
|
||||
## Assumptions
|
||||
|
||||
<!-- Anything we're assuming and needs confirmation. Empty list OK after clarifying questions. -->
|
||||
|
||||
## Files
|
||||
|
||||
<!-- Files expected to be created / modified / deleted. Paths are absolute or repo-relative. -->
|
||||
|
||||
| Action | Path | Why |
|
||||
|--------|------|-----|
|
||||
| | | |
|
||||
|
||||
## Approach
|
||||
|
||||
<!-- 3-10 bullets describing implementation order. -->
|
||||
|
||||
## TDD Approach
|
||||
|
||||
<!-- One of:
|
||||
(a) **TDD applies** — list the failing test(s) to write first, then implementation, then confirm green.
|
||||
(b) **TDD auto-skipped** — reason must be exactly one of:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
(c) **TDD user-approved skip** — user explicitly approved skipping TDD for this task.
|
||||
Record the approval timestamp (ISO-8601) and the specific reason (e.g., `pure-config-addition`).
|
||||
-->
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] <criterion 1>
|
||||
- [ ] <criterion 2>
|
||||
|
||||
## Verification
|
||||
|
||||
<!-- Commands to run:
|
||||
lint: <cmd>
|
||||
typecheck: <cmd>
|
||||
tests: <cmd>
|
||||
-->
|
||||
|
||||
## Rollback
|
||||
|
||||
<!-- How to undo: `git revert <hash>`, or manual steps if the change is not easily revertable. -->
|
||||
|
||||
---
|
||||
|
||||
## Runtime State
|
||||
|
||||
<!-- Updated by the skill at runtime. Used to detect resume and to persist reviewer session IDs across rounds. -->
|
||||
|
||||
```yaml
|
||||
plan_review_round: 0
|
||||
implementation_review_round: 0
|
||||
CODEX_PLAN_SESSION_ID:
|
||||
CODEX_IMPL_SESSION_ID:
|
||||
CURSOR_PLAN_SESSION_ID:
|
||||
CURSOR_IMPL_SESSION_ID:
|
||||
OPENCODE_PLAN_SESSION_ID:
|
||||
OPENCODE_IMPL_SESSION_ID:
|
||||
last_phase_entered:
|
||||
last_round_ts:
|
||||
last_scan_outcome_plan:
|
||||
last_scan_outcome_impl:
|
||||
verification_attempts: 0
|
||||
tests_added_count: 0
|
||||
tdd_used: false
|
||||
```
|
||||
|
||||
## Review History
|
||||
|
||||
<!-- Append one entry per reviewer round, both loops. -->
|
||||
|
||||
| Timestamp (ISO-8601) | Loop | Round | Verdict | Summary |
|
||||
|----------------------|------|-------|---------|---------|
|
||||
| | | | | |
|
||||
|
||||
## Final Status
|
||||
|
||||
<!-- Filled at the terminal outcome (phase 9/10). Populate at least:
|
||||
- Terminal status (one of the 10 Status enum values)
|
||||
- Commit hash (if any)
|
||||
- Plan-review rounds used / MAX_ROUNDS
|
||||
- Implementation-review rounds used / MAX_ROUNDS
|
||||
- TDD used (true|false)
|
||||
- Tests added count
|
||||
- Verification attempts used
|
||||
- Last round ISO-8601 timestamp
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
- This file is the single persistent artifact for `do-task`. Do not split it or delete it on success.
|
||||
- `Status` must always match one of the 10 enum values.
|
||||
- `Runtime State` is updated by the skill, not by the user.
|
||||
- Review History is append-only.
|
||||
- `last_scan_outcome_plan` and `last_scan_outcome_impl` record the most recent secret-scan result for each loop. They are informational; the scan itself runs per-payload with no caching.
|
||||
@@ -0,0 +1,853 @@
|
||||
---
|
||||
name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in Cursor Agent CLI. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
# Do Task (Cursor Agent CLI)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
|
||||
This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `implement-plan`, `do-task` operates on one persistent `task-plan.md` (not a full milestone plan) and defaults to the **current branch** (not a worktree).
|
||||
|
||||
**Core principle:** Cursor Agent CLI discovers skills from `.cursor/skills/` (repo-local), `~/.cursor/skills/` (global), and installed Cursor plugin cache entries. It also reads `AGENTS.md` at the repo root for additional instructions.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Cursor Agent CLI: `cursor-agent --version` (install via `curl https://cursor.com/install -fsS | bash`). Binary is `cursor-agent`; the alias `cursor agent` also works.
|
||||
- `jq` (**required** — `do-task` always parses JSON output from at least the cursor reviewer branch, and other reviewers may produce JSON). Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`.
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- Superpowers skills available from the Cursor plugin cache, `.cursor/skills/` (repo-local), or `~/.cursor/skills/` (global). Do not install both the plugin and a manual Superpowers copy, or Cursor may show duplicate skill entries.
|
||||
- `superpowers:brainstorming`
|
||||
- `superpowers:test-driven-development`
|
||||
- `superpowers:verification-before-completion`
|
||||
- `superpowers:finishing-a-development-branch`
|
||||
- `superpowers:using-git-worktrees` (only when the prompt opts in to a worktree)
|
||||
- Shared reviewer runtime: `.cursor/skills/reviewer-runtime/run-review.sh` (repo-local, preferred) OR `~/.cursor/skills/reviewer-runtime/run-review.sh` (global fallback)
|
||||
- Telegram notifier helper: `.cursor/skills/reviewer-runtime/notify-telegram.sh` OR `~/.cursor/skills/reviewer-runtime/notify-telegram.sh`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
cursor-agent --version
|
||||
jq --version
|
||||
test -f .cursor/skills/superpowers/skills/brainstorming/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/brainstorming/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/brainstorming/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/test-driven-development/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/test-driven-development/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/test-driven-development/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/verification-before-completion/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/verification-before-completion/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
test -f .cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || test -f ~/.cursor/skills/superpowers/skills/finishing-a-development-branch/SKILL.md || find ~/.cursor/plugins/cache/cursor-public/superpowers -path '*/skills/finishing-a-development-branch/SKILL.md' -print -quit 2>/dev/null | grep -q .
|
||||
```
|
||||
|
||||
If any required dependency is missing, stop immediately and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Install Cursor Agent CLI, jq, and the Cursor Superpowers plugin or Superpowers skills under .cursor/skills/ or ~/.cursor/skills/, then retry.`
|
||||
|
||||
## Required Skill Invocation Rules
|
||||
|
||||
- Invoke relevant skills through Cursor-native discovery (`.cursor/skills/`, `~/.cursor/skills/`, or installed Cursor plugin cache entries).
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- For skills with checklists, track checklist items explicitly in conversation.
|
||||
- The reviewer CLI branch for `cursor` MUST use `--mode=ask --trust --output-format json`. Never use `--mode=agent` or `--force` for reviewer calls.
|
||||
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
- "execute this task"
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
- "on a new branch called X"
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Preflight
|
||||
|
||||
1. Verify git repo: `git rev-parse --is-inside-work-tree`.
|
||||
2. Verify `/ai_plan/` is present in `.gitignore`. If missing:
|
||||
- Append `/ai_plan/` to `.gitignore`.
|
||||
- Commit that infra change immediately with message `chore(gitignore): ignore ai_plan local planning artifacts`.
|
||||
- This infra commit is EXPLICITLY separate from the task commit in Phase 9. It may occur even when the final task ends up `aborted` or `failed`.
|
||||
3. Verify required sub-skills are discoverable through Cursor-native skill discovery: the Cursor Superpowers plugin cache, `.cursor/skills/superpowers/skills/`, or `~/.cursor/skills/superpowers/skills/`. Announce each sub-skill before invocation using:
|
||||
`I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
|
||||
### Phase 2: Parse Prompt and Question
|
||||
|
||||
1. Capture the exact user prompt verbatim.
|
||||
2. Detect trigger phrase (see above) and record which one matched.
|
||||
3. Detect escape phrase. If set, skip clarifying questions entirely.
|
||||
4. Apply the ask-first heuristic:
|
||||
- Skip clarifying questions ONLY if ALL are true:
|
||||
- Prompt names a concrete target (file, feature, or function).
|
||||
- Prompt names a concrete outcome (what success looks like).
|
||||
- Prompt has no ambiguous scope (no "and maybe also ...").
|
||||
- All identifiers in the prompt are resolvable against the codebase.
|
||||
- Otherwise, ask 1-3 clarifying questions, ONE AT A TIME, multiple-choice preferred.
|
||||
- Empty prompt → ask exactly once: "what task?".
|
||||
5. Invoke `superpowers:brainstorming` through Cursor-native skill discovery for any **behavior-changing** task — feature creation, bug fix with multiple plausible approaches, refactor, design decision. Present 2-3 approaches and recommend one before finalizing the plan. The ONLY skip conditions are the same ones that allow TDD auto-skip: `pure-documentation` and `pure-comment-whitespace-rename`. When skipping, record the skip reason in the Interpretation section of `task-plan.md`.
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "do task X, review with codex gpt-5.4"), use those values. If the user says "use defaults" or otherwise opts out of explicit configuration, proceed with `REVIEWER_CLI=codex`, `REVIEWER_MODEL=gpt-5.4`, and `MAX_ROUNDS=10`. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review both the plan and the implementation?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `opencode` — OpenCode CLI (`opencode run`)
|
||||
- `skip` — No external review, proceed with user approval only at each loop.
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `gpt-5.4`, alternatives: `gpt-5.3-codex`, `o4-mini`, `o3`.
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`.
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models.
|
||||
- For `opencode`: provider-qualified form `<provider>/<model>` (e.g., `anthropic/claude-sonnet-4-5`, `openai/gpt-5.4`). Run `opencode models` to list available models.
|
||||
- Accept any model string the user provides.
|
||||
|
||||
3. **Max review rounds shared across both loops?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 4: Initialize Plan Workspace
|
||||
|
||||
Cursor Agent CLI has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
- If `Status` is `draft` or `plan-approved` or `implementation-in-progress`: offer to resume, pick a new suffix (`<slug>-v2`), or abort. Default is resume.
|
||||
- If `Status` is any terminal value (`pushed`, `local-only`, `aborted-*`, `failed`): offer a new suffix or abort. Default is new suffix.
|
||||
4. If not resuming, create the folder and write `task-plan.md` from the template at `templates/task-plan.md` (this skill's template folder; resolves to `.cursor/skills/do-task/templates/task-plan.md` repo-local or `~/.cursor/skills/do-task/templates/task-plan.md` global when installed directly).
|
||||
5. Fill in:
|
||||
- `Metadata` block.
|
||||
- `Prompt` (verbatim).
|
||||
- `Interpretation`, `Assumptions`, `Files`, `Approach`, `TDD Approach`, `Acceptance Criteria`, `Verification`, `Rollback`.
|
||||
- Leave `Runtime State`, `Review History`, `Final Status` empty (skill updates these).
|
||||
6. Set `Status: draft`.
|
||||
|
||||
**Worktree branch:** If the prompt opts in to a worktree (see Trigger Phrase Detection), invoke `superpowers:using-git-worktrees` via Cursor-native discovery before proceeding. Otherwise continue on the current branch.
|
||||
|
||||
### Phase 5: Plan Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = PLAN_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_PLAN_SESSION_ID | CURSOR_PLAN_SESSION_ID | OPENCODE_PLAN_SESSION_ID
|
||||
```
|
||||
|
||||
Payload is the current `task-plan.md` **with the `Runtime State` and `Review History` blocks stripped** before writing to `PAYLOAD_PATH`. Those two blocks contain reviewer session IDs and scan outcomes that must never be sent back to any reviewer CLI. Reviewers only need the Prompt, Interpretation, Assumptions, Files, Approach, TDD Approach, Acceptance Criteria, Verification, Rollback, and Metadata sections.
|
||||
|
||||
`PLAN_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this task plan for completeness, correctness, and risk. Focus on:
|
||||
1. Does the plan match the user's prompt?
|
||||
2. Are all assumptions surfaced?
|
||||
3. Are acceptance criteria testable?
|
||||
4. Is the TDD approach appropriate per the TDD Approach section?
|
||||
5. Are there missing files, risks, or security concerns?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
|
||||
### Phase 6: Execute (TDD-first where applicable)
|
||||
|
||||
Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
|
||||
1. Set `Status: implementation-in-progress`.
|
||||
2. For every behavior-changing file edit:
|
||||
- Invoke `superpowers:test-driven-development` via Cursor-native discovery.
|
||||
- Write the failing test first. Run it. Confirm it fails.
|
||||
- Implement the minimal code to make it pass. Run the test. Confirm green.
|
||||
- Do NOT commit yet — a single task commit happens in Phase 9.
|
||||
3. Auto-skip of TDD is permitted ONLY for tasks classified in `task-plan.md` TDD Approach as:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
4. Any other skip (including `pure-config-addition`) requires explicit user approval recorded in `task-plan.md` with an ISO-8601 timestamp.
|
||||
5. Update `task-plan.md` after each logical step: add notes to `Approach`, check off `Acceptance Criteria` items as they complete.
|
||||
|
||||
### Phase 7: Verification Gate
|
||||
|
||||
Invoke `superpowers:verification-before-completion` via Cursor-native discovery.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
|
||||
### Phase 8: Implementation Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = IMPL_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_IMPL_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_IMPL_SESSION_ID
|
||||
```
|
||||
|
||||
Payload contents (assembled by the skill):
|
||||
|
||||
```markdown
|
||||
# Implementation Review: [Short Title]
|
||||
|
||||
## Task Plan (the plan that was approved)
|
||||
<embed approved task-plan.md, excluding Runtime State block>
|
||||
|
||||
## Changes Made (git diff)
|
||||
<output of: `git diff` for unstaged + `git diff --staged` for staged>
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
<lint output>
|
||||
### Typecheck
|
||||
<typecheck output>
|
||||
### Tests
|
||||
<test output, pass/fail counts>
|
||||
```
|
||||
|
||||
`IMPL_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this implementation against the task plan. Focus on:
|
||||
1. Correctness — Does the diff satisfy the Acceptance Criteria?
|
||||
2. Code quality — Clean, maintainable, no obvious issues?
|
||||
3. Test coverage — Are behavior changes adequately tested (per the plan's TDD Approach)?
|
||||
4. Security — Any security concerns introduced?
|
||||
5. Regressions — Does the diff risk breaking unrelated code?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
|
||||
### Phase 9: Commit + Push Ask
|
||||
|
||||
Invoke `superpowers:finishing-a-development-branch` via Cursor-native discovery.
|
||||
|
||||
1. Stage all changed files explicitly (avoid `git add -A`).
|
||||
2. Single commit with message derived from the task goal:
|
||||
- Format: `<type>(<scope>): <short description>`
|
||||
- Example: `feat(auth): add session token rotation`
|
||||
3. Do NOT push. Update `Status: local-only`.
|
||||
4. Ask the user: "Push to remote? (yes / no)"
|
||||
- On explicit `yes` → push, then set `Status: pushed`.
|
||||
- Any other response → leave `Status: local-only`.
|
||||
|
||||
### Phase 10: Telegram Notification + Finalize
|
||||
|
||||
Resolve the notifier helper (prefer repo-local, fall back to global):
|
||||
|
||||
```bash
|
||||
if [ -x .cursor/skills/reviewer-runtime/notify-telegram.sh ]; then
|
||||
TELEGRAM_NOTIFY_RUNTIME=.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
else
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
fi
|
||||
```
|
||||
|
||||
On every terminal outcome (`pushed`, `local-only`, `aborted-*`, `failed`), send a Telegram summary if both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are set:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "do-task <slug>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
Fill in `Final Status` in `task-plan.md` (include commit hash if any). Do NOT delete the plan folder — it stays as a record.
|
||||
|
||||
---
|
||||
|
||||
## Review Loop (Shared Subroutine)
|
||||
|
||||
This subroutine is invoked twice per `do-task` run: once in Phase 5 (`REVIEW_KIND=plan`) and once in Phase 8 (`REVIEW_KIND=implementation`). Separate session IDs are used for each loop so reviewer context never leaks across loops.
|
||||
|
||||
### Subroutine Inputs
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `REVIEW_KIND` | `plan` or `implementation` |
|
||||
| `REVIEW_ID` | 8-char hex (from `uuidgen`); reused across rounds of the same loop |
|
||||
| `PAYLOAD_PATH` | `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` |
|
||||
| `PROMPT_TEMPLATE` | `PLAN_REVIEW_PROMPT` or `IMPL_REVIEW_PROMPT` |
|
||||
| `REVIEWER_CLI` | `codex` \| `claude` \| `cursor` \| `opencode` \| `pi` |
|
||||
| `REVIEWER_MODEL` | Model name |
|
||||
| `MAX_ROUNDS` | Default 10 |
|
||||
| `SESSION_ID_VAR` | `CODEX_PLAN_SESSION_ID` \| `CODEX_IMPL_SESSION_ID` \| `CURSOR_PLAN_SESSION_ID` \| `CURSOR_IMPL_SESSION_ID` \| `OPENCODE_PLAN_SESSION_ID` \| `OPENCODE_IMPL_SESSION_ID` |
|
||||
|
||||
Temp artifact paths (per loop):
|
||||
|
||||
- `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` — payload
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md` — normalized review text
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json` — raw Cursor/OpenCode JSON (cursor only, plus opencode when `--format json` is used)
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared helper (prefer repo-local, fall back to global):
|
||||
|
||||
```bash
|
||||
if [ -x .cursor/skills/reviewer-runtime/run-review.sh ]; then
|
||||
REVIEWER_RUNTIME=.cursor/skills/reviewer-runtime/run-review.sh
|
||||
else
|
||||
REVIEWER_RUNTIME=~/.cursor/skills/reviewer-runtime/run-review.sh
|
||||
fi
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
### Step 1: Write Payload
|
||||
|
||||
Write the full payload for this round to `PAYLOAD_PATH`.
|
||||
|
||||
### Step 1a: Secret Scan (per-payload, no caching)
|
||||
|
||||
**BEFORE** sending the payload to any reviewer CLI, scan it for secrets. This scan runs EVERY round — no results are cached. Rationale: Phase 8 payloads include newly-introduced diff content that earlier rounds never saw.
|
||||
|
||||
Run the secret scan with all of these anchored regexes. Use `grep -En` on the payload file:
|
||||
|
||||
```bash
|
||||
SECRET_REGEX_FILE=$(mktemp)
|
||||
cat >"$SECRET_REGEX_FILE" <<'EOF'
|
||||
AKIA[0-9A-Z]{16}
|
||||
"type"\s*:\s*"service_account"
|
||||
(ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
|
||||
xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
|
||||
xox[abpsr]-[A-Za-z0-9]{10,48}
|
||||
sk-(proj-)?[A-Za-z0-9_-]{20,}
|
||||
sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
|
||||
-----BEGIN [A-Z ]+ PRIVATE KEY-----
|
||||
(TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
|
||||
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
|
||||
EOF
|
||||
|
||||
SCAN_MATCHES=$(grep -Ensf "$SECRET_REGEX_FILE" "$PAYLOAD_PATH" || true)
|
||||
rm -f "$SECRET_REGEX_FILE"
|
||||
```
|
||||
|
||||
If `SCAN_MATCHES` is non-empty:
|
||||
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
### Step 2: Generate Reviewer Command Script
|
||||
|
||||
Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh` as a bash script starting with:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Round 1 — fresh `codex exec`:
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"Review the ${REVIEW_KIND} payload in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
${PROMPT_TEMPLATE}"
|
||||
```
|
||||
|
||||
Do not capture the Codex session ID yet. After Round 1 completes, extract it from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` (look for `session id: <uuid>`) and persist it to Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
Round 2 and later — resume session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${SESSION_ID_VAR_VALUE} \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before.
|
||||
Keep findings ordered P0 to P3, use '- None.' when a severity has no findings, and only use VERDICT: APPROVED when no P0, P1, or P2 findings remain."
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `codex exec` with prior-round context.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call every round (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"${ROUND_PREFIX}Review the following ${REVIEW_KIND} payload.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
Where `${ROUND_PREFIX}` is empty for Round 1 and `"You previously reviewed this ${REVIEW_KIND} and requested revisions. Previous feedback summary: [key points]. "` for subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Round 1:
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later — resume:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${SESSION_ID_VAR_VALUE} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p`.
|
||||
|
||||
After the command completes, extract the session id and review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SID=$(jq -r '.session_id' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Persist `CURSOR_SID` to Runtime State under `${SESSION_ID_VAR}` on Round 1.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
OpenCode does not expose a dedicated read-only flag at the CLI level; use the built-in `plan` primary agent (`--agent plan`) for review, which is read-oriented and does not modify files. Session resume is supported via `-s <session-id>`, but the most reliable pattern for non-interactive review is **fresh call each round** (like `claude`) because opencode's session lifecycle and ID capture are less standardized than codex/cursor for headless runs. Skills MAY opt-in to session resume when they have verified the installed opencode version exposes a stable session id in `--format json` output.
|
||||
|
||||
Round 1 (preferred, fresh call):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later (fresh-call fallback path — recommended default):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this ${REVIEW_KIND} and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've revised. Updated payload is below.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Optional session-resume path (only if the installed opencode reliably emits a session id in `--format json` output and accepts it back via `-s`):
|
||||
|
||||
```bash
|
||||
# Round 2+ with resume
|
||||
opencode run \
|
||||
-s ${SESSION_ID_VAR_VALUE} \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"I've revised. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Extract the review body (the JSON stream emits events; the final assistant message contains the review text):
|
||||
|
||||
```bash
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If the JSON parse falls through, promote the raw JSON file as the review output and surface a warning to the user. On any opencode CLI or JSON parsing failure, treat this loop round as `completed-empty-output` and follow the helper-failure escalation in Step 6.
|
||||
|
||||
### Step 3: Run via `run-review.sh`
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
2>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll the `.status` file instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
### Step 4: Promote Reviewer Output + Capture Session ID
|
||||
|
||||
After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
### Step 5: Parse Verdict + Update Review History
|
||||
|
||||
1. Read `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md`.
|
||||
2. Append one row to `task-plan.md` Review History:
|
||||
- Timestamp (ISO-8601 UTC).
|
||||
- Loop (`plan` or `implementation`).
|
||||
- Round number.
|
||||
- Verdict (`APPROVED` or `REVISE`).
|
||||
- Summary (first line of the `## Summary` section).
|
||||
3. Increment `plan_review_round` or `implementation_review_round` in Runtime State.
|
||||
|
||||
### Step 6: Branch APPROVED / REVISE / MAX_ROUNDS
|
||||
|
||||
Verdict rules:
|
||||
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → exit the subroutine with `APPROVED`.
|
||||
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if cheap and safe, then exit with `APPROVED`.
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to revision (see below), then return to Step 1 for the next round.
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as APPROVED.
|
||||
- Helper state `completed-empty-output` → treat as failed review attempt, surface `.stderr`/`.status`, fix invocation or prompt handling, then retry.
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters.
|
||||
- Round counter ≥ `MAX_ROUNDS` → exit the subroutine with `MAX_ROUNDS`. Caller decides next action per Phase 5 or Phase 8.
|
||||
|
||||
**Revision:** The caller (Phase 5 for plan, Phase 6/7 for implementation) applies findings in priority order (`P0` → `P1` → `P2` → `P3`). For implementation review revisions, Phase 7 verification must be re-run after every revision before returning to Step 1.
|
||||
|
||||
### Step 7: Liveness Contract (during Step 3)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- Keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
### Step 8: Cleanup (on successful round exit)
|
||||
|
||||
```bash
|
||||
rm -f /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, KEEP `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them.
|
||||
|
||||
---
|
||||
|
||||
## Resume Semantics
|
||||
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
| `plan-approved` | Resume at Phase 6 (execute) |
|
||||
| `implementation-in-progress` | Resume at Phase 6 (continue execute) |
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
Review History is append-only.
|
||||
|
||||
---
|
||||
|
||||
## Execution Workflow Rules
|
||||
|
||||
- Current branch is the default; worktree is opt-in only.
|
||||
- Do NOT push without explicit "yes".
|
||||
- Secret scan runs **per-payload, no caching** — every round, including revisions.
|
||||
- Review loops use `MAX_ROUNDS=10` by default, shared across both loops.
|
||||
- The task commit is a single commit created in Phase 9; interim WIP commits are NOT created.
|
||||
- The `.gitignore` infra commit in Phase 1 is explicitly separate from the task commit and is allowed even on abort.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] `ai_plan/` exists and `/ai_plan/` is in `.gitignore`
|
||||
- [ ] `task-plan.md` created under `ai_plan/YYYY-MM-DD-<slug>/`
|
||||
- [ ] Reviewer CLI + model + `MAX_ROUNDS` configured (or `skip`)
|
||||
- [ ] Secret scan ran on every outbound reviewer payload
|
||||
- [ ] Plan review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Phase 6 executed TDD-first for all behavior-changing steps (or documented skip)
|
||||
- [ ] Phase 7 verification green before Phase 8
|
||||
- [ ] Implementation review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Single task commit created locally, no push without explicit yes
|
||||
- [ ] Telegram notification attempted if configured
|
||||
- [ ] `task-plan.md` Final Status filled in
|
||||
|
||||
---
|
||||
|
||||
## Variant Hardening Notes — Cursor
|
||||
|
||||
- Must use Cursor-native discovery from `.cursor/skills/`, `~/.cursor/skills/`, or installed Cursor plugin cache entries.
|
||||
- Reviewer invocation MUST use `--mode=ask --trust --output-format json`. Never `--mode=agent`, never `--force`, never write-capable modes for reviewer calls.
|
||||
- `jq` is a hard prerequisite because the cursor reviewer branch always returns JSON that must be parsed for `session_id` and `result`.
|
||||
- Helper paths are `.cursor/skills/reviewer-runtime/...` preferred, with `~/.cursor/skills/reviewer-runtime/...` as the fallback path.
|
||||
- Sub-skills are invoked through Cursor-native discovery with explicit announcement — Cursor does not expose a `Skill` tool.
|
||||
- `cursor-agent --version` must succeed as part of the prerequisite check. The binary is `cursor-agent`; the alias `cursor agent` is equivalent.
|
||||
- No plan-mode guard (Cursor Agent CLI has no plan-mode concept).
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Forgetting `--trust` flag when running `cursor-agent` non-interactively (causes interactive prompt).
|
||||
- Using `--mode=agent` or `--force` for reviews (reviewer must be read-only, use `--mode=ask`).
|
||||
- Forgetting to install `jq` (cursor reviewer branch silently breaks).
|
||||
- Skipping workspace-discovery skill verification in Prerequisite Check.
|
||||
- Asking multiple clarifying questions in a single message.
|
||||
- Skipping the per-payload secret scan because "the previous round was clean".
|
||||
- Pushing the task commit without explicit user approval.
|
||||
|
||||
## Red Flags — Stop and Correct
|
||||
|
||||
- Reviewer CLI is running with `--mode=agent`, `--force`, or any write-capable flag.
|
||||
- You did not announce which skill you invoked and why.
|
||||
- You are proceeding to implementation review with failing lint/typecheck/tests.
|
||||
- You are echoing raw secret-scan matches to the user or logs.
|
||||
- You are pushing without explicit user approval.
|
||||
@@ -0,0 +1,143 @@
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Cursor):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through workspace discovery from `.cursor/skills/superpowers/skills/<skill>/SKILL.md` or `~/.cursor/skills/superpowers/skills/<skill>/SKILL.md`. Reviewer invocations MUST use `--mode=ask --trust --output-format json`. `jq` is a hard prerequisite.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Created | YYYY-MM-DD |
|
||||
| Slug | YYYY-MM-DD-<slug> |
|
||||
| Runtime | cursor |
|
||||
| Reviewer CLI | codex \| claude \| cursor \| opencode \| pi |
|
||||
| Reviewer Model | <model> |
|
||||
| MAX_ROUNDS | 10 |
|
||||
| Branch Strategy | current-branch \| worktree |
|
||||
| Branch Name | <current branch name, or new branch name when worktree is used> |
|
||||
| Worktree Path | <absolute path to worktree dir; blank when Branch Strategy = current-branch> |
|
||||
| Status | draft |
|
||||
|
||||
### Status Enum (authoritative)
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
<!-- Exact user prompt, verbatim. -->
|
||||
|
||||
## Interpretation
|
||||
|
||||
<!-- Short restatement of goal + out-of-scope items. -->
|
||||
|
||||
## Assumptions
|
||||
|
||||
<!-- Anything we're assuming and needs confirmation. Empty list OK after clarifying questions. -->
|
||||
|
||||
## Files
|
||||
|
||||
<!-- Files expected to be created / modified / deleted. Paths are absolute or repo-relative. -->
|
||||
|
||||
| Action | Path | Why |
|
||||
|--------|------|-----|
|
||||
| | | |
|
||||
|
||||
## Approach
|
||||
|
||||
<!-- 3-10 bullets describing implementation order. -->
|
||||
|
||||
## TDD Approach
|
||||
|
||||
<!-- One of:
|
||||
(a) **TDD applies** — list the failing test(s) to write first, then implementation, then confirm green.
|
||||
(b) **TDD auto-skipped** — reason must be exactly one of:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
(c) **TDD user-approved skip** — user explicitly approved skipping TDD for this task.
|
||||
Record the approval timestamp (ISO-8601) and the specific reason (e.g., `pure-config-addition`).
|
||||
-->
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] <criterion 1>
|
||||
- [ ] <criterion 2>
|
||||
|
||||
## Verification
|
||||
|
||||
<!-- Commands to run:
|
||||
lint: <cmd>
|
||||
typecheck: <cmd>
|
||||
tests: <cmd>
|
||||
-->
|
||||
|
||||
## Rollback
|
||||
|
||||
<!-- How to undo: `git revert <hash>`, or manual steps if the change is not easily revertable. -->
|
||||
|
||||
---
|
||||
|
||||
## Runtime State
|
||||
|
||||
<!-- Updated by the skill at runtime. Used to detect resume and to persist reviewer session IDs across rounds. -->
|
||||
|
||||
```yaml
|
||||
plan_review_round: 0
|
||||
implementation_review_round: 0
|
||||
CODEX_PLAN_SESSION_ID:
|
||||
CODEX_IMPL_SESSION_ID:
|
||||
CURSOR_PLAN_SESSION_ID:
|
||||
CURSOR_IMPL_SESSION_ID:
|
||||
OPENCODE_PLAN_SESSION_ID:
|
||||
OPENCODE_IMPL_SESSION_ID:
|
||||
last_phase_entered:
|
||||
last_round_ts:
|
||||
last_scan_outcome_plan:
|
||||
last_scan_outcome_impl:
|
||||
verification_attempts: 0
|
||||
tests_added_count: 0
|
||||
tdd_used: false
|
||||
```
|
||||
|
||||
## Review History
|
||||
|
||||
<!-- Append one entry per reviewer round, both loops. -->
|
||||
|
||||
| Timestamp (ISO-8601) | Loop | Round | Verdict | Summary |
|
||||
|----------------------|------|-------|---------|---------|
|
||||
| | | | | |
|
||||
|
||||
## Final Status
|
||||
|
||||
<!-- Filled at the terminal outcome (phase 9/10). Populate at least:
|
||||
- Terminal status (one of the 10 Status enum values)
|
||||
- Commit hash (if any)
|
||||
- Plan-review rounds used / MAX_ROUNDS
|
||||
- Implementation-review rounds used / MAX_ROUNDS
|
||||
- TDD used (true|false)
|
||||
- Tests added count
|
||||
- Verification attempts used
|
||||
- Last round ISO-8601 timestamp
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
- This file is the single persistent artifact for `do-task`. Do not split it or delete it on success.
|
||||
- `Status` must always match one of the 10 enum values.
|
||||
- `Runtime State` is updated by the skill, not by the user.
|
||||
- Review History is append-only.
|
||||
- `last_scan_outcome_plan` and `last_scan_outcome_impl` record the most recent secret-scan result for each loop. They are informational; the scan itself runs per-payload with no caching.
|
||||
@@ -0,0 +1,848 @@
|
||||
---
|
||||
name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in OpenCode. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
# Do Task (OpenCode)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
|
||||
This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `implement-plan`, `do-task` operates on one persistent `task-plan.md` (not a full milestone plan) and defaults to the **current branch** (not a worktree).
|
||||
|
||||
**Core principle:** OpenCode loads skills through its native skill tool. Local skills live under `~/.config/opencode/skills/`, and OpenCode can also expose shared agent skills from `~/.agents/skills/`. Sub-skill invocations use OpenCode's native mechanism — not Claude's `Skill` tool, not Cursor's discovery mechanism.
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- OpenCode CLI: `opencode --version` (install via your package manager or `brew install opencode`).
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- OpenCode Superpowers skills available at `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`
|
||||
- `superpowers/brainstorming`
|
||||
- `superpowers/test-driven-development`
|
||||
- `superpowers/verification-before-completion`
|
||||
- `superpowers/finishing-a-development-branch`
|
||||
- `superpowers/using-git-worktrees` (only when the prompt opts in to a worktree)
|
||||
- Shared reviewer runtime: `~/.config/opencode/skills/reviewer-runtime/run-review.sh`
|
||||
- Telegram notifier helper: `~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh`
|
||||
|
||||
Verify before proceeding:
|
||||
|
||||
```bash
|
||||
opencode --version
|
||||
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md || test -f ~/.config/opencode/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md || test -f ~/.config/opencode/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md || test -f ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
```
|
||||
|
||||
If any required dependency is missing, stop immediately and return:
|
||||
|
||||
`Missing dependency: [specific missing item]. Install required OpenCode Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the reviewer-runtime helper, then retry.`
|
||||
|
||||
## Required Skill Invocation Rules
|
||||
|
||||
- Invoke relevant skills through OpenCode's native skill tool.
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- For skills with checklists, track checklist items explicitly in conversation.
|
||||
- Do NOT use Claude's `Skill` tool syntax or Cursor's discovery mechanism. OpenCode's skill system may expose shared files from `~/.agents/skills/`, but invocation still goes through OpenCode's native skill mechanism.
|
||||
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
- "execute this task"
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
- "on a new branch called X"
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Preflight (includes Bootstrap Superpowers Context)
|
||||
|
||||
1. **Bootstrap Superpowers context** — use OpenCode's native skill tool to list installed skills and confirm `superpowers/brainstorming`, `superpowers/test-driven-development`, `superpowers/verification-before-completion`, and `superpowers/finishing-a-development-branch` are discoverable. If any is missing, stop with the Prerequisite Check error message.
|
||||
2. Verify git repo: `git rev-parse --is-inside-work-tree`.
|
||||
3. Verify `/ai_plan/` is present in `.gitignore`. If missing:
|
||||
- Append `/ai_plan/` to `.gitignore`.
|
||||
- Commit that infra change immediately with message `chore(gitignore): ignore ai_plan local planning artifacts`.
|
||||
- This infra commit is EXPLICITLY separate from the task commit in Phase 9. It may occur even when the final task ends up `aborted` or `failed`.
|
||||
4. Announce each sub-skill before invocation using:
|
||||
`I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
|
||||
### Phase 2: Parse Prompt and Question
|
||||
|
||||
1. Capture the exact user prompt verbatim.
|
||||
2. Detect trigger phrase (see above) and record which one matched.
|
||||
3. Detect escape phrase. If set, skip clarifying questions entirely.
|
||||
4. Apply the ask-first heuristic:
|
||||
- Skip clarifying questions ONLY if ALL are true:
|
||||
- Prompt names a concrete target (file, feature, or function).
|
||||
- Prompt names a concrete outcome (what success looks like).
|
||||
- Prompt has no ambiguous scope (no "and maybe also ...").
|
||||
- All identifiers in the prompt are resolvable against the codebase.
|
||||
- Otherwise, ask 1-3 clarifying questions, ONE AT A TIME, multiple-choice preferred.
|
||||
- Empty prompt → ask exactly once: "what task?".
|
||||
5. Invoke `superpowers/brainstorming` via OpenCode's native skill tool for any **behavior-changing** task — feature creation, bug fix with multiple plausible approaches, refactor, design decision. Present 2-3 approaches and recommend one before finalizing the plan. The ONLY skip conditions are the same ones that allow TDD auto-skip: `pure-documentation` and `pure-comment-whitespace-rename`. When skipping, record the skip reason in the Interpretation section of `task-plan.md`.
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user has already specified a reviewer CLI and model (e.g., "do task X, review with codex gpt-5.4"), use those values. If the user says "use defaults" or otherwise opts out of explicit configuration, proceed with `REVIEWER_CLI=codex`, `REVIEWER_MODEL=gpt-5.4`, and `MAX_ROUNDS=10`. Otherwise, ask:
|
||||
|
||||
1. **Which CLI should review both the plan and the implementation?**
|
||||
- `codex` — OpenAI Codex CLI (`codex exec`)
|
||||
- `claude` — Claude Code CLI (`claude -p`)
|
||||
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
|
||||
- `opencode` — OpenCode CLI (`opencode run`)
|
||||
- `skip` — No external review, proceed with user approval only at each loop.
|
||||
|
||||
2. **Which model?** (only if a CLI was chosen)
|
||||
- For `codex`: default `gpt-5.4`, alternatives: `gpt-5.3-codex`, `o4-mini`, `o3`.
|
||||
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`.
|
||||
- For `cursor`: **run `cursor-agent models` first** to see available models.
|
||||
- For `opencode`: provider-qualified form `<provider>/<model>` (e.g., `anthropic/claude-sonnet-4-5`, `openai/gpt-5.4`). Run `opencode models` to list available models.
|
||||
- Accept any model string the user provides.
|
||||
|
||||
3. **Max review rounds shared across both loops?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
### Phase 4: Initialize Plan Workspace
|
||||
|
||||
OpenCode has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
- If `Status` is `draft` or `plan-approved` or `implementation-in-progress`: offer to resume, pick a new suffix (`<slug>-v2`), or abort. Default is resume.
|
||||
- If `Status` is any terminal value (`pushed`, `local-only`, `aborted-*`, `failed`): offer a new suffix or abort. Default is new suffix.
|
||||
4. If not resuming, create the folder and write `task-plan.md` from the template at `templates/task-plan.md` (this skill's template folder; falls back to `~/.config/opencode/skills/do-task/templates/task-plan.md` when installed directly).
|
||||
5. Fill in:
|
||||
- `Metadata` block.
|
||||
- `Prompt` (verbatim).
|
||||
- `Interpretation`, `Assumptions`, `Files`, `Approach`, `TDD Approach`, `Acceptance Criteria`, `Verification`, `Rollback`.
|
||||
- Leave `Runtime State`, `Review History`, `Final Status` empty (skill updates these).
|
||||
6. Set `Status: draft`.
|
||||
|
||||
**Worktree branch:** If the prompt opts in to a worktree (see Trigger Phrase Detection), invoke `superpowers/using-git-worktrees` via OpenCode's native skill tool before proceeding. Otherwise continue on the current branch.
|
||||
|
||||
### Phase 5: Plan Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = PLAN_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_PLAN_SESSION_ID | CURSOR_PLAN_SESSION_ID | OPENCODE_PLAN_SESSION_ID
|
||||
```
|
||||
|
||||
Payload is the current `task-plan.md` **with the `Runtime State` and `Review History` blocks stripped** before writing to `PAYLOAD_PATH`. Those two blocks contain reviewer session IDs and scan outcomes that must never be sent back to any reviewer CLI. Reviewers only need the Prompt, Interpretation, Assumptions, Files, Approach, TDD Approach, Acceptance Criteria, Verification, Rollback, and Metadata sections.
|
||||
|
||||
`PLAN_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this task plan for completeness, correctness, and risk. Focus on:
|
||||
1. Does the plan match the user's prompt?
|
||||
2. Are all assumptions surfaced?
|
||||
3. Are acceptance criteria testable?
|
||||
4. Is the TDD approach appropriate per the TDD Approach section?
|
||||
5. Are there missing files, risks, or security concerns?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
|
||||
### Phase 6: Execute (TDD-first where applicable)
|
||||
|
||||
Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
|
||||
1. Set `Status: implementation-in-progress`.
|
||||
2. For every behavior-changing file edit:
|
||||
- Invoke `superpowers/test-driven-development` via OpenCode's native skill tool.
|
||||
- Write the failing test first. Run it. Confirm it fails.
|
||||
- Implement the minimal code to make it pass. Run the test. Confirm green.
|
||||
- Do NOT commit yet — a single task commit happens in Phase 9.
|
||||
3. Auto-skip of TDD is permitted ONLY for tasks classified in `task-plan.md` TDD Approach as:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
4. Any other skip (including `pure-config-addition`) requires explicit user approval recorded in `task-plan.md` with an ISO-8601 timestamp.
|
||||
5. Update `task-plan.md` after each logical step: add notes to `Approach`, check off `Acceptance Criteria` items as they complete.
|
||||
|
||||
### Phase 7: Verification Gate
|
||||
|
||||
Invoke `superpowers/verification-before-completion` via OpenCode's native skill tool.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
|
||||
### Phase 8: Implementation Review Loop
|
||||
|
||||
If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and proceed only after explicit user approval.
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
PROMPT_TEMPLATE = IMPL_REVIEW_PROMPT (see below)
|
||||
SESSION_ID_VAR = CODEX_IMPL_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_IMPL_SESSION_ID
|
||||
```
|
||||
|
||||
Payload contents (assembled by the skill):
|
||||
|
||||
```markdown
|
||||
# Implementation Review: [Short Title]
|
||||
|
||||
## Task Plan (the plan that was approved)
|
||||
<embed approved task-plan.md, excluding Runtime State block>
|
||||
|
||||
## Changes Made (git diff)
|
||||
<output of: `git diff` for unstaged + `git diff --staged` for staged>
|
||||
|
||||
## Verification Output
|
||||
### Lint
|
||||
<lint output>
|
||||
### Typecheck
|
||||
<typecheck output>
|
||||
### Tests
|
||||
<test output, pass/fail counts>
|
||||
```
|
||||
|
||||
`IMPL_REVIEW_PROMPT`:
|
||||
|
||||
```text
|
||||
Review this implementation against the task plan. Focus on:
|
||||
1. Correctness — Does the diff satisfy the Acceptance Criteria?
|
||||
2. Code quality — Clean, maintainable, no obvious issues?
|
||||
3. Test coverage — Are behavior changes adequately tested (per the plan's TDD Approach)?
|
||||
4. Security — Any security concerns introduced?
|
||||
5. Regressions — Does the diff risk breaking unrelated code?
|
||||
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
|
||||
### Phase 9: Commit + Push Ask
|
||||
|
||||
Invoke `superpowers/finishing-a-development-branch` via OpenCode's native skill tool.
|
||||
|
||||
1. Stage all changed files explicitly (avoid `git add -A`).
|
||||
2. Single commit with message derived from the task goal:
|
||||
- Format: `<type>(<scope>): <short description>`
|
||||
- Example: `feat(auth): add session token rotation`
|
||||
3. Do NOT push. Update `Status: local-only`.
|
||||
4. Ask the user: "Push to remote? (yes / no)"
|
||||
- On explicit `yes` → push, then set `Status: pushed`.
|
||||
- Any other response → leave `Status: local-only`.
|
||||
|
||||
### Phase 10: Telegram Notification + Finalize
|
||||
|
||||
Resolve the notifier helper:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
|
||||
```
|
||||
|
||||
On every terminal outcome (`pushed`, `local-only`, `aborted-*`, `failed`), send a Telegram summary if both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are set:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "do-task <slug>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
- If Telegram is not configured, state that no Telegram notification was sent.
|
||||
|
||||
Fill in `Final Status` in `task-plan.md` (include commit hash if any). Do NOT delete the plan folder — it stays as a record.
|
||||
|
||||
---
|
||||
|
||||
## Review Loop (Shared Subroutine)
|
||||
|
||||
This subroutine is invoked twice per `do-task` run: once in Phase 5 (`REVIEW_KIND=plan`) and once in Phase 8 (`REVIEW_KIND=implementation`). Separate session IDs are used for each loop so reviewer context never leaks across loops.
|
||||
|
||||
### Subroutine Inputs
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `REVIEW_KIND` | `plan` or `implementation` |
|
||||
| `REVIEW_ID` | 8-char hex (from `uuidgen`); reused across rounds of the same loop |
|
||||
| `PAYLOAD_PATH` | `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` |
|
||||
| `PROMPT_TEMPLATE` | `PLAN_REVIEW_PROMPT` or `IMPL_REVIEW_PROMPT` |
|
||||
| `REVIEWER_CLI` | `codex` \| `claude` \| `cursor` \| `opencode` \| `pi` |
|
||||
| `REVIEWER_MODEL` | Model name |
|
||||
| `MAX_ROUNDS` | Default 10 |
|
||||
| `SESSION_ID_VAR` | `CODEX_PLAN_SESSION_ID` \| `CODEX_IMPL_SESSION_ID` \| `CURSOR_PLAN_SESSION_ID` \| `CURSOR_IMPL_SESSION_ID` \| `OPENCODE_PLAN_SESSION_ID` \| `OPENCODE_IMPL_SESSION_ID` |
|
||||
|
||||
Temp artifact paths (per loop):
|
||||
|
||||
- `/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md` — payload
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md` — normalized review text
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json` — raw Cursor/OpenCode JSON (cursor only, plus opencode when `--format json` is used)
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out`
|
||||
- `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh`
|
||||
|
||||
Resolve the shared helper:
|
||||
|
||||
```bash
|
||||
REVIEWER_RUNTIME=~/.config/opencode/skills/reviewer-runtime/run-review.sh
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
### Step 1: Write Payload
|
||||
|
||||
Write the full payload for this round to `PAYLOAD_PATH`.
|
||||
|
||||
### Step 1a: Secret Scan (per-payload, no caching)
|
||||
|
||||
**BEFORE** sending the payload to any reviewer CLI, scan it for secrets. This scan runs EVERY round — no results are cached. Rationale: Phase 8 payloads include newly-introduced diff content that earlier rounds never saw.
|
||||
|
||||
Run the secret scan with all of these anchored regexes. Use `grep -En` on the payload file:
|
||||
|
||||
```bash
|
||||
SECRET_REGEX_FILE=$(mktemp)
|
||||
cat >"$SECRET_REGEX_FILE" <<'EOF'
|
||||
AKIA[0-9A-Z]{16}
|
||||
"type"\s*:\s*"service_account"
|
||||
(ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
|
||||
xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
|
||||
xox[abpsr]-[A-Za-z0-9]{10,48}
|
||||
sk-(proj-)?[A-Za-z0-9_-]{20,}
|
||||
sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
|
||||
-----BEGIN [A-Z ]+ PRIVATE KEY-----
|
||||
(TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
|
||||
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
|
||||
EOF
|
||||
|
||||
SCAN_MATCHES=$(grep -Ensf "$SECRET_REGEX_FILE" "$PAYLOAD_PATH" || true)
|
||||
rm -f "$SECRET_REGEX_FILE"
|
||||
```
|
||||
|
||||
If `SCAN_MATCHES` is non-empty:
|
||||
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
### Step 2: Generate Reviewer Command Script
|
||||
|
||||
Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh` as a bash script starting with:
|
||||
|
||||
```bash
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
|
||||
--model "$REVIEWER_MODEL" \
|
||||
--tools read,grep,find,ls \
|
||||
-p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
|
||||
```
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
Round 1 — fresh `codex exec`:
|
||||
|
||||
```bash
|
||||
codex exec \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
-s read-only \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"Review the ${REVIEW_KIND} payload in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
${PROMPT_TEMPLATE}"
|
||||
```
|
||||
|
||||
Do not capture the Codex session ID yet. After Round 1 completes, extract it from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` (look for `session id: <uuid>`) and persist it to Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
Round 2 and later — resume session:
|
||||
|
||||
```bash
|
||||
codex exec resume ${SESSION_ID_VAR_VALUE} \
|
||||
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before.
|
||||
Keep findings ordered P0 to P3, use '- None.' when a severity has no findings, and only use VERDICT: APPROVED when no P0, P1, or P2 findings remain."
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `codex exec` with prior-round context.
|
||||
|
||||
**If `REVIEWER_CLI` is `claude`:**
|
||||
|
||||
Fresh call every round (Claude CLI has no session resume):
|
||||
|
||||
```bash
|
||||
claude -p \
|
||||
"${ROUND_PREFIX}Review the following ${REVIEW_KIND} payload.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
```
|
||||
|
||||
Where `${ROUND_PREFIX}` is empty for Round 1 and `"You previously reviewed this ${REVIEW_KIND} and requested revisions. Previous feedback summary: [key points]. "` for subsequent rounds.
|
||||
|
||||
**If `REVIEWER_CLI` is `cursor`:**
|
||||
|
||||
Round 1:
|
||||
|
||||
```bash
|
||||
cursor-agent -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later — resume:
|
||||
|
||||
```bash
|
||||
cursor-agent --resume ${SESSION_ID_VAR_VALUE} -p \
|
||||
--mode=ask \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--trust \
|
||||
--output-format json \
|
||||
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
If resume fails, fall back to fresh `cursor-agent -p`.
|
||||
|
||||
After the command completes, extract the session id and review text:
|
||||
|
||||
```bash
|
||||
CURSOR_SID=$(jq -r '.session_id' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
|
||||
jq -r '.result' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
Persist `CURSOR_SID` to Runtime State under `${SESSION_ID_VAR}` on Round 1.
|
||||
|
||||
**If `REVIEWER_CLI` is `opencode`:**
|
||||
|
||||
OpenCode does not expose a dedicated read-only flag at the CLI level; use the built-in `plan` primary agent (`--agent plan`) for review, which is read-oriented and does not modify files. Session resume is supported via `-s <session-id>`, but the most reliable pattern for non-interactive review is **fresh call each round** (like `claude`) because opencode's session lifecycle and ID capture are less standardized than codex/cursor for headless runs. Skills MAY opt-in to session resume when they have verified the installed opencode version exposes a stable session id in `--format json` output.
|
||||
|
||||
Round 1 (preferred, fresh call):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
|
||||
|
||||
${PROMPT_TEMPLATE}" \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Round 2 and later (fresh-call fallback path — recommended default):
|
||||
|
||||
```bash
|
||||
opencode run \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"You previously reviewed this ${REVIEW_KIND} and requested revisions.
|
||||
|
||||
Previous feedback summary: [key points from last review]
|
||||
|
||||
I've revised. Updated payload is below.
|
||||
|
||||
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Optional session-resume path (only if the installed opencode reliably emits a session id in `--format json` output and accepts it back via `-s`):
|
||||
|
||||
```bash
|
||||
# Round 2+ with resume
|
||||
opencode run \
|
||||
-s ${SESSION_ID_VAR_VALUE} \
|
||||
-m ${REVIEWER_MODEL} \
|
||||
--agent plan \
|
||||
--format json \
|
||||
"I've revised. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
|
||||
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
Extract the review body (the JSON stream emits events; the final assistant message contains the review text):
|
||||
|
||||
```bash
|
||||
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
|| cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
If the JSON parse falls through, promote the raw JSON file as the review output and surface a warning to the user. On any opencode CLI or JSON parsing failure, treat this loop round as `completed-empty-output` and follow the helper-failure escalation in Step 6.
|
||||
|
||||
### Step 3: Run via `run-review.sh`
|
||||
|
||||
Run the command script through the shared helper when available:
|
||||
|
||||
```bash
|
||||
if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
"$REVIEWER_RUNTIME" \
|
||||
--command-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
|
||||
>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
2>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll the `.status` file instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
### Step 4: Promote Reviewer Output + Capture Session ID
|
||||
|
||||
After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
|
||||
### Step 5: Parse Verdict + Update Review History
|
||||
|
||||
1. Read `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md`.
|
||||
2. Append one row to `task-plan.md` Review History:
|
||||
- Timestamp (ISO-8601 UTC).
|
||||
- Loop (`plan` or `implementation`).
|
||||
- Round number.
|
||||
- Verdict (`APPROVED` or `REVISE`).
|
||||
- Summary (first line of the `## Summary` section).
|
||||
3. Increment `plan_review_round` or `implementation_review_round` in Runtime State.
|
||||
|
||||
### Step 6: Branch APPROVED / REVISE / MAX_ROUNDS
|
||||
|
||||
Verdict rules:
|
||||
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → exit the subroutine with `APPROVED`.
|
||||
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if cheap and safe, then exit with `APPROVED`.
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to revision (see below), then return to Step 1 for the next round.
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as APPROVED.
|
||||
- Helper state `completed-empty-output` → treat as failed review attempt, surface `.stderr`/`.status`, fix invocation or prompt handling, then retry.
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters.
|
||||
- Round counter ≥ `MAX_ROUNDS` → exit the subroutine with `MAX_ROUNDS`. Caller decides next action per Phase 5 or Phase 8.
|
||||
|
||||
**Revision:** The caller (Phase 5 for plan, Phase 6/7 for implementation) applies findings in priority order (`P0` → `P1` → `P2` → `P3`). For implementation review revisions, Phase 7 verification must be re-run after every revision before returning to Step 1.
|
||||
|
||||
### Step 7: Liveness Contract (during Step 3)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- Keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
### Step 8: Cleanup (on successful round exit)
|
||||
|
||||
```bash
|
||||
rm -f /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
|
||||
```
|
||||
|
||||
If the round failed, produced empty output, or reached operator-decision timeout, KEEP `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them.
|
||||
|
||||
---
|
||||
|
||||
## Resume Semantics
|
||||
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
| `plan-approved` | Resume at Phase 6 (execute) |
|
||||
| `implementation-in-progress` | Resume at Phase 6 (continue execute) |
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
Review History is append-only.
|
||||
|
||||
---
|
||||
|
||||
## Execution Workflow Rules
|
||||
|
||||
- Current branch is the default; worktree is opt-in only.
|
||||
- Do NOT push without explicit "yes".
|
||||
- Secret scan runs **per-payload, no caching** — every round, including revisions.
|
||||
- Review loops use `MAX_ROUNDS=10` by default, shared across both loops.
|
||||
- The task commit is a single commit created in Phase 9; interim WIP commits are NOT created.
|
||||
- The `.gitignore` infra commit in Phase 1 is explicitly separate from the task commit and is allowed even on abort.
|
||||
|
||||
---
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
- [ ] `ai_plan/` exists and `/ai_plan/` is in `.gitignore`
|
||||
- [ ] `task-plan.md` created under `ai_plan/YYYY-MM-DD-<slug>/`
|
||||
- [ ] Reviewer CLI + model + `MAX_ROUNDS` configured (or `skip`)
|
||||
- [ ] Secret scan ran on every outbound reviewer payload
|
||||
- [ ] Plan review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Phase 6 executed TDD-first for all behavior-changing steps (or documented skip)
|
||||
- [ ] Phase 7 verification green before Phase 8
|
||||
- [ ] Implementation review completed (APPROVED, MAX_ROUNDS handled, or skipped)
|
||||
- [ ] Single task commit created locally, no push without explicit yes
|
||||
- [ ] Telegram notification attempted if configured
|
||||
- [ ] `task-plan.md` Final Status filled in
|
||||
|
||||
---
|
||||
|
||||
## Variant Hardening Notes — OpenCode
|
||||
|
||||
- Must use OpenCode's native skill tool for sub-skill invocation. Do NOT use Claude's `Skill` tool syntax. OpenCode may load shared skill files from `~/.agents/skills/`, but invocation is still OpenCode-native.
|
||||
- Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms `superpowers/brainstorming`, `superpowers/test-driven-development`, `superpowers/verification-before-completion`, and `superpowers/finishing-a-development-branch` are discoverable before any other phase runs.
|
||||
- Helper paths are `~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}`.
|
||||
- OpenCode reviewer CLI branch (when `REVIEWER_CLI=opencode`):
|
||||
- Binary: `opencode`. Non-interactive: `opencode run "<message>"`.
|
||||
- Model: `-m <provider>/<model>` (e.g., `openai/gpt-5.4`, `anthropic/claude-sonnet-4-5`).
|
||||
- Read-only posture: `--agent plan` (uses OpenCode's built-in plan primary agent; no explicit `--read-only` flag exists).
|
||||
- Output: `--format json` for structured output. Extraction uses `jq` against the JSON event stream.
|
||||
- Session resume: `-s <session-id>` or `--continue`. Fresh call each round is the recommended default since session id capture is less standardized than codex/cursor for headless runs.
|
||||
- No plan-mode guard (OpenCode has no plan-mode concept).
|
||||
|
||||
## Common Mistakes
|
||||
|
||||
- Skipping the Bootstrap Superpowers Context step in Phase 1 (breaks native skill discovery).
|
||||
- Using Claude `Skill` tool syntax, or treating shared `~/.agents/skills/` files as anything other than OpenCode-native skill entries.
|
||||
- Forgetting to set `--agent plan` on opencode reviewer calls (would use the default `build` agent which can write files).
|
||||
- Asking multiple clarifying questions in a single message.
|
||||
- Skipping the per-payload secret scan because "the previous round was clean".
|
||||
- Pushing the task commit without explicit user approval.
|
||||
- Using a non-provider-qualified model string for opencode (e.g., `gpt-5.4` instead of `openai/gpt-5.4`).
|
||||
|
||||
## Red Flags — Stop and Correct
|
||||
|
||||
- You are invoking sub-skills via Claude's `Skill` tool or Codex native-discovery paths instead of OpenCode's native skill tool.
|
||||
- You are running an opencode reviewer call without `--agent plan`.
|
||||
- You did not announce which skill you invoked and why.
|
||||
- You are proceeding to implementation review with failing lint/typecheck/tests.
|
||||
- You are echoing raw secret-scan matches to the user or logs.
|
||||
- You are pushing without explicit user approval.
|
||||
@@ -0,0 +1,143 @@
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (OpenCode):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through OpenCode's native skill tool from `~/.config/opencode/skills/superpowers/<skill>/SKILL.md`. Phase 1 MUST include the Bootstrap Superpowers Context step. Opencode reviewer calls MUST use `--agent plan` to stay read-only.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Created | YYYY-MM-DD |
|
||||
| Slug | YYYY-MM-DD-<slug> |
|
||||
| Runtime | opencode |
|
||||
| Reviewer CLI | codex \| claude \| cursor \| opencode \| pi |
|
||||
| Reviewer Model | <model> |
|
||||
| MAX_ROUNDS | 10 |
|
||||
| Branch Strategy | current-branch \| worktree |
|
||||
| Branch Name | <current branch name, or new branch name when worktree is used> |
|
||||
| Worktree Path | <absolute path to worktree dir; blank when Branch Strategy = current-branch> |
|
||||
| Status | draft |
|
||||
|
||||
### Status Enum (authoritative)
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
<!-- Exact user prompt, verbatim. -->
|
||||
|
||||
## Interpretation
|
||||
|
||||
<!-- Short restatement of goal + out-of-scope items. -->
|
||||
|
||||
## Assumptions
|
||||
|
||||
<!-- Anything we're assuming and needs confirmation. Empty list OK after clarifying questions. -->
|
||||
|
||||
## Files
|
||||
|
||||
<!-- Files expected to be created / modified / deleted. Paths are absolute or repo-relative. -->
|
||||
|
||||
| Action | Path | Why |
|
||||
|--------|------|-----|
|
||||
| | | |
|
||||
|
||||
## Approach
|
||||
|
||||
<!-- 3-10 bullets describing implementation order. -->
|
||||
|
||||
## TDD Approach
|
||||
|
||||
<!-- One of:
|
||||
(a) **TDD applies** — list the failing test(s) to write first, then implementation, then confirm green.
|
||||
(b) **TDD auto-skipped** — reason must be exactly one of:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
(c) **TDD user-approved skip** — user explicitly approved skipping TDD for this task.
|
||||
Record the approval timestamp (ISO-8601) and the specific reason (e.g., `pure-config-addition`).
|
||||
-->
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] <criterion 1>
|
||||
- [ ] <criterion 2>
|
||||
|
||||
## Verification
|
||||
|
||||
<!-- Commands to run:
|
||||
lint: <cmd>
|
||||
typecheck: <cmd>
|
||||
tests: <cmd>
|
||||
-->
|
||||
|
||||
## Rollback
|
||||
|
||||
<!-- How to undo: `git revert <hash>`, or manual steps if the change is not easily revertable. -->
|
||||
|
||||
---
|
||||
|
||||
## Runtime State
|
||||
|
||||
<!-- Updated by the skill at runtime. Used to detect resume and to persist reviewer session IDs across rounds. -->
|
||||
|
||||
```yaml
|
||||
plan_review_round: 0
|
||||
implementation_review_round: 0
|
||||
CODEX_PLAN_SESSION_ID:
|
||||
CODEX_IMPL_SESSION_ID:
|
||||
CURSOR_PLAN_SESSION_ID:
|
||||
CURSOR_IMPL_SESSION_ID:
|
||||
OPENCODE_PLAN_SESSION_ID:
|
||||
OPENCODE_IMPL_SESSION_ID:
|
||||
last_phase_entered:
|
||||
last_round_ts:
|
||||
last_scan_outcome_plan:
|
||||
last_scan_outcome_impl:
|
||||
verification_attempts: 0
|
||||
tests_added_count: 0
|
||||
tdd_used: false
|
||||
```
|
||||
|
||||
## Review History
|
||||
|
||||
<!-- Append one entry per reviewer round, both loops. -->
|
||||
|
||||
| Timestamp (ISO-8601) | Loop | Round | Verdict | Summary |
|
||||
|----------------------|------|-------|---------|---------|
|
||||
| | | | | |
|
||||
|
||||
## Final Status
|
||||
|
||||
<!-- Filled at the terminal outcome (phase 9/10). Populate at least:
|
||||
- Terminal status (one of the 10 Status enum values)
|
||||
- Commit hash (if any)
|
||||
- Plan-review rounds used / MAX_ROUNDS
|
||||
- Implementation-review rounds used / MAX_ROUNDS
|
||||
- TDD used (true|false)
|
||||
- Tests added count
|
||||
- Verification attempts used
|
||||
- Last round ISO-8601 timestamp
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
- This file is the single persistent artifact for `do-task`. Do not split it or delete it on success.
|
||||
- `Status` must always match one of the 10 enum values.
|
||||
- `Runtime State` is updated by the skill, not by the user.
|
||||
- Review History is append-only.
|
||||
- `last_scan_outcome_plan` and `last_scan_outcome_impl` record the most recent secret-scan result for each loop. They are informational; the scan itself runs per-payload with no caching.
|
||||
@@ -0,0 +1,210 @@
|
||||
---
|
||||
name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end in pi with plan review, implementation review, verification, and one persistent task-plan artifact.
|
||||
---
|
||||
|
||||
# Do Task (Pi)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse, clarify, plan, implement, verify, review, commit, and optionally push.
|
||||
|
||||
This variant uses one persistent `task-plan.md` under `ai_plan/` and defaults to the current branch unless the prompt explicitly opts into a worktree workflow.
|
||||
|
||||
## Shared Setup
|
||||
|
||||
Before using this skill, read:
|
||||
|
||||
- [docs/PI-SUPERPOWERS.md](../../../docs/PI-SUPERPOWERS.md)
|
||||
- [docs/PI-COMMON-REVIEWER.md](../../../docs/PI-COMMON-REVIEWER.md)
|
||||
|
||||
This workflow depends on:
|
||||
|
||||
- Superpowers skills being visible to pi
|
||||
- the pi reviewer-runtime helper being installed in a supported location
|
||||
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- `pi --version`
|
||||
- Superpowers `brainstorming`
|
||||
- Superpowers `test-driven-development`
|
||||
- Superpowers `verification-before-completion`
|
||||
- Superpowers `finishing-a-development-branch`
|
||||
- Superpowers `using-git-worktrees` when the prompt opts into a worktree
|
||||
- pi reviewer runtime helper
|
||||
- pi Telegram notifier helper
|
||||
|
||||
Quick checks for common installs:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md || test -f ~/.pi/agent/skills/superpowers/brainstorming/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md || test -f ~/.pi/agent/skills/superpowers/test-driven-development/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md || test -f ~/.pi/agent/skills/superpowers/verification-before-completion/SKILL.md
|
||||
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.pi/agent/skills/superpowers/finishing-a-development-branch/SKILL.md
|
||||
test -x .pi/skills/reviewer-runtime/pi/run-review.sh || test -x ~/.pi/agent/skills/reviewer-runtime/pi/run-review.sh
|
||||
test -x .pi/skills/reviewer-runtime/pi/notify-telegram.sh || test -x ~/.pi/agent/skills/reviewer-runtime/pi/notify-telegram.sh
|
||||
```
|
||||
|
||||
If you use a settings-defined skill path for Superpowers, confirm it matches [docs/PI-SUPERPOWERS.md](../../../docs/PI-SUPERPOWERS.md) before continuing.
|
||||
|
||||
If you install the reviewer helper in a nonstandard location, confirm it matches [docs/PI-COMMON-REVIEWER.md](../../../docs/PI-COMMON-REVIEWER.md) before continuing.
|
||||
|
||||
If any required dependency is missing, stop immediately and return:
|
||||
|
||||
`Missing dependency: pi do-task requires the workflow skills and reviewer setup documented in docs/PI-SUPERPOWERS.md and docs/PI-COMMON-REVIEWER.md.`
|
||||
|
||||
## Required Workflow Rules
|
||||
|
||||
- Load the relevant workflow skill before entering its phase. If pi did not auto-load it, use `/skill:<name>`.
|
||||
- Announce skill usage explicitly:
|
||||
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
|
||||
- Keep the `task-plan.md` artifact current as work progresses.
|
||||
- Do not use deprecated wrapper CLIs.
|
||||
|
||||
## Trigger Detection
|
||||
|
||||
Always use this skill for:
|
||||
|
||||
- `/do-task`
|
||||
- `do this task`
|
||||
- `do task ...`
|
||||
- `execute this task`
|
||||
- `make it so`
|
||||
- `just do ...` when another skill is not a better fit
|
||||
|
||||
Use current-branch execution by default. Only switch to a worktree when the prompt explicitly asks for one.
|
||||
|
||||
## Process
|
||||
|
||||
### Phase 1: Preflight
|
||||
|
||||
1. Verify the repo: `git rev-parse --is-inside-work-tree`
|
||||
2. Ensure `/ai_plan/` exists in `.gitignore`
|
||||
3. Confirm the required workflow skills are available to pi
|
||||
4. Announce each workflow skill before using it
|
||||
|
||||
### Phase 2: Parse Prompt And Clarify
|
||||
|
||||
1. Capture the user's prompt verbatim
|
||||
2. Detect whether the prompt is concrete enough to proceed without questions
|
||||
3. If needed, ask 1-3 short questions one at a time
|
||||
4. Load `brainstorming` for behavior-changing work unless the task is pure documentation or pure comment/whitespace/rename work
|
||||
|
||||
### Phase 3: Configure Reviewer
|
||||
|
||||
If the user already specified reviewer settings, use them. Otherwise ask:
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`
|
||||
|
||||
1. Which CLI should review the plan and implementation?
|
||||
2. Reviewer model
|
||||
3. Max rounds, default `10`
|
||||
|
||||
Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS`.
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
pi --version
|
||||
```
|
||||
|
||||
For shorthand `pi/<pi-model-name>`, split only on the first slash when the prefix is exactly `pi`; store the complete remainder in `REVIEWER_MODEL`. Examples: `pi/claude-opus-4-7` -> `claude-opus-4-7`, `pi/anthropic/claude-opus-4-7` -> `anthropic/claude-opus-4-7`, and `pi/openrouter/anthropic/claude-opus-4-7` -> `openrouter/anthropic/claude-opus-4-7`.
|
||||
|
||||
When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the pi model running this workflow. Use any configured pi model string, including provider-qualified model IDs. If the reviewer model or provider is unavailable, surface the review helper stderr/status and ask for a configured model; use `pi --list-models [search]` to inspect configured models.
|
||||
|
||||
The pi reviewer command rendered into `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh` must be isolated and read-only:
|
||||
|
||||
```bash
|
||||
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files --model "$REVIEWER_MODEL" --tools read,grep,find,ls -p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review."
|
||||
```
|
||||
|
||||
The pi reviewer invocation must not load workflow skills and must not include `write`, `edit`, or `bash` tools.
|
||||
|
||||
### Phase 4: Initialize `task-plan.md`
|
||||
|
||||
1. Compute `ai_plan/YYYY-MM-DD-<slug>/`
|
||||
2. Resume if an existing plan folder is active, otherwise create a new one
|
||||
3. Write `task-plan.md` from this skill's `templates/task-plan.md`
|
||||
4. Fill `Metadata`, `Prompt`, `Interpretation`, `Assumptions`, `Files`, `Approach`, `TDD Approach`, `Acceptance Criteria`, `Verification`, and `Rollback`
|
||||
5. Set `Status: draft`
|
||||
|
||||
If the prompt explicitly opts into a worktree, load `using-git-worktrees` before implementation. Otherwise remain on the current branch.
|
||||
|
||||
### Phase 5: Plan Review Loop
|
||||
|
||||
Skip this phase if `REVIEWER_CLI=skip`.
|
||||
|
||||
1. Write a reviewer payload from `task-plan.md`
|
||||
2. Strip the runtime-only sections before sending it out
|
||||
3. Run the reviewer through the pi reviewer-runtime helper when available
|
||||
4. Fix `P0`, `P1`, and `P2` findings before proceeding
|
||||
5. Keep `P3` findings for optional cleanup
|
||||
6. Set `Status: plan-approved` when the reviewer approves
|
||||
|
||||
The reviewer response format must be:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
### Phase 6: Execute
|
||||
|
||||
1. Set `Status: implementation-in-progress`
|
||||
2. Load `test-driven-development` for every behavior-changing edit unless `task-plan.md` explicitly records an allowed skip
|
||||
3. Update `task-plan.md` as acceptance criteria are completed
|
||||
4. Do not commit yet
|
||||
|
||||
### Phase 7: Verification Gate
|
||||
|
||||
1. Load `verification-before-completion`
|
||||
2. Run the commands listed in `task-plan.md`
|
||||
3. Fix failures and re-run verification until green
|
||||
4. If verification stalls repeatedly, stop and surface the blocker
|
||||
|
||||
### Phase 8: Implementation Review Loop
|
||||
|
||||
Skip this phase if `REVIEWER_CLI=skip`.
|
||||
|
||||
1. Build a review payload from the approved plan, current diff, and verification output
|
||||
2. Run the reviewer through the pi reviewer-runtime helper
|
||||
3. Address `P0`, `P1`, and `P2` findings before approval
|
||||
4. Fix cheap `P3` findings when safe
|
||||
5. Set `Status: implementation-approved` when approved
|
||||
|
||||
### Phase 9: Commit And Push Decision
|
||||
|
||||
1. Load `finishing-a-development-branch`
|
||||
2. Stage only the intended files
|
||||
3. Create one commit for the task
|
||||
4. Ask whether to push or keep the work local
|
||||
|
||||
### Phase 10: Telegram Completion Notification
|
||||
|
||||
Resolve the helper in this order:
|
||||
|
||||
```bash
|
||||
TELEGRAM_NOTIFY_RUNTIME=""
|
||||
for candidate in ".pi/skills/reviewer-runtime/pi/notify-telegram.sh" "$HOME/.pi/agent/skills/reviewer-runtime/pi/notify-telegram.sh"; do
|
||||
if [ -x "$candidate" ]; then
|
||||
TELEGRAM_NOTIFY_RUNTIME="$candidate"
|
||||
break
|
||||
fi
|
||||
done
|
||||
```
|
||||
|
||||
If the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured, send a short completion summary. Otherwise state that no Telegram completion notification was sent.
|
||||
@@ -0,0 +1,128 @@
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (pi):** Required workflow skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) must be available to pi as documented in `docs/PI-SUPERPOWERS.md`. Load the relevant workflow skill before entering its matching phase.
|
||||
|
||||
## Metadata
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Created | YYYY-MM-DD |
|
||||
| Slug | YYYY-MM-DD-<slug> |
|
||||
| Runtime | pi |
|
||||
| Reviewer CLI | codex \| claude \| cursor \| opencode \| pi |
|
||||
| Reviewer Model | <model> |
|
||||
| MAX_ROUNDS | 10 |
|
||||
| Branch Strategy | current-branch \| worktree |
|
||||
| Branch Name | <current branch name, or new branch name when worktree is used> |
|
||||
| Worktree Path | <absolute path to worktree dir; blank when Branch Strategy = current-branch> |
|
||||
| Status | draft |
|
||||
|
||||
### Status Enum (authoritative)
|
||||
|
||||
| Value | Meaning |
|
||||
|-------|---------|
|
||||
| `draft` | Newly created; plan review not yet started |
|
||||
| `plan-approved` | Plan review loop returned APPROVED |
|
||||
| `implementation-in-progress` | Phase 6 executing |
|
||||
| `implementation-approved` | Phase 8 review loop returned APPROVED; awaiting commit |
|
||||
| `pushed` | Committed + pushed to remote |
|
||||
| `local-only` | Committed locally; user declined push |
|
||||
| `aborted-plan-review` | MAX_ROUNDS reached in Phase 5; user aborted |
|
||||
| `aborted-impl-review` | MAX_ROUNDS reached in Phase 8; user aborted |
|
||||
| `aborted-verification` | Phase 7 retries exhausted; user aborted |
|
||||
| `failed` | Hard tooling failure |
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
<!-- Exact user prompt, verbatim. -->
|
||||
|
||||
## Interpretation
|
||||
|
||||
<!-- Short restatement of goal + out-of-scope items. -->
|
||||
|
||||
## Assumptions
|
||||
|
||||
<!-- Anything we're assuming and needs confirmation. Empty list OK after clarifying questions. -->
|
||||
|
||||
## Files
|
||||
|
||||
<!-- Files expected to be created / modified / deleted. Paths are absolute or repo-relative. -->
|
||||
|
||||
| Action | Path | Why |
|
||||
|--------|------|-----|
|
||||
| | | |
|
||||
|
||||
## Approach
|
||||
|
||||
<!-- 3-10 bullets describing implementation order. -->
|
||||
|
||||
## TDD Approach
|
||||
|
||||
<!-- One of:
|
||||
(a) **TDD applies** — list the failing test(s) to write first, then implementation, then confirm green.
|
||||
(b) **TDD auto-skipped** — reason must be exactly one of:
|
||||
- `pure-documentation`
|
||||
- `pure-comment-whitespace-rename`
|
||||
(c) **TDD user-approved skip** — user explicitly approved skipping TDD for this task.
|
||||
Record the approval timestamp (ISO-8601) and the specific reason.
|
||||
-->
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] <criterion 1>
|
||||
- [ ] <criterion 2>
|
||||
|
||||
## Verification
|
||||
|
||||
<!-- Commands to run:
|
||||
lint: <cmd>
|
||||
typecheck: <cmd>
|
||||
tests: <cmd>
|
||||
-->
|
||||
|
||||
## Rollback
|
||||
|
||||
<!-- How to undo: `git revert <hash>`, or manual steps if the change is not easily reversible. -->
|
||||
|
||||
---
|
||||
|
||||
## Runtime State
|
||||
|
||||
```yaml
|
||||
plan_review_round: 0
|
||||
implementation_review_round: 0
|
||||
CODEX_PLAN_SESSION_ID:
|
||||
CODEX_IMPL_SESSION_ID:
|
||||
CURSOR_PLAN_SESSION_ID:
|
||||
CURSOR_IMPL_SESSION_ID:
|
||||
OPENCODE_PLAN_SESSION_ID:
|
||||
OPENCODE_IMPL_SESSION_ID:
|
||||
last_phase_entered:
|
||||
last_round_ts:
|
||||
last_scan_outcome_plan:
|
||||
last_scan_outcome_impl:
|
||||
verification_attempts: 0
|
||||
tests_added_count: 0
|
||||
tdd_used: false
|
||||
```
|
||||
|
||||
## Review History
|
||||
|
||||
| Timestamp (ISO-8601) | Loop | Round | Verdict | Summary |
|
||||
|----------------------|------|-------|---------|---------|
|
||||
| | | | | |
|
||||
|
||||
## Final Status
|
||||
|
||||
<!-- Populate the terminal status, commit hash if any, rounds used, TDD usage, tests added, verification attempts, and any revisit notes. -->
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
- This file is the single persistent artifact for `do-task`. Do not split it or delete it on success.
|
||||
- `Status` must always match one of the enum values.
|
||||
- `Runtime State` is updated by the skill, not by the user.
|
||||
- Review History is append-only.
|
||||
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"$schema": "https://ai-coding-skills.dev/schemas/generated-manifest/v1.json",
|
||||
"generator": "scripts/generate-skills.mjs",
|
||||
"generatedRoot": "skills/do-task/claude-code",
|
||||
"files": [
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "0acbb9e6c130667d5dfb57a3c31cbb5f06a1570e8586dead032a7c3eae6bbc06"
|
||||
},
|
||||
{
|
||||
"path": "templates/task-plan.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "7c34d2c380e9252cf23ae63581ef2c882c6be28b97f75463f2211f7db6e9e0ca"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -3,6 +3,8 @@ name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review). ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/claude-code/ and run `pnpm run sync:pi`. -->
|
||||
|
||||
# Do Task (Claude Code)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
@@ -12,6 +14,7 @@ This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `i
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Claude Code CLI: `claude --version`
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- `superpowers:brainstorming`
|
||||
@@ -31,6 +34,7 @@ This variant depends on explicit sub-skill invocation via the `Skill` tool. Do N
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
@@ -38,24 +42,29 @@ This variant depends on explicit sub-skill invocation via the `Skill` tool. Do N
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
@@ -113,7 +122,6 @@ Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
@@ -127,11 +135,13 @@ When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the
|
||||
### Phase 4: Initialize Plan Workspace
|
||||
|
||||
**PLAN MODE CHECK:** If currently in plan mode:
|
||||
|
||||
1. Inform user that `task-plan.md` cannot be written while in plan mode.
|
||||
2. Instruct user to exit plan mode (approve plan or use `ExitPlanMode`).
|
||||
3. Proceed with file generation only after exiting plan mode.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
@@ -153,7 +163,7 @@ If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only afte
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
@@ -191,11 +201,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
@@ -221,16 +233,19 @@ Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
Invoke `superpowers:verification-before-completion` via the `Skill` tool.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
@@ -241,7 +256,7 @@ If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and pr
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
@@ -297,11 +312,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
@@ -336,6 +353,7 @@ fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
@@ -426,18 +444,19 @@ If `SCAN_MATCHES` is non-empty:
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
2. Wait for user response.
|
||||
3. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
4. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
5. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
@@ -450,7 +469,6 @@ Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
@@ -654,15 +672,19 @@ After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
@@ -720,6 +742,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
@@ -728,6 +751,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
@@ -737,10 +761,12 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/claude-code/ and run `pnpm run sync:pi`. -->
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Claude Code):** When generating or updating this file, the agent MUST be out of plan mode. Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through the `Skill` tool explicitly — no shell wrappers.
|
||||
@@ -132,7 +133,6 @@ tdd_used: false
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"$schema": "https://ai-coding-skills.dev/schemas/generated-manifest/v1.json",
|
||||
"generator": "scripts/generate-skills.mjs",
|
||||
"generatedRoot": "skills/do-task/codex",
|
||||
"files": [
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "534d823d34fcfa89935c3fd350726d921a57c806154a44e36e8bfc9bef1a5995"
|
||||
},
|
||||
{
|
||||
"path": "templates/task-plan.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "7e460a5cd7df46daa5b1bf8daa7aaf07f5ac0ee26eee2320826a16aedf1cf39a"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -3,6 +3,8 @@ name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in Codex. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/codex/ and run `pnpm run sync:pi`. -->
|
||||
|
||||
# Do Task (Codex Native Superpowers)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
@@ -14,6 +16,7 @@ This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `i
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Codex CLI: `codex --version`
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
|
||||
@@ -53,6 +56,7 @@ If any required dependency is missing, stop immediately and return:
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
@@ -60,24 +64,29 @@ If any required dependency is missing, stop immediately and return:
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
@@ -135,7 +144,6 @@ Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
@@ -151,6 +159,7 @@ When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the
|
||||
Codex has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
@@ -172,7 +181,7 @@ If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only afte
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
@@ -210,11 +219,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
@@ -240,16 +251,19 @@ Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
Invoke `superpowers:verification-before-completion` via native discovery.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
@@ -260,7 +274,7 @@ If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and pr
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
@@ -316,11 +330,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
@@ -355,6 +371,7 @@ fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
@@ -445,18 +462,19 @@ If `SCAN_MATCHES` is non-empty:
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
2. Wait for user response.
|
||||
3. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
4. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
5. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
@@ -469,7 +487,6 @@ Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
@@ -673,15 +690,19 @@ After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
@@ -739,6 +760,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
@@ -747,6 +769,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
@@ -756,10 +779,12 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/codex/ and run `pnpm run sync:pi`. -->
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Codex):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through native skill discovery from `~/.agents/skills/superpowers/<skill>/SKILL.md` — no `superpowers-codex` CLI wrappers. Checklist-driven sub-skills MUST track items with `update_plan` todos.
|
||||
@@ -132,7 +133,6 @@ tdd_used: false
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"$schema": "https://ai-coding-skills.dev/schemas/generated-manifest/v1.json",
|
||||
"generator": "scripts/generate-skills.mjs",
|
||||
"generatedRoot": "skills/do-task/cursor",
|
||||
"files": [
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "3a255d12f67be0eef3c45969befb23e3283a8a91f061053d939714d50e5f8cee"
|
||||
},
|
||||
{
|
||||
"path": "templates/task-plan.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "6af4c73434e932c1705ba13faf47be2152d61971f0b60ee9bece534d22493596"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -3,6 +3,8 @@ name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in Cursor Agent CLI. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/cursor/ and run `pnpm run sync:pi`. -->
|
||||
|
||||
# Do Task (Cursor Agent CLI)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
@@ -14,6 +16,7 @@ This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `i
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- Cursor Agent CLI: `cursor-agent --version` (install via `curl https://cursor.com/install -fsS | bash`). Binary is `cursor-agent`; the alias `cursor agent` also works.
|
||||
- `jq` (**required** — `do-task` always parses JSON output from at least the cursor reviewer branch, and other reviewers may produce JSON). Install via `brew install jq` (macOS) or your package manager. Verify: `jq --version`.
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
@@ -52,6 +55,7 @@ If any required dependency is missing, stop immediately and return:
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
@@ -59,24 +63,29 @@ If any required dependency is missing, stop immediately and return:
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
@@ -134,7 +143,6 @@ Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
@@ -150,6 +158,7 @@ When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the
|
||||
Cursor Agent CLI has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
@@ -171,7 +180,7 @@ If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only afte
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
@@ -209,11 +218,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
@@ -239,16 +250,19 @@ Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
Invoke `superpowers:verification-before-completion` via Cursor-native discovery.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
@@ -259,7 +273,7 @@ If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and pr
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
@@ -315,11 +329,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
@@ -358,6 +374,7 @@ fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
@@ -452,18 +469,19 @@ If `SCAN_MATCHES` is non-empty:
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
2. Wait for user response.
|
||||
3. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
4. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
5. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
@@ -476,7 +494,6 @@ Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
@@ -680,15 +697,19 @@ After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
@@ -746,6 +767,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
@@ -754,6 +776,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
@@ -763,10 +786,12 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/cursor/ and run `pnpm run sync:pi`. -->
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (Cursor):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through workspace discovery from `.cursor/skills/superpowers/skills/<skill>/SKILL.md` or `~/.cursor/skills/superpowers/skills/<skill>/SKILL.md`. Reviewer invocations MUST use `--mode=ask --trust --output-format json`. `jq` is a hard prerequisite.
|
||||
@@ -132,7 +133,6 @@ tdd_used: false
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"$schema": "https://ai-coding-skills.dev/schemas/generated-manifest/v1.json",
|
||||
"generator": "scripts/generate-skills.mjs",
|
||||
"generatedRoot": "skills/do-task/opencode",
|
||||
"files": [
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "ebbae7c050b3cfe0d1de8722cd71c8c10ba74de709adeec4ebb9645e2f770a03"
|
||||
},
|
||||
{
|
||||
"path": "templates/task-plan.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "a4c0ad9a9faaf738c503b403a285b6a005ee245b6b05f5af973c8e0224f6fefe"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -3,6 +3,8 @@ name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in OpenCode. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan).
|
||||
---
|
||||
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/opencode/ and run `pnpm run sync:pi`. -->
|
||||
|
||||
# Do Task (OpenCode)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
|
||||
@@ -14,6 +16,7 @@ This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `i
|
||||
## Prerequisite Check (MANDATORY)
|
||||
|
||||
Required:
|
||||
|
||||
- OpenCode CLI: `opencode --version` (install via your package manager or `brew install opencode`).
|
||||
- Superpowers repo: `https://github.com/obra/superpowers`
|
||||
- OpenCode Superpowers skills available at `~/.agents/skills/superpowers` or `~/.config/opencode/skills/superpowers`
|
||||
@@ -50,6 +53,7 @@ If any required dependency is missing, stop immediately and return:
|
||||
## Trigger Phrase Detection
|
||||
|
||||
**Binding triggers** (always invoke this skill):
|
||||
|
||||
- `/do-task`
|
||||
- "do this task"
|
||||
- "do task ..."
|
||||
@@ -57,24 +61,29 @@ If any required dependency is missing, stop immediately and return:
|
||||
- "make it so"
|
||||
|
||||
**Hint trigger** (invoke unless context clearly maps to another skill):
|
||||
|
||||
- "just do ..."
|
||||
|
||||
**Escape phrases** (skip the Phase 2 clarifying-question loop):
|
||||
|
||||
- `--no-questions`
|
||||
- `"just do it:"`
|
||||
- `"just do this:"`
|
||||
- `"no questions:"`
|
||||
|
||||
**Excluded** (do NOT trigger `do-task`):
|
||||
|
||||
- "implement this" — reserved for `implement-plan`.
|
||||
|
||||
**Dropped defaults** (explicitly NOT binding triggers):
|
||||
|
||||
- "work on ..."
|
||||
- "handle this"
|
||||
- "take care of ..."
|
||||
- "get this done"
|
||||
|
||||
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
|
||||
|
||||
- "in a worktree"
|
||||
- "use a worktree"
|
||||
- "on an isolated branch"
|
||||
@@ -133,7 +142,6 @@ Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
|
||||
|
||||
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
|
||||
|
||||
|
||||
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
|
||||
|
||||
```bash
|
||||
@@ -149,6 +157,7 @@ When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the
|
||||
OpenCode has no plan-mode concept; there is no plan-mode guard here.
|
||||
|
||||
Steps:
|
||||
|
||||
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
|
||||
2. Compute plan folder: `ai_plan/<slug>/`.
|
||||
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
|
||||
@@ -170,7 +179,7 @@ If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only afte
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = plan
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
|
||||
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
|
||||
@@ -208,11 +217,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: plan-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 6.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-plan-review`.
|
||||
- Send Telegram summary before stopping.
|
||||
- Ask the user whether to override and proceed, restart, or abort.
|
||||
@@ -238,16 +249,19 @@ Native orchestration — do not invoke `superpowers:executing-plans`.
|
||||
Invoke `superpowers/verification-before-completion` via OpenCode's native skill tool.
|
||||
|
||||
Run the commands listed in the `Verification` section of `task-plan.md`:
|
||||
|
||||
- Lint (changed files first).
|
||||
- Typecheck.
|
||||
- Tests (targeted first, then broader suite if quick).
|
||||
|
||||
All must pass. If a command fails:
|
||||
|
||||
- Fix the issue.
|
||||
- Re-run that command.
|
||||
- Increment `verification_attempts` in Runtime State.
|
||||
|
||||
If `verification_attempts` exceeds 3 without green:
|
||||
|
||||
- Set `Status: aborted-verification`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to retry, override, or abort.
|
||||
@@ -258,7 +272,7 @@ If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and pr
|
||||
|
||||
Otherwise, invoke the Review Loop (Shared Subroutine) with:
|
||||
|
||||
```
|
||||
```text
|
||||
REVIEW_KIND = implementation
|
||||
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
|
||||
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
|
||||
@@ -314,11 +328,13 @@ Rules:
|
||||
```
|
||||
|
||||
On APPROVED:
|
||||
|
||||
- Set `Status: implementation-approved`.
|
||||
- Append APPROVED row to Review History.
|
||||
- Proceed to Phase 9.
|
||||
|
||||
On MAX_ROUNDS:
|
||||
|
||||
- Set `Status: aborted-impl-review`.
|
||||
- Send Telegram summary.
|
||||
- Ask the user whether to override and commit anyway, restart, or abort.
|
||||
@@ -353,6 +369,7 @@ fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- Telegram is the only supported notification path.
|
||||
- Notification failures are non-blocking but must be surfaced to the user.
|
||||
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
|
||||
@@ -443,18 +460,19 @@ If `SCAN_MATCHES` is non-empty:
|
||||
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
|
||||
2. Present the redacted match summary to the user using this exact wording:
|
||||
|
||||
```
|
||||
```text
|
||||
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
|
||||
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
|
||||
...
|
||||
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
|
||||
```
|
||||
|
||||
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
|
||||
|
||||
2. Wait for user response.
|
||||
3. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
4. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
5. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
3. Wait for user response.
|
||||
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
|
||||
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
|
||||
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
|
||||
|
||||
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
|
||||
|
||||
@@ -467,7 +485,6 @@ Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID
|
||||
set -euo pipefail
|
||||
```
|
||||
|
||||
|
||||
**If `REVIEWER_CLI` is `pi`:**
|
||||
|
||||
Fresh call every round (Pi reviewer calls do not use session resume):
|
||||
@@ -671,15 +688,19 @@ After the command completes:
|
||||
|
||||
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
|
||||
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
|
||||
|
||||
```bash
|
||||
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
|
||||
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
|
||||
|
||||
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
|
||||
@@ -737,6 +758,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
1. Detect existing plan folder by slug at Phase 4.
|
||||
2. Read `task-plan.md` → `Status`.
|
||||
3. Decide next action:
|
||||
|
||||
| Status | Action |
|
||||
|--------|--------|
|
||||
| `draft` | Resume at Phase 5 (plan review) |
|
||||
@@ -745,6 +767,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
|
||||
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
|
||||
| `aborted-*` \| `failed` | Offer new suffix or full restart |
|
||||
|
||||
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
|
||||
|
||||
---
|
||||
@@ -754,10 +777,12 @@ If the round failed, produced empty output, or reached operator-decision timeout
|
||||
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
|
||||
|
||||
Before starting any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Update `last_phase_entered` in Runtime State.
|
||||
|
||||
After completing any phase:
|
||||
|
||||
1. Update `Status` if it transitions.
|
||||
2. Append notes to the relevant section of `task-plan.md`.
|
||||
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/opencode/ and run `pnpm run sync:pi`. -->
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (OpenCode):** Sub-skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) MUST be invoked through OpenCode's native skill tool from `~/.config/opencode/skills/superpowers/<skill>/SKILL.md`. Phase 1 MUST include the Bootstrap Superpowers Context step. Opencode reviewer calls MUST use `--agent plan` to stay read-only.
|
||||
@@ -132,7 +133,6 @@ tdd_used: false
|
||||
- Notes (anything the user should know when revisiting)
|
||||
-->
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Guardrails (do NOT remove)
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
{
|
||||
"$schema": "https://ai-coding-skills.dev/schemas/generated-manifest/v1.json",
|
||||
"generator": "scripts/generate-skills.mjs",
|
||||
"generatedRoot": "skills/do-task/pi",
|
||||
"files": [
|
||||
{
|
||||
"path": "SKILL.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "4920ad0cdeda546b37432c2268159724de54ddb01922308f2df88fcca4db8d31"
|
||||
},
|
||||
{
|
||||
"path": "templates/task-plan.md",
|
||||
"kind": "file",
|
||||
"mode": "644",
|
||||
"sha256": "fd38213fabf350e14b48c5209910d00c16ff74c455101618063835fa8c19e73e"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -3,6 +3,8 @@ name: do-task
|
||||
description: Execute a single user-supplied prompt end-to-end in pi with plan review, implementation review, verification, and one persistent task-plan artifact.
|
||||
---
|
||||
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/pi/ and run `pnpm run sync:pi`. -->
|
||||
|
||||
# Do Task (Pi)
|
||||
|
||||
Execute an ad-hoc user prompt end-to-end: parse, clarify, plan, implement, verify, review, commit, and optionally push.
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
<!-- ⚠️ GENERATED FILE – do not edit directly. Edit the canonical source in skills/do-task/_source/pi/ and run `pnpm run sync:pi`. -->
|
||||
# Task Plan: [Short Title]
|
||||
|
||||
> **Variant guardrail (pi):** Required workflow skills (`brainstorming`, `test-driven-development`, `verification-before-completion`, `finishing-a-development-branch`, `using-git-worktrees`) must be available to pi as documented in `docs/PI-SUPERPOWERS.md`. Load the relevant workflow skill before entering its matching phase.
|
||||
|
||||
Reference in New Issue
Block a user