37 KiB
name, description
| name | description |
|---|---|
| do-task | Execute a single user-supplied prompt end-to-end with two reviewer loops (plan review + implementation review) in OpenCode. ALWAYS invoke when the user says `/do-task`, "do this task", "do task ...", "execute this task", or "make it so". Also invoke on the hint phrase "just do ...". Do NOT invoke on "implement this" (that phrase is reserved for implement-plan). |
Do Task (OpenCode)
Execute an ad-hoc user prompt end-to-end: parse → clarify → plan (with reviewer loop) → implement (TDD-first where applicable) → verify → implementation review loop → commit → optional push → notify.
This is a single-artifact sibling of create-plan + implement-plan. Unlike implement-plan, do-task operates on one persistent task-plan.md (not a full milestone plan) and defaults to the current branch (not a worktree).
Core principle: OpenCode loads skills through its native skill tool. Local skills live under ~/.config/opencode/skills/, and OpenCode can also expose shared agent skills from ~/.agents/skills/. Sub-skill invocations use OpenCode's native mechanism — not Claude's Skill tool, not Cursor's workspace discovery.
Prerequisite Check (MANDATORY)
Required:
- OpenCode CLI:
opencode --version(install via your package manager orbrew install opencode). - Superpowers repo:
https://github.com/obra/superpowers - OpenCode Superpowers skills available at
~/.agents/skills/superpowersor~/.config/opencode/skills/superpowers superpowers/brainstormingsuperpowers/test-driven-developmentsuperpowers/verification-before-completionsuperpowers/finishing-a-development-branchsuperpowers/using-git-worktrees(only when the prompt opts in to a worktree)- Shared reviewer runtime:
~/.config/opencode/skills/reviewer-runtime/run-review.sh - Telegram notifier helper:
~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
Verify before proceeding:
opencode --version
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md || test -f ~/.config/opencode/skills/superpowers/brainstorming/SKILL.md
test -f ~/.agents/skills/superpowers/test-driven-development/SKILL.md || test -f ~/.config/opencode/skills/superpowers/test-driven-development/SKILL.md
test -f ~/.agents/skills/superpowers/verification-before-completion/SKILL.md || test -f ~/.config/opencode/skills/superpowers/verification-before-completion/SKILL.md
test -f ~/.agents/skills/superpowers/finishing-a-development-branch/SKILL.md || test -f ~/.config/opencode/skills/superpowers/finishing-a-development-branch/SKILL.md
If any required dependency is missing, stop immediately and return:
Missing dependency: [specific missing item]. Install required OpenCode Superpowers skills (https://github.com/obra/superpowers, OpenCode setup) and the reviewer-runtime helper, then retry.
Required Skill Invocation Rules
- Invoke relevant skills through OpenCode's native skill tool.
- Announce skill usage explicitly:
I've read the [Skill Name] skill and I'm using it to [purpose].
- For skills with checklists, track checklist items explicitly in conversation.
- Do NOT use Claude's
Skilltool syntax or Cursor's workspace discovery. OpenCode's skill system may expose shared files from~/.agents/skills/, but invocation still goes through OpenCode's native skill mechanism.
Trigger Phrase Detection
Binding triggers (always invoke this skill):
/do-task- "do this task"
- "do task ..."
- "execute this task"
- "make it so"
Hint trigger (invoke unless context clearly maps to another skill):
- "just do ..."
Escape phrases (skip the Phase 2 clarifying-question loop):
--no-questions"just do it:""just do this:""no questions:"
Excluded (do NOT trigger do-task):
- "implement this" — reserved for
implement-plan.
Dropped defaults (explicitly NOT binding triggers):
- "work on ..."
- "handle this"
- "take care of ..."
- "get this done"
Worktree opt-in phrases (Phase 4 takes the worktree branch):
- "in a worktree"
- "use a worktree"
- "on an isolated branch"
- "on a new branch called X"
Process
Phase 1: Preflight (includes Bootstrap Superpowers Context)
- Bootstrap Superpowers context — use OpenCode's native skill tool to list installed skills and confirm
superpowers/brainstorming,superpowers/test-driven-development,superpowers/verification-before-completion, andsuperpowers/finishing-a-development-branchare discoverable. If any is missing, stop with the Prerequisite Check error message. - Verify git repo:
git rev-parse --is-inside-work-tree. - Verify
/ai_plan/is present in.gitignore. If missing:- Append
/ai_plan/to.gitignore. - Commit that infra change immediately with message
chore(gitignore): ignore ai_plan local planning artifacts. - This infra commit is EXPLICITLY separate from the task commit in Phase 9. It may occur even when the final task ends up
abortedorfailed.
- Append
- Announce each sub-skill before invocation using:
I've read the [Skill Name] skill and I'm using it to [purpose].
Phase 2: Parse Prompt and Question
- Capture the exact user prompt verbatim.
- Detect trigger phrase (see above) and record which one matched.
- Detect escape phrase. If set, skip clarifying questions entirely.
- Apply the ask-first heuristic:
- Skip clarifying questions ONLY if ALL are true:
- Prompt names a concrete target (file, feature, or function).
- Prompt names a concrete outcome (what success looks like).
- Prompt has no ambiguous scope (no "and maybe also ...").
- All identifiers in the prompt are resolvable against the codebase.
- Otherwise, ask 1-3 clarifying questions, ONE AT A TIME, multiple-choice preferred.
- Empty prompt → ask exactly once: "what task?".
- Skip clarifying questions ONLY if ALL are true:
- Invoke
superpowers/brainstormingvia OpenCode's native skill tool for any behavior-changing task — feature creation, bug fix with multiple plausible approaches, refactor, design decision. Present 2-3 approaches and recommend one before finalizing the plan. The ONLY skip conditions are the same ones that allow TDD auto-skip:pure-documentationandpure-comment-whitespace-rename. When skipping, record the skip reason in the Interpretation section oftask-plan.md.
Phase 3: Configure Reviewer
If the user has already specified a reviewer CLI and model (e.g., "do task X, review with codex gpt-5.4"), use those values. If the user says "use defaults" or otherwise opts out of explicit configuration, proceed with REVIEWER_CLI=codex, REVIEWER_MODEL=gpt-5.4, and MAX_ROUNDS=10. Otherwise, ask:
-
Which CLI should review both the plan and the implementation?
codex— OpenAI Codex CLI (codex exec)claude— Claude Code CLI (claude -p)cursor— Cursor Agent CLI (cursor-agent -p)opencode— OpenCode CLI (opencode run)skip— No external review, proceed with user approval only at each loop.
-
Which model? (only if a CLI was chosen)
- For
codex: defaultgpt-5.4, alternatives:gpt-5.3-codex,o4-mini,o3. - For
claude: defaultsonnet, alternatives:opus,haiku. - For
cursor: runcursor-agent modelsfirst to see available models. - For
opencode: provider-qualified form<provider>/<model>(e.g.,anthropic/claude-sonnet-4-5,openai/gpt-5.4). Runopencode modelsto list available models. - Accept any model string the user provides.
- For
-
Max review rounds shared across both loops? (default: 10)
- If the user does not provide a value, set
MAX_ROUNDS=10.
- If the user does not provide a value, set
Store REVIEWER_CLI, REVIEWER_MODEL, and MAX_ROUNDS for Phases 5 and 8.
Reviewer CLI: codex, claude, cursor, opencode, pi, or skip.
If REVIEWER_CLI=pi, verify the Pi reviewer binary before entering the review loop:
pi --version
For shorthand pi/<pi-model-name>, split only on the first slash when the prefix is exactly pi; store the complete remainder in REVIEWER_MODEL. Examples: pi/claude-opus-4-7 -> claude-opus-4-7, pi/anthropic/claude-opus-4-7 -> anthropic/claude-opus-4-7, and pi/openrouter/anthropic/claude-opus-4-7 -> openrouter/anthropic/claude-opus-4-7.
When REVIEWER_CLI=pi, the reviewer model is configured independently from the model running this workflow. If the model/provider is unavailable, surface helper stderr/status and use pi --list-models [search] to inspect configured models.
Phase 4: Initialize Plan Workspace
OpenCode has no plan-mode concept; there is no plan-mode guard here.
Steps:
- Compute slug:
YYYY-MM-DD-<slug>where<slug>is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only). - Compute plan folder:
ai_plan/<slug>/. - Resume detection: If the folder already exists, read
task-plan.md:- If
Statusisdraftorplan-approvedorimplementation-in-progress: offer to resume, pick a new suffix (<slug>-v2), or abort. Default is resume. - If
Statusis any terminal value (pushed,local-only,aborted-*,failed): offer a new suffix or abort. Default is new suffix.
- If
- If not resuming, create the folder and write
task-plan.mdfrom the template attemplates/task-plan.md(this skill's template folder; falls back to~/.config/opencode/skills/do-task/templates/task-plan.mdwhen installed directly). - Fill in:
Metadatablock.Prompt(verbatim).Interpretation,Assumptions,Files,Approach,TDD Approach,Acceptance Criteria,Verification,Rollback.- Leave
Runtime State,Review History,Final Statusempty (skill updates these).
- Set
Status: draft.
Worktree branch: If the prompt opts in to a worktree (see Trigger Phrase Detection), invoke superpowers/using-git-worktrees via OpenCode's native skill tool before proceeding. Otherwise continue on the current branch.
Phase 5: Plan Review Loop
If REVIEWER_CLI=skip, present task-plan.md to the user and proceed only after explicit user approval.
Otherwise, invoke the Review Loop (Shared Subroutine) with:
REVIEW_KIND = plan
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
PROMPT_TEMPLATE = PLAN_REVIEW_PROMPT (see below)
SESSION_ID_VAR = CODEX_PLAN_SESSION_ID | CURSOR_PLAN_SESSION_ID | OPENCODE_PLAN_SESSION_ID
Payload is the current task-plan.md with the Runtime State and Review History blocks stripped before writing to PAYLOAD_PATH. Those two blocks contain reviewer session IDs and scan outcomes that must never be sent back to any reviewer CLI. Reviewers only need the Prompt, Interpretation, Assumptions, Files, Approach, TDD Approach, Acceptance Criteria, Verification, Rollback, and Metadata sections.
PLAN_REVIEW_PROMPT:
Review this task plan for completeness, correctness, and risk. Focus on:
1. Does the plan match the user's prompt?
2. Are all assumptions surfaced?
3. Are acceptance criteria testable?
4. Is the TDD approach appropriate per the TDD Approach section?
5. Are there missing files, risks, or security concerns?
Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict
Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
On APPROVED:
- Set
Status: plan-approved. - Append APPROVED row to Review History.
- Proceed to Phase 6.
On MAX_ROUNDS:
- Set
Status: aborted-plan-review. - Send Telegram summary before stopping.
- Ask the user whether to override and proceed, restart, or abort.
Phase 6: Execute (TDD-first where applicable)
Native orchestration — do not invoke superpowers:executing-plans.
- Set
Status: implementation-in-progress. - For every behavior-changing file edit:
- Invoke
superpowers/test-driven-developmentvia OpenCode's native skill tool. - Write the failing test first. Run it. Confirm it fails.
- Implement the minimal code to make it pass. Run the test. Confirm green.
- Do NOT commit yet — a single task commit happens in Phase 9.
- Invoke
- Auto-skip of TDD is permitted ONLY for tasks classified in
task-plan.mdTDD Approach as:pure-documentationpure-comment-whitespace-rename
- Any other skip (including
pure-config-addition) requires explicit user approval recorded intask-plan.mdwith an ISO-8601 timestamp. - Update
task-plan.mdafter each logical step: add notes toApproach, check offAcceptance Criteriaitems as they complete.
Phase 7: Verification Gate
Invoke superpowers/verification-before-completion via OpenCode's native skill tool.
Run the commands listed in the Verification section of task-plan.md:
- Lint (changed files first).
- Typecheck.
- Tests (targeted first, then broader suite if quick).
All must pass. If a command fails:
- Fix the issue.
- Re-run that command.
- Increment
verification_attemptsin Runtime State.
If verification_attempts exceeds 3 without green:
- Set
Status: aborted-verification. - Send Telegram summary.
- Ask the user whether to retry, override, or abort.
Phase 8: Implementation Review Loop
If REVIEWER_CLI=skip, present a diff + verification summary to the user and proceed only after explicit user approval.
Otherwise, invoke the Review Loop (Shared Subroutine) with:
REVIEW_KIND = implementation
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
PROMPT_TEMPLATE = IMPL_REVIEW_PROMPT (see below)
SESSION_ID_VAR = CODEX_IMPL_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_IMPL_SESSION_ID
Payload contents (assembled by the skill):
# Implementation Review: [Short Title]
## Task Plan (the plan that was approved)
<embed approved task-plan.md, excluding Runtime State block>
## Changes Made (git diff)
<output of: `git diff` for unstaged + `git diff --staged` for staged>
## Verification Output
### Lint
<lint output>
### Typecheck
<typecheck output>
### Tests
<test output, pass/fail counts>
IMPL_REVIEW_PROMPT:
Review this implementation against the task plan. Focus on:
1. Correctness — Does the diff satisfy the Acceptance Criteria?
2. Code quality — Clean, maintainable, no obvious issues?
3. Test coverage — Are behavior changes adequately tested (per the plan's TDD Approach)?
4. Security — Any security concerns introduced?
5. Regressions — Does the diff risk breaking unrelated code?
Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict
Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`.
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking.
On APPROVED:
- Set
Status: implementation-approved. - Append APPROVED row to Review History.
- Proceed to Phase 9.
On MAX_ROUNDS:
- Set
Status: aborted-impl-review. - Send Telegram summary.
- Ask the user whether to override and commit anyway, restart, or abort.
Phase 9: Commit + Push Ask
Invoke superpowers/finishing-a-development-branch via OpenCode's native skill tool.
- Stage all changed files explicitly (avoid
git add -A). - Single commit with message derived from the task goal:
- Format:
<type>(<scope>): <short description> - Example:
feat(auth): add session token rotation
- Format:
- Do NOT push. Update
Status: local-only. - Ask the user: "Push to remote? (yes / no)"
- On explicit
yes→ push, then setStatus: pushed. - Any other response → leave
Status: local-only.
- On explicit
Phase 10: Telegram Notification + Finalize
Resolve the notifier helper:
TELEGRAM_NOTIFY_RUNTIME=~/.config/opencode/skills/reviewer-runtime/notify-telegram.sh
On every terminal outcome (pushed, local-only, aborted-*, failed), send a Telegram summary if both TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID are set:
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
"$TELEGRAM_NOTIFY_RUNTIME" --message "do-task <slug>: <status summary>"
fi
Rules:
- Telegram is the only supported notification path.
- Notification failures are non-blocking but must be surfaced to the user.
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
- If Telegram is not configured, state that no Telegram notification was sent.
Fill in Final Status in task-plan.md (include commit hash if any). Do NOT delete the plan folder — it stays as a record.
Review Loop (Shared Subroutine)
This subroutine is invoked twice per do-task run: once in Phase 5 (REVIEW_KIND=plan) and once in Phase 8 (REVIEW_KIND=implementation). Separate session IDs are used for each loop so reviewer context never leaks across loops.
Subroutine Inputs
| Variable | Purpose |
|---|---|
REVIEW_KIND |
plan or implementation |
REVIEW_ID |
8-char hex (from uuidgen); reused across rounds of the same loop |
PAYLOAD_PATH |
/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md |
PROMPT_TEMPLATE |
PLAN_REVIEW_PROMPT or IMPL_REVIEW_PROMPT |
REVIEWER_CLI |
codex | claude | cursor | opencode | pi |
REVIEWER_MODEL |
Model name |
MAX_ROUNDS |
Default 10 |
SESSION_ID_VAR |
CODEX_PLAN_SESSION_ID | CODEX_IMPL_SESSION_ID | CURSOR_PLAN_SESSION_ID | CURSOR_IMPL_SESSION_ID | OPENCODE_PLAN_SESSION_ID | OPENCODE_IMPL_SESSION_ID |
Temp artifact paths (per loop):
/tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md— payload/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md— normalized review text/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json— raw Cursor/OpenCode JSON (cursor only, plus opencode when--format jsonis used)/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
Resolve the shared helper:
REVIEWER_RUNTIME=~/.config/opencode/skills/reviewer-runtime/run-review.sh
Set helper success-artifact args before writing the command script:
HELPER_SUCCESS_FILE_ARGS=()
case "$REVIEWER_CLI" in
codex)
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md)
;;
cursor)
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
;;
esac
Step 1: Write Payload
Write the full payload for this round to PAYLOAD_PATH.
Step 1a: Secret Scan (per-payload, no caching)
BEFORE sending the payload to any reviewer CLI, scan it for secrets. This scan runs EVERY round — no results are cached. Rationale: Phase 8 payloads include newly-introduced diff content that earlier rounds never saw.
Run the secret scan with all of these anchored regexes. Use grep -En on the payload file:
SECRET_REGEX_FILE=$(mktemp)
cat >"$SECRET_REGEX_FILE" <<'EOF'
AKIA[0-9A-Z]{16}
"type"\s*:\s*"service_account"
(ghp|gho|ghs|ghu|ghr)_[A-Za-z0-9]{36,}
xox[abpsr]-[0-9]+-[0-9]+-[0-9]+-[A-Za-z0-9]{24,}
xox[abpsr]-[A-Za-z0-9]{10,48}
sk-(proj-)?[A-Za-z0-9_-]{20,}
sk-ant-(api|admin)[0-9]+-[A-Za-z0-9_-]{20,}
-----BEGIN [A-Z ]+ PRIVATE KEY-----
(TOKEN|SECRET|PASSWORD|API_?KEY|ACCESS_?KEY)\s*=\s*["']?[A-Za-z0-9+/=_-]{8,}
eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+
EOF
SCAN_MATCHES=$(grep -Ensf "$SECRET_REGEX_FILE" "$PAYLOAD_PATH" || true)
rm -f "$SECRET_REGEX_FILE"
If SCAN_MATCHES is non-empty:
-
Redact the matched text before surfacing — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match:
[REDACTED:<pattern-label>:<match-length>-chars]. Example: a matched AWS key becomes[REDACTED:aws-access-key:20-chars]. Keep the file path and line number; they are useful for the user and not secret. -
Present the redacted match summary to the user using this exact wording:
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N): <file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars] ... Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)Pattern labels:
aws-access-key,gcp-service-account,github-token,slack-token,openai-key,anthropic-key,pem-private-key,dotenv-style,jwt. -
Wait for user response.
-
On
yes: recordlast_scan_outcome_${REVIEW_KIND}=user-approved-with-matchesin Runtime State, and proceed. -
On
redact: ask the user to supply redactions, apply them toPAYLOAD_PATH, re-scan (this step), recordlast_scan_outcome_${REVIEW_KIND}=redacted-and-approved. -
On
no: stop the loop, setStatus: failed, send Telegram, return to the user.
If SCAN_MATCHES is empty, record last_scan_outcome_${REVIEW_KIND}=clean and proceed.
Step 2: Generate Reviewer Command Script
Write the reviewer invocation to /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh as a bash script starting with:
#!/usr/bin/env bash
set -euo pipefail
If REVIEWER_CLI is pi:
Fresh call every round (Pi reviewer calls do not use session resume):
pi --no-session --no-skills --no-prompt-templates --no-extensions --no-context-files \
--model "$REVIEWER_MODEL" \
--tools read,grep,find,ls \
-p "Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review. Return exactly the required ## Summary, ## Findings, and ## Verdict structure."
If REVIEWER_CLI is codex:
Round 1 — fresh codex exec:
codex exec \
-m ${REVIEWER_MODEL} \
-s read-only \
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
"Review the ${REVIEW_KIND} payload in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
${PROMPT_TEMPLATE}"
Do not capture the Codex session ID yet. After Round 1 completes, extract it from /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out (look for session id: <uuid>) and persist it to Runtime State under ${SESSION_ID_VAR}.
Round 2 and later — resume session:
codex exec resume ${SESSION_ID_VAR_VALUE} \
-o /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
Changes made:
[List specific changes]
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before.
Keep findings ordered P0 to P3, use '- None.' when a severity has no findings, and only use VERDICT: APPROVED when no P0, P1, or P2 findings remain."
If resume fails, fall back to fresh codex exec with prior-round context.
If REVIEWER_CLI is claude:
Fresh call every round (Claude CLI has no session resume):
claude -p \
"${ROUND_PREFIX}Review the following ${REVIEW_KIND} payload.
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
${PROMPT_TEMPLATE}" \
--model ${REVIEWER_MODEL} \
--strict-mcp-config \
--setting-sources user
Where ${ROUND_PREFIX} is empty for Round 1 and "You previously reviewed this ${REVIEW_KIND} and requested revisions. Previous feedback summary: [key points]. " for subsequent rounds.
If REVIEWER_CLI is cursor:
Round 1:
cursor-agent -p \
--mode=ask \
--model ${REVIEWER_MODEL} \
--trust \
--output-format json \
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
${PROMPT_TEMPLATE}" \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
Round 2 and later — resume:
cursor-agent --resume ${SESSION_ID_VAR_VALUE} -p \
--mode=ask \
--model ${REVIEWER_MODEL} \
--trust \
--output-format json \
"I've revised based on your feedback. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
Changes made:
[List specific changes]
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
If resume fails, fall back to fresh cursor-agent -p.
After the command completes, extract the session id and review text:
CURSOR_SID=$(jq -r '.session_id' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json)
jq -r '.result' /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
Persist CURSOR_SID to Runtime State under ${SESSION_ID_VAR} on Round 1.
If REVIEWER_CLI is opencode:
OpenCode does not expose a dedicated read-only flag at the CLI level; use the built-in plan primary agent (--agent plan) for review, which is read-oriented and does not modify files. Session resume is supported via -s <session-id>, but the most reliable pattern for non-interactive review is fresh call each round (like claude) because opencode's session lifecycle and ID capture are less standardized than codex/cursor for headless runs. Skills MAY opt-in to session resume when they have verified the installed opencode version exposes a stable session id in --format json output.
Round 1 (preferred, fresh call):
opencode run \
-m ${REVIEWER_MODEL} \
--agent plan \
--format json \
"Read the file /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md and review.
${PROMPT_TEMPLATE}" \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
Round 2 and later (fresh-call fallback path — recommended default):
opencode run \
-m ${REVIEWER_MODEL} \
--agent plan \
--format json \
"You previously reviewed this ${REVIEW_KIND} and requested revisions.
Previous feedback summary: [key points from last review]
I've revised. Updated payload is below.
$(cat /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md)
Changes made:
[List specific changes]
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
Optional session-resume path (only if the installed opencode reliably emits a session id in --format json output and accepts it back via -s):
# Round 2+ with resume
opencode run \
-s ${SESSION_ID_VAR_VALUE} \
-m ${REVIEWER_MODEL} \
--agent plan \
--format json \
"I've revised. Updated payload is in /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md.
Changes made:
[List specific changes]
Re-review using the same ## Summary, ## Findings, and ## Verdict structure as before." \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json
Extract the review body (the JSON stream emits events; the final assistant message contains the review text):
jq -r '.[] | select(.type == "message" and .role == "assistant") | .content' \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
> /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
|| cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
If the JSON parse falls through, promote the raw JSON file as the review output and surface a warning to the user. On any opencode CLI or JSON parsing failure, treat this loop round as completed-empty-output and follow the helper-failure escalation in Step 6.
Step 3: Run via run-review.sh
Run the command script through the shared helper when available:
if [ -x "$REVIEWER_RUNTIME" ]; then
"$REVIEWER_RUNTIME" \
--command-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
--stdout-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
--stderr-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
--status-file /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
"${HELPER_SUCCESS_FILE_ARGS[@]}"
else
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
bash /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh \
>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
2>/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr
fi
Run the helper in the foreground and watch live stdout for state=in-progress heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll the .status file instead of treating heartbeats as post-hoc-only data.
Step 4: Promote Reviewer Output + Capture Session ID
After the command completes:
cursor: already promoted in Step 2 viajq -r '.result' .... Also capturesession_idif first round.codex: extractCODEX_SESSION_IDfrom/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.outafter the helper or fallback run. If the review text lives only in.runner.out,cpit into the.mdfile:cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \ /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.mdclaudeorpi: promote.runner.outinto the.mdfile:cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \ /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.mdopencode: already promoted in Step 2 viajqon the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to${SESSION_ID_VAR}.
On Round 1, persist the captured session ID (if any) into task-plan.md's Runtime State under ${SESSION_ID_VAR}.
Step 5: Parse Verdict + Update Review History
- Read
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md. - Append one row to
task-plan.mdReview History:- Timestamp (ISO-8601 UTC).
- Loop (
planorimplementation). - Round number.
- Verdict (
APPROVEDorREVISE). - Summary (first line of the
## Summarysection).
- Increment
plan_review_roundorimplementation_review_roundin Runtime State.
Step 6: Branch APPROVED / REVISE / MAX_ROUNDS
Verdict rules:
- VERDICT: APPROVED with no
P0,P1, orP2findings → exit the subroutine withAPPROVED. - VERDICT: APPROVED with only
P3findings → optionally fix theP3items if cheap and safe, then exit withAPPROVED. - VERDICT: REVISE or any
P0,P1, orP2finding → go to revision (see below), then return to Step 1 for the next round. - No clear verdict but
P0,P1, andP2are all- None.→ treat as APPROVED. - Helper state
completed-empty-output→ treat as failed review attempt, surface.stderr/.status, fix invocation or prompt handling, then retry. - Helper state
needs-operator-decision→ surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters. - Round counter ≥
MAX_ROUNDS→ exit the subroutine withMAX_ROUNDS. Caller decides next action per Phase 5 or Phase 8.
Revision: The caller (Phase 5 for plan, Phase 6/7 for implementation) applies findings in priority order (P0 → P1 → P2 → P3). For implementation review revisions, Phase 7 verification must be re-run after every revision before returning to Step 1.
Step 7: Liveness Contract (during Step 3)
- The shared reviewer runtime emits
state=in-progress note="In progress N"heartbeats every 60 seconds while the reviewer child is alive. - Keep waiting as long as a fresh
In progress Nheartbeat keeps arriving roughly once per minute. - Do not abort just because the review is slow, a soft timeout fired, or a
stall-warningline appears, as long as theIn progress Nheartbeat continues. - Treat missing heartbeats,
state=failed,state=completed-empty-output, andstate=needs-operator-decisionas escalation signals.
Step 8: Cleanup (on successful round exit)
rm -f /tmp/do-task-${REVIEW_KIND}-${REVIEW_ID}.md \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.json \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.stderr \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.status \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.sh
If the round failed, produced empty output, or reached operator-decision timeout, KEEP .stderr, .status, and .runner.out until the issue is diagnosed instead of deleting them.
Resume Semantics
- Detect existing plan folder by slug at Phase 4.
- Read
task-plan.md→Status. - Decide next action:
Status Action draftResume at Phase 5 (plan review) plan-approvedResume at Phase 6 (execute) implementation-in-progressResume at Phase 6 (continue execute) implementation-approvedResume at Phase 9 (commit + push ask) pushed|local-onlyAsk user: new suffix, abort, or replay for reference only aborted-*|failedOffer new suffix or full restart - When resuming, read Runtime State for
CODEX_PLAN_SESSION_ID,CODEX_IMPL_SESSION_ID,CURSOR_PLAN_SESSION_ID,CURSOR_IMPL_SESSION_ID,OPENCODE_PLAN_SESSION_ID,OPENCODE_IMPL_SESSION_ID, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) viacodex exec resume,cursor-agent --resume, oropencode run -s <id>as applicable.
Tracker Discipline (MANDATORY)
ALWAYS update task-plan.md before/after each phase transition. NEVER proceed with stale state.
Before starting any phase:
- Update
Statusif it transitions. - Update
last_phase_enteredin Runtime State.
After completing any phase:
- Update
Statusif it transitions. - Append notes to the relevant section of
task-plan.md.
Review History is append-only.
Execution Workflow Rules
- Current branch is the default; worktree is opt-in only.
- Do NOT push without explicit "yes".
- Secret scan runs per-payload, no caching — every round, including revisions.
- Review loops use
MAX_ROUNDS=10by default, shared across both loops. - The task commit is a single commit created in Phase 9; interim WIP commits are NOT created.
- The
.gitignoreinfra commit in Phase 1 is explicitly separate from the task commit and is allowed even on abort.
Verification Checklist
ai_plan/exists and/ai_plan/is in.gitignoretask-plan.mdcreated underai_plan/YYYY-MM-DD-<slug>/- Reviewer CLI + model +
MAX_ROUNDSconfigured (orskip) - Secret scan ran on every outbound reviewer payload
- Plan review completed (APPROVED, MAX_ROUNDS handled, or skipped)
- Phase 6 executed TDD-first for all behavior-changing steps (or documented skip)
- Phase 7 verification green before Phase 8
- Implementation review completed (APPROVED, MAX_ROUNDS handled, or skipped)
- Single task commit created locally, no push without explicit yes
- Telegram notification attempted if configured
task-plan.mdFinal Status filled in
Variant Hardening Notes — OpenCode
- Must use OpenCode's native skill tool for sub-skill invocation. Do NOT use Claude's
Skilltool syntax. OpenCode may load shared skill files from~/.agents/skills/, but invocation is still OpenCode-native. - Phase 1 includes a Bootstrap Superpowers Context step that lists installed skills and confirms
superpowers/brainstorming,superpowers/test-driven-development,superpowers/verification-before-completion, andsuperpowers/finishing-a-development-branchare discoverable before any other phase runs. - Helper paths are
~/.config/opencode/skills/reviewer-runtime/{run-review.sh,notify-telegram.sh}. - OpenCode reviewer CLI branch (when
REVIEWER_CLI=opencode):- Binary:
opencode. Non-interactive:opencode run "<message>". - Model:
-m <provider>/<model>(e.g.,openai/gpt-5.4,anthropic/claude-sonnet-4-5). - Read-only posture:
--agent plan(uses OpenCode's built-in plan primary agent; no explicit--read-onlyflag exists). - Output:
--format jsonfor structured output. Extraction usesjqagainst the JSON event stream. - Session resume:
-s <session-id>or--continue. Fresh call each round is the recommended default since session id capture is less standardized than codex/cursor for headless runs.
- Binary:
- No plan-mode guard (OpenCode has no plan-mode concept).
Common Mistakes
- Skipping the Bootstrap Superpowers Context step in Phase 1 (breaks native skill discovery).
- Using Claude
Skilltool syntax, or treating shared~/.agents/skills/files as anything other than OpenCode-native skill entries. - Forgetting to set
--agent planon opencode reviewer calls (would use the defaultbuildagent which can write files). - Asking multiple clarifying questions in a single message.
- Skipping the per-payload secret scan because "the previous round was clean".
- Pushing the task commit without explicit user approval.
- Using a non-provider-qualified model string for opencode (e.g.,
gpt-5.4instead ofopenai/gpt-5.4).
Red Flags — Stop and Correct
- You are invoking sub-skills via Claude's
Skilltool or Codex native-discovery paths instead of OpenCode's native skill tool. - You are running an opencode reviewer call without
--agent plan. - You did not announce which skill you invoked and why.
- You are proceeding to implementation review with failing lint/typecheck/tests.
- You are echoing raw secret-scan matches to the user or logs.
- You are pushing without explicit user approval.