feat(M2): Documentation flow, accuracy, consistency cleanup, and cross-platform shell portability

This commit is contained in:
Stefano Fiorini
2026-05-03 20:14:44 -05:00
parent 0443381aa0
commit be993429c1
59 changed files with 1898 additions and 385 deletions
+33 -9
View File
@@ -12,6 +12,7 @@ This is a single-artifact sibling of `create-plan` + `implement-plan`. Unlike `i
## Prerequisite Check (MANDATORY)
Required:
- Claude Code CLI: `claude --version`
- Superpowers repo: `https://github.com/obra/superpowers`
- `superpowers:brainstorming`
@@ -31,6 +32,7 @@ This variant depends on explicit sub-skill invocation via the `Skill` tool. Do N
## Trigger Phrase Detection
**Binding triggers** (always invoke this skill):
- `/do-task`
- "do this task"
- "do task ..."
@@ -38,24 +40,29 @@ This variant depends on explicit sub-skill invocation via the `Skill` tool. Do N
- "make it so"
**Hint trigger** (invoke unless context clearly maps to another skill):
- "just do ..."
**Escape phrases** (skip the Phase 2 clarifying-question loop):
- `--no-questions`
- `"just do it:"`
- `"just do this:"`
- `"no questions:"`
**Excluded** (do NOT trigger `do-task`):
- "implement this" — reserved for `implement-plan`.
**Dropped defaults** (explicitly NOT binding triggers):
- "work on ..."
- "handle this"
- "take care of ..."
- "get this done"
**Worktree opt-in phrases** (Phase 4 takes the worktree branch):
- "in a worktree"
- "use a worktree"
- "on an isolated branch"
@@ -113,7 +120,6 @@ Store `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phases 5 and 8.
Reviewer CLI: `codex`, `claude`, `cursor`, `opencode`, `pi`, or `skip`.
If `REVIEWER_CLI=pi`, verify the Pi reviewer binary before entering the review loop:
```bash
@@ -127,11 +133,13 @@ When `REVIEWER_CLI=pi`, the reviewer model is configured independently from the
### Phase 4: Initialize Plan Workspace
**PLAN MODE CHECK:** If currently in plan mode:
1. Inform user that `task-plan.md` cannot be written while in plan mode.
2. Instruct user to exit plan mode (approve plan or use `ExitPlanMode`).
3. Proceed with file generation only after exiting plan mode.
Steps:
1. Compute slug: `YYYY-MM-DD-<slug>` where `<slug>` is a kebab-case hash of the task goal (lowercase, alphanumeric + hyphens only).
2. Compute plan folder: `ai_plan/<slug>/`.
3. **Resume detection:** If the folder already exists, read `task-plan.md`:
@@ -153,7 +161,7 @@ If `REVIEWER_CLI=skip`, present `task-plan.md` to the user and proceed only afte
Otherwise, invoke the Review Loop (Shared Subroutine) with:
```
```text
REVIEW_KIND = plan
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
PAYLOAD_PATH = /tmp/do-task-plan-${REVIEW_ID}.md
@@ -191,11 +199,13 @@ Rules:
```
On APPROVED:
- Set `Status: plan-approved`.
- Append APPROVED row to Review History.
- Proceed to Phase 6.
On MAX_ROUNDS:
- Set `Status: aborted-plan-review`.
- Send Telegram summary before stopping.
- Ask the user whether to override and proceed, restart, or abort.
@@ -221,16 +231,19 @@ Native orchestration — do not invoke `superpowers:executing-plans`.
Invoke `superpowers:verification-before-completion` via the `Skill` tool.
Run the commands listed in the `Verification` section of `task-plan.md`:
- Lint (changed files first).
- Typecheck.
- Tests (targeted first, then broader suite if quick).
All must pass. If a command fails:
- Fix the issue.
- Re-run that command.
- Increment `verification_attempts` in Runtime State.
If `verification_attempts` exceeds 3 without green:
- Set `Status: aborted-verification`.
- Send Telegram summary.
- Ask the user whether to retry, override, or abort.
@@ -241,7 +254,7 @@ If `REVIEWER_CLI=skip`, present a diff + verification summary to the user and pr
Otherwise, invoke the Review Loop (Shared Subroutine) with:
```
```text
REVIEW_KIND = implementation
REVIEW_ID = $(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8) # distinct from plan-review ID
PAYLOAD_PATH = /tmp/do-task-implementation-${REVIEW_ID}.md
@@ -297,11 +310,13 @@ Rules:
```
On APPROVED:
- Set `Status: implementation-approved`.
- Append APPROVED row to Review History.
- Proceed to Phase 9.
On MAX_ROUNDS:
- Set `Status: aborted-impl-review`.
- Send Telegram summary.
- Ask the user whether to override and commit anyway, restart, or abort.
@@ -336,6 +351,7 @@ fi
```
Rules:
- Telegram is the only supported notification path.
- Notification failures are non-blocking but must be surfaced to the user.
- Before stopping for any user interaction, approval, or manual decision, send a Telegram summary first if configured.
@@ -426,18 +442,19 @@ If `SCAN_MATCHES` is non-empty:
1. **Redact the matched text before surfacing** — never echo the raw secret to the user, chat log, terminal scrollback, or any persistent file. Replace each matched substring with a fixed token that preserves only the fact of a match: `[REDACTED:<pattern-label>:<match-length>-chars]`. Example: a matched AWS key becomes `[REDACTED:aws-access-key:20-chars]`. Keep the file path and line number; they are useful for the user and not secret.
2. Present the redacted match summary to the user using this exact wording:
```
```text
SECRET-SCAN MATCH in outbound reviewer payload (loop: ${REVIEW_KIND}, round: N):
<file>:<line>: [REDACTED:<pattern-label>:<match-length>-chars]
...
Proceed with sending this payload to ${REVIEWER_CLI}? (yes / no / redact)
```
Pattern labels: `aws-access-key`, `gcp-service-account`, `github-token`, `slack-token`, `openai-key`, `anthropic-key`, `pem-private-key`, `dotenv-style`, `jwt`.
2. Wait for user response.
3. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
4. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
5. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
3. Wait for user response.
4. On `yes`: record `last_scan_outcome_${REVIEW_KIND}=user-approved-with-matches` in Runtime State, and proceed.
5. On `redact`: ask the user to supply redactions, apply them to `PAYLOAD_PATH`, re-scan (this step), record `last_scan_outcome_${REVIEW_KIND}=redacted-and-approved`.
6. On `no`: stop the loop, set `Status: failed`, send Telegram, return to the user.
If `SCAN_MATCHES` is empty, record `last_scan_outcome_${REVIEW_KIND}=clean` and proceed.
@@ -450,7 +467,6 @@ Write the reviewer invocation to `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID
set -euo pipefail
```
**If `REVIEWER_CLI` is `pi`:**
Fresh call every round (Pi reviewer calls do not use session resume):
@@ -654,15 +670,19 @@ After the command completes:
- `cursor`: already promoted in Step 2 via `jq -r '.result' ...`. Also capture `session_id` if first round.
- `codex`: extract `CODEX_SESSION_ID` from `/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text lives only in `.runner.out`, `cp` it into the `.md` file:
```bash
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
```
- `claude` or `pi`: promote `.runner.out` into the `.md` file:
```bash
cp /tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.runner.out \
/tmp/do-task-${REVIEW_KIND}-review-${REVIEW_ID}.md
```
- `opencode`: already promoted in Step 2 via `jq` on the JSON stream. If opt-in session-resume is active and the JSON includes a stable session id, capture it and persist to `${SESSION_ID_VAR}`.
On Round 1, persist the captured session ID (if any) into `task-plan.md`'s Runtime State under `${SESSION_ID_VAR}`.
@@ -720,6 +740,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
1. Detect existing plan folder by slug at Phase 4.
2. Read `task-plan.md` → `Status`.
3. Decide next action:
| Status | Action |
|--------|--------|
| `draft` | Resume at Phase 5 (plan review) |
@@ -728,6 +749,7 @@ If the round failed, produced empty output, or reached operator-decision timeout
| `implementation-approved` | Resume at Phase 9 (commit + push ask) |
| `pushed` \| `local-only` | Ask user: new suffix, abort, or replay for reference only |
| `aborted-*` \| `failed` | Offer new suffix or full restart |
4. When resuming, read Runtime State for `CODEX_PLAN_SESSION_ID`, `CODEX_IMPL_SESSION_ID`, `CURSOR_PLAN_SESSION_ID`, `CURSOR_IMPL_SESSION_ID`, `OPENCODE_PLAN_SESSION_ID`, `OPENCODE_IMPL_SESSION_ID`, and the round counters. If a session ID is populated, use it for the first revision round in that loop (Round 2) via `codex exec resume`, `cursor-agent --resume`, or `opencode run -s <id>` as applicable.
---
@@ -737,10 +759,12 @@ If the round failed, produced empty output, or reached operator-decision timeout
**ALWAYS update `task-plan.md` before/after each phase transition. NEVER proceed with stale state.**
Before starting any phase:
1. Update `Status` if it transitions.
2. Update `last_phase_entered` in Runtime State.
After completing any phase:
1. Update `Status` if it transitions.
2. Append notes to the relevant section of `task-plan.md`.
@@ -132,7 +132,6 @@ tdd_used: false
- Notes (anything the user should know when revisiting)
-->
---
## Guardrails (do NOT remove)