Files

Stefano Fiorini 63a048a26c Align reviewer runtime and Telegram notifications

2026-03-24 11:45:58 -05:00

22 KiB

Raw Blame History

name, description

name	description
create-plan	Use when a user asks to create or maintain a structured implementation plan in Codex, including milestones, bite-sized stories, and resumable local planning artifacts under ai_plan.

Create Plan (Codex Native Superpowers)

Create and maintain a local plan workspace under ai_plan/ at project root.

Overview

This skill wraps the current Superpowers flow for Codex:

Design first with superpowers:brainstorming
Then build an implementation plan with superpowers:writing-plans
Review the plan iteratively with a second model/provider
Persist a local execution package in ai_plan/YYYY-MM-DD-<short-title>/

Core principle: Codex uses native skill discovery from ~/.agents/skills/. Do not use deprecated superpowers-codex bootstrap or use-skill CLI commands.

Prerequisite Check (MANDATORY)

Required:

Superpowers skills symlink: ~/.agents/skills/superpowers -> ~/.codex/superpowers/skills
superpowers:brainstorming
superpowers:writing-plans

Verify before proceeding:

test -L ~/.agents/skills/superpowers
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
test -f ~/.agents/skills/superpowers/writing-plans/SKILL.md

If any dependency is missing, stop and return:

Missing dependency: native Superpowers skills are required (superpowers:brainstorming, superpowers:writing-plans). Ensure ~/.agents/skills/superpowers is configured, then retry.

Required Skill Invocation Rules

Invoke relevant skills through native discovery (no CLI wrapper).
Announce skill usage explicitly:
- I've read the [Skill Name] skill and I'm using it to [purpose].
For skills with checklists, track checklist items with update_plan todos.
Tool mapping for Codex:
- TodoWrite -> update_plan
- Task subagents -> unavailable in Codex; do the work directly and state the limitation
- Skill -> use native skill discovery from ~/.agents/skills/

Process

Phase 1: Analyze

Explore the codebase and existing patterns.

Phase 2: Gather Requirements

Ask questions one at a time until user says ready.
Confirm scope, constraints, success criteria, dependencies.

Phase 3: Configure Reviewer

If the user has already specified a reviewer CLI and model (e.g., "create a plan, review with claude sonnet"), use those values. Otherwise, ask:

Which CLI should review the plan?
- codex — OpenAI Codex CLI (codex exec)
- claude — Claude Code CLI (claude -p)
- cursor — Cursor Agent CLI (cursor-agent -p)
- skip — No external review, proceed directly to file generation
Which model? (only if a CLI was chosen)
- For codex: default o4-mini, alternatives: gpt-5.3-codex, o3
- For claude: default sonnet, alternatives: opus, haiku
- For cursor: run cursor-agent models first to see your account's available models (availability varies by subscription)
- Accept any model string the user provides
Max review rounds for the plan? (default: 10)
- If the user does not provide a value, set MAX_ROUNDS=10.

Store the chosen REVIEWER_CLI, REVIEWER_MODEL, and MAX_ROUNDS for Phase 6 (Iterative Plan Review).

Phase 4: Design (REQUIRED SUB-SKILL)

Invoke superpowers:brainstorming, then propose 2-3 approaches and recommend one.

Phase 5: Plan (REQUIRED SUB-SKILL)

Invoke superpowers:writing-plans, then break work into milestones and bite-sized stories.

Phase 6: Iterative Plan Review

Send the plan to the configured reviewer CLI for feedback. Revise and re-submit until approved (default max 10 rounds).

Skip this phase entirely if reviewer was set to skip.

Step 1: Generate Session ID

REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)

Use for temp artifacts:

/tmp/plan-${REVIEW_ID}.md - plan payload
/tmp/plan-review-${REVIEW_ID}.md - normalized review text presented to the user
/tmp/plan-review-${REVIEW_ID}.json - raw Cursor JSON (only for cursor)
/tmp/plan-review-${REVIEW_ID}.stderr - reviewer stderr
/tmp/plan-review-${REVIEW_ID}.status - helper heartbeat/status log
/tmp/plan-review-${REVIEW_ID}.runner.out - helper-managed stdout from the reviewer command process
/tmp/plan-review-${REVIEW_ID}.sh - reviewer command script

Resolve the shared reviewer helper from the installed Codex skills directory:

REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh

Set helper success-artifact args before writing the command script:

HELPER_SUCCESS_FILE_ARGS=()
case "$REVIEWER_CLI" in
  codex)
    HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.md)
    ;;
  cursor)
    HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.json)
    ;;
esac

Step 2: Write Plan to Temp File

Write the complete plan (milestones, stories, design decisions, specs) to /tmp/plan-${REVIEW_ID}.md.

Review Contract (Applies to Every Round)

The reviewer response must use this structure:

## Summary
...

## Findings
### P0
- ...
### P1
- ...
### P2
- ...
### P3
- ...

## Verdict
VERDICT: APPROVED

Rules:

Order findings from P0 to P3.
P0 = total blocker, P1 = major risk, P2 = must-fix before approval, P3 = cosmetic / nice to have.
Use - None. when a severity has no findings.
VERDICT: APPROVED is allowed only when no P0, P1, or P2 findings remain. P3 findings are non-blocking.
The calling agent should still try to fix P3 findings when they are cheap and safe.

Liveness Contract (Applies While Review Is Running)

The shared reviewer runtime emits state=in-progress note="In progress N" heartbeats every 60 seconds while the reviewer child is alive.
The calling agent must keep waiting as long as a fresh In progress N heartbeat keeps arriving roughly once per minute.
Do not abort just because the review is slow, a soft timeout fired, or a stall-warning line appears, as long as the In progress N heartbeat continues.
Treat missing heartbeats, state=failed, state=completed-empty-output, and state=needs-operator-decision as escalation signals.

Step 3: Submit to Reviewer (Round 1)

Write the reviewer invocation to /tmp/plan-review-${REVIEW_ID}.sh as a bash script:

#!/usr/bin/env bash
set -euo pipefail

If REVIEWER_CLI is codex:

codex exec \
  -m ${REVIEWER_MODEL} \
  -s read-only \
  -o /tmp/plan-review-${REVIEW_ID}.md \
  "Review the implementation plan in /tmp/plan-${REVIEW_ID}.md. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."

Do not try to capture the Codex session ID yet. When using the helper, extract it from /tmp/plan-review-${REVIEW_ID}.runner.out after the command completes (look for session id: <uuid>), then store it as CODEX_SESSION_ID for resume in subsequent rounds.

If REVIEWER_CLI is claude:

claude -p \
  "Review the implementation plan below. Focus on:

$(cat /tmp/plan-${REVIEW_ID}.md)

1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
  --model ${REVIEWER_MODEL} \
  --strict-mcp-config \
  --setting-sources user

If REVIEWER_CLI is cursor:

cursor-agent -p \
  --mode=ask \
  --model ${REVIEWER_MODEL} \
  --trust \
  --output-format json \
  "Read the file /tmp/plan-${REVIEW_ID}.md and review the implementation plan. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
  > /tmp/plan-review-${REVIEW_ID}.json

For cursor, the command script writes raw JSON to /tmp/plan-review-${REVIEW_ID}.json. Do not run jq extraction until after the helper or fallback execution completes. If jq is not installed, inform the user: brew install jq (macOS) or equivalent.

Run the command script through the shared helper when available:

if [ -x "$REVIEWER_RUNTIME" ]; then
  "$REVIEWER_RUNTIME" \
    --command-file /tmp/plan-review-${REVIEW_ID}.sh \
    --stdout-file /tmp/plan-review-${REVIEW_ID}.runner.out \
    --stderr-file /tmp/plan-review-${REVIEW_ID}.stderr \
    --status-file /tmp/plan-review-${REVIEW_ID}.status \
    "${HELPER_SUCCESS_FILE_ARGS[@]}"
else
  echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
  bash /tmp/plan-review-${REVIEW_ID}.sh >/tmp/plan-review-${REVIEW_ID}.runner.out 2>/tmp/plan-review-${REVIEW_ID}.stderr
fi

Run the helper in the foreground and watch its live stdout for state=in-progress heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll /tmp/plan-review-${REVIEW_ID}.status separately instead of treating heartbeats as post-hoc-only data.

After the command completes:

If REVIEWER_CLI=cursor, extract the final review text:

CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/plan-review-${REVIEW_ID}.json)
jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md

If REVIEWER_CLI=codex, extract CODEX_SESSION_ID from /tmp/plan-review-${REVIEW_ID}.runner.out after the helper or fallback run. If the review text is only in .runner.out, move or copy the actual review body into /tmp/plan-review-${REVIEW_ID}.md before verdict parsing.
If REVIEWER_CLI=claude, promote stdout captured by the helper or fallback runner into the markdown review file:

cp /tmp/plan-review-${REVIEW_ID}.runner.out /tmp/plan-review-${REVIEW_ID}.md

Fallback is allowed only when the helper is missing or not executable.

Step 4: Read Review & Check Verdict

Read /tmp/plan-review-${REVIEW_ID}.md
If the review failed, produced empty output, or reached helper timeout, also read:
- /tmp/plan-review-${REVIEW_ID}.stderr
- /tmp/plan-review-${REVIEW_ID}.status
- /tmp/plan-review-${REVIEW_ID}.runner.out
Present review to the user:

## Plan Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})

[Reviewer feedback]

While the reviewer is still running, keep waiting as long as fresh state=in-progress note="In progress N" heartbeats continue to appear roughly once per minute.
Check verdict:
- VERDICT: APPROVED with no P0, P1, or P2 findings → proceed to Phase 7 (Initialize workspace)
- VERDICT: APPROVED with only P3 findings → optionally fix the P3 items if they are cheap and safe, then proceed
- VERDICT: REVISE or any P0, P1, or P2 finding → go to Step 5
- No clear verdict but P0, P1, and P2 are all - None. → treat as approved
- Helper state completed-empty-output → treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
- Helper state needs-operator-decision → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
- Max rounds (MAX_ROUNDS) reached → present the outcome to the user for a manual decision (proceed or stop)

Step 5: Revise the Plan

Address the reviewer findings in priority order (P0 → P1 → P2, then P3 when practical). Update the plan in conversation context and rewrite /tmp/plan-${REVIEW_ID}.md.

Summarize revisions for the user:

### Revisions (Round N)
- [Change and reason, one bullet per issue addressed]

If a revision contradicts the user's explicit requirements, skip it and note it for the user.

Step 6: Re-submit to Reviewer (Rounds 2-N)

Rewrite /tmp/plan-review-${REVIEW_ID}.sh for the next round. The script should contain the reviewer invocation only; do not run it directly.

If REVIEWER_CLI is codex:

Resume the existing session:

codex exec resume ${CODEX_SESSION_ID} \
  -o /tmp/plan-review-${REVIEW_ID}.md \
  "I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.

Changes made:
[List specific changes]

Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."

If resume fails (session expired), fall back to fresh codex exec with context about prior rounds.

If REVIEWER_CLI is claude:

Fresh call with accumulated context (Claude CLI has no session resume):

claude -p \
  "You previously reviewed an implementation plan and requested revisions.

Previous feedback summary: [key points from last review]

I've revised the plan. Updated version is below.

$(cat /tmp/plan-${REVIEW_ID}.md)

Changes made:
[List specific changes]

Re-review the full plan using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
  --model ${REVIEWER_MODEL} \
  --strict-mcp-config \
  --setting-sources user

If REVIEWER_CLI is cursor:

Resume the existing session:

cursor-agent --resume ${CURSOR_SESSION_ID} -p \
  --mode=ask \
  --model ${REVIEWER_MODEL} \
  --trust \
  --output-format json \
  "I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.

Changes made:
[List specific changes]

Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
  > /tmp/plan-review-${REVIEW_ID}.json

jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md

If resume fails, fall back to fresh cursor-agent -p with context about prior rounds.

After updating /tmp/plan-review-${REVIEW_ID}.sh, run the same helper/fallback flow from Round 1.

Return to Step 4.

Step 7: Present Final Result

## Plan Review — Final (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})

**Status:** Approved after N round(s)
[or]
**Status:** Max rounds (`MAX_ROUNDS`) reached — not fully approved

[Final feedback / remaining concerns]

Step 8: Cleanup

rm -f /tmp/plan-${REVIEW_ID}.md \
  /tmp/plan-review-${REVIEW_ID}.md \
  /tmp/plan-review-${REVIEW_ID}.json \
  /tmp/plan-review-${REVIEW_ID}.stderr \
  /tmp/plan-review-${REVIEW_ID}.status \
  /tmp/plan-review-${REVIEW_ID}.runner.out \
  /tmp/plan-review-${REVIEW_ID}.sh

If the round failed, produced empty output, or reached operator-decision timeout, keep .stderr, .status, and .runner.out until the issue is diagnosed instead of deleting them immediately.

Phase 7: Initialize Local Plan Workspace (MANDATORY)

At project root:

Ensure ai_plan/ exists. Create it if missing.
Ensure .gitignore contains /ai_plan/.
If .gitignore was changed, commit that change immediately (local commit only).

Recommended commit message:

chore(gitignore): ignore ai_plan local planning artifacts

Phase 8: Generate Plan Files (MANDATORY)

Create ai_plan/YYYY-MM-DD-<short-title>/ with all files below:

original-plan.md - copy of original planner-generated plan.
final-transcript.md - copy of final planning transcript used to reach approved plan.
milestone-plan.md - full implementation spec (from template).
story-tracker.md - story/milestone status tracker (from template).
continuation-runbook.md - execution instructions and context (from template).

Use templates from this skill's templates/ folder.

Phase 9: Handoff

Always instruct the executing agent:

Read ai_plan/YYYY-MM-DD-<short-title>/continuation-runbook.md first, then execute from that folder.

Do not rely on planner-private files during implementation.

Phase 10: Telegram Completion Notification (MANDATORY)

Resolve the Telegram notifier helper from the installed Codex skills directory:

TELEGRAM_NOTIFY_RUNTIME=~/.codex/skills/reviewer-runtime/notify-telegram.sh

On every terminal outcome for the create-plan run (approved, max rounds reached, skipped reviewer, or failure), send a Telegram summary if the helper exists and both TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID are configured:

if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
  "$TELEGRAM_NOTIFY_RUNTIME" --message "create-plan completed for <plan-folder-name>: <status summary>"
fi