Files
ai-coding-skills/skills/create-plan/codex/SKILL.md
2026-03-24 11:45:58 -05:00

22 KiB

name, description
name description
create-plan Use when a user asks to create or maintain a structured implementation plan in Codex, including milestones, bite-sized stories, and resumable local planning artifacts under ai_plan.

Create Plan (Codex Native Superpowers)

Create and maintain a local plan workspace under ai_plan/ at project root.

Overview

This skill wraps the current Superpowers flow for Codex:

  1. Design first with superpowers:brainstorming
  2. Then build an implementation plan with superpowers:writing-plans
  3. Review the plan iteratively with a second model/provider
  4. Persist a local execution package in ai_plan/YYYY-MM-DD-<short-title>/

Core principle: Codex uses native skill discovery from ~/.agents/skills/. Do not use deprecated superpowers-codex bootstrap or use-skill CLI commands.

Prerequisite Check (MANDATORY)

Required:

  • Superpowers skills symlink: ~/.agents/skills/superpowers -> ~/.codex/superpowers/skills
  • superpowers:brainstorming
  • superpowers:writing-plans

Verify before proceeding:

test -L ~/.agents/skills/superpowers
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
test -f ~/.agents/skills/superpowers/writing-plans/SKILL.md

If any dependency is missing, stop and return:

Missing dependency: native Superpowers skills are required (superpowers:brainstorming, superpowers:writing-plans). Ensure ~/.agents/skills/superpowers is configured, then retry.

Required Skill Invocation Rules

  • Invoke relevant skills through native discovery (no CLI wrapper).
  • Announce skill usage explicitly:
    • I've read the [Skill Name] skill and I'm using it to [purpose].
  • For skills with checklists, track checklist items with update_plan todos.
  • Tool mapping for Codex:
    • TodoWrite -> update_plan
    • Task subagents -> unavailable in Codex; do the work directly and state the limitation
    • Skill -> use native skill discovery from ~/.agents/skills/

Process

Phase 1: Analyze

  • Explore the codebase and existing patterns.

Phase 2: Gather Requirements

  • Ask questions one at a time until user says ready.
  • Confirm scope, constraints, success criteria, dependencies.

Phase 3: Configure Reviewer

If the user has already specified a reviewer CLI and model (e.g., "create a plan, review with claude sonnet"), use those values. Otherwise, ask:

  1. Which CLI should review the plan?

    • codex — OpenAI Codex CLI (codex exec)
    • claude — Claude Code CLI (claude -p)
    • cursor — Cursor Agent CLI (cursor-agent -p)
    • skip — No external review, proceed directly to file generation
  2. Which model? (only if a CLI was chosen)

    • For codex: default o4-mini, alternatives: gpt-5.3-codex, o3
    • For claude: default sonnet, alternatives: opus, haiku
    • For cursor: run cursor-agent models first to see your account's available models (availability varies by subscription)
    • Accept any model string the user provides
  3. Max review rounds for the plan? (default: 10)

    • If the user does not provide a value, set MAX_ROUNDS=10.

Store the chosen REVIEWER_CLI, REVIEWER_MODEL, and MAX_ROUNDS for Phase 6 (Iterative Plan Review).

Phase 4: Design (REQUIRED SUB-SKILL)

Invoke superpowers:brainstorming, then propose 2-3 approaches and recommend one.

Phase 5: Plan (REQUIRED SUB-SKILL)

Invoke superpowers:writing-plans, then break work into milestones and bite-sized stories.

Phase 6: Iterative Plan Review

Send the plan to the configured reviewer CLI for feedback. Revise and re-submit until approved (default max 10 rounds).

Skip this phase entirely if reviewer was set to skip.

Step 1: Generate Session ID

REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)

Use for temp artifacts:

  • /tmp/plan-${REVIEW_ID}.md - plan payload
  • /tmp/plan-review-${REVIEW_ID}.md - normalized review text presented to the user
  • /tmp/plan-review-${REVIEW_ID}.json - raw Cursor JSON (only for cursor)
  • /tmp/plan-review-${REVIEW_ID}.stderr - reviewer stderr
  • /tmp/plan-review-${REVIEW_ID}.status - helper heartbeat/status log
  • /tmp/plan-review-${REVIEW_ID}.runner.out - helper-managed stdout from the reviewer command process
  • /tmp/plan-review-${REVIEW_ID}.sh - reviewer command script

Resolve the shared reviewer helper from the installed Codex skills directory:

REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh

Set helper success-artifact args before writing the command script:

HELPER_SUCCESS_FILE_ARGS=()
case "$REVIEWER_CLI" in
  codex)
    HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.md)
    ;;
  cursor)
    HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.json)
    ;;
esac

Step 2: Write Plan to Temp File

Write the complete plan (milestones, stories, design decisions, specs) to /tmp/plan-${REVIEW_ID}.md.

Review Contract (Applies to Every Round)

The reviewer response must use this structure:

## Summary
...

## Findings
### P0
- ...
### P1
- ...
### P2
- ...
### P3
- ...

## Verdict
VERDICT: APPROVED

Rules:

  • Order findings from P0 to P3.
  • P0 = total blocker, P1 = major risk, P2 = must-fix before approval, P3 = cosmetic / nice to have.
  • Use - None. when a severity has no findings.
  • VERDICT: APPROVED is allowed only when no P0, P1, or P2 findings remain. P3 findings are non-blocking.
  • The calling agent should still try to fix P3 findings when they are cheap and safe.

Liveness Contract (Applies While Review Is Running)

  • The shared reviewer runtime emits state=in-progress note="In progress N" heartbeats every 60 seconds while the reviewer child is alive.
  • The calling agent must keep waiting as long as a fresh In progress N heartbeat keeps arriving roughly once per minute.
  • Do not abort just because the review is slow, a soft timeout fired, or a stall-warning line appears, as long as the In progress N heartbeat continues.
  • Treat missing heartbeats, state=failed, state=completed-empty-output, and state=needs-operator-decision as escalation signals.

Step 3: Submit to Reviewer (Round 1)

Write the reviewer invocation to /tmp/plan-review-${REVIEW_ID}.sh as a bash script:

#!/usr/bin/env bash
set -euo pipefail

If REVIEWER_CLI is codex:

codex exec \
  -m ${REVIEWER_MODEL} \
  -s read-only \
  -o /tmp/plan-review-${REVIEW_ID}.md \
  "Review the implementation plan in /tmp/plan-${REVIEW_ID}.md. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."

Do not try to capture the Codex session ID yet. When using the helper, extract it from /tmp/plan-review-${REVIEW_ID}.runner.out after the command completes (look for session id: <uuid>), then store it as CODEX_SESSION_ID for resume in subsequent rounds.

If REVIEWER_CLI is claude:

claude -p \
  "Review the implementation plan below. Focus on:

$(cat /tmp/plan-${REVIEW_ID}.md)

1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
  --model ${REVIEWER_MODEL} \
  --strict-mcp-config \
  --setting-sources user

If REVIEWER_CLI is cursor:

cursor-agent -p \
  --mode=ask \
  --model ${REVIEWER_MODEL} \
  --trust \
  --output-format json \
  "Read the file /tmp/plan-${REVIEW_ID}.md and review the implementation plan. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?

Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict

Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
  > /tmp/plan-review-${REVIEW_ID}.json

For cursor, the command script writes raw JSON to /tmp/plan-review-${REVIEW_ID}.json. Do not run jq extraction until after the helper or fallback execution completes. If jq is not installed, inform the user: brew install jq (macOS) or equivalent.

Run the command script through the shared helper when available:

if [ -x "$REVIEWER_RUNTIME" ]; then
  "$REVIEWER_RUNTIME" \
    --command-file /tmp/plan-review-${REVIEW_ID}.sh \
    --stdout-file /tmp/plan-review-${REVIEW_ID}.runner.out \
    --stderr-file /tmp/plan-review-${REVIEW_ID}.stderr \
    --status-file /tmp/plan-review-${REVIEW_ID}.status \
    "${HELPER_SUCCESS_FILE_ARGS[@]}"
else
  echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
  bash /tmp/plan-review-${REVIEW_ID}.sh >/tmp/plan-review-${REVIEW_ID}.runner.out 2>/tmp/plan-review-${REVIEW_ID}.stderr
fi

Run the helper in the foreground and watch its live stdout for state=in-progress heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll /tmp/plan-review-${REVIEW_ID}.status separately instead of treating heartbeats as post-hoc-only data.

After the command completes:

  • If REVIEWER_CLI=cursor, extract the final review text:
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/plan-review-${REVIEW_ID}.json)
jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md
  • If REVIEWER_CLI=codex, extract CODEX_SESSION_ID from /tmp/plan-review-${REVIEW_ID}.runner.out after the helper or fallback run. If the review text is only in .runner.out, move or copy the actual review body into /tmp/plan-review-${REVIEW_ID}.md before verdict parsing.
  • If REVIEWER_CLI=claude, promote stdout captured by the helper or fallback runner into the markdown review file:
cp /tmp/plan-review-${REVIEW_ID}.runner.out /tmp/plan-review-${REVIEW_ID}.md

Fallback is allowed only when the helper is missing or not executable.

Step 4: Read Review & Check Verdict

  1. Read /tmp/plan-review-${REVIEW_ID}.md
  2. If the review failed, produced empty output, or reached helper timeout, also read:
    • /tmp/plan-review-${REVIEW_ID}.stderr
    • /tmp/plan-review-${REVIEW_ID}.status
    • /tmp/plan-review-${REVIEW_ID}.runner.out
  3. Present review to the user:
## Plan Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})

[Reviewer feedback]
  1. While the reviewer is still running, keep waiting as long as fresh state=in-progress note="In progress N" heartbeats continue to appear roughly once per minute.
  2. Check verdict:
    • VERDICT: APPROVED with no P0, P1, or P2 findings → proceed to Phase 7 (Initialize workspace)
    • VERDICT: APPROVED with only P3 findings → optionally fix the P3 items if they are cheap and safe, then proceed
    • VERDICT: REVISE or any P0, P1, or P2 finding → go to Step 5
    • No clear verdict but P0, P1, and P2 are all - None. → treat as approved
    • Helper state completed-empty-output → treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
    • Helper state needs-operator-decision → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
    • Max rounds (MAX_ROUNDS) reached → present the outcome to the user for a manual decision (proceed or stop)

Step 5: Revise the Plan

Address the reviewer findings in priority order (P0P1P2, then P3 when practical). Update the plan in conversation context and rewrite /tmp/plan-${REVIEW_ID}.md.

Summarize revisions for the user:

### Revisions (Round N)
- [Change and reason, one bullet per issue addressed]

If a revision contradicts the user's explicit requirements, skip it and note it for the user.

Step 6: Re-submit to Reviewer (Rounds 2-N)

Rewrite /tmp/plan-review-${REVIEW_ID}.sh for the next round. The script should contain the reviewer invocation only; do not run it directly.

If REVIEWER_CLI is codex:

Resume the existing session:

codex exec resume ${CODEX_SESSION_ID} \
  -o /tmp/plan-review-${REVIEW_ID}.md \
  "I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.

Changes made:
[List specific changes]

Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."

If resume fails (session expired), fall back to fresh codex exec with context about prior rounds.

If REVIEWER_CLI is claude:

Fresh call with accumulated context (Claude CLI has no session resume):

claude -p \
  "You previously reviewed an implementation plan and requested revisions.

Previous feedback summary: [key points from last review]

I've revised the plan. Updated version is below.

$(cat /tmp/plan-${REVIEW_ID}.md)

Changes made:
[List specific changes]

Re-review the full plan using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
  --model ${REVIEWER_MODEL} \
  --strict-mcp-config \
  --setting-sources user

If REVIEWER_CLI is cursor:

Resume the existing session:

cursor-agent --resume ${CURSOR_SESSION_ID} -p \
  --mode=ask \
  --model ${REVIEWER_MODEL} \
  --trust \
  --output-format json \
  "I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.

Changes made:
[List specific changes]

Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
  > /tmp/plan-review-${REVIEW_ID}.json

jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md

If resume fails, fall back to fresh cursor-agent -p with context about prior rounds.

After updating /tmp/plan-review-${REVIEW_ID}.sh, run the same helper/fallback flow from Round 1.

Return to Step 4.

Step 7: Present Final Result

## Plan Review — Final (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})

**Status:** Approved after N round(s)
[or]
**Status:** Max rounds (`MAX_ROUNDS`) reached — not fully approved

[Final feedback / remaining concerns]

Step 8: Cleanup

rm -f /tmp/plan-${REVIEW_ID}.md \
  /tmp/plan-review-${REVIEW_ID}.md \
  /tmp/plan-review-${REVIEW_ID}.json \
  /tmp/plan-review-${REVIEW_ID}.stderr \
  /tmp/plan-review-${REVIEW_ID}.status \
  /tmp/plan-review-${REVIEW_ID}.runner.out \
  /tmp/plan-review-${REVIEW_ID}.sh

If the round failed, produced empty output, or reached operator-decision timeout, keep .stderr, .status, and .runner.out until the issue is diagnosed instead of deleting them immediately.

Phase 7: Initialize Local Plan Workspace (MANDATORY)

At project root:

  1. Ensure ai_plan/ exists. Create it if missing.
  2. Ensure .gitignore contains /ai_plan/.
  3. If .gitignore was changed, commit that change immediately (local commit only).

Recommended commit message:

  • chore(gitignore): ignore ai_plan local planning artifacts

Phase 8: Generate Plan Files (MANDATORY)

Create ai_plan/YYYY-MM-DD-<short-title>/ with all files below:

  1. original-plan.md - copy of original planner-generated plan.
  2. final-transcript.md - copy of final planning transcript used to reach approved plan.
  3. milestone-plan.md - full implementation spec (from template).
  4. story-tracker.md - story/milestone status tracker (from template).
  5. continuation-runbook.md - execution instructions and context (from template).

Use templates from this skill's templates/ folder.

Phase 9: Handoff

Always instruct the executing agent:

Read ai_plan/YYYY-MM-DD-<short-title>/continuation-runbook.md first, then execute from that folder.

Do not rely on planner-private files during implementation.

Phase 10: Telegram Completion Notification (MANDATORY)

Resolve the Telegram notifier helper from the installed Codex skills directory:

TELEGRAM_NOTIFY_RUNTIME=~/.codex/skills/reviewer-runtime/notify-telegram.sh

On every terminal outcome for the create-plan run (approved, max rounds reached, skipped reviewer, or failure), send a Telegram summary if the helper exists and both TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID are configured:

if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
  "$TELEGRAM_NOTIFY_RUNTIME" --message "create-plan completed for <plan-folder-name>: <status summary>"
fi

Rules:

  • Telegram is the only supported completion notification path. Do not use desktop notifications, say, email, or any other notifier.
  • Notification failures are non-blocking, but they must be surfaced to the user.
  • If Telegram is not configured, state that no completion notification was sent.

Quick Reference

Phase Action Required Output
1 Analyze codebase/context Constraints and known patterns
2 Gather requirements (one question at a time) Confirmed scope and success criteria
3 Configure reviewer CLI and model REVIEWER_CLI, REVIEWER_MODEL, MAX_ROUNDS (or skip)
4 Invoke superpowers:brainstorming Chosen design approach
5 Invoke superpowers:writing-plans Milestones and bite-sized stories
6 Iterative plan review (max MAX_ROUNDS rounds) Reviewer approval or max-rounds warning
7 Initialize ai_plan/ + .gitignore Local planning workspace ready
8 Build plan package from templates Full plan folder with required files
9 Handoff with runbook-first instruction Resumable execution context
10 Send Telegram completion notification User notified or notification status reported

Execution Rules to Include in Plan (MANDATORY)

  • Run lint/typecheck/tests after each milestone.
  • Prefer linting changed files only for speed.
  • Commit locally after each completed milestone (do not push).
  • Stop and ask user for feedback.
  • Apply feedback, rerun checks, and commit again.
  • Move to next milestone only after user approval.
  • After all milestones are completed and approved, ask permission to push.
  • Only after approved push: mark plan as completed.

Gitignore Note

ai_plan/ is intentionally local and must stay gitignored. Do not treat inability to commit plan-file updates in ai_plan/ as a problem.

Common Mistakes

  • Using deprecated commands like superpowers-codex bootstrap or superpowers-codex use-skill.
  • Jumping to implementation planning without running superpowers:brainstorming first.
  • Asking multiple requirement questions in one message.
  • Forgetting to create/update .gitignore for /ai_plan/.
  • Omitting one or more required files in the plan package.
  • Handoff without explicit "read runbook first" direction.
  • Skipping the reviewer phase without explicit user opt-out.
  • Not capturing the Codex session ID for resume in subsequent review rounds.
  • Using any completion notification path other than Telegram.

Rationalizations and Counters

Rationalization Counter
"Bootstrap CLI is faster" Deprecated for Codex; native discovery is the supported path.
"I can skip brainstorming for small tasks" Creative/planning work still requires design validation first.
"I don't need update_plan for checklist skills" Checklist tracking is mandatory for execution reliability.
"I can keep plan files outside ai_plan/" This skill standardizes local resumable planning under ai_plan/.
"The reviewer approved, I can skip my own validation" Reviewer feedback supplements but does not replace your own verification.

Red Flags - Stop and Correct

  • You are about to run any superpowers-codex command.
  • You started writing milestones before design validation.
  • You did not announce which skill you invoked and why.
  • You are marking planning complete without all required files.
  • Handoff does not explicitly point to continuation-runbook.md.
  • You are applying a reviewer suggestion that contradicts user requirements.

Verification Checklist

  • ai_plan/ exists at project root
  • .gitignore includes /ai_plan/
  • .gitignore ignore-rule commit was created if needed
  • Plan directory created under ai_plan/YYYY-MM-DD-<short-title>/
  • Reviewer configured or explicitly skipped
  • Max review rounds confirmed (default: 10)
  • Plan review completed (approved or max rounds) — or skipped
  • original-plan.md present
  • final-transcript.md present
  • milestone-plan.md present
  • story-tracker.md present
  • continuation-runbook.md present
  • Handoff explicitly says to read runbook first and execute from plan folder
  • Telegram completion notification attempted if configured