Files
ai-coding-skills/skills/create-plan/codex/SKILL.md
2026-03-24 11:45:58 -05:00

579 lines
22 KiB
Markdown

---
name: create-plan
description: Use when a user asks to create or maintain a structured implementation plan in Codex, including milestones, bite-sized stories, and resumable local planning artifacts under ai_plan.
---
# Create Plan (Codex Native Superpowers)
Create and maintain a local plan workspace under `ai_plan/` at project root.
## Overview
This skill wraps the current Superpowers flow for Codex:
1. Design first with `superpowers:brainstorming`
2. Then build an implementation plan with `superpowers:writing-plans`
3. Review the plan iteratively with a second model/provider
4. Persist a local execution package in `ai_plan/YYYY-MM-DD-<short-title>/`
**Core principle:** Codex uses native skill discovery from `~/.agents/skills/`. Do not use deprecated `superpowers-codex bootstrap` or `use-skill` CLI commands.
## Prerequisite Check (MANDATORY)
Required:
- Superpowers skills symlink: `~/.agents/skills/superpowers -> ~/.codex/superpowers/skills`
- `superpowers:brainstorming`
- `superpowers:writing-plans`
Verify before proceeding:
```bash
test -L ~/.agents/skills/superpowers
test -f ~/.agents/skills/superpowers/brainstorming/SKILL.md
test -f ~/.agents/skills/superpowers/writing-plans/SKILL.md
```
If any dependency is missing, stop and return:
`Missing dependency: native Superpowers skills are required (superpowers:brainstorming, superpowers:writing-plans). Ensure ~/.agents/skills/superpowers is configured, then retry.`
## Required Skill Invocation Rules
- Invoke relevant skills through native discovery (no CLI wrapper).
- Announce skill usage explicitly:
- `I've read the [Skill Name] skill and I'm using it to [purpose].`
- For skills with checklists, track checklist items with `update_plan` todos.
- Tool mapping for Codex:
- `TodoWrite` -> `update_plan`
- `Task` subagents -> unavailable in Codex; do the work directly and state the limitation
- `Skill` -> use native skill discovery from `~/.agents/skills/`
## Process
### Phase 1: Analyze
- Explore the codebase and existing patterns.
### Phase 2: Gather Requirements
- Ask questions one at a time until user says ready.
- Confirm scope, constraints, success criteria, dependencies.
### Phase 3: Configure Reviewer
If the user has already specified a reviewer CLI and model (e.g., "create a plan, review with claude sonnet"), use those values. Otherwise, ask:
1. **Which CLI should review the plan?**
- `codex` — OpenAI Codex CLI (`codex exec`)
- `claude` — Claude Code CLI (`claude -p`)
- `cursor` — Cursor Agent CLI (`cursor-agent -p`)
- `skip` — No external review, proceed directly to file generation
2. **Which model?** (only if a CLI was chosen)
- For `codex`: default `o4-mini`, alternatives: `gpt-5.3-codex`, `o3`
- For `claude`: default `sonnet`, alternatives: `opus`, `haiku`
- For `cursor`: **run `cursor-agent models` first** to see your account's available models (availability varies by subscription)
- Accept any model string the user provides
3. **Max review rounds for the plan?** (default: 10)
- If the user does not provide a value, set `MAX_ROUNDS=10`.
Store the chosen `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phase 6 (Iterative Plan Review).
### Phase 4: Design (REQUIRED SUB-SKILL)
Invoke `superpowers:brainstorming`, then propose 2-3 approaches and recommend one.
### Phase 5: Plan (REQUIRED SUB-SKILL)
Invoke `superpowers:writing-plans`, then break work into milestones and bite-sized stories.
### Phase 6: Iterative Plan Review
Send the plan to the configured reviewer CLI for feedback. Revise and re-submit until approved (default max 10 rounds).
**Skip this phase entirely if reviewer was set to `skip`.**
#### Step 1: Generate Session ID
```bash
REVIEW_ID=$(uuidgen | tr '[:upper:]' '[:lower:]' | head -c 8)
```
Use for temp artifacts:
- `/tmp/plan-${REVIEW_ID}.md` - plan payload
- `/tmp/plan-review-${REVIEW_ID}.md` - normalized review text presented to the user
- `/tmp/plan-review-${REVIEW_ID}.json` - raw Cursor JSON (only for `cursor`)
- `/tmp/plan-review-${REVIEW_ID}.stderr` - reviewer stderr
- `/tmp/plan-review-${REVIEW_ID}.status` - helper heartbeat/status log
- `/tmp/plan-review-${REVIEW_ID}.runner.out` - helper-managed stdout from the reviewer command process
- `/tmp/plan-review-${REVIEW_ID}.sh` - reviewer command script
Resolve the shared reviewer helper from the installed Codex skills directory:
```bash
REVIEWER_RUNTIME=~/.codex/skills/reviewer-runtime/run-review.sh
```
Set helper success-artifact args before writing the command script:
```bash
HELPER_SUCCESS_FILE_ARGS=()
case "$REVIEWER_CLI" in
codex)
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.md)
;;
cursor)
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.json)
;;
esac
```
#### Step 2: Write Plan to Temp File
Write the complete plan (milestones, stories, design decisions, specs) to `/tmp/plan-${REVIEW_ID}.md`.
#### Review Contract (Applies to Every Round)
The reviewer response must use this structure:
```text
## Summary
...
## Findings
### P0
- ...
### P1
- ...
### P2
- ...
### P3
- ...
## Verdict
VERDICT: APPROVED
```
Rules:
- Order findings from `P0` to `P3`.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- Use `- None.` when a severity has no findings.
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
#### Liveness Contract (Applies While Review Is Running)
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
#### Step 3: Submit to Reviewer (Round 1)
Write the reviewer invocation to `/tmp/plan-review-${REVIEW_ID}.sh` as a bash script:
```bash
#!/usr/bin/env bash
set -euo pipefail
```
**If `REVIEWER_CLI` is `codex`:**
```bash
codex exec \
-m ${REVIEWER_MODEL} \
-s read-only \
-o /tmp/plan-review-${REVIEW_ID}.md \
"Review the implementation plan in /tmp/plan-${REVIEW_ID}.md. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?
Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict
Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
```
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/plan-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
**If `REVIEWER_CLI` is `claude`:**
```bash
claude -p \
"Review the implementation plan below. Focus on:
$(cat /tmp/plan-${REVIEW_ID}.md)
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?
Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict
Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
--model ${REVIEWER_MODEL} \
--strict-mcp-config \
--setting-sources user
```
**If `REVIEWER_CLI` is `cursor`:**
```bash
cursor-agent -p \
--mode=ask \
--model ${REVIEWER_MODEL} \
--trust \
--output-format json \
"Read the file /tmp/plan-${REVIEW_ID}.md and review the implementation plan. Focus on:
1. Correctness — Will this plan achieve the stated goals?
2. Risks — What could go wrong? Edge cases? Data loss?
3. Missing steps — Is anything forgotten?
4. Alternatives — Is there a simpler or better approach?
5. Security — Any security concerns?
Return exactly these sections in order:
## Summary
## Findings
### P0
### P1
### P2
### P3
## Verdict
Rules:
- Order findings from highest severity to lowest.
- Use `- None.` when a severity has no findings.
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
> /tmp/plan-review-${REVIEW_ID}.json
```
For `cursor`, the command script writes raw JSON to `/tmp/plan-review-${REVIEW_ID}.json`. Do not run `jq` extraction until after the helper or fallback execution completes. If `jq` is not installed, inform the user: `brew install jq` (macOS) or equivalent.
Run the command script through the shared helper when available:
```bash
if [ -x "$REVIEWER_RUNTIME" ]; then
"$REVIEWER_RUNTIME" \
--command-file /tmp/plan-review-${REVIEW_ID}.sh \
--stdout-file /tmp/plan-review-${REVIEW_ID}.runner.out \
--stderr-file /tmp/plan-review-${REVIEW_ID}.stderr \
--status-file /tmp/plan-review-${REVIEW_ID}.status \
"${HELPER_SUCCESS_FILE_ARGS[@]}"
else
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
bash /tmp/plan-review-${REVIEW_ID}.sh >/tmp/plan-review-${REVIEW_ID}.runner.out 2>/tmp/plan-review-${REVIEW_ID}.stderr
fi
```
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/plan-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
After the command completes:
- If `REVIEWER_CLI=cursor`, extract the final review text:
```bash
CURSOR_SESSION_ID=$(jq -r '.session_id' /tmp/plan-review-${REVIEW_ID}.json)
jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md
```
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/plan-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/plan-review-${REVIEW_ID}.md` before verdict parsing.
- If `REVIEWER_CLI=claude`, promote stdout captured by the helper or fallback runner into the markdown review file:
```bash
cp /tmp/plan-review-${REVIEW_ID}.runner.out /tmp/plan-review-${REVIEW_ID}.md
```
Fallback is allowed only when the helper is missing or not executable.
#### Step 4: Read Review & Check Verdict
1. Read `/tmp/plan-review-${REVIEW_ID}.md`
2. If the review failed, produced empty output, or reached helper timeout, also read:
- `/tmp/plan-review-${REVIEW_ID}.stderr`
- `/tmp/plan-review-${REVIEW_ID}.status`
- `/tmp/plan-review-${REVIEW_ID}.runner.out`
3. Present review to the user:
```
## Plan Review — Round N (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
[Reviewer feedback]
```
4. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
5. Check verdict:
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → proceed to Phase 7 (Initialize workspace)
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if they are cheap and safe, then proceed
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to Step 5
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as approved
- Helper state `completed-empty-output` → treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
- Max rounds (`MAX_ROUNDS`) reached → present the outcome to the user for a manual decision (proceed or stop)
#### Step 5: Revise the Plan
Address the reviewer findings in priority order (`P0``P1``P2`, then `P3` when practical). Update the plan in conversation context and rewrite `/tmp/plan-${REVIEW_ID}.md`.
Summarize revisions for the user:
```
### Revisions (Round N)
- [Change and reason, one bullet per issue addressed]
```
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
Rewrite `/tmp/plan-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
**If `REVIEWER_CLI` is `codex`:**
Resume the existing session:
```bash
codex exec resume ${CODEX_SESSION_ID} \
-o /tmp/plan-review-${REVIEW_ID}.md \
"I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.
Changes made:
[List specific changes]
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
```
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
**If `REVIEWER_CLI` is `claude`:**
Fresh call with accumulated context (Claude CLI has no session resume):
```bash
claude -p \
"You previously reviewed an implementation plan and requested revisions.
Previous feedback summary: [key points from last review]
I've revised the plan. Updated version is below.
$(cat /tmp/plan-${REVIEW_ID}.md)
Changes made:
[List specific changes]
Re-review the full plan using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
--model ${REVIEWER_MODEL} \
--strict-mcp-config \
--setting-sources user
```
**If `REVIEWER_CLI` is `cursor`:**
Resume the existing session:
```bash
cursor-agent --resume ${CURSOR_SESSION_ID} -p \
--mode=ask \
--model ${REVIEWER_MODEL} \
--trust \
--output-format json \
"I've revised the plan based on your feedback. Updated plan is in /tmp/plan-${REVIEW_ID}.md.
Changes made:
[List specific changes]
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
> /tmp/plan-review-${REVIEW_ID}.json
jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md
```
If resume fails, fall back to fresh `cursor-agent -p` with context about prior rounds.
After updating `/tmp/plan-review-${REVIEW_ID}.sh`, run the same helper/fallback flow from Round 1.
Return to Step 4.
#### Step 7: Present Final Result
```
## Plan Review — Final (reviewer: ${REVIEWER_CLI} / ${REVIEWER_MODEL})
**Status:** Approved after N round(s)
[or]
**Status:** Max rounds (`MAX_ROUNDS`) reached — not fully approved
[Final feedback / remaining concerns]
```
#### Step 8: Cleanup
```bash
rm -f /tmp/plan-${REVIEW_ID}.md \
/tmp/plan-review-${REVIEW_ID}.md \
/tmp/plan-review-${REVIEW_ID}.json \
/tmp/plan-review-${REVIEW_ID}.stderr \
/tmp/plan-review-${REVIEW_ID}.status \
/tmp/plan-review-${REVIEW_ID}.runner.out \
/tmp/plan-review-${REVIEW_ID}.sh
```
If the round failed, produced empty output, or reached operator-decision timeout, keep `.stderr`, `.status`, and `.runner.out` until the issue is diagnosed instead of deleting them immediately.
### Phase 7: Initialize Local Plan Workspace (MANDATORY)
At project root:
1. Ensure `ai_plan/` exists. Create it if missing.
2. Ensure `.gitignore` contains `/ai_plan/`.
3. If `.gitignore` was changed, commit that change immediately (local commit only).
Recommended commit message:
- `chore(gitignore): ignore ai_plan local planning artifacts`
### Phase 8: Generate Plan Files (MANDATORY)
Create `ai_plan/YYYY-MM-DD-<short-title>/` with all files below:
1. `original-plan.md` - copy of original planner-generated plan.
2. `final-transcript.md` - copy of final planning transcript used to reach approved plan.
3. `milestone-plan.md` - full implementation spec (from template).
4. `story-tracker.md` - story/milestone status tracker (from template).
5. `continuation-runbook.md` - execution instructions and context (from template).
Use templates from this skill's `templates/` folder.
### Phase 9: Handoff
Always instruct the executing agent:
> Read `ai_plan/YYYY-MM-DD-<short-title>/continuation-runbook.md` first, then execute from that folder.
Do not rely on planner-private files during implementation.
### Phase 10: Telegram Completion Notification (MANDATORY)
Resolve the Telegram notifier helper from the installed Codex skills directory:
```bash
TELEGRAM_NOTIFY_RUNTIME=~/.codex/skills/reviewer-runtime/notify-telegram.sh
```
On every terminal outcome for the create-plan run (approved, max rounds reached, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
```bash
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
"$TELEGRAM_NOTIFY_RUNTIME" --message "create-plan completed for <plan-folder-name>: <status summary>"
fi
```
Rules:
- Telegram is the only supported completion notification path. Do not use desktop notifications, `say`, email, or any other notifier.
- Notification failures are non-blocking, but they must be surfaced to the user.
- If Telegram is not configured, state that no completion notification was sent.
## Quick Reference
| Phase | Action | Required Output |
|---|---|---|
| 1 | Analyze codebase/context | Constraints and known patterns |
| 2 | Gather requirements (one question at a time) | Confirmed scope and success criteria |
| 3 | Configure reviewer CLI and model | `REVIEWER_CLI`, `REVIEWER_MODEL`, `MAX_ROUNDS` (or `skip`) |
| 4 | Invoke `superpowers:brainstorming` | Chosen design approach |
| 5 | Invoke `superpowers:writing-plans` | Milestones and bite-sized stories |
| 6 | Iterative plan review (max `MAX_ROUNDS` rounds) | Reviewer approval or max-rounds warning |
| 7 | Initialize `ai_plan/` + `.gitignore` | Local planning workspace ready |
| 8 | Build plan package from templates | Full plan folder with required files |
| 9 | Handoff with runbook-first instruction | Resumable execution context |
| 10 | Send Telegram completion notification | User notified or notification status reported |
## Execution Rules to Include in Plan (MANDATORY)
- Run lint/typecheck/tests after each milestone.
- Prefer linting changed files only for speed.
- Commit locally after each completed milestone (**do not push**).
- Stop and ask user for feedback.
- Apply feedback, rerun checks, and commit again.
- Move to next milestone only after user approval.
- After all milestones are completed and approved, ask permission to push.
- Only after approved push: mark plan as completed.
## Gitignore Note
`ai_plan/` is intentionally local and must stay gitignored. Do not treat inability to commit plan-file updates in `ai_plan/` as a problem.
## Common Mistakes
- Using deprecated commands like `superpowers-codex bootstrap` or `superpowers-codex use-skill`.
- Jumping to implementation planning without running `superpowers:brainstorming` first.
- Asking multiple requirement questions in one message.
- Forgetting to create/update `.gitignore` for `/ai_plan/`.
- Omitting one or more required files in the plan package.
- Handoff without explicit "read runbook first" direction.
- Skipping the reviewer phase without explicit user opt-out.
- Not capturing the Codex session ID for resume in subsequent review rounds.
- Using any completion notification path other than Telegram.
## Rationalizations and Counters
| Rationalization | Counter |
|---|---|
| "Bootstrap CLI is faster" | Deprecated for Codex; native discovery is the supported path. |
| "I can skip brainstorming for small tasks" | Creative/planning work still requires design validation first. |
| "I don't need `update_plan` for checklist skills" | Checklist tracking is mandatory for execution reliability. |
| "I can keep plan files outside `ai_plan/`" | This skill standardizes local resumable planning under `ai_plan/`. |
| "The reviewer approved, I can skip my own validation" | Reviewer feedback supplements but does not replace your own verification. |
## Red Flags - Stop and Correct
- You are about to run any `superpowers-codex` command.
- You started writing milestones before design validation.
- You did not announce which skill you invoked and why.
- You are marking planning complete without all required files.
- Handoff does not explicitly point to `continuation-runbook.md`.
- You are applying a reviewer suggestion that contradicts user requirements.
## Verification Checklist
- [ ] `ai_plan/` exists at project root
- [ ] `.gitignore` includes `/ai_plan/`
- [ ] `.gitignore` ignore-rule commit was created if needed
- [ ] Plan directory created under `ai_plan/YYYY-MM-DD-<short-title>/`
- [ ] Reviewer configured or explicitly skipped
- [ ] Max review rounds confirmed (default: 10)
- [ ] Plan review completed (approved or max rounds) — or skipped
- [ ] `original-plan.md` present
- [ ] `final-transcript.md` present
- [ ] `milestone-plan.md` present
- [ ] `story-tracker.md` present
- [ ] `continuation-runbook.md` present
- [ ] Handoff explicitly says to read runbook first and execute from plan folder
- [ ] Telegram completion notification attempted if configured