Align reviewer runtime and Telegram notifications
This commit is contained in:
@@ -73,7 +73,10 @@ If the user has already specified a reviewer CLI and model (e.g., "create a plan
|
||||
- For `cursor`: **run `cursor-agent models` first** to see your account's available models (availability varies by subscription)
|
||||
- Accept any model string the user provides
|
||||
|
||||
Store the chosen `REVIEWER_CLI` and `REVIEWER_MODEL` for Phase 6 (Iterative Plan Review).
|
||||
3. **Max review rounds for the plan?** (default: 10)
|
||||
- If the user does not provide a value, set `MAX_ROUNDS=10`.
|
||||
|
||||
Store the chosen `REVIEWER_CLI`, `REVIEWER_MODEL`, and `MAX_ROUNDS` for Phase 6 (Iterative Plan Review).
|
||||
|
||||
### Phase 4: Design (REQUIRED SUB-SKILL)
|
||||
|
||||
@@ -86,7 +89,7 @@ Story IDs: `S-{milestone}{sequence}`.
|
||||
|
||||
### Phase 6: Iterative Plan Review
|
||||
|
||||
Send the plan to the configured reviewer CLI for feedback. Revise and re-submit until approved (max 5 rounds).
|
||||
Send the plan to the configured reviewer CLI for feedback. Revise and re-submit until approved (default max 10 rounds).
|
||||
|
||||
**Skip this phase entirely if reviewer was set to `skip`.**
|
||||
|
||||
@@ -115,10 +118,60 @@ else
|
||||
fi
|
||||
```
|
||||
|
||||
Set helper success-artifact args before writing the command script:
|
||||
|
||||
```bash
|
||||
HELPER_SUCCESS_FILE_ARGS=()
|
||||
case "$REVIEWER_CLI" in
|
||||
codex)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.md)
|
||||
;;
|
||||
cursor)
|
||||
HELPER_SUCCESS_FILE_ARGS+=(--success-file /tmp/plan-review-${REVIEW_ID}.json)
|
||||
;;
|
||||
esac
|
||||
```
|
||||
|
||||
#### Step 2: Write Plan to Temp File
|
||||
|
||||
Write the complete plan (milestones, stories, design decisions, specs) to `/tmp/plan-${REVIEW_ID}.md`.
|
||||
|
||||
#### Review Contract (Applies to Every Round)
|
||||
|
||||
The reviewer response must use this structure:
|
||||
|
||||
```text
|
||||
## Summary
|
||||
...
|
||||
|
||||
## Findings
|
||||
### P0
|
||||
- ...
|
||||
### P1
|
||||
- ...
|
||||
### P2
|
||||
- ...
|
||||
### P3
|
||||
- ...
|
||||
|
||||
## Verdict
|
||||
VERDICT: APPROVED
|
||||
```
|
||||
|
||||
Rules:
|
||||
- Order findings from `P0` to `P3`.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `VERDICT: APPROVED` is allowed only when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking.
|
||||
- The calling agent should still try to fix `P3` findings when they are cheap and safe.
|
||||
|
||||
#### Liveness Contract (Applies While Review Is Running)
|
||||
|
||||
- The shared reviewer runtime emits `state=in-progress note="In progress N"` heartbeats every 60 seconds while the reviewer child is alive.
|
||||
- The calling agent must keep waiting as long as a fresh `In progress N` heartbeat keeps arriving roughly once per minute.
|
||||
- Do not abort just because the review is slow, a soft timeout fired, or a `stall-warning` line appears, as long as the `In progress N` heartbeat continues.
|
||||
- Treat missing heartbeats, `state=failed`, `state=completed-empty-output`, and `state=needs-operator-decision` as escalation signals.
|
||||
|
||||
#### Step 3: Submit to Reviewer (Round 1)
|
||||
|
||||
Write the reviewer invocation to `/tmp/plan-review-${REVIEW_ID}.sh` as a bash script:
|
||||
@@ -142,8 +195,21 @@ codex exec \
|
||||
4. Alternatives — Is there a simpler or better approach?
|
||||
5. Security — Any security concerns?
|
||||
|
||||
Be specific and actionable. If the plan is solid, end with exactly: VERDICT: APPROVED
|
||||
If changes are needed, end with exactly: VERDICT: REVISE"
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
Do not try to capture the Codex session ID yet. When using the helper, extract it from `/tmp/plan-review-${REVIEW_ID}.runner.out` after the command completes (look for `session id: <uuid>`), then store it as `CODEX_SESSION_ID` for resume in subsequent rounds.
|
||||
@@ -162,8 +228,21 @@ $(cat /tmp/plan-${REVIEW_ID}.md)
|
||||
4. Alternatives — Is there a simpler or better approach?
|
||||
5. Security — Any security concerns?
|
||||
|
||||
Be specific and actionable. If the plan is solid, end with exactly: VERDICT: APPROVED
|
||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
@@ -184,8 +263,21 @@ cursor-agent -p \
|
||||
4. Alternatives — Is there a simpler or better approach?
|
||||
5. Security — Any security concerns?
|
||||
|
||||
Be specific and actionable. If the plan is solid, end with exactly: VERDICT: APPROVED
|
||||
If changes are needed, end with exactly: VERDICT: REVISE" \
|
||||
Return exactly these sections in order:
|
||||
## Summary
|
||||
## Findings
|
||||
### P0
|
||||
### P1
|
||||
### P2
|
||||
### P3
|
||||
## Verdict
|
||||
|
||||
Rules:
|
||||
- Order findings from highest severity to lowest.
|
||||
- Use `- None.` when a severity has no findings.
|
||||
- `P0` = total blocker, `P1` = major risk, `P2` = must-fix before approval, `P3` = cosmetic / nice to have.
|
||||
- End with exactly one verdict line: `VERDICT: APPROVED` or `VERDICT: REVISE`
|
||||
- `VERDICT: APPROVED` is allowed only when there are no `P0`, `P1`, or `P2` findings. `P3` findings are non-blocking." \
|
||||
> /tmp/plan-review-${REVIEW_ID}.json
|
||||
```
|
||||
|
||||
@@ -205,13 +297,16 @@ if [ -x "$REVIEWER_RUNTIME" ]; then
|
||||
--command-file /tmp/plan-review-${REVIEW_ID}.sh \
|
||||
--stdout-file /tmp/plan-review-${REVIEW_ID}.runner.out \
|
||||
--stderr-file /tmp/plan-review-${REVIEW_ID}.stderr \
|
||||
--status-file /tmp/plan-review-${REVIEW_ID}.status
|
||||
--status-file /tmp/plan-review-${REVIEW_ID}.status \
|
||||
"${HELPER_SUCCESS_FILE_ARGS[@]}"
|
||||
else
|
||||
echo "Warning: reviewer runtime helper not found at $REVIEWER_RUNTIME; falling back to direct synchronous review." >&2
|
||||
bash /tmp/plan-review-${REVIEW_ID}.sh >/tmp/plan-review-${REVIEW_ID}.runner.out 2>/tmp/plan-review-${REVIEW_ID}.stderr
|
||||
fi
|
||||
```
|
||||
|
||||
Run the helper in the foreground and watch its live stdout for `state=in-progress` heartbeats. If your agent environment buffers command output until exit, start the helper in the background and poll `/tmp/plan-review-${REVIEW_ID}.status` separately instead of treating heartbeats as post-hoc-only data.
|
||||
|
||||
After the command completes:
|
||||
- If `REVIEWER_CLI=cursor`, extract the final review text:
|
||||
|
||||
@@ -221,6 +316,11 @@ jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_I
|
||||
```
|
||||
|
||||
- If `REVIEWER_CLI=codex`, extract `CODEX_SESSION_ID` from `/tmp/plan-review-${REVIEW_ID}.runner.out` after the helper or fallback run. If the review text is only in `.runner.out`, move or copy the actual review body into `/tmp/plan-review-${REVIEW_ID}.md` before verdict parsing.
|
||||
- If `REVIEWER_CLI=claude`, promote stdout captured by the helper or fallback runner into the markdown review file:
|
||||
|
||||
```bash
|
||||
cp /tmp/plan-review-${REVIEW_ID}.runner.out /tmp/plan-review-${REVIEW_ID}.md
|
||||
```
|
||||
|
||||
#### Step 4: Read Review & Check Verdict
|
||||
|
||||
@@ -237,17 +337,19 @@ jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_I
|
||||
[Reviewer feedback]
|
||||
```
|
||||
|
||||
3. Check verdict:
|
||||
- **VERDICT: APPROVED** → proceed to Phase 7 (Initialize workspace)
|
||||
- **VERDICT: REVISE** → go to Step 5
|
||||
- No clear verdict but positive / no actionable items → treat as approved
|
||||
4. While the reviewer is still running, keep waiting as long as fresh `state=in-progress note="In progress N"` heartbeats continue to appear roughly once per minute.
|
||||
5. Check verdict:
|
||||
- **VERDICT: APPROVED** with no `P0`, `P1`, or `P2` findings → proceed to Phase 7 (Initialize workspace)
|
||||
- **VERDICT: APPROVED** with only `P3` findings → optionally fix the `P3` items if they are cheap and safe, then proceed
|
||||
- **VERDICT: REVISE** or any `P0`, `P1`, or `P2` finding → go to Step 5
|
||||
- No clear verdict but `P0`, `P1`, and `P2` are all `- None.` → treat as approved
|
||||
- Helper state `completed-empty-output` → treat as failed review attempt, surface stderr/status, fix invocation or prompt handling, then retry
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to keep waiting, abort, or retry with different helper parameters
|
||||
- Max rounds (5) reached → proceed with warning
|
||||
- Helper state `needs-operator-decision` → surface status log and decide whether to extend the timeout, abort, or retry with different helper parameters
|
||||
- Max rounds (`MAX_ROUNDS`) reached → present the outcome to the user for a manual decision (proceed or stop)
|
||||
|
||||
#### Step 5: Revise the Plan
|
||||
|
||||
Address each issue the reviewer raised. Update the plan in conversation context and rewrite `/tmp/plan-${REVIEW_ID}.md`.
|
||||
Address the reviewer findings in priority order (`P0` → `P1` → `P2`, then `P3` when practical). Update the plan in conversation context and rewrite `/tmp/plan-${REVIEW_ID}.md`.
|
||||
|
||||
Summarize revisions for the user:
|
||||
|
||||
@@ -258,7 +360,9 @@ Summarize revisions for the user:
|
||||
|
||||
If a revision contradicts the user's explicit requirements, skip it and note it for the user.
|
||||
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-5)
|
||||
#### Step 6: Re-submit to Reviewer (Rounds 2-N)
|
||||
|
||||
Rewrite `/tmp/plan-review-${REVIEW_ID}.sh` for the next round. The script should contain the reviewer invocation only; do not run it directly.
|
||||
|
||||
**If `REVIEWER_CLI` is `codex`:**
|
||||
|
||||
@@ -272,8 +376,8 @@ codex exec resume ${CODEX_SESSION_ID} \
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review. If solid, end with: VERDICT: APPROVED
|
||||
If more changes needed, end with: VERDICT: REVISE"
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking."
|
||||
```
|
||||
|
||||
If resume fails (session expired), fall back to fresh `codex exec` with context about prior rounds.
|
||||
@@ -295,8 +399,8 @@ $(cat /tmp/plan-${REVIEW_ID}.md)
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review the full plan. If solid, end with: VERDICT: APPROVED
|
||||
If more changes needed, end with: VERDICT: REVISE" \
|
||||
Re-review the full plan using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
--model ${REVIEWER_MODEL} \
|
||||
--strict-mcp-config \
|
||||
--setting-sources user
|
||||
@@ -317,8 +421,8 @@ cursor-agent --resume ${CURSOR_SESSION_ID} -p \
|
||||
Changes made:
|
||||
[List specific changes]
|
||||
|
||||
Re-review. If solid, end with: VERDICT: APPROVED
|
||||
If more changes needed, end with: VERDICT: REVISE" \
|
||||
Re-review using the same `## Summary`, `## Findings`, and `## Verdict` structure as before.
|
||||
Keep findings ordered `P0` to `P3`, use `- None.` when a severity has no findings, and only use `VERDICT: APPROVED` when no `P0`, `P1`, or `P2` findings remain. `P3` findings are non-blocking." \
|
||||
> /tmp/plan-review-${REVIEW_ID}.json
|
||||
|
||||
jq -r '.result' /tmp/plan-review-${REVIEW_ID}.json > /tmp/plan-review-${REVIEW_ID}.md
|
||||
@@ -337,7 +441,7 @@ Return to Step 4.
|
||||
|
||||
**Status:** Approved after N round(s)
|
||||
[or]
|
||||
**Status:** Max rounds (5) reached — not fully approved
|
||||
**Status:** Max rounds (`MAX_ROUNDS`) reached — not fully approved
|
||||
|
||||
[Final feedback / remaining concerns]
|
||||
```
|
||||
@@ -384,19 +488,45 @@ Always instruct the executing agent:
|
||||
|
||||
Do not rely on planner-private files during implementation.
|
||||
|
||||
### Phase 10: Telegram Completion Notification (MANDATORY)
|
||||
|
||||
Resolve the Telegram notifier helper from Cursor's installed skills directory:
|
||||
|
||||
```bash
|
||||
if [ -x .cursor/skills/reviewer-runtime/notify-telegram.sh ]; then
|
||||
TELEGRAM_NOTIFY_RUNTIME=.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
else
|
||||
TELEGRAM_NOTIFY_RUNTIME=~/.cursor/skills/reviewer-runtime/notify-telegram.sh
|
||||
fi
|
||||
```
|
||||
|
||||
On every terminal outcome for the create-plan run (approved, max rounds reached, skipped reviewer, or failure), send a Telegram summary if the helper exists and both `TELEGRAM_BOT_TOKEN` and `TELEGRAM_CHAT_ID` are configured:
|
||||
|
||||
```bash
|
||||
if [ -x "$TELEGRAM_NOTIFY_RUNTIME" ] && [ -n "${TELEGRAM_BOT_TOKEN:-}" ] && [ -n "${TELEGRAM_CHAT_ID:-}" ]; then
|
||||
"$TELEGRAM_NOTIFY_RUNTIME" --message "create-plan completed for <plan-folder-name>: <status summary>"
|
||||
fi
|
||||
```
|
||||
|
||||
Rules:
|
||||
- Telegram is the only supported completion notification path. Do not use desktop notifications, `say`, email, or any other notifier.
|
||||
- Notification failures are non-blocking, but they must be surfaced to the user.
|
||||
- If Telegram is not configured, state that no completion notification was sent.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Phase | Action | Required Output |
|
||||
|---|---|---|
|
||||
| 1 | Analyze codebase/context | Constraints and known patterns |
|
||||
| 2 | Gather requirements (one question at a time) | Confirmed scope and success criteria |
|
||||
| 3 | Configure reviewer CLI and model | `REVIEWER_CLI` and `REVIEWER_MODEL` (or `skip`) |
|
||||
| 3 | Configure reviewer CLI and model | `REVIEWER_CLI`, `REVIEWER_MODEL`, `MAX_ROUNDS` (or `skip`) |
|
||||
| 4 | Invoke `superpowers:brainstorming` | Chosen design approach |
|
||||
| 5 | Invoke `superpowers:writing-plans` | Milestones and bite-sized stories |
|
||||
| 6 | Iterative plan review (max 5 rounds) | Reviewer approval or max-rounds warning |
|
||||
| 6 | Iterative plan review (max `MAX_ROUNDS` rounds) | Reviewer approval or max-rounds warning |
|
||||
| 7 | Initialize `ai_plan/` + `.gitignore` | Local planning workspace ready |
|
||||
| 8 | Build plan package from templates | Full plan folder with required files |
|
||||
| 9 | Handoff with runbook-first instruction | Resumable execution context |
|
||||
| 10 | Send Telegram completion notification | User notified or notification status reported |
|
||||
|
||||
## Tracker Discipline (MANDATORY)
|
||||
|
||||
@@ -437,6 +567,7 @@ After completing any story:
|
||||
- Omitting one or more required files in the plan package.
|
||||
- Handoff without explicit "read runbook first" direction.
|
||||
- Skipping the reviewer phase without explicit user opt-out.
|
||||
- Using any completion notification path other than Telegram.
|
||||
|
||||
## Red Flags - Stop and Correct
|
||||
|
||||
@@ -454,6 +585,7 @@ After completing any story:
|
||||
- [ ] `.gitignore` ignore-rule commit was created if needed
|
||||
- [ ] Plan directory created under `ai_plan/YYYY-MM-DD-<short-title>/`
|
||||
- [ ] Reviewer configured or explicitly skipped
|
||||
- [ ] Max review rounds confirmed (default: 10)
|
||||
- [ ] Plan review completed (approved or max rounds) — or skipped
|
||||
- [ ] `original-plan.md` present
|
||||
- [ ] `final-transcript.md` present
|
||||
@@ -461,6 +593,7 @@ After completing any story:
|
||||
- [ ] `story-tracker.md` present
|
||||
- [ ] `continuation-runbook.md` present
|
||||
- [ ] Handoff explicitly says to read runbook first and execute from plan folder
|
||||
- [ ] Telegram completion notification attempted if configured
|
||||
|
||||
## Exit Triggers for Question Phase
|
||||
User says: "ready", "done", "let's plan", "proceed", "enough questions"
|
||||
|
||||
Reference in New Issue
Block a user