Align reviewer runtime and Telegram notifications

2026-03-24 11:45:58 -05:00
parent 4d37674626
commit 63a048a26c
17 changed files with 1756 additions and 200 deletions
@@ -18,6 +18,7 @@ Create structured implementation plans with milestone and story tracking, and op
  - Claude Code: `~/.claude/skills/reviewer-runtime/run-review.sh`
  - OpenCode: `~/.config/opencode/skills/reviewer-runtime/run-review.sh`
  - Cursor: `.cursor/skills/reviewer-runtime/run-review.sh` or `~/.cursor/skills/reviewer-runtime/run-review.sh`
+- Telegram notification setup is documented in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)

 If dependencies are missing, stop and return:

@@ -115,10 +116,13 @@ Verify Superpowers dependencies exist in your agent skills root:
 - Creates plans under `ai_plan/YYYY-MM-DD-<short-title>/`.
 - Ensures `/ai_plan/` is in `.gitignore`.
 - Commits `.gitignore` update locally when added.
- Asks which reviewer CLI and model to use (or accepts `skip` for no review).
- Iteratively reviews the plan with the chosen reviewer (max 5 rounds) before generating files.
+- Asks which reviewer CLI, model, and max rounds to use (or accepts `skip` for no review).
+- Iteratively reviews the plan with the chosen reviewer (default max 10 rounds) before generating files.
 - Runs reviewer commands through `reviewer-runtime/run-review.sh` when available, with fallback to direct synchronous execution only if the helper is missing.
+- Waits as long as the reviewer runtime keeps emitting per-minute `In progress N` heartbeats.
+- Requires reviewer findings to be ordered `P0` through `P3`, with `P3` explicitly non-blocking.
 - Captures reviewer stderr and helper status logs for diagnostics and retains them on failed, empty-output, or operator-decision review rounds.
+- Sends completion notifications through Telegram only when the shared setup in [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md) is installed and configured.
 - Produces:
  - `original-plan.md`
  - `final-transcript.md`
@@ -130,13 +134,24 @@ Verify Superpowers dependencies exist in your agent skills root:

 After the plan is created (design + milestones + stories), the skill sends it to a second model for review:

-1. **Configure** — user picks a reviewer CLI (`codex`, `claude`, `cursor`) and model, or skips
+1. **Configure** — user picks a reviewer CLI (`codex`, `claude`, `cursor`), a model, and optional max rounds (default 10), or skips
 2. **Prepare** — plan payload and a bash reviewer command script are written to temp files
 3. **Run** — the command script is executed through `reviewer-runtime/run-review.sh` when installed
-4. **Feedback** — reviewer evaluates correctness, risks, missing steps, alternatives, security
-5. **Revise** — the planning agent addresses each issue and re-submits
-6. **Repeat** — up to 5 rounds until the reviewer returns `VERDICT: APPROVED`
-7. **Finalize** — approved plan is used to generate the plan file package
+4. **Feedback** — reviewer evaluates correctness, risks, missing steps, alternatives, security, and returns `## Summary`, `## Findings`, and `## Verdict`
+5. **Prioritize** — findings are ordered `P0`, `P1`, `P2`, `P3`
+6. **Revise** — the planning agent addresses findings in priority order and re-submits
+7. **Repeat** — up to max rounds until the reviewer returns `VERDICT: APPROVED`
+8. **Finalize** — approved plan is used to generate the plan file package
+
+### Reviewer Output Contract
+
+- `P0` = total blocker
+- `P1` = major risk
+- `P2` = must-fix before approval
+- `P3` = cosmetic / nice to have
+- Each severity section should use `- None.` when empty
+- `VERDICT: APPROVED` is valid only when no `P0`, `P1`, or `P2` findings remain
+- `P3` findings are non-blocking, but the caller should still try to fix them when cheap and safe

 ### Runtime Artifacts

@@ -153,16 +168,18 @@ The review flow may create these temp artifacts:
 Status log lines use this format:

 ```text
-ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
+ts=<ISO-8601> level=<info|warn|error> state=<running-silent|running-active|in-progress|stall-warning|completed|completed-empty-output|failed|needs-operator-decision> elapsed_s=<int> pid=<int> stdout_bytes=<int> stderr_bytes=<int> note="<short message>"
 ```

-`stall-warning` is a heartbeat/status-log state only. It is not a terminal review result.
+`in-progress` is the liveness heartbeat emitted roughly once per minute with `note="In progress N"`.
+`stall-warning` is a non-terminal status-log state only. It does not mean the caller should stop waiting if `in-progress` heartbeats continue.

 ### Failure Handling

 - `completed-empty-output` means the reviewer exited without producing review text; surface `.stderr` and `.status`, then retry only after diagnosing the cause.
- `needs-operator-decision` means the helper reached hard-timeout escalation; surface `.status` and decide whether to keep waiting, abort, or retry with different parameters.
+- `needs-operator-decision` means the helper reached hard-timeout escalation; surface `.status` and decide whether to extend the timeout, abort, or retry with different parameters.
 - Successful rounds clean up temp artifacts. Failed, empty-output, and operator-decision rounds should retain `.stderr`, `.status`, and `.runner.out` until diagnosed.
+- As long as fresh `in-progress` heartbeats continue to arrive roughly once per minute, the caller should keep waiting.

 ### Supported Reviewer CLIs

@@ -178,6 +195,12 @@ For all three CLIs, the preferred execution path is:
 2. run that script through `reviewer-runtime/run-review.sh`
 3. fall back to direct synchronous execution only if the helper is missing or not executable

+## Notifications
+
+- Telegram is the only supported completion notification path.
+- Shared setup: [TELEGRAM-NOTIFICATIONS.md](./TELEGRAM-NOTIFICATIONS.md)
+- Notification failures are non-blocking, but they must be surfaced to the user.
+
 ## Template Guardrails

 All plan templates now include guardrail sections that enforce: