RFC Review: Downgrade Webhook — Billing-Side Quota Notification
Companion review for downgrade-webhook.md. R1–R5 reviewed prior drafts.
R6 reviewed the milestone-based re-trigger rewrite. R7 reflects the event simplification:
DowngradeNotificationEvent removed — all five milestones use NegativeBalanceEvent;
consumers filter milestone == "day_0" for email.
Executive Summary
- Overall Score:
8.5/10
- Rating:
Strong
- RFC Type:
backend
- Sub-Type:
enhancement
- Assessment Confidence:
High
- Applied Caps/Gates: none triggered
- Implementation Readiness Verdict:
PROCEED with notes — chunks 1, 3, 5 are agent-executable immediately; chunks 2 and 6 are blocked on two design confirmations (§5 Q1: delayed job support; §5 Q3: same-day re-entry policy). Both are design questions, not spec gaps — the RFC documents both options for each.
- RFC Author: addo.hernando@mekari.com | Reviewed: 2026-06-30
An AI agent can implement the milestone-based downgrade notification flow from this RFC without guessing. The trigger logic is unambiguous, the two new tables have migration-precision DDL with constraints and indexes, the DowngradeNotificationWorker is specified at pseudocode level with explicit FOR UPDATE idempotency, and Kafka payloads are given as Go structs. The biggest remaining gap is §5 Q1 (delayed job mechanism) — the RFC correctly documents a fallback (cron sweeper) if perform_at is unavailable, but the agent must pause to confirm which path to implement before coding chunk 6. §5 Q3 (same-day re-entry policy) similarly needs a design decision before migration 2.
Quick Verdict
Why this RFC can be implemented agentically:
- Source Verification table cites real file + line for every pattern touched (
publishNegativeBalanceEventsIfNeeded at line 752, re-trigger gate at line 767, jobEnqueuer usage at line 883).
- DDL is complete: 2 new tables with correct constraint naming, milestone CHECK, status CHECK, unique index, lookup indexes. No schema ambiguity.
DowngradeNotificationWorker logic is fully specified (SELECT FOR UPDATE → quota read → publish or cancel). Test command is sourced from qontak-billing/Makefile.
Why this RFC will require design clarification before full implementation:
- §5 Q1: agent cannot know whether
jobEnqueuer supports perform_at without inspecting the implementation — and the two paths (delayed job vs cron sweeper) produce different code for chunk 6.
- §5 Q3: idempotency key design (
yyyymmdd vs uuid) affects migration 2 DDL comment; agent needs a confirmed decision.
Findings Ledger
| ID | Severity | Finding | RFC location | Status | First seen | Resolved in | Evidence / fix |
|---|
REV-1 | major | Delayed job perform_at support in jobEnqueuer not confirmed | §5 Q1, §4.C chunk 6 | open | R6 | — | RFC documents both paths (Option A: delayed job; Option B: cron sweeper). Agent must pause at chunk 6 until confirmed. |
REV-2 | major | Same-day re-entry policy not decided (§5 Q3) — affects migration 2 DDL comment and Day 0 guard logic | §5 Q3, §2.2 DDL | open | R6 | — | Two options: yyyymmdd key (same-day blocked) vs uuid key (re-entry allowed). Decision needed before chunk 2. |
REV-3 | minor | gen_random_uuid() availability on billing Postgres unconfirmed | §5 Q6, §2.2 DDL | open | R6 | — | Run SELECT 1 FROM pg_extension WHERE extname='pgcrypto' pre-migration; or switch to app-side UUID. |
REV-4 | minor | Backfill codes in migration 1 not verified against prod data | §5 Q7, §2.2 migration 1 | open | R6 | — | Run SELECT against prod to confirm 6 codes exist before running migration. |
REV-5 | minor | billing.quota_management.downgrade_notification topic provisioning | §4.A | fixed | R6 | R7 | Topic removed — DowngradeNotificationEvent eliminated; single topic negative_balance sufficient. |
REV-6 | minor | Job enqueue failure for milestone jobs (§5 Q4) leaves orphaned scheduled rows with no worker — only mitigated if cron fallback (Option B) is implemented | §3 failure catalog, §5 Q4 | open | R6 | — | RFC acknowledges; cron fallback closes the gap. |
REV-7 | minor | Mermaid validity: 6 blocks validated with mmdc — all pass | §2.1 diagrams | fixed | R6 | R6 | 6/6 parse; no </> issues |
Ledger summary: 2 major open (REV-1, REV-2), 3 minor open (REV-3, REV-4, REV-6), 2 fixed this cycle (REV-5, REV-7). No external cross-squad blockers.
PRD → RFC Traceability
| PRD requirement | RFC section | Coverage |
|---|
| Downgrade flow triggers on negative quota for eligible components | §1 trigger logic pseudocode, §2 Decision 1 | Full |
| Day 0 immediate + Week 1/2/3/Month 1 staggered milestones | §1 milestone table, §2.1 sequence diagrams, §2 Decision 2 | Full |
| Abort schedule when quota resolves | §1 trigger logic, §2.1 "balance resolves" diagram, §2.4 Detail 2.5 | Full |
| Email once at first negative occurrence | §1 AC, §2 Decision 4, downgrade_notification Kafka event | Full |
| No Qontak One / plan-type gate | §1 trigger logic, §1 Out of Scope | Full (explicitly removed) |
| All work in qontak-billing only | §2 topology, §4.C execution plan | Full |
Summary: all acceptance criteria from the task description are covered. No scope creep.
Scorecard
| Category | Score | Evidence |
|---|
| PRT — PRD Traceability | 9.0 | All 6 PRD/AC items mapped; no scope creep. |
| TDC — Technical Decisions | 9.0 | 5 decisions, each with context/options/rationale/reversibility. Decision 2 documents fallback if delayed job unavailable. REV-1/2 are design confirmations, not dangling decisions. |
| DMS — Data Model & Schema | 8.5 | Complete DDL for 2 tables + column extension; constraints named, indexes justified, lifecycle table, PII/retention notes. Held from 9.0 by REV-2 (idempotency key design open) and REV-3 (pgcrypto). |
| ACV — API Contract & Versioning | 9.5 | One event type (NegativeBalanceEvent), one topic, five milestones. milestone field is the sole differentiator. Consumer contract explicit (filter milestone=day_0 for email). Struct additions are backward-compatible (zero-value defaults). Simpler than two-event design. |
| DIC — Data Integrity & Consistency | 9.0 | Detail 2.5 covers all 6 write paths with scope, partial failure behavior, idempotency key, consistency level. Safe order (INSERT before publish) documented. |
| FMC — Failure Mode & Retry | 8.5 | Failure catalog covers all external calls; orphaned-row risk documented (REV-6); milestone Kafka retry path clear. Held from 9.0 by REV-6. |
| CSS — Concurrency & Scaling | 9.0 | Concurrency map covers 3 collision scenarios; FOR UPDATE guard; idempotent cancel UPDATE. |
| SAS — Security | 9.5 | No new HTTP surfaces; no PII; existing gosec/staticcheck cover new code. |
| MRP — Migration & Rollout | 8.5 | 4-step migration sequence; all additive; rollback = drop column/tables. Chunk dependency order explicit (chunk 4 depends on chunk 1; chunk 6 on chunks 2,3,4,5). Held from 9.0 by REV-1 (delayed job path affects chunk 6 implementation). |
| OBS — Observability | 9.0 | 8 named log events with fields; alert threshold; "debug at 3am" implied by fields. |
| SBC — Service Boundary | 9.5 | Precisely bounded to qontak-billing; consumer-side email RFC explicitly deferred; no coupling to external services. |
| CPA — Pattern Alignment | 9.0 | Source Verification cites file + line for every pattern; chunk 3 mentions make generate for sqlc; worker registration follows existing pattern. |
| CDG — Compliance | 9.0 | No PII in new tables; no compliance trigger. |
Overall: 8.5/10 — strong spec. Two major open items are design questions (REV-1, REV-2) with documented fallback paths, not spec holes.
Decision Closure Assessment
| # | Decision | Status | Agent-implementable? |
|---|
| 1 | triggers_downgrade on billing_components | Resolved | yes — DDL + backfill given |
| 2 | Milestone scheduling mechanism | Resolved (two documented paths) | yes — but path choice affects chunk 6 (§5 Q1) |
| 3 | Day 0 idempotency guard (active-schedule check) | Resolved | yes — FOR UPDATE guard + status check |
| 4 | Email notification: Day 0 only, Kafka event | Resolved | yes — struct + topic given |
| 5 | Balance re-check in milestone worker: direct DB read | Resolved | yes — same quota tables as recalculateComponentQuota |
5 of 5 Resolved. 0 Dangling.
Data Integrity Deep-Dive
| Write path | Idempotency key | Safe order | Duplicate handling |
|---|
downgrade_events INSERT | idempotency_key unique index | before schedule INSERT | unique constraint rejects duplicate → skip Day 0 |
downgrade_schedules INSERT x4 | (downgrade_event_id, milestone) unique | after downgrade_events | unique constraint prevents re-insertion |
Kafka negative_balance Day 0 | consumer dedup on (company_id, billing_code, milestone, timestamp) | after DB inserts | at-most-once; consumer handles idempotency |
Kafka downgrade_notification Day 0 | consumer uses EventID UUID | after negative_balance | EventID is unique per Day 0 event |
Milestone status update | FOR UPDATE row lock | after quota re-check | second executor sees non-scheduled → no-op |
| Cancel remaining | WHERE status='scheduled' predicate | idempotent | re-running UPDATE touches 0 rows if already cancelled |
Concurrency Collision Map
| Resource | Writers | Collision | Resolution |
|---|
| Milestone row | worker + Sidekiq retry | double-fire | FOR UPDATE + status guard |
downgrade_events Day 0 | concurrent recalculations | duplicate key | unique idempotency_key → second fails → skip |
| Cancel remaining scheduled rows | milestone worker + main use case | both cancel | idempotent WHERE status='scheduled' → converges |
Strengths
- Acceptance criteria encoded verbatim — the §1 milestone table and sequence diagrams directly implement the stated ACs (Day 0 + Week 1/2/3/Month 1, abort on resolve, email once). No interpretation needed.
- Self-contained in qontak-billing — the milestone worker queries the billing DB directly (Decision 5). No cross-service read at milestone time. Lower latency, higher reliability.
- Documented fallback — Decision 2 explicitly covers the cron-sweeper fallback if
perform_at is unavailable, so the agent has a complete implementation path regardless of job system capabilities.
Biggest Gaps
- REV-1 (major):
jobEnqueuer perform_at support unknown — agent cannot write chunk 6 without this confirmed.
- REV-2 (major): same-day re-entry policy open — affects migration 2 DDL and Day 0 guard logic.
Priority Actions
- (REV-1) Inspect
jobEnqueuer implementation to confirm perform_at support; document the result in §6 comment log and update §2 Decision 2 with the confirmed path.
- (REV-2) Decide same-day re-entry policy (§5 Q3) and update
idempotency_key format in migration 2 accordingly.
- (REV-3) Run
SELECT 1 FROM pg_extension WHERE extname='pgcrypto' on billing Postgres; note in §6 log.
- (REV-4) Verify 6 backfill
parent_component_code values against prod before migration 1.
(REV-5) Resolved R7 — downgrade_notification topic removed.
Implementation Readiness Checklist
Blocked (resolve before starting chunk)
Pre-deploy checks
Task Manifest
| Order | Chunk | Files | Commands | Acceptance criteria | Depends on |
|---|
| 1 | Migration 1: triggers_downgrade | db/migrations/<ts>_add_triggers_downgrade.up/down.sql | make migrate-up | column exists default false; 6 codes true; rollback clean | — |
| 2 | Migration 2: downgrade_events + downgrade_schedules | db/migrations/<ts>_create_downgrade_tables.up/down.sql | make migrate-up | tables + constraints + indexes; rollback drops tables | REV-2 resolved; REV-3 resolved |
| 3 | Payload + topic constant | internal/app/payload/kafka_event.go, internal/kafka/topics.go | go build ./... | structs compile; topic accessible | — |
| 4 | Query extension | db/queries/billing_components.sql, internal/app/repository/billing_components.sql.go | make generate | TriggersDowngrade populated; existing tests pass | Chunk 1 |
| 5 | DowngradeNotificationWorker | new file + worker registration | go test -race ./internal/app/usecase/quota_management/... | fires when negative; cancels when resolved; FOR UPDATE guard | Chunks 2, 3 |
| 6 | compareComponents Day 0 extension | activate_or_update_component_quota.go | go test -race ./internal/app/usecase/quota_management/... | Day 0 publishes + 4 rows + 4 jobs; active-schedule guard skips; resolve cancels | Chunks 2,3,4,5; REV-1 resolved |
| 7 | Full verification | — | make test lint sec | all green | Chunks 1–6; REV-4, REV-5 resolved |
Open Questions
| # | Question | Category | Severity |
|---|
| 1 | Does jobEnqueuer support perform_at / delayed execution? | TDC / MRP | Major (blocks chunk 6) |
| 2 | Same-day re-entry policy: block (yyyymmdd key) or allow (uuid key)? | DMS | Major (blocks chunk 2) |
| 3 | Cleanup cron for downgrade_events/downgrade_schedules — same RFC or follow-up? | DMS | Nice-to-have |
| 4 | If enqueue fails for a milestone, is cron fallback (Option B) mandatory or optional? | FMC | Important |
| 5 | Who consumes billing.quota_management.negative_balance to send email (filter milestone=day_0)? Separate RFC needed. | SBC | Non-blocking |
| 6 | gen_random_uuid() availability on billing Postgres? | DMS | Pre-deploy gate |
| 7 | Backfill codes confirmed against production data? | DMS | Pre-deploy gate |
Review History
| Cycle | Date | RFC revision | Score | Verdict | Notes |
|---|
| R1–R5 | 2026-06-23/30 | prior drafts | 7.5–8.5 | PROCEED with notes | Superseded by scope/architecture rewrite |
| R6 | 2026-06-30 | last_updated 2026-06-30 (milestone-based re-trigger rewrite) | 8.5 | PROCEED with notes | Milestone scheduling (Day 0 + Week 1/2/3/Month 1); email once at Day 0; 2 design confirmations needed (REV-1,2); 6/6 mermaid pass |
| R7 | 2026-06-30 | last_updated 2026-06-30 (event + struct simplification) | 8.5 | PROCEED with notes | Removed DowngradeNotificationEvent; renamed Milestone string → TriggerSequence int (1..5); removed TriggersDowngrade from event; consumers filter trigger_sequence==1 for email; ACV raised to 9.5; REV-5 closed |