Skip to main content

RFC Review: Downgrade Webhook — Billing-Side Quota Notification

Companion review for downgrade-webhook.md. R1–R5 reviewed prior drafts. R6 reviewed the milestone-based re-trigger rewrite. R7 reflects the event simplification: DowngradeNotificationEvent removed — all five milestones use NegativeBalanceEvent; consumers filter milestone == "day_0" for email.

Executive Summary

  • Overall Score: 8.5/10
  • Rating: Strong
  • RFC Type: backend
  • Sub-Type: enhancement
  • Assessment Confidence: High
  • Applied Caps/Gates: none triggered
  • Implementation Readiness Verdict: PROCEED with notes — chunks 1, 3, 5 are agent-executable immediately; chunks 2 and 6 are blocked on two design confirmations (§5 Q1: delayed job support; §5 Q3: same-day re-entry policy). Both are design questions, not spec gaps — the RFC documents both options for each.
  • RFC Author: addo.hernando@mekari.com | Reviewed: 2026-06-30

An AI agent can implement the milestone-based downgrade notification flow from this RFC without guessing. The trigger logic is unambiguous, the two new tables have migration-precision DDL with constraints and indexes, the DowngradeNotificationWorker is specified at pseudocode level with explicit FOR UPDATE idempotency, and Kafka payloads are given as Go structs. The biggest remaining gap is §5 Q1 (delayed job mechanism) — the RFC correctly documents a fallback (cron sweeper) if perform_at is unavailable, but the agent must pause to confirm which path to implement before coding chunk 6. §5 Q3 (same-day re-entry policy) similarly needs a design decision before migration 2.


Quick Verdict

Why this RFC can be implemented agentically:

  • Source Verification table cites real file + line for every pattern touched (publishNegativeBalanceEventsIfNeeded at line 752, re-trigger gate at line 767, jobEnqueuer usage at line 883).
  • DDL is complete: 2 new tables with correct constraint naming, milestone CHECK, status CHECK, unique index, lookup indexes. No schema ambiguity.
  • DowngradeNotificationWorker logic is fully specified (SELECT FOR UPDATE → quota read → publish or cancel). Test command is sourced from qontak-billing/Makefile.

Why this RFC will require design clarification before full implementation:

  • §5 Q1: agent cannot know whether jobEnqueuer supports perform_at without inspecting the implementation — and the two paths (delayed job vs cron sweeper) produce different code for chunk 6.
  • §5 Q3: idempotency key design (yyyymmdd vs uuid) affects migration 2 DDL comment; agent needs a confirmed decision.

Findings Ledger

IDSeverityFindingRFC locationStatusFirst seenResolved inEvidence / fix
REV-1majorDelayed job perform_at support in jobEnqueuer not confirmed§5 Q1, §4.C chunk 6openR6RFC documents both paths (Option A: delayed job; Option B: cron sweeper). Agent must pause at chunk 6 until confirmed.
REV-2majorSame-day re-entry policy not decided (§5 Q3) — affects migration 2 DDL comment and Day 0 guard logic§5 Q3, §2.2 DDLopenR6Two options: yyyymmdd key (same-day blocked) vs uuid key (re-entry allowed). Decision needed before chunk 2.
REV-3minorgen_random_uuid() availability on billing Postgres unconfirmed§5 Q6, §2.2 DDLopenR6Run SELECT 1 FROM pg_extension WHERE extname='pgcrypto' pre-migration; or switch to app-side UUID.
REV-4minorBackfill codes in migration 1 not verified against prod data§5 Q7, §2.2 migration 1openR6Run SELECT against prod to confirm 6 codes exist before running migration.
REV-5minorbilling.quota_management.downgrade_notification topic provisioning§4.AfixedR6R7Topic removed — DowngradeNotificationEvent eliminated; single topic negative_balance sufficient.
REV-6minorJob enqueue failure for milestone jobs (§5 Q4) leaves orphaned scheduled rows with no worker — only mitigated if cron fallback (Option B) is implemented§3 failure catalog, §5 Q4openR6RFC acknowledges; cron fallback closes the gap.
REV-7minorMermaid validity: 6 blocks validated with mmdc — all pass§2.1 diagramsfixedR6R66/6 parse; no </> issues

Ledger summary: 2 major open (REV-1, REV-2), 3 minor open (REV-3, REV-4, REV-6), 2 fixed this cycle (REV-5, REV-7). No external cross-squad blockers.


PRD → RFC Traceability

PRD requirementRFC sectionCoverage
Downgrade flow triggers on negative quota for eligible components§1 trigger logic pseudocode, §2 Decision 1Full
Day 0 immediate + Week 1/2/3/Month 1 staggered milestones§1 milestone table, §2.1 sequence diagrams, §2 Decision 2Full
Abort schedule when quota resolves§1 trigger logic, §2.1 "balance resolves" diagram, §2.4 Detail 2.5Full
Email once at first negative occurrence§1 AC, §2 Decision 4, downgrade_notification Kafka eventFull
No Qontak One / plan-type gate§1 trigger logic, §1 Out of ScopeFull (explicitly removed)
All work in qontak-billing only§2 topology, §4.C execution planFull

Summary: all acceptance criteria from the task description are covered. No scope creep.


Scorecard

CategoryScoreEvidence
PRT — PRD Traceability9.0All 6 PRD/AC items mapped; no scope creep.
TDC — Technical Decisions9.05 decisions, each with context/options/rationale/reversibility. Decision 2 documents fallback if delayed job unavailable. REV-1/2 are design confirmations, not dangling decisions.
DMS — Data Model & Schema8.5Complete DDL for 2 tables + column extension; constraints named, indexes justified, lifecycle table, PII/retention notes. Held from 9.0 by REV-2 (idempotency key design open) and REV-3 (pgcrypto).
ACV — API Contract & Versioning9.5One event type (NegativeBalanceEvent), one topic, five milestones. milestone field is the sole differentiator. Consumer contract explicit (filter milestone=day_0 for email). Struct additions are backward-compatible (zero-value defaults). Simpler than two-event design.
DIC — Data Integrity & Consistency9.0Detail 2.5 covers all 6 write paths with scope, partial failure behavior, idempotency key, consistency level. Safe order (INSERT before publish) documented.
FMC — Failure Mode & Retry8.5Failure catalog covers all external calls; orphaned-row risk documented (REV-6); milestone Kafka retry path clear. Held from 9.0 by REV-6.
CSS — Concurrency & Scaling9.0Concurrency map covers 3 collision scenarios; FOR UPDATE guard; idempotent cancel UPDATE.
SAS — Security9.5No new HTTP surfaces; no PII; existing gosec/staticcheck cover new code.
MRP — Migration & Rollout8.54-step migration sequence; all additive; rollback = drop column/tables. Chunk dependency order explicit (chunk 4 depends on chunk 1; chunk 6 on chunks 2,3,4,5). Held from 9.0 by REV-1 (delayed job path affects chunk 6 implementation).
OBS — Observability9.08 named log events with fields; alert threshold; "debug at 3am" implied by fields.
SBC — Service Boundary9.5Precisely bounded to qontak-billing; consumer-side email RFC explicitly deferred; no coupling to external services.
CPA — Pattern Alignment9.0Source Verification cites file + line for every pattern; chunk 3 mentions make generate for sqlc; worker registration follows existing pattern.
CDG — Compliance9.0No PII in new tables; no compliance trigger.

Overall: 8.5/10 — strong spec. Two major open items are design questions (REV-1, REV-2) with documented fallback paths, not spec holes.


Decision Closure Assessment

#DecisionStatusAgent-implementable?
1triggers_downgrade on billing_componentsResolvedyes — DDL + backfill given
2Milestone scheduling mechanismResolved (two documented paths)yes — but path choice affects chunk 6 (§5 Q1)
3Day 0 idempotency guard (active-schedule check)Resolvedyes — FOR UPDATE guard + status check
4Email notification: Day 0 only, Kafka eventResolvedyes — struct + topic given
5Balance re-check in milestone worker: direct DB readResolvedyes — same quota tables as recalculateComponentQuota

5 of 5 Resolved. 0 Dangling.


Data Integrity Deep-Dive

Write pathIdempotency keySafe orderDuplicate handling
downgrade_events INSERTidempotency_key unique indexbefore schedule INSERTunique constraint rejects duplicate → skip Day 0
downgrade_schedules INSERT x4(downgrade_event_id, milestone) uniqueafter downgrade_eventsunique constraint prevents re-insertion
Kafka negative_balance Day 0consumer dedup on (company_id, billing_code, milestone, timestamp)after DB insertsat-most-once; consumer handles idempotency
Kafka downgrade_notification Day 0consumer uses EventID UUIDafter negative_balanceEventID is unique per Day 0 event
Milestone status updateFOR UPDATE row lockafter quota re-checksecond executor sees non-scheduled → no-op
Cancel remainingWHERE status='scheduled' predicateidempotentre-running UPDATE touches 0 rows if already cancelled

Concurrency Collision Map

ResourceWritersCollisionResolution
Milestone rowworker + Sidekiq retrydouble-fireFOR UPDATE + status guard
downgrade_events Day 0concurrent recalculationsduplicate keyunique idempotency_key → second fails → skip
Cancel remaining scheduled rowsmilestone worker + main use caseboth cancelidempotent WHERE status='scheduled' → converges

Strengths

  • Acceptance criteria encoded verbatim — the §1 milestone table and sequence diagrams directly implement the stated ACs (Day 0 + Week 1/2/3/Month 1, abort on resolve, email once). No interpretation needed.
  • Self-contained in qontak-billing — the milestone worker queries the billing DB directly (Decision 5). No cross-service read at milestone time. Lower latency, higher reliability.
  • Documented fallback — Decision 2 explicitly covers the cron-sweeper fallback if perform_at is unavailable, so the agent has a complete implementation path regardless of job system capabilities.

Biggest Gaps

  • REV-1 (major): jobEnqueuer perform_at support unknown — agent cannot write chunk 6 without this confirmed.
  • REV-2 (major): same-day re-entry policy open — affects migration 2 DDL and Day 0 guard logic.

Priority Actions

  1. (REV-1) Inspect jobEnqueuer implementation to confirm perform_at support; document the result in §6 comment log and update §2 Decision 2 with the confirmed path.
  2. (REV-2) Decide same-day re-entry policy (§5 Q3) and update idempotency_key format in migration 2 accordingly.
  3. (REV-3) Run SELECT 1 FROM pg_extension WHERE extname='pgcrypto' on billing Postgres; note in §6 log.
  4. (REV-4) Verify 6 backfill parent_component_code values against prod before migration 1.
  5. (REV-5) Resolved R7 — downgrade_notification topic removed.

Implementation Readiness Checklist

Unblocked (start immediately)

  • Trigger logic complete and unambiguous
  • Migration 1 DDL (triggers_downgrade column + backfill) — chunk 1
  • Payload + topic constants — chunk 3
  • DowngradeNotificationWorker fully specified with pseudocode — chunk 5
  • Query extension (TriggersDowngrade in struct) — chunk 4 (depends chunk 1)
  • Data integrity: safe write order; FOR UPDATE guard; idempotent cancel
  • All 5 decisions resolved; no dangling architectural choices
  • 6/6 mermaid blocks parse (REV-7 fixed R6)

Blocked (resolve before starting chunk)

  • REV-1 — confirm delayed job perform_at before chunk 6 (Day 0 enqueue)
  • REV-2 — confirm same-day re-entry policy before chunk 2 (migration 2)

Pre-deploy checks

  • REV-3 — pgcrypto confirmation
  • REV-4 — backfill codes verified
  • REV-5 — Kafka topic provisioned

Task Manifest

OrderChunkFilesCommandsAcceptance criteriaDepends on
1Migration 1: triggers_downgradedb/migrations/<ts>_add_triggers_downgrade.up/down.sqlmake migrate-upcolumn exists default false; 6 codes true; rollback clean
2Migration 2: downgrade_events + downgrade_schedulesdb/migrations/<ts>_create_downgrade_tables.up/down.sqlmake migrate-uptables + constraints + indexes; rollback drops tablesREV-2 resolved; REV-3 resolved
3Payload + topic constantinternal/app/payload/kafka_event.go, internal/kafka/topics.gogo build ./...structs compile; topic accessible
4Query extensiondb/queries/billing_components.sql, internal/app/repository/billing_components.sql.gomake generateTriggersDowngrade populated; existing tests passChunk 1
5DowngradeNotificationWorkernew file + worker registrationgo test -race ./internal/app/usecase/quota_management/...fires when negative; cancels when resolved; FOR UPDATE guardChunks 2, 3
6compareComponents Day 0 extensionactivate_or_update_component_quota.gogo test -race ./internal/app/usecase/quota_management/...Day 0 publishes + 4 rows + 4 jobs; active-schedule guard skips; resolve cancelsChunks 2,3,4,5; REV-1 resolved
7Full verificationmake test lint secall greenChunks 1–6; REV-4, REV-5 resolved

Open Questions

#QuestionCategorySeverity
1Does jobEnqueuer support perform_at / delayed execution?TDC / MRPMajor (blocks chunk 6)
2Same-day re-entry policy: block (yyyymmdd key) or allow (uuid key)?DMSMajor (blocks chunk 2)
3Cleanup cron for downgrade_events/downgrade_schedules — same RFC or follow-up?DMSNice-to-have
4If enqueue fails for a milestone, is cron fallback (Option B) mandatory or optional?FMCImportant
5Who consumes billing.quota_management.negative_balance to send email (filter milestone=day_0)? Separate RFC needed.SBCNon-blocking
6gen_random_uuid() availability on billing Postgres?DMSPre-deploy gate
7Backfill codes confirmed against production data?DMSPre-deploy gate

Review History

CycleDateRFC revisionScoreVerdictNotes
R1–R52026-06-23/30prior drafts7.5–8.5PROCEED with notesSuperseded by scope/architecture rewrite
R62026-06-30last_updated 2026-06-30 (milestone-based re-trigger rewrite)8.5PROCEED with notesMilestone scheduling (Day 0 + Week 1/2/3/Month 1); email once at Day 0; 2 design confirmations needed (REV-1,2); 6/6 mermaid pass
R72026-06-30last_updated 2026-06-30 (event + struct simplification)8.5PROCEED with notesRemoved DowngradeNotificationEvent; renamed Milestone stringTriggerSequence int (1..5); removed TriggersDowngrade from event; consumers filter trigger_sequence==1 for email; ACV raised to 9.5; REV-5 closed