Skip to main content

RFC: Postpaid Usage Monthly Scheduler — Snapshot, Dashboard & Bulk Export (Backend)

Document Conventions (do not remove)

This RFC follows the Qontak RFC Template format for governance — the metadata table, Confluence sections 1–6, and Comment logs are mandatory. Sections marked N/A — reason are intentionally not applicable, not omitted.

It is agent-execution-ready: §1 PRD-to-Schema Derivation, §2 Repo Reading Guide (Detail 2.0), mermaid diagrams, and §4 Agent Execution Plan + Verification & Rollback Recipe are complete (see §7).

Delivery & project management live elsewhere. This RFC is the technical artifact only. Staffing, effort, timeline, and rollout schedule live in the initiative's delivery/ folder. This RFC has not yet been handed to delivery.

Scope note — this is the BACKEND RFC for two repos: moderator-be + hub_core. The Modpanel "Postpaid Usage Dashboard" (PUMS-S02…S06) is server-rendered Rails + Hotwire (Turbo/Stimulus) inside moderator-be — not a separate SPA and not a @mekari/ds design-system frontend — so its controllers, ERB views, and Stimulus controllers are in-repo moderator-be work covered here. The visual design (Figma) is still pending (PRD §7 "Figma: TBD"); these views are specified functionally against the read/search/export contracts, and pixel-level design polish is tracked as an Open Question (§5 OQ-5), not a blocker for the data-and-behavior layer.

The PRD's service map was wrong about the data source — this RFC corrects it. PRD §6.2 assumed the usage data lives in a Go service qontak-billing and async export in a Go report-worker. Neither exists in the grounded repo set. The billing data (monthly_usages, organization_packages, organizations, waba_accounts) lives in hub_core — a Rails engine (gem) mounted into the qontak_chat_service host app — and moderator-be already reaches it cross-service over HTTP via pigeon_get (verified: Core::Repositories::AppIntegrations::ChatPanel::Reports::DownloadMonthlyPostpaidUsageCHAT_PANEL_SERVICE_BASE_URL /api/core/v1/reports/billing/download/monthly_postpaid_usage). The async ZIP export is built in moderator-be Sidekiq (it already bundles caxlsx and has an async-export precedent — new_reporting_flow), not in a Go report-worker.

Metadata

FieldValueNotes
StatusRFC (in discussion)Human label; YAML status: carries the remapped linter enum draft
DRIpm-group.qontak@mekari.comSingle accountable owner (mirrors PRD/initiative). Per-task staffing lives in delivery/.
TeambifrostAdvisory squad slug carried from the PRD / initiative README
Author(s)HafrizBackend implementor (initiative README implementors: Hafriz (Backend))
Reviewersmoderator-be · hub_core · qontak_chat_service reviewers (TBD)Tech reviewers across the affected repos
Approver(s)Bifrost tech lead (TBD) + infosec approver (TBD)PII (encrypted organization_name) is read cross-service — infosec sign-off required
Submitted Date2026-06-30Date RFC opened for discussion
Last Updated2026-06-30Bump on every material edit
Target Release2026-Q3
Target Quarter2026-Q3Advisory, carried from the PRD
Deliverynot yet handed to delivery
Related../prds/prd-postpaid-usage-monthly-scheduler.mdSource PRD (v1.3)
DiscussionTBD — #bifrost squad Slack channel

Type: backend Sub-type: new-feature

Sections at a Glance

  1. Overview (incl. §1 PRD-to-Schema Derivation — entities, business rules, contracts; no Figma)
  2. Technical Design (Infrastructure Topology → Technical Decisions [ADR] → Repo Reading Guide → Architecture & Service Map → Sequence Diagrams → DDL → APIs → integrity / concurrency / async specs)
  3. High-Availability & Security
  4. Backwards Compatibility and Rollout Plan (incl. §4 Agent Execution Plan + Verification & Rollback Recipe)
  5. Concern, Questions, or Known Limitations
  6. Comment logs
  7. Ready for agent execution

1. Overview

The Finance team manually downloads postpaid usage reports one client at a time at the start of every billing cycle (~3 hours/month, confirmed PRD §12). Billing V3 multiplies the problem by adding multiple postpaid components per client (WhatsApp Balance, MUV, Voice/Call Balance, whitelisted quotas). Today the only tool is moderator-be's synchronous download_monthly_postpaid_usage endpoint, which produces one CSV per organization_name query by proxying to the billing host app.

This RFC defines the backend of a Postpaid Usage Monthly Scheduler:

  1. A monthly cron (moderator-be Sidekiq, 0 2 1 * * Asia/Jakarta) that reads the prior month's finalized usage for all postpaid CIDs from hub_core (cross-service via the qontak_chat_service HTTP API) and writes frozen snapshot rows — one per CID per billing type — into a new postpaid_usage_snapshots table in moderator-be's billing DB.
  2. A Modpanel dashboard (server-rendered Rails + Hotwire in moderator-be) for Finance to list / search / filter those snapshots (paginated, 50/page).
  3. A bulk async ZIP export (moderator-be Sidekiq) that packages the selected snapshots into one .xlsx per CID per type inside a single ZIP, tracked in a new postpaid_usage_exports status table.

The change touches two repos (plus a thin contract addition to the billing host app):

  • moderator-be (Ruby/Rails 7.2) — cron scheduler, snapshot + export tables, Modpanel UI, async ZIP worker, Finance RBAC (Pundit + permission), Flipper flag, cross-service read client.
  • hub_core (Ruby/Rails engine, billing domain) — owns the source data; must be extended to expose a bulk monthly-usage summary that adds waba_id, billing version/type, V3 Call Balance, and whitelisted-quota components (its current report returns only WA Balance + MUV and no waba_id). Exposed via a new qontak_chat_service route.

Success Criteria

  • On the 1st of each month, snapshots are generated for ≥ 99% of eligible postpaid CIDs within 2 hours of the cron start (PRD §6, §12); a per-CID failure logs snapshot_failed and does not abort the run (PUMS-S01/ERR-1).
  • The dashboard table loads ≤ 3 s for up to 1,000 records (PRD §6) reading from the local snapshot table (no live cross-service call on page load).
  • A bulk ZIP for a selection ≤ 50 MB completes ≤ 30 min (PRD §6), with ≥ 95% job success within 60 days of GA.
  • Finance time on the monthly download drops from 3 h → ≤ 30 min (PRD §12 ⭐ KPI).
  • Snapshot data is immutable once written — a mid-month read returns end-of-month values unchanged (PUMS-S01/AC-2).

Out of Scope

  • Real-time usage (snapshots are frozen month-end values only) — PRD §5.1.
  • V1 components other than WA Balance + MUV — PRD §5.2.
  • Edit / correct / reprocess / manual regenerate of usage data — PRD §5.3, §5.6.
  • External-client / CID-admin exposure (internal Finance Modpanel only) — PRD §5.4.
  • Email / push notification when a snapshot is ready — PRD §5.5.
  • ERP/accounting integration — PRD §5.7.
  • Building a Go qontak-billing or report-worker service — they do not exist; this RFC uses hub_core + moderator-be Sidekiq instead.
  • Pixel-level Modpanel visual design — Figma pending (§5 OQ-5); views built functionally.

Assumptions

  • qontak_chat_service (the host app mounting hub_core) is reachable from moderator-be via ENV['CHAT_PANEL_SERVICE_BASE_URL'] with the existing Core::Services::Redis::ChatPanel::Auth token — verified by the existing DownloadMonthlyPostpaidUsage repository.
  • hub_core's daily aggregation (Billings::AddDailyUsageWorker, ~00:00 + 7h Jakarta offset) has fully written month-end monthly_usages rows before the snapshot cron runs at 02:00 on the 1st (the 2-hour gap is the safety margin — PRD §15, 2026-06-23 decision).
  • moderator-be's billing DB (chat_billing, models under Billings::ApplicationRecord) is the correct home for the new snapshot/export tables (it is operationally owned by Bifrost and already separate from hub_core's billing DB).
  • billing_version "1.0.0" = V1, "3.0.0" = V3 (verified migration). "2.0.0" handling is unresolved — see §5 OQ-3 (carried from PRD OQ-3).

Dependencies

DependencyTypeWhat we needOwnerAvailability
hub_core bulk summaryInternal (data source)New repo/interactor returning per-CID per-type usage incl. waba_id, V3 Call Balance, quotashub_core (Bifrost)needs building (extends existing DownloadPostpaidMonthlyUsage)
qontak_chat_service routeInternal (HTTP contract)New host-app route mounting the hub_core interactor, callable by moderator-be pigeon clientchat_service host (Bifrost)needs building
moderator-be Sidekiq + sidekiq-cronInfra (exists)Cron scheduling + worker queuesBifrostexists (config/schedule.yml)
caxlsx gemLibrary (exists)XLSX generation per CID fileexists (Gemfile:128) — resolves PRD OQ-2
rubyzip gemLibrary (new)Package per-CID .xlsx into one ZIPneeds adding (no zip gem present today)
Object storage + presigned URLInfra (unconfirmed)Store generated ZIP, serve 24 h download linkInfraBLOCKER for export delivery — no S3/OSS client found in moderator-be; see §5 OQ-1 + Decision 6
Flipper / Core::Services::PreferenceInfra (exists)Feature flag postpaid_usage_scheduler_enabledBifrostexists
Pundit + Permission/RoleInfra (exists)New Finance permission keyBifrostexists

PRD-to-Schema Derivation (backend-specific — required)

Backend RFCs derive the schema/contract from the PRD as a domain spec; no Figma. Every §2.3 DDL row and §2.4 endpoint traces back to a row here.

PRD-described entity / attribute / rulePersisted as (table.column)Exposed via (endpoint / event)Enforced whereSource (PRD §)
Frozen monthly usage per CID per postpaid typepostpaid_usage_snapshots(cid, organization_id, waba_id, year_month, billing_type, usage_value, report_date)hub_core bulk summary (write side) + Modpanel list endpoint (read side)PostpaidUsageSnapshotWorker (write); immutable after insert (no update path)§9.2 PUMS-S01
Snapshot is frozen/immutable mid-monthpostpaid_usage_snapshots — no UPDATE path; unique key (cid, year_month, billing_type)n/a (read-only thereafter)DB unique index + worker upsert-once semanticsPUMS-S01/AC-2
V1 = WA Balance + MUV onlybilling_type ∈ {WA_BALANCE_V1, MUV_V1}hub_core summary filters billing_version="1.0.0"hub_core repo queryPUMS-S01/AC-3
V3 = WA Balance + MUV + Call Balance + whitelisted quotasbilling_type ∈ {WA_BALANCE_V3, MUV_V3, CALL_BALANCE_V3, <quota codes>}hub_core summary reads monthly_usages + organization_package_component_postpaid_limitshub_core repo queryPUMS-S01/AC-4
WA Balance value = monthly_usages.mcc_total; MUV = muv_totalpostpaid_usage_snapshots.usage_valuehub_core summaryhub_core repo mappingPUMS-S01 (Data Fields)
WABA ID per CIDpostpaid_usage_snapshots.waba_idorganizations.settings['waba_id'] or waba_accounts.waba_idhub_core summary (new field)hub_core repo joinPUMS-S02, PRD OQ-1 (resolved)
Company name (decrypted)snapshot stores company_name; source organizations.name / organization_packages.organization_name (lockbox-encrypted)hub_core summary (decrypted server-side, never logged)hub_core lockbox has_encryptedPUMS-S02, PRD §6.2
Per-CID snapshot failure isolationsnapshot_failed log event (cid, year_month, reason); run continuesworker rescue-per-CIDPostpaidUsageSnapshotWorkerPUMS-S01/ERR-1
snapshot_failed rate > 5% → alertmetric moderator_panel_postpaid_snapshot_failed countDatadog alertworker + DatadogPUMS-S01/ERR-2, §11.1
Paginated dashboard read (50/page)read postpaid_usage_snapshotsGET /api/v1/qontak/chat_panel/reports/postpaid_usage_snapshotsPundit can? + PagyPUMS-S02
Search by CID/WABA + filter by year_monthquery params search_query, year_monthsame list endpointrepository WHERE (exact match)PUMS-S04
Postpaid type label / "Unknown" fallbackbilling_type → display maplist endpoint serializerview/serializerPUMS-S03 + PUMS-S03/ERR-1
Bulk async ZIP, ≤ 50 MB, 24 h linkpostpaid_usage_exports(job_id, status, file_ref, file_size_bytes, expires_at, requested_by, error_details)POST .../postpaid_usage_exports (trigger) + GET .../postpaid_usage_exports/:job_id (poll)PostpaidUsageExportWorker; 50 MB pre-check; TTL cleanupPUMS-S06
ZIP file naming {CID} {Company Name} {Postpaid Usage YYYY-MM} {Type}.xlsxn/a (file name)export workercaxlsx builderPUMS-S06/AC-3
Download link expiry 24 hpostpaid_usage_exports.expires_atpoll endpoint returns "expired" past TTLTTL cleanup cron + checkPUMS-S06/AC-4, §6.1
Finance-only accessn/a (authz)all endpoints behind permission billing_postpaid_usage_*Pundit policy + can?PUMS-S02…S06 Permission Model

Detail 1.A — PRD Traceability Matrix

Composite AC ids (<STORY-ID>/AC-n, also ERR-n) — AC numbering restarts per story.

Forward (PRD AC → RFC):

PRD composite AC idService / endpoint / jobRFC section
PUMS-S01/AC-1..AC-4PostpaidUsageSnapshotWorker + hub_core summary§2 Decision 1–3 · §2.2 seq-1 · §2.3 · §4.C ch.2,4,6
PUMS-S01/ERR-1, ERR-2per-CID rescue + Datadog alert§2.2 seq-2 · §3.A.1 · §3 Monitoring
PUMS-S02/AC-1..AC-3, ERR-1, ERR-2GET .../postpaid_usage_snapshots + Modpanel view§2.4 row 1 · §4.C ch.7,8
PUMS-S03/AC-1..AC-3, ERR-1billing_type display map§2.3 (billing_type enum) · §4.C ch.8
PUMS-S04/AC-1..AC-4, ERR-1, ERR-2list endpoint search/filter params§2.4 row 1 · §4.C ch.8
PUMS-S05/AC-1..AC-5, ERR-1client-side selection (Stimulus)§4.C ch.9 (FE-in-repo)
PUMS-S06/AC-1..AC-4, ERR-1..ERR-3POST/GET .../postpaid_usage_exports + PostpaidUsageExportWorker§2 Decision 4–6 · §2.2 seq-3,4 · §2.3 · §2.C · §4.C ch.10,11,12

Reverse (RFC → PRD AC):

New endpoint / table / service / dependencyPRD composite AC id it serves
postpaid_usage_snapshots tablePUMS-S01/AC-1, PUMS-S02/AC-1
hub_core bulk summary (waba_id + V3)PUMS-S01/AC-4, PUMS-S02/AC-1
GET postpaid_usage_snapshots (paginated)PUMS-S02/AC-3, PUMS-S04/AC-1..AC-4
postpaid_usage_exports table + workerPUMS-S06/AC-1, AC-2
50 MB pre-checkPUMS-S06/ERR-1
expires_at + TTL cleanupPUMS-S06/AC-4
Finance permission keyPUMS-S02…S06 Permission Model

UI / Consumer Surface Coverage

PRD-named surfaceConsumerRequired reads (BE endpoint)Required writes (BE endpoint)Status surface
/modpanel/postpaid-usage Dashboard (table, search, filter)web (Modpanel, server-rendered)GET .../postpaid_usage_snapshotsn/a — read-only viewreport_date per row
Bulk download / ZIP status toastweb (Modpanel + Stimulus poll)GET .../postpaid_usage_exports/:job_idPOST .../postpaid_usage_exportspostpaid_usage_exports.status enum

Role Coverage

PRD roleAuthorization mechanismEndpoints permittedCross-tenant?Audit trail
Finance Modpanel userPundit can? + permission billing_postpaid_usage_index / _downloadlist + export trigger + pollyes — Finance sees all CIDs (internal tool, by design)bulk_download_triggered event + integration_logs
Non-Finance Modpanel userPundit denies (no permission)none → 403 / hiddenn/adenied attempt logged
System (cron)no auth surface — Sidekiq-internaln/a (no manual-trigger endpoint)n/asnapshot_generated / snapshot_failed

PRD Section Coverage

PRD §TitleWhere covered
3One-liner + Problem§1 Overview
4Target Users + Persona§1 (Finance) · Role Coverage
5Non-Goals§1 Out of Scope
Scope ChangesBackend/Frontend§1 (BE here; FE = in-repo server-rendered)
6 / 6.1 / 6.2Constraints / Lifecycle / Service Architecture§3 Performance · §2.3 retention · §2 Infra Topology (corrected)
7New Features (Dashboard)§2.4 · §4.C ch.7–9
8API & Webhook Behavior§2.4 APIs · §2.C async
9.1 / 9.2System Flow / User Stories§2.2 sequence · Detail 1.C
10Rollout§4 Rollout Strategy
11 / 11.1Observability / Cadence§3 Monitoring & Alerting
12Success Metrics§1 Success Criteria
13Launch Plan & Stage Gates§4 Rollout Strategy
14Dependencies§1 Dependencies
15Key Decisions§2 Technical Decisions (ADR)
16Open Questions§5 (OQ-1 storage, OQ-3 V2 carried; OQ-1 WABA resolved)

Detail 1.B — Key Decisions Summary (full ADR treatment in §2 Technical Decisions)

#DecisionChosen option§2 block
1Where snapshots are storedNew postpaid_usage_snapshots table in moderator-be chat_billing DBDecision 1
2How moderator-be reads source usageCross-service pigeon_get to hub_core via qontak_chat_service (reuse existing pattern)Decision 2
3hub_core: extend vs new bulk endpointExtend the existing DownloadPostpaidMonthlyUsage report into a bulk summary (adds waba_id + V3)Decision 3
4Snapshot generation sync vs asyncAsync Sidekiq cron, per-CID isolation, no run abortDecision 4
5Bulk export sync vs async + ZIP build locationAsync Sidekiq worker in moderator-be, caxlsx + rubyzipDecision 5
6ZIP storage + deliveryObject storage + presigned URL — mechanism unconfirmed (recommended option + blocker)Decision 6
7File format XLSX vs CSVXLSX via caxlsx (already bundled) — resolves PRD OQ-2Decision 7

Detail 1.C — Per-Story Change Map (human-readable index)

Story idTitleLayer scopeChanges (concrete BE artifacts)Composite AC idsAcceptance criteria (verifiable)RFC anchors
PUMS-S01Monthly Snapshot GenerationRuntime / behavior (cron) + Cross-squad (hub_core)PostpaidUsageSnapshotWorker; schedule.yml cron 0 2 1 * * Asia/Jakarta; postpaid_usage_snapshots migration+model; SummaryPostpaidUsage pigeon repo; hub_core bulk summary repo/interactor/builder + chat_service routeS01/AC-1..AC-4, ERR-1, ERR-2rspec on worker passes; running worker for a fixture month inserts N rows = eligible CIDs × types; per-CID raise logs snapshot_failed and run completes; Datadog count emitted§2 Dec 1–4 · §2.2 seq-1,2 · §2.3 · §4.C ch.2,3,4,6
PUMS-S02View Postpaid Usage TableBE + FE consumes newGET .../postpaid_usage_snapshots (Pagy 50/page); Modpanel controller + ERB view + StimulusS02/AC-1..AC-3, ERR-1, ERR-2rspec request spec returns 50 rows w/ columns; flag-off → 403/hidden; empty month → empty-state JSON§2.4 row 1 · §4.C ch.7,8
PUMS-S03Identify Postpaid TypesBE + FE consumes newbilling_type enum → label map (WA Balance / MUV / Call Balance / Unknown)S03/AC-1..AC-3, ERR-1unit test maps each enum to label; unknown code → "Unknown" not blank§2.3 enum · §4.C ch.8
PUMS-S04Search & FilterBE + FE consumes newlist endpoint params search_query (exact CID/WABA), year_month (required, default latest)S04/AC-1..AC-4, ERR-1, ERR-2request spec: exact-match CID returns subset; combined filter ANDs; no match → empty-state§2.4 row 1 · §4.C ch.8
PUMS-S05Bulk SelectionFE-only (Stimulus, in-repo)Stimulus controller: per-row + select-all-across-pages, persist across pagination, clear on filter changeS05/AC-1..AC-5, ERR-1Stimulus unit/jest or system spec: select-all sets count; filter change clears; refresh clears§4.C ch.9
PUMS-S06Bulk Download (Async ZIP)BE + FE consumes newPOST/GET .../postpaid_usage_exports; PostpaidUsageExportWorker; postpaid_usage_exports migration+model; caxlsx+rubyzip; storage+presigned (Dec 6); 50 MB pre-checkS06/AC-1..AC-4, ERR-1..ERR-3request spec: >50 MB → reject + zip_size_limit_exceeded; worker builds ZIP w/ exact file names; status → completed; link past expires_at → "expired"; 0 selected → 422§2 Dec 5–7 · §2.2 seq-3,4 · §2.3 · §2.C · §4.C ch.10,11,12

2. Technical Design

Infrastructure Topology (start here)

Deployment topology

flowchart TB
finance(["Finance user (browser)"]) -->|HTTPS| lb["Load Balancer / Ingress"]
lb -->|HTTP| mod["moderator-be api pods (Rails 7.2, Puma)"]
mod -->|"read / write"| billdb[("chat_billing Postgres (primary)")]
mod -->|"enqueue"| redis[("Redis (Sidekiq queues)")]
redis -->|consume| worker["moderator-be Sidekiq worker pods"]
cron["sidekiq-cron (schedule.yml)"] -->|"0 2 1 * * Asia/Jakarta"| redis
worker -->|"read / write"| billdb
worker -->|"HTTPS pigeon_get (auth token)"| chatsvc["qontak_chat_service (host app + hub_core engine)"]
chatsvc -->|"read"| hubdb[("hub_core billing Postgres + replica_billing")]
worker -->|"PUT object (UNCONFIRMED — Decision 6)"| oss[("Object storage (ZIP files)")]
mod -->|"GET presigned URL (UNCONFIRMED)"| oss

Per-service responsibility

flowchart LR
subgraph mod["moderator-be (Bifrost — Ruby/Rails 7.2)"]
uc1["PostpaidUsageSnapshotWorker (cron — generate snapshots)"]
uc2["GET postpaid_usage_snapshots (list / search / filter)"]
uc3["POST/GET postpaid_usage_exports (bulk ZIP)"]
uc4["PostpaidUsageExportWorker (build XLSX + ZIP)"]
end
subgraph hub["hub_core engine via qontak_chat_service (Bifrost)"]
s1["GET summary_postpaid_usage (bulk monthly usage + waba_id + V3)"]
end
uc1 -->|"HTTPS pigeon_get"| s1
s1 -->|"ActiveRecord read (replica)"| hubread[("monthly_usages, organizations, waba_accounts, postpaid_limits")]
uc1 -->|"INSERT frozen rows"| snaptbl[("postpaid_usage_snapshots")]
uc2 -->|"SELECT (Pagy)"| snaptbl
uc3 -->|"enqueue + status"| exptbl[("postpaid_usage_exports")]
uc4 -->|"read snapshots, write status"| exptbl
uc4 -->|"caxlsx + rubyzip, PUT object"| oss[("Object storage")]

Every external/internal edge above is verified except the Object storage edges (Decision 6, §5 OQ-1) — drawn dashed-in-prose as UNCONFIRMED.


Technical Decisions (ADR-format — the engineering heart)


Decision 1: Store snapshots in moderator-be's chat_billing DB (not in hub_core)

Context The frozen monthly snapshot is a new dataset shaped for the Modpanel dashboard (per-CID per-type rows with waba_id, display company_name, report_date). It must be queried with pagination/search on every page load (≤ 3 s for 1,000 rows) and must be immutable. hub_core owns the source aggregates (monthly_usages) but is a shared billing engine; moderator-be owns the Modpanel and its billing DB (chat_billing, models under Billings::ApplicationRecord).

Options considered

  • Option A — new table in moderator-be chat_billing: snapshot lives next to the dashboard.
    • Pros: dashboard reads are local (no cross-service latency on page load); Bifrost owns it end-to-end; no schema churn in the shared billing engine.
    • Cons: data duplication (snapshot copies a projection of monthly_usages); a cross-service read at generation time.
  • Option B — new table in hub_core billing DB: snapshot lives with the source data.
    • Pros: no cross-service read at generation; single source of truth.
    • Cons: every dashboard page load becomes a cross-service call (blows the 3 s budget / couples Modpanel to chat_service uptime); schema change in a shared engine consumed by many host apps; Modpanel RBAC can't gate a hub_core table.

Decision: Option A.

Rationale The dashboard's read-latency budget and Modpanel ownership dominate. Generation is a once-a-month batch where a cross-service read is acceptable; page loads are frequent and must be local. Mirrors the existing split — moderator-be already proxies billing reads cross-service rather than embedding the billing engine.

Consequences Snapshot is a denormalized projection; if monthly_usages is later corrected the snapshot does not auto-update (acceptable — snapshots are intentionally frozen, PUMS-S01/AC-2). A cross-service dependency exists at generation (handled by Decision 4 isolation).

Reversibility Moderate. The table + worker are additive; moving to Option B means relocating the migration to hub_core and swapping the read endpoint for a local query — bounded, no data loss (snapshots regenerate monthly).


Decision 2: Read source usage cross-service via pigeon_get to qontak_chat_service

Context moderator-be cannot read hub_core's billing DB directly — hub_core is a Rails engine mounted in a different service (qontak_chat_service) with its own DB. The verified, in-production pattern for billing reads is pigeon_get to ENV['CHAT_PANEL_SERVICE_BASE_URL'] with a Core::Services::Redis::ChatPanel::Auth token (see DownloadMonthlyPostpaidUsage).

Options considered

  • Option A — reuse pigeon_get cross-service HTTP: extend the existing report client.
    • Pros: zero new infra; reuses auth, circuit breaker, retry, timeout already wired in Core::Repositories::AbstractRepository; matches how Modpanel already fetches billing data.
    • Cons: bulk payload (all postpaid CIDs for a month) over HTTP — must paginate/stream to avoid a huge response.
  • Option B — direct DB connection from moderator-be to hub_core billing DB: add a shard.
    • Pros: no HTTP serialization overhead.
    • Cons: cross-service DB coupling (anti-pattern here); bypasses the billing engine's encryption/decryption (organization_name is lockbox-encrypted in hub_core); ops/secret-sharing burden.

Decision: Option A.

Rationale Reuse of a proven, secured path; respects the service boundary; lets hub_core do the lockbox decryption server-side so moderator-be never holds the billing master key. Bulk size is managed by paginating the summary endpoint (Decision 3) and the once-a-month cadence.

Consequences Generation latency includes HTTP round-trips; the summary endpoint must paginate. A chat_service outage delays (not corrupts) snapshot generation — the cron can be re-run.

Reversibility Easy — it's one repository class behind the worker.


Decision 3: Extend hub_core's existing report into a paginated bulk summary (adds waba_id + V3)

Context hub_core already has Billings::Repositories::V1::Reports::DownloadPostpaidMonthlyUsage (reads monthly_usages joined to organizations, returns mcc_total/muv_total). It does not return waba_id, does not cover V3 Call Balance or whitelisted quota components (organization_package_component_postpaid_limits), and is shaped to render a CSV for one organization_name filter — not a bulk machine-readable summary for all CIDs.

Options considered

  • Option A — extend the existing report repo into a bulk JSON summary: add waba_id, V3 components, billing_version, pagination; expose via a new chat_service route.
    • Pros: reuses the verified query (read_from_replica_db, org join, encryption handling); one place for postpaid-usage logic; least new surface in the shared engine.
    • Cons: the existing repo is CSV-oriented (Builders::...DownloadPostpaidMonthlyUsage); needs a sibling builder for JSON + the extra fields.
  • Option B — brand-new unrelated interactor: build from scratch.
    • Pros: clean separation from the CSV report.
    • Cons: duplicates the org-join + replica-read + encryption logic; two code paths to keep correct.

Decision: Option A — a new bulk summary interactor/repository/builder alongside the existing report, sharing its read helpers, exposed at a new route (e.g. GET /api/core/v1/reports/billing/summary_postpaid_usage).

Rationale Maximizes reuse of verified, encryption-aware read code; isolates the new JSON shape in a dedicated builder; keeps the CSV report untouched (backward compatible). Resolves PRD OQ-1 (waba_id from organizations.settings['waba_id'] or waba_accounts) in the one place that can do it without leaking the encryption key.

Consequences hub_core gains a new bulk endpoint and a quota/V3 query path (organization_package_component_postpaid_limits); a sibling builder is maintained. V3 quota-code enumeration must be confirmed with the billing team (§5 OQ-2-bis).

Reversibility Easy — additive; the existing CSV report is unchanged.


Decision 4: Snapshot generation is async (Sidekiq cron) with per-CID isolation

Context Generation spans all postpaid CIDs × billing types and a cross-service read — far over 100 ms and partial-failure-prone. PRD requires the run to continue past a single CID failure and finish within 2 h.

Options considered

  • Option A — one Sidekiq cron job, iterate CIDs, rescue per CID: retry: 0, log + continue.
    • Pros: simple; matches existing monthly crons (chatbot_ai_reset_monthly_task, airene_reset_monthly_task — both 0 0 1 * * Asia/Jakarta); one run = one month.
    • Cons: a single long-running job; one slow CID slows the whole run.
  • Option B — fan-out: dispatcher enqueues one job per CID (mirrors hub_core's SingleAddMonthlyUsageWorker).
    • Pros: parallelism, natural per-CID isolation/retry.
    • Cons: more moving parts; harder to assert "run complete + failure rate"; thundering cross-service reads against chat_service.

Decision: Option A for Phase 1 (single job, per-CID begin/rescue, accumulate failures, emit snapshot_failed per CID + a run summary). Revisit Option B only if the 2 h budget is breached.

Rationale Matches the verified monthly-cron precedent and retry: 0 convention; keeps the cross-service read rate controlled; makes ERR-2 (>5% failure → alert) a single post-run check.

Consequences A pathologically slow chat_service could approach the 2 h budget; mitigated by paginating the summary read and a per-page timeout. No automatic retry of the whole run (manual re-trigger via re-enqueue if needed).

Reversibility Easy — swap the loop body for perform_async per CID.


Decision 5: Bulk export is async (Sidekiq) building XLSX+ZIP in moderator-be

Context A Finance selection can be hundreds of CIDs; synchronous generation times out the browser (PRD §15). The named repo set has no Go report-worker — so the ZIP is built in moderator-be. moderator-be already bundles caxlsx and has an async-export precedent (new_reporting_flow, and a download_mcc_v2 path that returns a zip).

Options considered

  • Option A — moderator-be Sidekiq worker, caxlsx + rubyzip: enqueue on trigger, poll status.
    • Pros: stays in the named repos; reuses Sidekiq + the existing async-export UX precedent; caxlsx already present.
    • Cons: needs a new rubyzip dependency and object storage (Decision 6); ZIP build is CPU/memory on the worker pod.
  • Option B — delegate to the chat/reporting service's async export (the service behind new_reporting_flow).
    • Pros: that service may already have storage + presigned URLs.
    • Cons: that service is not in this RFC's grounded repo set; its contract for "bulk postpaid snapshot ZIP" is unverified; couples delivery to another team's roadmap.

Decision: Option A.

Rationale Keeps the work inside the two named repos and reuses caxlsx + the async precedent; avoids a hard dependency on an un-grounded service. The one gap (storage) is isolated to Decision 6.

Consequences Adds rubyzip; worker pods need memory headroom for ZIP build (≤ 50 MB cap bounds it); introduces the storage dependency (Decision 6 / OQ-1).

Reversibility Moderate — the worker + tables are additive; switching to Option B replaces the worker body with a delegation call.


Context The generated ZIP must persist and be served to Finance as a link that expires in 24 h (PRD §6.1). No S3/OSS/GCS client or storage gem was found in moderator-be (verified grep — only caxlsx, no aws-sdk/aliyun/fog/storage service). This is the one place the design cannot be fully grounded in existing code.

Options considered

  • Option A — object storage + presigned URL (S3/Alibaba OSS), expires_at = created_at + 24h.
    • Pros: standard; link expiry maps to presigned TTL; no large blobs in Postgres.
    • Cons: requires provisioning a bucket + client in moderator-be — not present today.
  • Option B — store the ZIP bytes in Postgres / on a shared volume, serve via an authenticated moderator-be download endpoint that checks expires_at.
    • Pros: no new external infra; moderator-be already serves files via send_data.
    • Cons: large blobs in the DB (≤ 50 MB × concurrent jobs) is poor practice; volume needs cleanup.

Decision: Recommend Option A; treat the concrete storage client/bucket as a blocker for the export-delivery chunks only (§4.C ch.11–12) until Infra confirms availability (§5 OQ-1). Option B is the documented fallback if no bucket can be provisioned in Phase 1.

Rationale Object storage is the correct long-term answer and matches the PRD's presigned-URL intent; but the RFC must not fabricate a storage client that does not exist. The snapshot + dashboard (Phase 1a) ship independently of this; bulk export (Phase 1b) gates on this confirmation.

Consequences Phase 1b cannot execute until storage is decided. The postpaid_usage_exports schema is storage-agnostic (file_ref holds either an object key or a local path) so neither option forces a migration change.

Reversibility The file_ref indirection makes A↔B a worker-internal change, no schema churn.


Decision 7: File format is XLSX via caxlsx (resolves PRD OQ-2)

Context PRD §16 OQ-2 left XLSX-vs-CSV open, assuming the Go report-worker only did CSV.

Options considered

  • Option A — XLSX via caxlsx: Gemfile:128 already bundles it.
    • Pros: matches PRD's stated .xlsx naming; no new dependency for the spreadsheet itself.
    • Cons: slightly larger files than CSV.
  • Option B — CSV: simpler.
    • Pros: smallest files.
    • Cons: diverges from PRD's stated format; Finance asked for .xlsx.

Decision: Option A — XLSX via caxlsx.

Rationale caxlsx is already a dependency, so XLSX carries no new library cost — the original OQ-2 concern (needing a new Go Excel lib) does not apply in the Ruby path.

Consequences Per-file .xlsx; ZIP packaging via rubyzip.

Reversibility Trivial — the builder can emit CSV instead.


Detail 2.0 — Repo Reading Guide

Repo Map (mermaid)

flowchart LR
subgraph mod["moderator-be"]
sched["config/schedule.yml"]
wkr["app/domains/core/workers/sidekiq/"]
repo["app/domains/core/repositories/app_integrations/chat_panel/reports/"]
ctrl["app/controllers/api/v1/qontak/chat_panel/reports/"]
mdl["app/models/billings/"]
mig["db/billing/migrate/"]
views["app/views/billing/postpaid_usage/"]
end
subgraph hub["hub_core"]
hint["app/apps/billings/interactors/v1/reports/"]
hrepo["app/apps/billings/repositories/v1/reports/"]
hbld["app/apps/billings/builders/v1/reports/"]
hmdl["app/core/domains/models/billing/"]
end
sched --> wkr --> repo --> hint
hint --> hrepo --> hmdl
hrepo --> hbld
ctrl --> mdl --> mig
ctrl --> views

Existing Code Anchors

Path (repo)Why the agent reads itWhat pattern it teaches
moderator-be: app/controllers/api/v1/qontak/chat_panel/reports/reports_controller.rbThe existing postpaid endpoint + async (new_reporting_flow) + zip (download_mcc_v2) precedentDry::Matcher::ResultMatcher; send_data; flag-gated async/zip branching
moderator-be: app/domains/core/repositories/app_integrations/chat_panel/reports/download_monthly_postpaid_usage.rbThe exact cross-service read pattern to copy for the summary readpigeon_get(service: 'qontak_chat_service', path:…) + Core::Services::Redis::ChatPanel::Auth + parse_response_csv
moderator-be: app/domains/core/workers/sidekiq/chatbot_ai_reset_monthly_worker.rb + abstract_worker.rbMonthly-cron worker shape to copy< Core::Workers::Sidekiq::AbstractWorker; sidekiq_options queue:, retry: 0; perform → repository
moderator-be: config/schedule.yml (chatbot_ai_reset_monthly_task)Exact cron entry format6-field cron + Asia/Jakarta; name/class/queue keys
moderator-be: app/domains/core/repositories/abstract_repository.rbBase for HTTP/circuit-breaker/Dry::Monadsinclude …Http::PigeonHttp, CircuitBreakers; parse_response
moderator-be: app/domains/core/services/preference.rbFlipper flag definition + checkCore::Services::Preference.new.enabled?(:flag)
moderator-be: app/controllers/application_controller.rb (can?) + app/policies/application_policy.rbPermission gate + Pundit policy shapebefore_action :can?; Core::UseCases::Permissions::Check; pundit_user = current_account_user
moderator-be: db/billing_schema.rb + config/database.yml (migrations_paths: db/billing/migrate)Billing DB dialect + migration homeActiveRecord::Schema[7.2]; UUID PK gen_random_uuid(); decimal 15,2; billing migrations live in db/billing/migrate
moderator-be: app/models/billings/application_record.rbBase class for new billing modelsBillings::ApplicationRecord (chat_billing connection)
moderator-be: app/domains/core/services/datadog/custom_metric.rbMetric naming conventionDatadog::Statsd; names prefixed moderator_panel_*
hub_core: app/apps/billings/repositories/v1/reports/download_postpaid_monthly_usage.rb (+ interactor + builder)The report to extend into a bulk summaryread_from_replica_db { … }; Models::Billing::MonthlyUsage.where(timestamp: range); org join
hub_core: app/core/domains/models/billing/organization_package.rbEncryption + billing_version + company_idhas_encrypted :organization_name, key: :lockbox_billing_key; billing_version; payment_type
hub_core: app/core/domains/models/organization.rb + waba_account.rbWABA join (resolves PRD OQ-1)store_accessor :settings, :waba_id; has_many :waba_accounts

Existing Contracts to Reuse, Extend, or Replace

ContractStatusJustificationOwner
moderator-be cross-service billing read (pigeon_get to chat_service)reuseExact pattern in DownloadMonthlyPostpaidUsageBifrost
hub_core DownloadPostpaidMonthlyUsage reportextendAdd waba_id, V3 Call Balance, quota components, bulk JSON + pagination via a sibling builderBifrost
chat_service route GET /api/core/v1/reports/billing/summary_postpaid_usagenew-with-justificationNo bulk machine-readable summary exists; the only route is the CSV download for one org filterBifrost
moderator-be GET .../reports/postpaid_usage_snapshots (list)new-with-justificationThe existing endpoint returns a synchronous file for one name filter; the dashboard needs paginated JSON from the new local snapshot tableBifrost
moderator-be POST/GET .../reports/postpaid_usage_exportsnew-with-justificationNo bulk async-ZIP export endpoint for snapshots existsBifrost
Sidekiq cron worker + schedule.ymlreuse/extendAdd one cron entry + one worker following the monthly-cron precedentBifrost
Flipper flag via Core::Services::Preferencereusepostpaid_usage_scheduler_enabledBifrost
Pundit can? + Permission modelreuse/extendAdd one permission keyBifrost

Patterns to Follow

ConcernPattern in repoReference fileDeviation?
HTTP handler shapeDry::Matcher::ResultMatcher over a use-case resultmoderator-be: …/reports_controller.rbnone
Cross-service readpigeon_get + chat-panel auth tokenmoderator-be: …/download_monthly_postpaid_usage.rbnone
Repository / DB access< Core::Repositories::AbstractRepository; AR for billing modelsmoderator-be: abstract_repository.rbnone
Worker shape< Core::Workers::Sidekiq::AbstractWorker; sidekiq_optionsmoderator-be: chatbot_ai_reset_monthly_worker.rbnone
Migration dialectPostgres, UUID PK gen_random_uuid(), decimal 15,2moderator-be: db/billing_schema.rbnone
hub_core readread_from_replica_db { … } on replica shardhub_core: download_postpaid_monthly_usage.rbnone
Logging / metricsRails.logger.error + Core::Services::Datadog::CustomMetric (moderator_panel_*)moderator-be: custom_metric.rbnone
Encryptionlockbox has_encrypted — decrypt only in hub_corehub_core: organization_package.rbnone — moderator-be never holds the key

Reading Order for the Agent

  1. moderator-be: …/reports/reports_controller.rb — controller + async/zip precedent.
  2. moderator-be: …/reports/download_monthly_postpaid_usage.rb — cross-service read to copy.
  3. moderator-be: app/domains/core/workers/sidekiq/chatbot_ai_reset_monthly_worker.rb + abstract_worker.rb — worker shape.
  4. moderator-be: config/schedule.yml — cron entry format.
  5. moderator-be: db/billing_schema.rb + config/database.yml — billing dialect + db/billing/migrate path.
  6. moderator-be: app/models/billings/application_record.rb — billing model base.
  7. moderator-be: app/controllers/application_controller.rb (can?) + app/policies/application_policy.rb — authz.
  8. moderator-be: app/domains/core/services/preference.rb — Flipper flag check.
  9. hub_core: …/reports/download_postpaid_monthly_usage.rb (+ interactor + builder) — the report to extend.
  10. hub_core: organization.rb + waba_account.rb + organization_package.rb — WABA join + encryption + billing_version.

Source Verification (anti-hallucination — required)

Anchor / pattern / contractVerified byEvidence
moderator-be reports controllerreadclass Api::V1::Qontak::ChatPanel::Reports::ReportsController; download_monthly_postpaid_usage L45-54; new_reporting_flow branch L24-30; download_mcc_v2'zip'
moderator-be cross-service readreadpigeon_get(service: 'qontak_chat_service', path: '/api/core/v1/reports/billing/download/monthly_postpaid_usage'…); @base_url = ENV['CHAT_PANEL_SERVICE_BASE_URL']; Core::Services::Redis::ChatPanel::Auth
Worker base + monthly cronreadclass Core::Workers::Sidekiq::AbstractWorker; include Sidekiq::Worker; ChatbotAiResetMonthlyWorker < …AbstractWorker, sidekiq_options queue: :chatbot_ai_reset_monthly, retry: 0
Cron formatreadchatbot_ai_reset_monthly_task: cron: "0 0 1 * * Asia/Jakarta" (6-field + TZ); airene_reset_monthly_task same
AbstractRepository HTTPreadinclude Core::Repositories::Http::PigeonHttp, CircuitBreakers; parse_response, parse_response_file
Flipper preferencereadCore::Services::Preference#enabled?(:feature, …)Flipper.enabled?(feature.to_sym)
Permission/Punditread (agent)application_controller.rb can?Core::UseCases::Permissions::Check; application_policy.rb pundit_user = current_account_user
Billing migration path + dialectreadconfig/database.yml: migrations_paths: db/billing/migrate; db/billing_schema.rb ActiveRecord::Schema[7.2], id: :uuid, default: -> { "gen_random_uuid()" }, decimal precision: 15, scale: 2
Billing model basereadapp/models/billings/application_record.rb (Billings::ApplicationRecord)
Datadog metric namingread (agent)Core::Services::Datadog::CustomMetric; @name = "moderator_panel_#{name}"
caxlsx present / no zip/OSS gemread (grep)Gemfile:128 gem 'caxlsx'; grep for rubyzip|aws|aliyun|fog|presigned|put_objectno match
hub_core report to extendreadBillings::Repositories::V1::Reports::DownloadPostpaidMonthlyUsage; read_from_replica_db { Models::Organization.select(:id,:name,:company_id,:moderator_account_id)… }; Models::Billing::MonthlyUsage.where(timestamp: @start_date..@end_date)no waba_id
hub_core monthly_usages columnsread (agent)migration create_monthly_usages: organization_id, timestamp, mcc_total, muv_total (decimal 15,2)
hub_core WABA join (OQ-1)read (agent)Models::Organization store_accessor :settings, :waba_id; has_many :waba_accounts; Models::WabaAccount waba_id, waba_name
hub_core encryption + billing_versionread (agent)organization_package.rb has_encrypted :organization_name, key: :lockbox_billing_key; billing_version default '1.0.0'; payment_type (prepaid/postpaid)

Detail 2.1 — Architecture (mermaid)

Component diagram

flowchart TB
cron["sidekiq-cron schedule.yml"] --> snapwkr["PostpaidUsageSnapshotWorker"]
snapwkr --> sumrepo["SummaryPostpaidUsage (pigeon repo)"]
sumrepo -->|HTTPS| chatsvc["qontak_chat_service / hub_core summary interactor"]
chatsvc --> hubrepo["Billings::Repositories::V1::Reports::SummaryPostpaidUsage"]
hubrepo --> hubdb[("monthly_usages / organizations / waba_accounts / postpaid_limits")]
snapwkr --> snapmodel["Billings::PostpaidUsageSnapshot"]
snapmodel --> snaptbl[("postpaid_usage_snapshots")]
finance(["Finance (Modpanel)"]) --> listctrl["ReportsController#postpaid_usage_snapshots"]
listctrl --> snaptbl
finance --> expctrl["ReportsController#postpaid_usage_exports"]
expctrl --> expmodel["Billings::PostpaidUsageExport"]
expmodel --> exptbl[("postpaid_usage_exports")]
expctrl --> expwkr["PostpaidUsageExportWorker"]
expwkr --> snaptbl
expwkr --> xlsx["caxlsx + rubyzip"]
xlsx --> oss[("Object storage (Decision 6 — unconfirmed)")]

Service use cases & third-party connections

flowchart LR
subgraph mod["moderator-be (Bifrost)"]
g1["snapshot cron (generate)"]
g2["list / search (read)"]
g3["export trigger + poll"]
g4["export worker (build ZIP)"]
end
g1 -->|"HTTPS pigeon — qontak_chat_service"| ext1(["hub_core summary (Bifrost)"])
g1 -->|"AR write"| db[("chat_billing Postgres")]
g2 -->|"AR read (Pagy)"| db
g3 -->|"enqueue Redis"| q[["Sidekiq queue"]]
g4 -->|"object PUT (unconfirmed)"| oss(["Object storage"])
g4 -->|"AR read/write"| db

Data model (erDiagram)

erDiagram
POSTPAID_USAGE_SNAPSHOTS {
uuid id PK
string cid
uuid organization_id
string waba_id
string company_name
string year_month
string billing_type
decimal usage_value
date report_date
timestamptz created_at
timestamptz updated_at
}
POSTPAID_USAGE_EXPORTS {
uuid id PK
string job_id
string status
string year_month
integer selected_count
bigint file_size_bytes
string file_ref
timestamptz expires_at
uuid requested_by
text error_details
timestamptz created_at
timestamptz updated_at
}
POSTPAID_USAGE_EXPORTS ||--o{ POSTPAID_USAGE_SNAPSHOTS : "selects (by year_month + ids, no FK)"

State machine — postpaid_usage_exports.status

stateDiagram-v2
[*] --> pending: POST export (within 50MB)
pending --> processing: worker picks up
processing --> completed: ZIP built and stored
processing --> failed: build or storage error
completed --> expired: past expires_at (24h)
failed --> [*]
expired --> [*]

Branch & skip flow — per-CID failure + size guard

flowchart TD
start(["snapshot worker: next CID"]) --> read{"summary read OK for CID?"}
read -- yes --> ins["INSERT frozen snapshot rows"]
read -- no --> skip["log snapshot_failed (cid, year_month, reason) + continue"]
ins --> more{"more CIDs?"}
skip --> more
more -- yes --> start
more -- no --> rate{"failure rate above 5 percent?"}
rate -- yes --> alert["emit Datadog alert metric"]
rate -- no --> done(["run complete"])
alert --> done
flowchart TD
trig(["POST export: selected ids"]) --> zero{"selected count is zero?"}
zero -- yes --> rej0["422 — button should be disabled"]
zero -- no --> size{"estimated size above 50MB?"}
size -- yes --> rej["reject — log zip_size_limit_exceeded"]
size -- no --> enq["insert export row pending + enqueue worker"]
enq --> ok(["return job_id — toast: generating"])

Detail 2.2 — Sequence (end-to-end across infra layers)

Happy path — monthly snapshot generation

sequenceDiagram
autonumber
participant Cron as sidekiq-cron
participant Q as Redis (Sidekiq)
participant W as PostpaidUsageSnapshotWorker
participant R as SummaryPostpaidUsage repo
participant CS as qontak_chat_service (hub_core)
participant DR as hub_core replica_billing
participant DB as chat_billing primary
Cron->>Q: enqueue at 02:00 on the 1st (Asia/Jakarta)
Q->>W: perform
W->>W: check flag postpaid_usage_scheduler_enabled
loop paginated over postpaid CIDs
W->>R: fetch summary (year_month, page)
R->>CS: pigeon_get summary_postpaid_usage (auth token)
CS->>DR: read monthly_usages + orgs + waba + quotas
DR-->>CS: usage rows (decrypted org_name)
CS-->>R: 200 JSON (page of CIDs x types)
R-->>W: parsed rows
W->>DB: INSERT frozen snapshot rows (idempotent on cid+year_month+type)
Note over W,DB: emit snapshot_generated per CID
end
W->>W: if failure rate above 5 percent emit alert metric

Failure path — cross-service read fails for a CID page

sequenceDiagram
autonumber
participant W as PostpaidUsageSnapshotWorker
participant R as SummaryPostpaidUsage repo
participant CS as qontak_chat_service
W->>R: fetch summary (year_month, page)
R->>CS: pigeon_get (timeout 60s, circuit breaker)
CS--xR: timeout or 5xx
R-->>W: Failure(reason)
W->>W: log snapshot_failed for affected CIDs + continue next page
Note over W: run does not abort — ERR-1

Happy path — bulk ZIP export

sequenceDiagram
autonumber
actor F as Finance (Modpanel)
participant LB as Load Balancer
participant API as moderator-be api
participant DB as chat_billing
participant Q as Redis (Sidekiq)
participant EW as PostpaidUsageExportWorker
participant OSS as Object storage (Decision 6)
F->>LB: POST postpaid_usage_exports (selected ids)
LB->>API: HTTP
API->>API: can? billing_postpaid_usage_download
API->>DB: estimate size — if above 50MB reject
API->>DB: INSERT export row (pending)
API->>Q: enqueue PostpaidUsageExportWorker(job_id)
API-->>F: 202 job_id — toast generating
Q->>EW: perform(job_id)
EW->>DB: read selected snapshots
EW->>EW: build one .xlsx per CID per type (caxlsx) + zip (rubyzip)
EW->>OSS: PUT zip object (private)
EW->>DB: UPDATE export completed + file_ref + expires_at (24h)
F->>API: GET postpaid_usage_exports/:job_id (poll)
API-->>F: completed + presigned URL

Failure path — export build/storage error

sequenceDiagram
autonumber
participant EW as PostpaidUsageExportWorker
participant DB as chat_billing
participant OSS as Object storage
EW->>EW: build XLSX + ZIP
EW->>OSS: PUT zip object
OSS--xEW: storage error
EW->>DB: UPDATE export failed + error_details
Note over EW: retry with exponential backoff up to MaxFails then leave failed

Detail 2.3 — Database Model (DDL)

Postgres, billing DB (chat_billing); migrations in moderator-be: db/billing/migrate/. Matches db/billing_schema.rb dialect (UUID PK gen_random_uuid(), decimal 15,2). Models inherit Billings::ApplicationRecord.

-- db/billing/migrate/<ts>_create_postpaid_usage_snapshots.rb
CREATE TABLE postpaid_usage_snapshots (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
cid varchar NOT NULL,
organization_id uuid NOT NULL,
waba_id varchar, -- nullable: shared/absent WABA (PRD OQ-1 resolved)
company_name varchar, -- decrypted in hub_core, stored plain for display
year_month varchar(7) NOT NULL, -- 'YYYY-MM' (prior month)
billing_type varchar NOT NULL, -- enum below
usage_value decimal(15,2) NOT NULL DEFAULT 0.0,
report_date date NOT NULL,
created_at timestamptz NOT NULL,
updated_at timestamptz NOT NULL
);
-- one frozen row per CID per type per month (idempotent regeneration / no dup)
CREATE UNIQUE INDEX idx_pus_unique ON postpaid_usage_snapshots (cid, year_month, billing_type); -- supports upsert-once + dedup
CREATE INDEX idx_pus_year_month ON postpaid_usage_snapshots (year_month); -- supports default-month list
CREATE INDEX idx_pus_search ON postpaid_usage_snapshots (year_month, cid, waba_id); -- supports search (PUMS-S04)
-- db/billing/migrate/<ts>_create_postpaid_usage_exports.rb
CREATE TABLE postpaid_usage_exports (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
job_id varchar NOT NULL,
status varchar NOT NULL DEFAULT 'pending', -- pending|processing|completed|failed|expired
year_month varchar(7) NOT NULL,
selected_count integer NOT NULL DEFAULT 0,
file_size_bytes bigint,
file_ref varchar, -- object key OR local path (Decision 6 indirection)
expires_at timestamptz, -- created + 24h on completion
requested_by uuid NOT NULL, -- moderator account id
error_details text,
created_at timestamptz NOT NULL,
updated_at timestamptz NOT NULL
);
CREATE UNIQUE INDEX idx_pue_job_id ON postpaid_usage_exports (job_id); -- poll by job_id
CREATE INDEX idx_pue_status_expiry ON postpaid_usage_exports (status, expires_at); -- TTL cleanup cron
  • billing_type enum (PUMS-S01/AC-3,4 · PUMS-S03): WA_BALANCE_V1, MUV_V1, WA_BALANCE_V3, MUV_V3, CALL_BALANCE_V3, plus whitelisted V3 quota component codes (e.g. CP-QONTAKCHAT-2025-0005). Display map: WA_BALANCE_* → "WA Balance", MUV_* → "MUV", CALL_BALANCE_V3 → "Call Balance", unmapped → "Unknown" (PUMS-S03/ERR-1).

  • Cardinality / growth: ~(postpaid CIDs) × (2–5 types) rows/month. At, say, 2,000 CIDs × 4 ≈ 8,000 rows/month → ~192k rows over the 24-month retention (PRD §6.1). Trivial for Postgres.

  • Example row: ('12345', '<uuid>', '628xxx', 'Acme Corp', '2026-05', 'WA_BALANCE_V3', 1234567.89, '2026-06-01').

  • PII classification: company_name = sensitive (business identity) — decrypted in hub_core, stored plain here for display, never logged. waba_id/cid = internal identifiers. usage_value = financial. No customer phone/PII in this table.

  • Retention: snapshots 24 months (PRD §6.1) — annual purge cron (out of scope to build now; noted). Exports: row + file 24 h, then TTL cleanup → expired + object delete.

  • Per-status lifecycle (postpaid_usage_exports.status):

    StatusVisibilityRetentionRestore semanticsTransitions allowed
    pendingpoll onlyuntil processingn/a→ processing
    processingpoll onlyuntil done/failn/a→ completed / failed
    completeddownload link activefile 24 hre-trigger to regenerate→ expired
    failed"Generation failed"24 hre-triggerterminal
    expired"link expired"row retained per auditre-triggerterminal
  • Partition / sharding: none — volume is small.

  • NoSQL alternative: rejected — relational filtering/pagination on year_month/cid/waba_id is the core access pattern; Postgres already hosts the billing DB.

Detail 2.4 — APIs

Outbound endpoints (consumers call us — moderator-be)

EndpointMethodAuthN/AuthZRequestResponseStatus codesIdempotencyVersioningReuse?
/api/v1/qontak/chat_panel/reports/postpaid_usage_snapshotsGETsession + can? billing_postpaid_usage_indexyear_month (req, default latest), search_query (opt, exact CID/WABA), page{ data: [rows], pagy: {…} }200 / 403 / 422n/a (read)/api/v1new-with-justification
/api/v1/qontak/chat_panel/reports/postpaid_usage_exportsPOSTsession + can? billing_postpaid_usage_download{ year_month, snapshot_ids: [], select_all: bool }{ status:'success', job_id }202 / 403 / 422 (size/zero)job_id (server-generated)/api/v1new-with-justification
/api/v1/qontak/chat_panel/reports/postpaid_usage_exports/:job_idGETsession + can? billing_postpaid_usage_downloadpath job_id{ status, download_url?, expires_at?, message? }200 / 403 / 404n/a (read)/api/v1new-with-justification

Internal endpoint (moderator-behub_core host app)

EndpointMethodAuthN/AuthZSourceRequestResponseStatusIdempotencyVersioning
/api/core/v1/reports/billing/summary_postpaid_usageGETchat-panel auth tokenmoderator-be snapshot workeryear_month, page, per_page{ data: [{cid, organization_id, waba_id, company_name, billing_type, usage_value}], page_meta }200 / 401 / 422n/a (read)/api/core/v1

Per-endpoint notes:

  • Pagination: list endpoint uses Pagy (already in moderator-be); 50/page (PUMS-S02/AC-3). The internal summary uses page/per_page to bound payload (Decision 2).
  • Rate limits / size: export POST enforces ≤ 50 MB estimated (PUMS-S06/ERR-1) before enqueue; reject with zip_size_limit_exceeded.
  • Backward compatibility: all endpoints are net-new; the existing download_monthly_postpaid_usage endpoint is untouched (no consumer migration).
  • Error shape: matches repo convention — render json: { status: errors || :error }, status: :unprocessable_entity (Dry::Matcher::ResultMatcher).

Detail 2.A — Data Integrity Matrix

Write pathTransaction scopePartial failureIdempotency key + TTLConsistencyDuplicate handlingStale-read
Snapshot insert (per CID)single-row insert per (cid, year_month, type)per-CID rescue → snapshot_failed, continueunique index (cid,year_month,billing_type)INSERT … ON CONFLICT DO NOTHING (regenerate-safe)eventual (monthly batch)unique index dedups a re-runsource is finalized month-end data (frozen)
Export row insertsingle insert (pending)request fails → no rowjob_id uniquestrong (local)unique job_idn/a
Export status updatesingle-row updateworker retry/backoff; terminal failedjob_idstrongidempotent set-statusn/a

Detail 2.B — Concurrency Collision Map

ResourceWritersCollisionResolutionOn failure
postpaid_usage_snapshots (cid,year_month,type)cron worker (+ manual re-run)two runs for same monthunique index + ON CONFLICT DO NOTHINGsecond run no-ops (idempotent)
postpaid_usage_exports.statusexport workerconcurrent retries of same job_idSidekiq single-job + status guard (only pending/processing → terminal)duplicate completion ignored
chat_service read loadmany CID pagesthundering readspaginate + per-page timeout + circuit breaker (AbstractRepository)page failure isolated (ERR-1)

Detail 2.C — Async Job / Event Consumer Spec

JobTriggerInputRetryDLQ / retentionConcurrencyIdempotency keyPer-message timeoutPoison handling
PostpaidUsageSnapshotWorkersidekiq-cron 0 2 1 * * Asia/Jakartanone (computes prior month)retry: 0 (matches monthly-cron precedent); manual re-enqueuenone — failures logged as snapshot_failed; run summary1 (single monthly run)unique snapshot index (regenerate-safe)per-page pigeon timeout (60 s default)per-CID rescue; never aborts run
PostpaidUsageExportWorkerenqueued by POST exports{ job_id }exponential backoff up to MaxFails then failedrow stays failed 24 hbounded by Sidekiq queue concurrencyjob_id + status guardjob-level (≤ 30 min budget)on repeated failure → failed + error_details
PostpaidUsageExportCleanupWorker (TTL)daily cronnoneretry: 0n/a1n/ashortmarks past-expires_atexpired, deletes object

Detail 2.D — Responsibility Boundary Matrix

Step (execution order)Owning serviceInbound triggerOutbound effectFailure handlerPRD anchor
1. Fire monthly cronmoderator-be (sidekiq-cron)schedule 0 2 1 * *enqueue snapshot workermissed run → manual re-enqueuePUMS-S01
2. Read finalized usagehub_core (via chat_service)pigeon_get summaryreturn per-CID per-type usage + waba_id5xx/timeout → page-level snapshot_failedPUMS-S01/AC-4, OQ-1
3. Write frozen snapshotsmoderator-beparsed summary rowsinsert snapshot rowsper-CID rescue → continuePUMS-S01/ERR-1
4. Serve dashboardmoderator-beFinance GETpaginated rowsDB error → error statePUMS-S02
5. Build + store ZIPmoderator-be export workerPOST exportsXLSX+ZIP → object storagestorage error → failed (Decision 6)PUMS-S06

No disagreement with the PRD on ownership except the PRD's §6.2 service map (Go qontak-billing

  • report-worker), which this RFC corrects to hub_core + moderator-be Sidekiq (see scope note + Decisions 2, 5).

Detail 2.E — State Surface Contract

EntityState field / eventDefaultUpdated byRead viaStale window
Snapshot rowreport_date (generation date)set at insertsnapshot workerlist endpointfrozen (never changes)
Export jobstatus enum + download_urlpendingexport worker / cleanuppoll endpointpoll cadence (FE Stimulus, ~3–5 s)

3. High-Availability & Security

moderator-be api/worker pods are stateless and horizontally scaled; state lives in Postgres (chat_billing) + Redis. A chat_service outage during generation delays (does not corrupt) snapshots — the cron is re-runnable and idempotent. An export-worker crash leaves the row pending/processing; the TTL cleanup and retry/backoff converge it to completed/failed.

Performance Requirement

  • Dashboard read: ≤ 3 s for 1,000 rows (PRD §6) — served by idx_pus_year_month / idx_pus_search; no cross-service call on page load (Decision 1).
  • Snapshot run: ≤ 2 h (PRD §6) — paginated cross-service reads, per-page timeout; single job.
  • Export: ≤ 30 min for ≤ 50 MB (PRD §6).
  • Scalability: generation is monthly batch (low steady RPS); export concurrency bounded by the Sidekiq queue. Load test: simulate worst-month CID count against a staging chat_service.

Monitoring & Alerting

  • Metrics (Datadog Statsd, prefix moderator_panel_* — matches Core::Services::Datadog::CustomMetric): moderator_panel_postpaid_snapshot_generated (count, tag year_month), moderator_panel_postpaid_snapshot_failed (count, tags year_month,reason), moderator_panel_postpaid_export_completed / _failed (count, tag job_id), moderator_panel_postpaid_export_duration_seconds (histogram), moderator_panel_postpaid_zip_size_limit_exceeded (count).
  • Structured logs (Rails.logger): events snapshot_generated, snapshot_failed, usage_dashboard_loaded, bulk_download_triggered, zip_job_completed, zip_job_failed, zip_size_limit_exceeded (PRD §11) — plus integration_logs row for the cross-service summary call.
  • Alerts (PRD §11.1): snapshot_failed rate > 5% in a monthly run → immediate Bifrost investigation; zip_job_failed rate > 10% in any 7-day window → escalate within 24 h.
  • "Debug at 3am": check the monthly run summary log → snapshot_failed reasons → chat_service health / circuit-breaker state.

Logging

  • Fields: cid, year_month, billing_type, job_id, user_id, reason, duration_seconds.
  • PII scrub: never log company_name, usage_value, or the chat-panel auth token. The cross-service summary payload is not logged verbatim (only counts) — matches the Kafka producer convention (logs error class only).

Security Implications

  • Threat model: an unauthorized Modpanel user attempting to read all-CID financial data; leakage of decrypted company_name/usage; a stale presigned link being reused.
  • AuthN/AuthZ matrix:

Role × Endpoint Authorization Matrix

RoleEndpoint(s)MethodsTenant scopeConstraintAudit trail
Finance Modpanel userpostpaid_usage_snapshots, postpaid_usage_exports[/:job_id]GET/POSTglobal (all CIDs — internal tool by design)read+download only; no write to usagebulk_download_triggered + integration_logs
Non-Finance Modpanel userall abovenonecan? denies → 403; nav hiddendenied attempt logged
System (cron)none (Sidekiq-internal)n/ano manual-trigger endpoint exposed (PUMS-S01 Permission Model)snapshot_* events
  • Enforcement layer: before_action :can? + Pundit policy (pundit_user = current_account_user) on every new endpoint; permission key billing_postpaid_usage_index / _download added to the permissions table and granted to the Finance role.
  • Input validation: year_month matches ^\d{4}-\d{2}$; search_query exact-match only (parameterized AR — no LIKE injection); snapshot_ids are UUIDs scoped to the requested year_month.
  • Injection: ActiveRecord parameterization; outbound pigeon URL is a fixed env-configured host (no SSRF surface).
  • Secrets: chat-panel auth via Core::Services::Redis::ChatPanel::Auth (existing); lockbox key stays in hub_core onlymoderator-be never receives it. Object-storage credentials (if Decision 6 Option A) via env/secret store, never logged.
  • Presigned URL: TTL = 24 h = expires_at; private ACL; link not logged.
  • Static analysis: bundle exec brakeman (existing CI step).
  • ISO 27001/27701: financial + business-identity data — covered by the Compliance table below.

Detail 3.A — Failure Mode & Retry Catalog

External callTimeoutRetriesCircuit breakerDLQCaller behavior on persistent failure
pigeon_get summary (moderator-be → chat_service)60 s (HTTP_REQUEST_TIMEOUT)per AbstractRepository/pigeon retry_thresholdSLEEP_WINDOW=60, ERROR_THRESHOLD=5 (existing)nonepage-level snapshot_failed, continue run (ERR-1)
Object storage PUT (export)per SDK (Decision 6)exponential backoff up to MaxFailsn/arow failedmark export failed + error_details; Finance sees "Generation failed"

Detail 3.A.1 — Branch & Skip Catalog

Branch triggerWhere checkedDownstream effectAudit trailUser-visible?
Flag postpaid_usage_scheduler_enabled OFFworker start + controllerno generation / dashboard hiddenflag stateyes (feature absent)
Per-CID summary read failssnapshot workerskip that CID, log, continuesnapshot_failedno (Finance sees missing row)
Selection > 50 MBexport controllerreject before enqueuezip_size_limit_exceededyes (error toast)
Selection = 0export controller / FEbutton disabled / 422yes (disabled)
Past expires_atpoll endpoint"link expired"yes

Detail 3.B — Error Response Catalog

EndpointError codeHTTPMessageWhenUser-facing?
listerror422"Could not load usage data. Try again."DB query failure (PUMS-S02/ERR-1)yes
list(empty 200)200"No usage data available for this period."no rows for month (PUMS-S02/ERR-2)yes
export POSTzip_size_limit_exceeded422"Selection exceeds 50MB limit. Reduce your selection and try again."size > 50 MB (PUMS-S06/ERR-1)yes
export POSTerror422"Could not start download. Please try again."enqueue/Redis failureyes
export pollexpired200"Download link expired. Generate again."past TTL (PUMS-S06/AC-4)yes
any(403)403(hidden / forbidden)non-Financeyes

Detail 3.C — Compliance & Data Governance

FieldClassificationLegal basisRetentionEncryptionAccess auditRight-to-delete
company_namesensitive (business identity)UU PDP (legitimate interest — billing ops)24 monthsrest: encrypted in hub_core (lockbox); plain in chat_billing for display; transit: HTTPSintegration_logs + Pundit denialscovered by CID offboarding (out of scope)
usage_valuefinancialUU PDP24 monthstransit HTTPS; at rest Postgresdashboard access logsn/a
waba_id, cidinternal identifier24 monthstransit HTTPSn/a

Trigger met (financial + business-identity data) — infosec approval required before GA.


4. Backwards Compatibility and Rollout Plan

Compatibility

  • Existing endpoints changed: none. The current download_monthly_postpaid_usage endpoint and the hub_core CSV report are untouched (additive sibling endpoints only).
  • hub_core: the new summary is a new route + new builder; the existing report/builder unchanged.
  • API version strategy: additive under /api/v1 (moderator-be) and /api/core/v1 (chat_service).

Rollout Strategy

  • Migration sequence: (1) create postpaid_usage_snapshots + postpaid_usage_exports in chat_billing (db/billing/migratedb:migrate:billing); no backfill (PRD §10 — first snapshot is the next 1st-of-month cron). (2) Ship hub_core summary + chat_service route. (3) Ship moderator-be worker + endpoints behind the flag. (4) Ship Modpanel views.
  • Schema during migration: purely additive; no intermediate dual-write state.
  • Feature flag: postpaid_usage_scheduler_enabled (Flipper via Core::Services::Preference), default ON at GA (PRD §6) but created OFF and enabled per stage; global (no per-CID). Kill-switch: disabling hides the dashboard and skips the cron.
  • Stages (PRD §13): Internal QA (staging — Bifrost + 1–2 Finance reps, 1 wk, gate: snapshots for all eligible CIDs with 0 failures, dashboard loads, ZIP valid with exact naming, 0 P0/P1) → GA (all Finance, gate: snapshot ≥ 99% on first prod run, ZIP ≥ 95% in 2 wks).
  • Rollback trigger: snapshot failure > 10% for 2 consecutive monthly runs (PRD §11.1) → disable flag pending fix.
  • Rollback mechanism: toggle flag OFF (instant hide + skip cron); snapshots already written are inert frozen rows (safe to leave); migrations reversible (db:rollback:billing).
  • PIC + timeline per stage: in delivery/ (not in this RFC).

Detail 4.A — Configuration Contract

Env / config / flagTypeDefaultRequiredProvisionerSecret?
postpaid_usage_scheduler_enabledFlipper flagcreated OFF → ON at GAyesBifrost (Preference)no
CHAT_PANEL_SERVICE_BASE_URLenvexistingyesInfra (existing)no
Permission billing_postpaid_usage_index/_downloadDB seedyesBifrost (seed/migration)no
schedule.yml postpaid_usage_snapshot_taskcron0 2 1 * * Asia/JakartayesBifrostno
Object-storage bucket + creds (Decision 6 Option A)env/secretTBDyes for exportInfra (unconfirmed — OQ-1)yes

Detail 4.B — Test Plan (commands sourced from repo)

LayerCommand (source)What it must prove
Unit/Request (moderator-be)bundle exec rspec spec/... (source: .rspec, Makefile:84 make test)worker inserts rows; list paginates/filters; export size-guard; status transitions
Lint (moderator-be)bundle exec rubocop (source: Makefile:86 make lint)style passes
Security (moderator-be)bundle exec brakeman (source: Makefile:92 make security)no new findings
Unit (hub_core)RAILS_ENV=test bundle exec rspec app/apps/billings/... (source: bitbucket-pipelines.yml)summary repo returns waba_id + V3 + decrypted org_name
Lint (hub_core)bundle exec rubocop (source: bitbucket-pipelines.yml:89)style passes
DB setup (moderator-be CI)RAILS_ENV=test bin/rails db:create:billing db:schema:load:billing (source: bitbucket-pipelines.yml:71)new tables load from schema

Detail 4.C — Agent Execution Plan

OrderChunkFiles to modify/createCommandsAcceptance criteria (verifiable)
1Add migrations for both tablesmoderator-be: db/billing/migrate/<ts>_create_postpaid_usage_snapshots.rb, …_create_postpaid_usage_exports.rbbin/rails db:migrate:billingboth tables + indexes exist in db/billing_schema.rb; db:rollback:billing reverts
2hub_core bulk summary repo/interactor/builderhub_core: app/apps/billings/{repositories,interactors,builders}/v1/reports/summary_postpaid_usage.rbRAILS_ENV=test bundle exec rspec app/apps/billings/.../summary_postpaid_usage_spec.rbrspec: returns waba_id, V3 Call Balance + quota rows, decrypted company_name; paginated
3chat_service route for summaryhost app routes (/api/core/v1/reports/billing/summary_postpaid_usage) + controllerhost-app request specGET returns 200 JSON page with auth token; 401 without
4moderator-be summary read repo (pigeon)moderator-be: app/domains/core/repositories/app_integrations/chat_panel/reports/summary_postpaid_usage.rbbundle exec rspec spec/.../summary_postpaid_usage_spec.rbrspec (stubbed pigeon): parses page; failure → Failure(reason)
5Modelsmoderator-be: app/models/billings/postpaid_usage_snapshot.rb, postpaid_usage_export.rbbundle exec rspec spec/models/...models inherit Billings::ApplicationRecord; enum + validations
6Snapshot worker + cron entrymoderator-be: app/domains/core/workers/sidekiq/postpaid_usage_snapshot_worker.rb; config/schedule.yml (postpaid_usage_snapshot_task)bundle exec rspec spec/.../postpaid_usage_snapshot_worker_spec.rbrunning worker for fixture month inserts N rows; per-CID raise → snapshot_failed + run completes; idempotent on re-run
7List/search endpoint + permissionmoderator-be: …/reports_controller.rb (+route config/routes/api/v1.rb); permission seedbundle exec rspec spec/requests/.../postpaid_usage_snapshots_spec.rb50/page; exact-match filter; year_month default latest; can? 403 for non-Finance; empty-state
8Modpanel view + type label mapmoderator-be: app/views/billing/postpaid_usage/*, Stimulus controller; billing_type label helperbundle exec rspec spec/.../ + system spectable renders columns; unknown type → "Unknown"; loading/empty/error states
9Bulk selection (Stimulus)moderator-be: app/javascript/controllers/postpaid_usage_selection_controller.jssystem/JS specselect-all-across-pages sets count; filter change clears; refresh clears
10Export trigger + poll endpointsmoderator-be: …/reports_controller.rb (+routes); use casesbundle exec rspec spec/requests/.../postpaid_usage_exports_spec.rbPOST >50 MB → 422 zip_size_limit_exceeded; 0 selected → 422; returns job_id; poll returns status
11Export worker (XLSX+ZIP) [gated on OQ-1]moderator-be: app/domains/core/workers/sidekiq/postpaid_usage_export_worker.rb; add rubyzip to Gemfile; storage client (Decision 6)bundle install + bundle exec rspec spec/.../postpaid_usage_export_worker_spec.rbbuilds one .xlsx per CID per type named {CID} {Company Name} {Postpaid Usage YYYY-MM} {Type}.xlsx; zips; status→completed+expires_at
12TTL cleanup cron [gated on OQ-1]moderator-be: app/domains/core/workers/sidekiq/postpaid_usage_export_cleanup_worker.rb; config/schedule.ymlbundle exec rspec spec/.../cleanup_worker_spec.rbpast-expires_atexpired + object deleted

Detail 4.D — Verification & Rollback Recipe

  • Pre-merge (run in order):
    1. bundle exec rubocop (moderator-be + hub_core)
    2. bundle exec brakeman
    3. bundle exec rspec (scoped to the new specs above)
    4. CI DB load: RAILS_ENV=test bin/rails db:create:billing db:schema:load:billing
  • Post-deploy signals:
    • Datadog: moderator_panel_postpaid_snapshot_failed rate < 5% on the first monthly run.
    • Log query: snapshot_generated count ≈ eligible-CIDs × types for the run.
    • moderator_panel_postpaid_export_completed / _failed ratio ≥ 95% completed over 2 weeks.
  • Rollback recipe (in order):
    1. Toggle postpaid_usage_scheduler_enabled OFF (Core::Services::Preference) — hides dashboard + skips cron.
    2. If a bad migration: bin/rails db:rollback:billing (additive tables — safe).
    3. Revert the moderator-be / hub_core PRs.
    4. Confirm moderator_panel_postpaid_* error metrics return to baseline within 15 min; existing download_monthly_postpaid_usage endpoint still works (unaffected).

Detail 4.E — Resource & Cost Notes

  • Compute: one monthly batch job + on-demand export workers — negligible steady load; export ZIP build needs memory headroom bounded by the 50 MB cap.
  • DB: ~192k snapshot rows over 24 months — trivial; two new tables on chat_billing.
  • Network: one monthly burst of paginated cross-service reads; per-export object PUT.
  • Storage: ZIP files ≤ 50 MB each, 24 h TTL — small footprint (Decision 6 dependent).

5. Concern, Questions, or Known Limitations

#Question / limitationOwnerImpact if unresolved
OQ-1Object storage for ZIP (BLOCKER for export delivery). No S3/OSS/GCS client or storage gem exists in moderator-be (verified). Decision 6 recommends object storage + presigned URL but the bucket/client must be provisioned (or fall back to DB/volume blob + authenticated send_data).Infra + BifrostBlocks §4.C ch.11–12 (bulk export delivery). Snapshot + dashboard (ch.1–9) ship independently.
OQ-2-bisV3 whitelisted quota enumeration. The exact set of quota component codes and the source field for V3 Call Balance / quotas (organization_package_component_postpaid_limits) must be confirmed with the billing team for the hub_core summary query.hub_core / billingWrong/missing V3 billing_type rows in snapshots (PUMS-S01/AC-4).
OQ-3Billing version "2.0.0". Whether V2 CIDs exist in prod and which postpaid types apply (carried from PRD OQ-3). The summary query must explicitly include or exclude billing_version="2.0.0".hub_core / billingSilent omission or double-count of V2 CIDs.
OQ-4WABA cardinality. waba_id resolves via organizations.settings['waba_id'] OR waba_accounts (1:many). For a CID with multiple WABAs, confirm whether the snapshot is per-WABA or per-CID (one waba_id chosen).Bifrost + FinanceRow granularity / dedup of dashboard rows.
OQ-5Modpanel visual design (Figma) pending (PRD §7). Views are built functionally against the contracts; pixel design is a follow-up, not a data-layer blocker.Design + BifrostCosmetic only.
Known limitSnapshot is a frozen projection — a later monthly_usages correction does not propagate (by design, PUMS-S01/AC-2).Documented behavior, not a defect.

6. Comment logs

DateComment(s) FromAction Item(s)
2026-06-30rfc-starter (author)Drafted from PRD v1.3, grounded in moderator-be + hub_core. Corrected PRD §6.2 service map (no Go qontak-billing/report-worker; data in hub_core, async ZIP in moderator-be). Resolved PRD OQ-1 (WABA) and OQ-2 (XLSX via caxlsx). All 16 mermaid blocks validated to parse (see below). Open blocker: object storage (OQ-1).

7. Ready for agent execution

  • yes — for Phase 1a (snapshot generation + dashboard read/search/type-label: §4.C chunks 1–9). All infra topology, ADR decisions, repo anchors (verified), DDL, APIs, sequence/ER/state/branch diagrams, failure/branch catalogs, config contract, and execution chunks are complete and grounded.
  • no — one gated dependency — for Phase 1b (bulk async ZIP export: §4.C chunks 10–12). The only blocker is §5 OQ-1 (object storage mechanism); chunk 10 (trigger/poll endpoints + size guard) is execution-ready, but chunks 11–12 (build/store/cleanup) must wait on the storage decision (Decision 6). The export schema is storage-agnostic so the decision needs no migration change.
  • Secondary confirmations (non-blocking for 1a): OQ-2-bis (V3 quota codes), OQ-3 ("2.0.0"), OQ-4 (WABA cardinality) — needed before the hub_core summary is finalized.

Optional next step: hand to rfc-reviewer for a second-pass score, and/or rfc-task-breakdown for man-day estimates (note the Phase 1a / 1b split above).