RFC: Postpaid Usage Monthly Scheduler — Snapshot, Dashboard & Bulk Export (Backend)
Document Conventions (do not remove)
This RFC follows the Qontak RFC Template format for governance — the metadata table, Confluence sections 1–6, and Comment logs are mandatory. Sections marked
N/A — reasonare intentionally not applicable, not omitted.It is agent-execution-ready: §1 PRD-to-Schema Derivation, §2 Repo Reading Guide (Detail 2.0), mermaid diagrams, and §4 Agent Execution Plan + Verification & Rollback Recipe are complete (see §7).
Delivery & project management live elsewhere. This RFC is the technical artifact only. Staffing, effort, timeline, and rollout schedule live in the initiative's
delivery/folder. This RFC has not yet been handed to delivery.Scope note — this is the BACKEND RFC for two repos:
moderator-be+hub_core. The Modpanel "Postpaid Usage Dashboard" (PUMS-S02…S06) is server-rendered Rails + Hotwire (Turbo/Stimulus) insidemoderator-be— not a separate SPA and not a@mekari/dsdesign-system frontend — so its controllers, ERB views, and Stimulus controllers are in-repomoderator-bework covered here. The visual design (Figma) is still pending (PRD §7 "Figma: TBD"); these views are specified functionally against the read/search/export contracts, and pixel-level design polish is tracked as an Open Question (§5 OQ-5), not a blocker for the data-and-behavior layer.The PRD's service map was wrong about the data source — this RFC corrects it. PRD §6.2 assumed the usage data lives in a Go service
qontak-billingand async export in a Goreport-worker. Neither exists in the grounded repo set. The billing data (monthly_usages,organization_packages,organizations,waba_accounts) lives inhub_core— a Rails engine (gem) mounted into theqontak_chat_servicehost app — andmoderator-bealready reaches it cross-service over HTTP viapigeon_get(verified:Core::Repositories::AppIntegrations::ChatPanel::Reports::DownloadMonthlyPostpaidUsage→CHAT_PANEL_SERVICE_BASE_URL/api/core/v1/reports/billing/download/monthly_postpaid_usage). The async ZIP export is built inmoderator-beSidekiq (it already bundlescaxlsxand has an async-export precedent —new_reporting_flow), not in a Goreport-worker.
Metadata
| Field | Value | Notes |
|---|---|---|
| Status | RFC (in discussion) | Human label; YAML status: carries the remapped linter enum draft |
| DRI | pm-group.qontak@mekari.com | Single accountable owner (mirrors PRD/initiative). Per-task staffing lives in delivery/. |
| Team | bifrost | Advisory squad slug carried from the PRD / initiative README |
| Author(s) | Hafriz | Backend implementor (initiative README implementors: Hafriz (Backend)) |
| Reviewers | moderator-be · hub_core · qontak_chat_service reviewers (TBD) | Tech reviewers across the affected repos |
| Approver(s) | Bifrost tech lead (TBD) + infosec approver (TBD) | PII (encrypted organization_name) is read cross-service — infosec sign-off required |
| Submitted Date | 2026-06-30 | Date RFC opened for discussion |
| Last Updated | 2026-06-30 | Bump on every material edit |
| Target Release | 2026-Q3 | |
| Target Quarter | 2026-Q3 | Advisory, carried from the PRD |
| Delivery | not yet handed to delivery | |
| Related | ../prds/prd-postpaid-usage-monthly-scheduler.md | Source PRD (v1.3) |
| Discussion | TBD — #bifrost squad Slack channel |
Type: backend Sub-type: new-feature
Sections at a Glance
- Overview (incl. §1 PRD-to-Schema Derivation — entities, business rules, contracts; no Figma)
- Technical Design (Infrastructure Topology → Technical Decisions [ADR] → Repo Reading Guide → Architecture & Service Map → Sequence Diagrams → DDL → APIs → integrity / concurrency / async specs)
- High-Availability & Security
- Backwards Compatibility and Rollout Plan (incl. §4 Agent Execution Plan + Verification & Rollback Recipe)
- Concern, Questions, or Known Limitations
- Comment logs
- Ready for agent execution
1. Overview
The Finance team manually downloads postpaid usage reports one client at a time at the
start of every billing cycle (~3 hours/month, confirmed PRD §12). Billing V3 multiplies the
problem by adding multiple postpaid components per client (WhatsApp Balance, MUV, Voice/Call
Balance, whitelisted quotas). Today the only tool is moderator-be's synchronous
download_monthly_postpaid_usage endpoint, which produces one CSV per organization_name
query by proxying to the billing host app.
This RFC defines the backend of a Postpaid Usage Monthly Scheduler:
- A monthly cron (
moderator-beSidekiq,0 2 1 * * Asia/Jakarta) that reads the prior month's finalized usage for all postpaid CIDs fromhub_core(cross-service via theqontak_chat_serviceHTTP API) and writes frozen snapshot rows — one per CID per billing type — into a newpostpaid_usage_snapshotstable inmoderator-be's billing DB. - A Modpanel dashboard (server-rendered Rails + Hotwire in
moderator-be) for Finance to list / search / filter those snapshots (paginated, 50/page). - A bulk async ZIP export (
moderator-beSidekiq) that packages the selected snapshots into one.xlsxper CID per type inside a single ZIP, tracked in a newpostpaid_usage_exportsstatus table.
The change touches two repos (plus a thin contract addition to the billing host app):
moderator-be(Ruby/Rails 7.2) — cron scheduler, snapshot + export tables, Modpanel UI, async ZIP worker, Finance RBAC (Pundit + permission), Flipper flag, cross-service read client.hub_core(Ruby/Rails engine, billing domain) — owns the source data; must be extended to expose a bulk monthly-usage summary that addswaba_id, billing version/type, V3 Call Balance, and whitelisted-quota components (its current report returns only WA Balance + MUV and nowaba_id). Exposed via a newqontak_chat_serviceroute.
Success Criteria
- On the 1st of each month, snapshots are generated for ≥ 99% of eligible postpaid CIDs
within 2 hours of the cron start (PRD §6, §12); a per-CID failure logs
snapshot_failedand does not abort the run (PUMS-S01/ERR-1). - The dashboard table loads ≤ 3 s for up to 1,000 records (PRD §6) reading from the local snapshot table (no live cross-service call on page load).
- A bulk ZIP for a selection ≤ 50 MB completes ≤ 30 min (PRD §6), with ≥ 95% job success within 60 days of GA.
- Finance time on the monthly download drops from 3 h → ≤ 30 min (PRD §12 ⭐ KPI).
- Snapshot data is immutable once written — a mid-month read returns end-of-month values unchanged (PUMS-S01/AC-2).
Out of Scope
- Real-time usage (snapshots are frozen month-end values only) — PRD §5.1.
- V1 components other than WA Balance + MUV — PRD §5.2.
- Edit / correct / reprocess / manual regenerate of usage data — PRD §5.3, §5.6.
- External-client / CID-admin exposure (internal Finance Modpanel only) — PRD §5.4.
- Email / push notification when a snapshot is ready — PRD §5.5.
- ERP/accounting integration — PRD §5.7.
- Building a Go
qontak-billingorreport-workerservice — they do not exist; this RFC useshub_core+moderator-beSidekiq instead. - Pixel-level Modpanel visual design — Figma pending (§5 OQ-5); views built functionally.
Related Documents
- PRD (source):
../prds/prd-postpaid-usage-monthly-scheduler.md(v1.3, 2026-06-23). - Sibling RFC (style + repo precedent):
../../billing-expired-handling/rfcs/billing-expired-handling.md— same author, samemoderator-be+hub_corerepo set. - Initiative README:
../README.md— EpicBIF-8641, QA Lane B.
Assumptions
qontak_chat_service(the host app mountinghub_core) is reachable frommoderator-beviaENV['CHAT_PANEL_SERVICE_BASE_URL']with the existingCore::Services::Redis::ChatPanel::Authtoken — verified by the existingDownloadMonthlyPostpaidUsagerepository.hub_core's daily aggregation (Billings::AddDailyUsageWorker, ~00:00 + 7h Jakarta offset) has fully written month-endmonthly_usagesrows before the snapshot cron runs at 02:00 on the 1st (the 2-hour gap is the safety margin — PRD §15, 2026-06-23 decision).moderator-be's billing DB (chat_billing, models underBillings::ApplicationRecord) is the correct home for the new snapshot/export tables (it is operationally owned by Bifrost and already separate fromhub_core'sbillingDB).billing_version"1.0.0"= V1,"3.0.0"= V3 (verified migration)."2.0.0"handling is unresolved — see §5 OQ-3 (carried from PRD OQ-3).
Dependencies
| Dependency | Type | What we need | Owner | Availability |
|---|---|---|---|---|
| hub_core bulk summary | Internal (data source) | New repo/interactor returning per-CID per-type usage incl. waba_id, V3 Call Balance, quotas | hub_core (Bifrost) | needs building (extends existing DownloadPostpaidMonthlyUsage) |
| qontak_chat_service route | Internal (HTTP contract) | New host-app route mounting the hub_core interactor, callable by moderator-be pigeon client | chat_service host (Bifrost) | needs building |
| moderator-be Sidekiq + sidekiq-cron | Infra (exists) | Cron scheduling + worker queues | Bifrost | exists (config/schedule.yml) |
caxlsx gem | Library (exists) | XLSX generation per CID file | — | exists (Gemfile:128) — resolves PRD OQ-2 |
rubyzip gem | Library (new) | Package per-CID .xlsx into one ZIP | — | needs adding (no zip gem present today) |
| Object storage + presigned URL | Infra (unconfirmed) | Store generated ZIP, serve 24 h download link | Infra | BLOCKER for export delivery — no S3/OSS client found in moderator-be; see §5 OQ-1 + Decision 6 |
Flipper / Core::Services::Preference | Infra (exists) | Feature flag postpaid_usage_scheduler_enabled | Bifrost | exists |
| Pundit + Permission/Role | Infra (exists) | New Finance permission key | Bifrost | exists |
PRD-to-Schema Derivation (backend-specific — required)
Backend RFCs derive the schema/contract from the PRD as a domain spec; no Figma. Every §2.3 DDL row and §2.4 endpoint traces back to a row here.
| PRD-described entity / attribute / rule | Persisted as (table.column) | Exposed via (endpoint / event) | Enforced where | Source (PRD §) |
|---|---|---|---|---|
| Frozen monthly usage per CID per postpaid type | postpaid_usage_snapshots(cid, organization_id, waba_id, year_month, billing_type, usage_value, report_date) | hub_core bulk summary (write side) + Modpanel list endpoint (read side) | PostpaidUsageSnapshotWorker (write); immutable after insert (no update path) | §9.2 PUMS-S01 |
| Snapshot is frozen/immutable mid-month | postpaid_usage_snapshots — no UPDATE path; unique key (cid, year_month, billing_type) | n/a (read-only thereafter) | DB unique index + worker upsert-once semantics | PUMS-S01/AC-2 |
| V1 = WA Balance + MUV only | billing_type ∈ {WA_BALANCE_V1, MUV_V1} | hub_core summary filters billing_version="1.0.0" | hub_core repo query | PUMS-S01/AC-3 |
| V3 = WA Balance + MUV + Call Balance + whitelisted quotas | billing_type ∈ {WA_BALANCE_V3, MUV_V3, CALL_BALANCE_V3, <quota codes>} | hub_core summary reads monthly_usages + organization_package_component_postpaid_limits | hub_core repo query | PUMS-S01/AC-4 |
WA Balance value = monthly_usages.mcc_total; MUV = muv_total | postpaid_usage_snapshots.usage_value | hub_core summary | hub_core repo mapping | PUMS-S01 (Data Fields) |
| WABA ID per CID | postpaid_usage_snapshots.waba_id ← organizations.settings['waba_id'] or waba_accounts.waba_id | hub_core summary (new field) | hub_core repo join | PUMS-S02, PRD OQ-1 (resolved) |
| Company name (decrypted) | snapshot stores company_name; source organizations.name / organization_packages.organization_name (lockbox-encrypted) | hub_core summary (decrypted server-side, never logged) | hub_core lockbox has_encrypted | PUMS-S02, PRD §6.2 |
| Per-CID snapshot failure isolation | snapshot_failed log event (cid, year_month, reason); run continues | worker rescue-per-CID | PostpaidUsageSnapshotWorker | PUMS-S01/ERR-1 |
snapshot_failed rate > 5% → alert | metric moderator_panel_postpaid_snapshot_failed count | Datadog alert | worker + Datadog | PUMS-S01/ERR-2, §11.1 |
| Paginated dashboard read (50/page) | read postpaid_usage_snapshots | GET /api/v1/qontak/chat_panel/reports/postpaid_usage_snapshots | Pundit can? + Pagy | PUMS-S02 |
| Search by CID/WABA + filter by year_month | query params search_query, year_month | same list endpoint | repository WHERE (exact match) | PUMS-S04 |
| Postpaid type label / "Unknown" fallback | billing_type → display map | list endpoint serializer | view/serializer | PUMS-S03 + PUMS-S03/ERR-1 |
| Bulk async ZIP, ≤ 50 MB, 24 h link | postpaid_usage_exports(job_id, status, file_ref, file_size_bytes, expires_at, requested_by, error_details) | POST .../postpaid_usage_exports (trigger) + GET .../postpaid_usage_exports/:job_id (poll) | PostpaidUsageExportWorker; 50 MB pre-check; TTL cleanup | PUMS-S06 |
ZIP file naming {CID} {Company Name} {Postpaid Usage YYYY-MM} {Type}.xlsx | n/a (file name) | export worker | caxlsx builder | PUMS-S06/AC-3 |
| Download link expiry 24 h | postpaid_usage_exports.expires_at | poll endpoint returns "expired" past TTL | TTL cleanup cron + check | PUMS-S06/AC-4, §6.1 |
| Finance-only access | n/a (authz) | all endpoints behind permission billing_postpaid_usage_* | Pundit policy + can? | PUMS-S02…S06 Permission Model |
Detail 1.A — PRD Traceability Matrix
Composite AC ids (
<STORY-ID>/AC-n, alsoERR-n) — AC numbering restarts per story.
Forward (PRD AC → RFC):
| PRD composite AC id | Service / endpoint / job | RFC section |
|---|---|---|
| PUMS-S01/AC-1..AC-4 | PostpaidUsageSnapshotWorker + hub_core summary | §2 Decision 1–3 · §2.2 seq-1 · §2.3 · §4.C ch.2,4,6 |
| PUMS-S01/ERR-1, ERR-2 | per-CID rescue + Datadog alert | §2.2 seq-2 · §3.A.1 · §3 Monitoring |
| PUMS-S02/AC-1..AC-3, ERR-1, ERR-2 | GET .../postpaid_usage_snapshots + Modpanel view | §2.4 row 1 · §4.C ch.7,8 |
| PUMS-S03/AC-1..AC-3, ERR-1 | billing_type display map | §2.3 (billing_type enum) · §4.C ch.8 |
| PUMS-S04/AC-1..AC-4, ERR-1, ERR-2 | list endpoint search/filter params | §2.4 row 1 · §4.C ch.8 |
| PUMS-S05/AC-1..AC-5, ERR-1 | client-side selection (Stimulus) | §4.C ch.9 (FE-in-repo) |
| PUMS-S06/AC-1..AC-4, ERR-1..ERR-3 | POST/GET .../postpaid_usage_exports + PostpaidUsageExportWorker | §2 Decision 4–6 · §2.2 seq-3,4 · §2.3 · §2.C · §4.C ch.10,11,12 |
Reverse (RFC → PRD AC):
| New endpoint / table / service / dependency | PRD composite AC id it serves |
|---|---|
postpaid_usage_snapshots table | PUMS-S01/AC-1, PUMS-S02/AC-1 |
hub_core bulk summary (waba_id + V3) | PUMS-S01/AC-4, PUMS-S02/AC-1 |
GET postpaid_usage_snapshots (paginated) | PUMS-S02/AC-3, PUMS-S04/AC-1..AC-4 |
postpaid_usage_exports table + worker | PUMS-S06/AC-1, AC-2 |
| 50 MB pre-check | PUMS-S06/ERR-1 |
expires_at + TTL cleanup | PUMS-S06/AC-4 |
| Finance permission key | PUMS-S02…S06 Permission Model |
UI / Consumer Surface Coverage
| PRD-named surface | Consumer | Required reads (BE endpoint) | Required writes (BE endpoint) | Status surface |
|---|---|---|---|---|
/modpanel/postpaid-usage Dashboard (table, search, filter) | web (Modpanel, server-rendered) | GET .../postpaid_usage_snapshots | n/a — read-only view | report_date per row |
| Bulk download / ZIP status toast | web (Modpanel + Stimulus poll) | GET .../postpaid_usage_exports/:job_id | POST .../postpaid_usage_exports | postpaid_usage_exports.status enum |
Role Coverage
| PRD role | Authorization mechanism | Endpoints permitted | Cross-tenant? | Audit trail |
|---|---|---|---|---|
| Finance Modpanel user | Pundit can? + permission billing_postpaid_usage_index / _download | list + export trigger + poll | yes — Finance sees all CIDs (internal tool, by design) | bulk_download_triggered event + integration_logs |
| Non-Finance Modpanel user | Pundit denies (no permission) | none → 403 / hidden | n/a | denied attempt logged |
| System (cron) | no auth surface — Sidekiq-internal | n/a (no manual-trigger endpoint) | n/a | snapshot_generated / snapshot_failed |
PRD Section Coverage
| PRD § | Title | Where covered |
|---|---|---|
| 3 | One-liner + Problem | §1 Overview |
| 4 | Target Users + Persona | §1 (Finance) · Role Coverage |
| 5 | Non-Goals | §1 Out of Scope |
| Scope Changes | Backend/Frontend | §1 (BE here; FE = in-repo server-rendered) |
| 6 / 6.1 / 6.2 | Constraints / Lifecycle / Service Architecture | §3 Performance · §2.3 retention · §2 Infra Topology (corrected) |
| 7 | New Features (Dashboard) | §2.4 · §4.C ch.7–9 |
| 8 | API & Webhook Behavior | §2.4 APIs · §2.C async |
| 9.1 / 9.2 | System Flow / User Stories | §2.2 sequence · Detail 1.C |
| 10 | Rollout | §4 Rollout Strategy |
| 11 / 11.1 | Observability / Cadence | §3 Monitoring & Alerting |
| 12 | Success Metrics | §1 Success Criteria |
| 13 | Launch Plan & Stage Gates | §4 Rollout Strategy |
| 14 | Dependencies | §1 Dependencies |
| 15 | Key Decisions | §2 Technical Decisions (ADR) |
| 16 | Open Questions | §5 (OQ-1 storage, OQ-3 V2 carried; OQ-1 WABA resolved) |
Detail 1.B — Key Decisions Summary (full ADR treatment in §2 Technical Decisions)
| # | Decision | Chosen option | §2 block |
|---|---|---|---|
| 1 | Where snapshots are stored | New postpaid_usage_snapshots table in moderator-be chat_billing DB | Decision 1 |
| 2 | How moderator-be reads source usage | Cross-service pigeon_get to hub_core via qontak_chat_service (reuse existing pattern) | Decision 2 |
| 3 | hub_core: extend vs new bulk endpoint | Extend the existing DownloadPostpaidMonthlyUsage report into a bulk summary (adds waba_id + V3) | Decision 3 |
| 4 | Snapshot generation sync vs async | Async Sidekiq cron, per-CID isolation, no run abort | Decision 4 |
| 5 | Bulk export sync vs async + ZIP build location | Async Sidekiq worker in moderator-be, caxlsx + rubyzip | Decision 5 |
| 6 | ZIP storage + delivery | Object storage + presigned URL — mechanism unconfirmed (recommended option + blocker) | Decision 6 |
| 7 | File format XLSX vs CSV | XLSX via caxlsx (already bundled) — resolves PRD OQ-2 | Decision 7 |
Detail 1.C — Per-Story Change Map (human-readable index)
| Story id | Title | Layer scope | Changes (concrete BE artifacts) | Composite AC ids | Acceptance criteria (verifiable) | RFC anchors |
|---|---|---|---|---|---|---|
| PUMS-S01 | Monthly Snapshot Generation | Runtime / behavior (cron) + Cross-squad (hub_core) | PostpaidUsageSnapshotWorker; schedule.yml cron 0 2 1 * * Asia/Jakarta; postpaid_usage_snapshots migration+model; SummaryPostpaidUsage pigeon repo; hub_core bulk summary repo/interactor/builder + chat_service route | S01/AC-1..AC-4, ERR-1, ERR-2 | rspec on worker passes; running worker for a fixture month inserts N rows = eligible CIDs × types; per-CID raise logs snapshot_failed and run completes; Datadog count emitted | §2 Dec 1–4 · §2.2 seq-1,2 · §2.3 · §4.C ch.2,3,4,6 |
| PUMS-S02 | View Postpaid Usage Table | BE + FE consumes new | GET .../postpaid_usage_snapshots (Pagy 50/page); Modpanel controller + ERB view + Stimulus | S02/AC-1..AC-3, ERR-1, ERR-2 | rspec request spec returns 50 rows w/ columns; flag-off → 403/hidden; empty month → empty-state JSON | §2.4 row 1 · §4.C ch.7,8 |
| PUMS-S03 | Identify Postpaid Types | BE + FE consumes new | billing_type enum → label map (WA Balance / MUV / Call Balance / Unknown) | S03/AC-1..AC-3, ERR-1 | unit test maps each enum to label; unknown code → "Unknown" not blank | §2.3 enum · §4.C ch.8 |
| PUMS-S04 | Search & Filter | BE + FE consumes new | list endpoint params search_query (exact CID/WABA), year_month (required, default latest) | S04/AC-1..AC-4, ERR-1, ERR-2 | request spec: exact-match CID returns subset; combined filter ANDs; no match → empty-state | §2.4 row 1 · §4.C ch.8 |
| PUMS-S05 | Bulk Selection | FE-only (Stimulus, in-repo) | Stimulus controller: per-row + select-all-across-pages, persist across pagination, clear on filter change | S05/AC-1..AC-5, ERR-1 | Stimulus unit/jest or system spec: select-all sets count; filter change clears; refresh clears | §4.C ch.9 |
| PUMS-S06 | Bulk Download (Async ZIP) | BE + FE consumes new | POST/GET .../postpaid_usage_exports; PostpaidUsageExportWorker; postpaid_usage_exports migration+model; caxlsx+rubyzip; storage+presigned (Dec 6); 50 MB pre-check | S06/AC-1..AC-4, ERR-1..ERR-3 | request spec: >50 MB → reject + zip_size_limit_exceeded; worker builds ZIP w/ exact file names; status → completed; link past expires_at → "expired"; 0 selected → 422 | §2 Dec 5–7 · §2.2 seq-3,4 · §2.3 · §2.C · §4.C ch.10,11,12 |
2. Technical Design
Infrastructure Topology (start here)
Deployment topology
flowchart TB
finance(["Finance user (browser)"]) -->|HTTPS| lb["Load Balancer / Ingress"]
lb -->|HTTP| mod["moderator-be api pods (Rails 7.2, Puma)"]
mod -->|"read / write"| billdb[("chat_billing Postgres (primary)")]
mod -->|"enqueue"| redis[("Redis (Sidekiq queues)")]
redis -->|consume| worker["moderator-be Sidekiq worker pods"]
cron["sidekiq-cron (schedule.yml)"] -->|"0 2 1 * * Asia/Jakarta"| redis
worker -->|"read / write"| billdb
worker -->|"HTTPS pigeon_get (auth token)"| chatsvc["qontak_chat_service (host app + hub_core engine)"]
chatsvc -->|"read"| hubdb[("hub_core billing Postgres + replica_billing")]
worker -->|"PUT object (UNCONFIRMED — Decision 6)"| oss[("Object storage (ZIP files)")]
mod -->|"GET presigned URL (UNCONFIRMED)"| oss
Per-service responsibility
flowchart LR
subgraph mod["moderator-be (Bifrost — Ruby/Rails 7.2)"]
uc1["PostpaidUsageSnapshotWorker (cron — generate snapshots)"]
uc2["GET postpaid_usage_snapshots (list / search / filter)"]
uc3["POST/GET postpaid_usage_exports (bulk ZIP)"]
uc4["PostpaidUsageExportWorker (build XLSX + ZIP)"]
end
subgraph hub["hub_core engine via qontak_chat_service (Bifrost)"]
s1["GET summary_postpaid_usage (bulk monthly usage + waba_id + V3)"]
end
uc1 -->|"HTTPS pigeon_get"| s1
s1 -->|"ActiveRecord read (replica)"| hubread[("monthly_usages, organizations, waba_accounts, postpaid_limits")]
uc1 -->|"INSERT frozen rows"| snaptbl[("postpaid_usage_snapshots")]
uc2 -->|"SELECT (Pagy)"| snaptbl
uc3 -->|"enqueue + status"| exptbl[("postpaid_usage_exports")]
uc4 -->|"read snapshots, write status"| exptbl
uc4 -->|"caxlsx + rubyzip, PUT object"| oss[("Object storage")]
Every external/internal edge above is verified except the
Object storageedges (Decision 6, §5 OQ-1) — drawn dashed-in-prose as UNCONFIRMED.
Technical Decisions (ADR-format — the engineering heart)
Decision 1: Store snapshots in moderator-be's chat_billing DB (not in hub_core)
Context The frozen monthly snapshot is a new dataset shaped for the Modpanel dashboard
(per-CID per-type rows with waba_id, display company_name, report_date). It must be queried
with pagination/search on every page load (≤ 3 s for 1,000 rows) and must be immutable. hub_core
owns the source aggregates (monthly_usages) but is a shared billing engine; moderator-be
owns the Modpanel and its billing DB (chat_billing, models under Billings::ApplicationRecord).
Options considered
- Option A — new table in
moderator-bechat_billing: snapshot lives next to the dashboard.- Pros: dashboard reads are local (no cross-service latency on page load); Bifrost owns it end-to-end; no schema churn in the shared billing engine.
- Cons: data duplication (snapshot copies a projection of
monthly_usages); a cross-service read at generation time.
- Option B — new table in
hub_corebilling DB: snapshot lives with the source data.- Pros: no cross-service read at generation; single source of truth.
- Cons: every dashboard page load becomes a cross-service call (blows the 3 s budget / couples Modpanel to chat_service uptime); schema change in a shared engine consumed by many host apps; Modpanel RBAC can't gate a hub_core table.
Decision: Option A.
Rationale The dashboard's read-latency budget and Modpanel ownership dominate. Generation is a
once-a-month batch where a cross-service read is acceptable; page loads are frequent and must be
local. Mirrors the existing split — moderator-be already proxies billing reads cross-service
rather than embedding the billing engine.
Consequences Snapshot is a denormalized projection; if monthly_usages is later corrected the
snapshot does not auto-update (acceptable — snapshots are intentionally frozen, PUMS-S01/AC-2).
A cross-service dependency exists at generation (handled by Decision 4 isolation).
Reversibility Moderate. The table + worker are additive; moving to Option B means relocating the
migration to hub_core and swapping the read endpoint for a local query — bounded, no data loss
(snapshots regenerate monthly).
Decision 2: Read source usage cross-service via pigeon_get to qontak_chat_service
Context moderator-be cannot read hub_core's billing DB directly — hub_core is a Rails
engine mounted in a different service (qontak_chat_service) with its own DB. The verified,
in-production pattern for billing reads is pigeon_get to ENV['CHAT_PANEL_SERVICE_BASE_URL']
with a Core::Services::Redis::ChatPanel::Auth token (see DownloadMonthlyPostpaidUsage).
Options considered
- Option A — reuse
pigeon_getcross-service HTTP: extend the existing report client.- Pros: zero new infra; reuses auth, circuit breaker, retry, timeout already wired in
Core::Repositories::AbstractRepository; matches how Modpanel already fetches billing data. - Cons: bulk payload (all postpaid CIDs for a month) over HTTP — must paginate/stream to avoid a huge response.
- Pros: zero new infra; reuses auth, circuit breaker, retry, timeout already wired in
- Option B — direct DB connection from
moderator-betohub_corebilling DB: add a shard.- Pros: no HTTP serialization overhead.
- Cons: cross-service DB coupling (anti-pattern here); bypasses the billing engine's encryption/decryption (
organization_nameis lockbox-encrypted in hub_core); ops/secret-sharing burden.
Decision: Option A.
Rationale Reuse of a proven, secured path; respects the service boundary; lets hub_core do
the lockbox decryption server-side so moderator-be never holds the billing master key. Bulk size
is managed by paginating the summary endpoint (Decision 3) and the once-a-month cadence.
Consequences Generation latency includes HTTP round-trips; the summary endpoint must paginate. A chat_service outage delays (not corrupts) snapshot generation — the cron can be re-run.
Reversibility Easy — it's one repository class behind the worker.
Decision 3: Extend hub_core's existing report into a paginated bulk summary (adds waba_id + V3)
Context hub_core already has Billings::Repositories::V1::Reports::DownloadPostpaidMonthlyUsage
(reads monthly_usages joined to organizations, returns mcc_total/muv_total). It does not
return waba_id, does not cover V3 Call Balance or whitelisted quota components
(organization_package_component_postpaid_limits), and is shaped to render a CSV for one
organization_name filter — not a bulk machine-readable summary for all CIDs.
Options considered
- Option A — extend the existing report repo into a bulk JSON summary: add
waba_id, V3 components,billing_version, pagination; expose via a new chat_service route.- Pros: reuses the verified query (
read_from_replica_db, org join, encryption handling); one place for postpaid-usage logic; least new surface in the shared engine. - Cons: the existing repo is CSV-oriented (
Builders::...DownloadPostpaidMonthlyUsage); needs a sibling builder for JSON + the extra fields.
- Pros: reuses the verified query (
- Option B — brand-new unrelated interactor: build from scratch.
- Pros: clean separation from the CSV report.
- Cons: duplicates the org-join + replica-read + encryption logic; two code paths to keep correct.
Decision: Option A — a new bulk summary interactor/repository/builder alongside the existing
report, sharing its read helpers, exposed at a new route (e.g. GET /api/core/v1/reports/billing/summary_postpaid_usage).
Rationale Maximizes reuse of verified, encryption-aware read code; isolates the new JSON shape
in a dedicated builder; keeps the CSV report untouched (backward compatible). Resolves PRD OQ-1
(waba_id from organizations.settings['waba_id'] or waba_accounts) in the one place that can
do it without leaking the encryption key.
Consequences hub_core gains a new bulk endpoint and a quota/V3 query path
(organization_package_component_postpaid_limits); a sibling builder is maintained. V3 quota-code
enumeration must be confirmed with the billing team (§5 OQ-2-bis).
Reversibility Easy — additive; the existing CSV report is unchanged.
Decision 4: Snapshot generation is async (Sidekiq cron) with per-CID isolation
Context Generation spans all postpaid CIDs × billing types and a cross-service read — far over 100 ms and partial-failure-prone. PRD requires the run to continue past a single CID failure and finish within 2 h.
Options considered
- Option A — one Sidekiq cron job, iterate CIDs, rescue per CID:
retry: 0, log + continue.- Pros: simple; matches existing monthly crons (
chatbot_ai_reset_monthly_task,airene_reset_monthly_task— both0 0 1 * * Asia/Jakarta); one run = one month. - Cons: a single long-running job; one slow CID slows the whole run.
- Pros: simple; matches existing monthly crons (
- Option B — fan-out: dispatcher enqueues one job per CID (mirrors hub_core's
SingleAddMonthlyUsageWorker).- Pros: parallelism, natural per-CID isolation/retry.
- Cons: more moving parts; harder to assert "run complete + failure rate"; thundering cross-service reads against chat_service.
Decision: Option A for Phase 1 (single job, per-CID begin/rescue, accumulate failures, emit
snapshot_failed per CID + a run summary). Revisit Option B only if the 2 h budget is breached.
Rationale Matches the verified monthly-cron precedent and retry: 0 convention; keeps the
cross-service read rate controlled; makes ERR-2 (>5% failure → alert) a single post-run check.
Consequences A pathologically slow chat_service could approach the 2 h budget; mitigated by paginating the summary read and a per-page timeout. No automatic retry of the whole run (manual re-trigger via re-enqueue if needed).
Reversibility Easy — swap the loop body for perform_async per CID.
Decision 5: Bulk export is async (Sidekiq) building XLSX+ZIP in moderator-be
Context A Finance selection can be hundreds of CIDs; synchronous generation times out the
browser (PRD §15). The named repo set has no Go report-worker — so the ZIP is built in
moderator-be. moderator-be already bundles caxlsx and has an async-export precedent
(new_reporting_flow, and a download_mcc_v2 path that returns a zip).
Options considered
- Option A —
moderator-beSidekiq worker,caxlsx+rubyzip: enqueue on trigger, poll status.- Pros: stays in the named repos; reuses Sidekiq + the existing async-export UX precedent;
caxlsxalready present. - Cons: needs a new
rubyzipdependency and object storage (Decision 6); ZIP build is CPU/memory on the worker pod.
- Pros: stays in the named repos; reuses Sidekiq + the existing async-export UX precedent;
- Option B — delegate to the chat/reporting service's async export (the service behind
new_reporting_flow).- Pros: that service may already have storage + presigned URLs.
- Cons: that service is not in this RFC's grounded repo set; its contract for "bulk postpaid snapshot ZIP" is unverified; couples delivery to another team's roadmap.
Decision: Option A.
Rationale Keeps the work inside the two named repos and reuses caxlsx + the async precedent;
avoids a hard dependency on an un-grounded service. The one gap (storage) is isolated to Decision 6.
Consequences Adds rubyzip; worker pods need memory headroom for ZIP build (≤ 50 MB cap bounds
it); introduces the storage dependency (Decision 6 / OQ-1).
Reversibility Moderate — the worker + tables are additive; switching to Option B replaces the worker body with a delegation call.
Decision 6: ZIP storage + 24 h presigned delivery — recommended mechanism, unconfirmed infra
Context The generated ZIP must persist and be served to Finance as a link that expires in
24 h (PRD §6.1). No S3/OSS/GCS client or storage gem was found in moderator-be (verified
grep — only caxlsx, no aws-sdk/aliyun/fog/storage service). This is the one place the design
cannot be fully grounded in existing code.
Options considered
- Option A — object storage + presigned URL (S3/Alibaba OSS),
expires_at = created_at + 24h.- Pros: standard; link expiry maps to presigned TTL; no large blobs in Postgres.
- Cons: requires provisioning a bucket + client in
moderator-be— not present today.
- Option B — store the ZIP bytes in Postgres / on a shared volume, serve via an authenticated
moderator-bedownload endpoint that checksexpires_at.- Pros: no new external infra;
moderator-bealready serves files viasend_data. - Cons: large blobs in the DB (≤ 50 MB × concurrent jobs) is poor practice; volume needs cleanup.
- Pros: no new external infra;
Decision: Recommend Option A; treat the concrete storage client/bucket as a blocker for the export-delivery chunks only (§4.C ch.11–12) until Infra confirms availability (§5 OQ-1). Option B is the documented fallback if no bucket can be provisioned in Phase 1.
Rationale Object storage is the correct long-term answer and matches the PRD's presigned-URL intent; but the RFC must not fabricate a storage client that does not exist. The snapshot + dashboard (Phase 1a) ship independently of this; bulk export (Phase 1b) gates on this confirmation.
Consequences Phase 1b cannot execute until storage is decided. The postpaid_usage_exports
schema is storage-agnostic (file_ref holds either an object key or a local path) so neither option
forces a migration change.
Reversibility The file_ref indirection makes A↔B a worker-internal change, no schema churn.
Decision 7: File format is XLSX via caxlsx (resolves PRD OQ-2)
Context PRD §16 OQ-2 left XLSX-vs-CSV open, assuming the Go report-worker only did CSV.
Options considered
- Option A — XLSX via
caxlsx:Gemfile:128already bundles it.- Pros: matches PRD's stated
.xlsxnaming; no new dependency for the spreadsheet itself. - Cons: slightly larger files than CSV.
- Pros: matches PRD's stated
- Option B — CSV: simpler.
- Pros: smallest files.
- Cons: diverges from PRD's stated format; Finance asked for
.xlsx.
Decision: Option A — XLSX via caxlsx.
Rationale caxlsx is already a dependency, so XLSX carries no new library cost — the original
OQ-2 concern (needing a new Go Excel lib) does not apply in the Ruby path.
Consequences Per-file .xlsx; ZIP packaging via rubyzip.
Reversibility Trivial — the builder can emit CSV instead.
Detail 2.0 — Repo Reading Guide
Repo Map (mermaid)
flowchart LR
subgraph mod["moderator-be"]
sched["config/schedule.yml"]
wkr["app/domains/core/workers/sidekiq/"]
repo["app/domains/core/repositories/app_integrations/chat_panel/reports/"]
ctrl["app/controllers/api/v1/qontak/chat_panel/reports/"]
mdl["app/models/billings/"]
mig["db/billing/migrate/"]
views["app/views/billing/postpaid_usage/"]
end
subgraph hub["hub_core"]
hint["app/apps/billings/interactors/v1/reports/"]
hrepo["app/apps/billings/repositories/v1/reports/"]
hbld["app/apps/billings/builders/v1/reports/"]
hmdl["app/core/domains/models/billing/"]
end
sched --> wkr --> repo --> hint
hint --> hrepo --> hmdl
hrepo --> hbld
ctrl --> mdl --> mig
ctrl --> views
Existing Code Anchors
| Path (repo) | Why the agent reads it | What pattern it teaches |
|---|---|---|
moderator-be: app/controllers/api/v1/qontak/chat_panel/reports/reports_controller.rb | The existing postpaid endpoint + async (new_reporting_flow) + zip (download_mcc_v2) precedent | Dry::Matcher::ResultMatcher; send_data; flag-gated async/zip branching |
moderator-be: app/domains/core/repositories/app_integrations/chat_panel/reports/download_monthly_postpaid_usage.rb | The exact cross-service read pattern to copy for the summary read | pigeon_get(service: 'qontak_chat_service', path:…) + Core::Services::Redis::ChatPanel::Auth + parse_response_csv |
moderator-be: app/domains/core/workers/sidekiq/chatbot_ai_reset_monthly_worker.rb + abstract_worker.rb | Monthly-cron worker shape to copy | < Core::Workers::Sidekiq::AbstractWorker; sidekiq_options queue:, retry: 0; perform → repository |
moderator-be: config/schedule.yml (chatbot_ai_reset_monthly_task) | Exact cron entry format | 6-field cron + Asia/Jakarta; name/class/queue keys |
moderator-be: app/domains/core/repositories/abstract_repository.rb | Base for HTTP/circuit-breaker/Dry::Monads | include …Http::PigeonHttp, CircuitBreakers; parse_response |
moderator-be: app/domains/core/services/preference.rb | Flipper flag definition + check | Core::Services::Preference.new.enabled?(:flag) |
moderator-be: app/controllers/application_controller.rb (can?) + app/policies/application_policy.rb | Permission gate + Pundit policy shape | before_action :can?; Core::UseCases::Permissions::Check; pundit_user = current_account_user |
moderator-be: db/billing_schema.rb + config/database.yml (migrations_paths: db/billing/migrate) | Billing DB dialect + migration home | ActiveRecord::Schema[7.2]; UUID PK gen_random_uuid(); decimal 15,2; billing migrations live in db/billing/migrate |
moderator-be: app/models/billings/application_record.rb | Base class for new billing models | Billings::ApplicationRecord (chat_billing connection) |
moderator-be: app/domains/core/services/datadog/custom_metric.rb | Metric naming convention | Datadog::Statsd; names prefixed moderator_panel_* |
hub_core: app/apps/billings/repositories/v1/reports/download_postpaid_monthly_usage.rb (+ interactor + builder) | The report to extend into a bulk summary | read_from_replica_db { … }; Models::Billing::MonthlyUsage.where(timestamp: range); org join |
hub_core: app/core/domains/models/billing/organization_package.rb | Encryption + billing_version + company_id | has_encrypted :organization_name, key: :lockbox_billing_key; billing_version; payment_type |
hub_core: app/core/domains/models/organization.rb + waba_account.rb | WABA join (resolves PRD OQ-1) | store_accessor :settings, :waba_id; has_many :waba_accounts |
Existing Contracts to Reuse, Extend, or Replace
| Contract | Status | Justification | Owner |
|---|---|---|---|
moderator-be cross-service billing read (pigeon_get to chat_service) | reuse | Exact pattern in DownloadMonthlyPostpaidUsage | Bifrost |
hub_core DownloadPostpaidMonthlyUsage report | extend | Add waba_id, V3 Call Balance, quota components, bulk JSON + pagination via a sibling builder | Bifrost |
chat_service route GET /api/core/v1/reports/billing/summary_postpaid_usage | new-with-justification | No bulk machine-readable summary exists; the only route is the CSV download for one org filter | Bifrost |
moderator-be GET .../reports/postpaid_usage_snapshots (list) | new-with-justification | The existing endpoint returns a synchronous file for one name filter; the dashboard needs paginated JSON from the new local snapshot table | Bifrost |
moderator-be POST/GET .../reports/postpaid_usage_exports | new-with-justification | No bulk async-ZIP export endpoint for snapshots exists | Bifrost |
Sidekiq cron worker + schedule.yml | reuse/extend | Add one cron entry + one worker following the monthly-cron precedent | Bifrost |
Flipper flag via Core::Services::Preference | reuse | postpaid_usage_scheduler_enabled | Bifrost |
Pundit can? + Permission model | reuse/extend | Add one permission key | Bifrost |
Patterns to Follow
| Concern | Pattern in repo | Reference file | Deviation? |
|---|---|---|---|
| HTTP handler shape | Dry::Matcher::ResultMatcher over a use-case result | moderator-be: …/reports_controller.rb | none |
| Cross-service read | pigeon_get + chat-panel auth token | moderator-be: …/download_monthly_postpaid_usage.rb | none |
| Repository / DB access | < Core::Repositories::AbstractRepository; AR for billing models | moderator-be: abstract_repository.rb | none |
| Worker shape | < Core::Workers::Sidekiq::AbstractWorker; sidekiq_options | moderator-be: chatbot_ai_reset_monthly_worker.rb | none |
| Migration dialect | Postgres, UUID PK gen_random_uuid(), decimal 15,2 | moderator-be: db/billing_schema.rb | none |
| hub_core read | read_from_replica_db { … } on replica shard | hub_core: download_postpaid_monthly_usage.rb | none |
| Logging / metrics | Rails.logger.error + Core::Services::Datadog::CustomMetric (moderator_panel_*) | moderator-be: custom_metric.rb | none |
| Encryption | lockbox has_encrypted — decrypt only in hub_core | hub_core: organization_package.rb | none — moderator-be never holds the key |
Reading Order for the Agent
moderator-be: …/reports/reports_controller.rb— controller + async/zip precedent.moderator-be: …/reports/download_monthly_postpaid_usage.rb— cross-service read to copy.moderator-be: app/domains/core/workers/sidekiq/chatbot_ai_reset_monthly_worker.rb+abstract_worker.rb— worker shape.moderator-be: config/schedule.yml— cron entry format.moderator-be: db/billing_schema.rb+config/database.yml— billing dialect +db/billing/migratepath.moderator-be: app/models/billings/application_record.rb— billing model base.moderator-be: app/controllers/application_controller.rb(can?) +app/policies/application_policy.rb— authz.moderator-be: app/domains/core/services/preference.rb— Flipper flag check.hub_core: …/reports/download_postpaid_monthly_usage.rb(+ interactor + builder) — the report to extend.hub_core: organization.rb+waba_account.rb+organization_package.rb— WABA join + encryption + billing_version.
Source Verification (anti-hallucination — required)
| Anchor / pattern / contract | Verified by | Evidence |
|---|---|---|
| moderator-be reports controller | read | class Api::V1::Qontak::ChatPanel::Reports::ReportsController; download_monthly_postpaid_usage L45-54; new_reporting_flow branch L24-30; download_mcc_v2 → 'zip' |
| moderator-be cross-service read | read | pigeon_get(service: 'qontak_chat_service', path: '/api/core/v1/reports/billing/download/monthly_postpaid_usage'…); @base_url = ENV['CHAT_PANEL_SERVICE_BASE_URL']; Core::Services::Redis::ChatPanel::Auth |
| Worker base + monthly cron | read | class Core::Workers::Sidekiq::AbstractWorker; include Sidekiq::Worker; ChatbotAiResetMonthlyWorker < …AbstractWorker, sidekiq_options queue: :chatbot_ai_reset_monthly, retry: 0 |
| Cron format | read | chatbot_ai_reset_monthly_task: cron: "0 0 1 * * Asia/Jakarta" (6-field + TZ); airene_reset_monthly_task same |
| AbstractRepository HTTP | read | include Core::Repositories::Http::PigeonHttp, CircuitBreakers; parse_response, parse_response_file |
| Flipper preference | read | Core::Services::Preference#enabled?(:feature, …) → Flipper.enabled?(feature.to_sym) |
| Permission/Pundit | read (agent) | application_controller.rb can? → Core::UseCases::Permissions::Check; application_policy.rb pundit_user = current_account_user |
| Billing migration path + dialect | read | config/database.yml: migrations_paths: db/billing/migrate; db/billing_schema.rb ActiveRecord::Schema[7.2], id: :uuid, default: -> { "gen_random_uuid()" }, decimal precision: 15, scale: 2 |
| Billing model base | read | app/models/billings/application_record.rb (Billings::ApplicationRecord) |
| Datadog metric naming | read (agent) | Core::Services::Datadog::CustomMetric; @name = "moderator_panel_#{name}" |
caxlsx present / no zip/OSS gem | read (grep) | Gemfile:128 gem 'caxlsx'; grep for rubyzip|aws|aliyun|fog|presigned|put_object → no match |
| hub_core report to extend | read | Billings::Repositories::V1::Reports::DownloadPostpaidMonthlyUsage; read_from_replica_db { Models::Organization.select(:id,:name,:company_id,:moderator_account_id)… }; Models::Billing::MonthlyUsage.where(timestamp: @start_date..@end_date) — no waba_id |
| hub_core monthly_usages columns | read (agent) | migration create_monthly_usages: organization_id, timestamp, mcc_total, muv_total (decimal 15,2) |
| hub_core WABA join (OQ-1) | read (agent) | Models::Organization store_accessor :settings, :waba_id; has_many :waba_accounts; Models::WabaAccount waba_id, waba_name |
| hub_core encryption + billing_version | read (agent) | organization_package.rb has_encrypted :organization_name, key: :lockbox_billing_key; billing_version default '1.0.0'; payment_type (prepaid/postpaid) |
Detail 2.1 — Architecture (mermaid)
Component diagram
flowchart TB
cron["sidekiq-cron schedule.yml"] --> snapwkr["PostpaidUsageSnapshotWorker"]
snapwkr --> sumrepo["SummaryPostpaidUsage (pigeon repo)"]
sumrepo -->|HTTPS| chatsvc["qontak_chat_service / hub_core summary interactor"]
chatsvc --> hubrepo["Billings::Repositories::V1::Reports::SummaryPostpaidUsage"]
hubrepo --> hubdb[("monthly_usages / organizations / waba_accounts / postpaid_limits")]
snapwkr --> snapmodel["Billings::PostpaidUsageSnapshot"]
snapmodel --> snaptbl[("postpaid_usage_snapshots")]
finance(["Finance (Modpanel)"]) --> listctrl["ReportsController#postpaid_usage_snapshots"]
listctrl --> snaptbl
finance --> expctrl["ReportsController#postpaid_usage_exports"]
expctrl --> expmodel["Billings::PostpaidUsageExport"]
expmodel --> exptbl[("postpaid_usage_exports")]
expctrl --> expwkr["PostpaidUsageExportWorker"]
expwkr --> snaptbl
expwkr --> xlsx["caxlsx + rubyzip"]
xlsx --> oss[("Object storage (Decision 6 — unconfirmed)")]
Service use cases & third-party connections
flowchart LR
subgraph mod["moderator-be (Bifrost)"]
g1["snapshot cron (generate)"]
g2["list / search (read)"]
g3["export trigger + poll"]
g4["export worker (build ZIP)"]
end
g1 -->|"HTTPS pigeon — qontak_chat_service"| ext1(["hub_core summary (Bifrost)"])
g1 -->|"AR write"| db[("chat_billing Postgres")]
g2 -->|"AR read (Pagy)"| db
g3 -->|"enqueue Redis"| q[["Sidekiq queue"]]
g4 -->|"object PUT (unconfirmed)"| oss(["Object storage"])
g4 -->|"AR read/write"| db
Data model (erDiagram)
erDiagram
POSTPAID_USAGE_SNAPSHOTS {
uuid id PK
string cid
uuid organization_id
string waba_id
string company_name
string year_month
string billing_type
decimal usage_value
date report_date
timestamptz created_at
timestamptz updated_at
}
POSTPAID_USAGE_EXPORTS {
uuid id PK
string job_id
string status
string year_month
integer selected_count
bigint file_size_bytes
string file_ref
timestamptz expires_at
uuid requested_by
text error_details
timestamptz created_at
timestamptz updated_at
}
POSTPAID_USAGE_EXPORTS ||--o{ POSTPAID_USAGE_SNAPSHOTS : "selects (by year_month + ids, no FK)"
State machine — postpaid_usage_exports.status
stateDiagram-v2
[*] --> pending: POST export (within 50MB)
pending --> processing: worker picks up
processing --> completed: ZIP built and stored
processing --> failed: build or storage error
completed --> expired: past expires_at (24h)
failed --> [*]
expired --> [*]
Branch & skip flow — per-CID failure + size guard
flowchart TD
start(["snapshot worker: next CID"]) --> read{"summary read OK for CID?"}
read -- yes --> ins["INSERT frozen snapshot rows"]
read -- no --> skip["log snapshot_failed (cid, year_month, reason) + continue"]
ins --> more{"more CIDs?"}
skip --> more
more -- yes --> start
more -- no --> rate{"failure rate above 5 percent?"}
rate -- yes --> alert["emit Datadog alert metric"]
rate -- no --> done(["run complete"])
alert --> done
flowchart TD
trig(["POST export: selected ids"]) --> zero{"selected count is zero?"}
zero -- yes --> rej0["422 — button should be disabled"]
zero -- no --> size{"estimated size above 50MB?"}
size -- yes --> rej["reject — log zip_size_limit_exceeded"]
size -- no --> enq["insert export row pending + enqueue worker"]
enq --> ok(["return job_id — toast: generating"])
Detail 2.2 — Sequence (end-to-end across infra layers)
Happy path — monthly snapshot generation
sequenceDiagram
autonumber
participant Cron as sidekiq-cron
participant Q as Redis (Sidekiq)
participant W as PostpaidUsageSnapshotWorker
participant R as SummaryPostpaidUsage repo
participant CS as qontak_chat_service (hub_core)
participant DR as hub_core replica_billing
participant DB as chat_billing primary
Cron->>Q: enqueue at 02:00 on the 1st (Asia/Jakarta)
Q->>W: perform
W->>W: check flag postpaid_usage_scheduler_enabled
loop paginated over postpaid CIDs
W->>R: fetch summary (year_month, page)
R->>CS: pigeon_get summary_postpaid_usage (auth token)
CS->>DR: read monthly_usages + orgs + waba + quotas
DR-->>CS: usage rows (decrypted org_name)
CS-->>R: 200 JSON (page of CIDs x types)
R-->>W: parsed rows
W->>DB: INSERT frozen snapshot rows (idempotent on cid+year_month+type)
Note over W,DB: emit snapshot_generated per CID
end
W->>W: if failure rate above 5 percent emit alert metric
Failure path — cross-service read fails for a CID page
sequenceDiagram
autonumber
participant W as PostpaidUsageSnapshotWorker
participant R as SummaryPostpaidUsage repo
participant CS as qontak_chat_service
W->>R: fetch summary (year_month, page)
R->>CS: pigeon_get (timeout 60s, circuit breaker)
CS--xR: timeout or 5xx
R-->>W: Failure(reason)
W->>W: log snapshot_failed for affected CIDs + continue next page
Note over W: run does not abort — ERR-1
Happy path — bulk ZIP export
sequenceDiagram
autonumber
actor F as Finance (Modpanel)
participant LB as Load Balancer
participant API as moderator-be api
participant DB as chat_billing
participant Q as Redis (Sidekiq)
participant EW as PostpaidUsageExportWorker
participant OSS as Object storage (Decision 6)
F->>LB: POST postpaid_usage_exports (selected ids)
LB->>API: HTTP
API->>API: can? billing_postpaid_usage_download
API->>DB: estimate size — if above 50MB reject
API->>DB: INSERT export row (pending)
API->>Q: enqueue PostpaidUsageExportWorker(job_id)
API-->>F: 202 job_id — toast generating
Q->>EW: perform(job_id)
EW->>DB: read selected snapshots
EW->>EW: build one .xlsx per CID per type (caxlsx) + zip (rubyzip)
EW->>OSS: PUT zip object (private)
EW->>DB: UPDATE export completed + file_ref + expires_at (24h)
F->>API: GET postpaid_usage_exports/:job_id (poll)
API-->>F: completed + presigned URL
Failure path — export build/storage error
sequenceDiagram
autonumber
participant EW as PostpaidUsageExportWorker
participant DB as chat_billing
participant OSS as Object storage
EW->>EW: build XLSX + ZIP
EW->>OSS: PUT zip object
OSS--xEW: storage error
EW->>DB: UPDATE export failed + error_details
Note over EW: retry with exponential backoff up to MaxFails then leave failed
Detail 2.3 — Database Model (DDL)
Postgres, billing DB (
chat_billing); migrations inmoderator-be: db/billing/migrate/. Matchesdb/billing_schema.rbdialect (UUID PKgen_random_uuid(), decimal 15,2). Models inheritBillings::ApplicationRecord.
-- db/billing/migrate/<ts>_create_postpaid_usage_snapshots.rb
CREATE TABLE postpaid_usage_snapshots (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
cid varchar NOT NULL,
organization_id uuid NOT NULL,
waba_id varchar, -- nullable: shared/absent WABA (PRD OQ-1 resolved)
company_name varchar, -- decrypted in hub_core, stored plain for display
year_month varchar(7) NOT NULL, -- 'YYYY-MM' (prior month)
billing_type varchar NOT NULL, -- enum below
usage_value decimal(15,2) NOT NULL DEFAULT 0.0,
report_date date NOT NULL,
created_at timestamptz NOT NULL,
updated_at timestamptz NOT NULL
);
-- one frozen row per CID per type per month (idempotent regeneration / no dup)
CREATE UNIQUE INDEX idx_pus_unique ON postpaid_usage_snapshots (cid, year_month, billing_type); -- supports upsert-once + dedup
CREATE INDEX idx_pus_year_month ON postpaid_usage_snapshots (year_month); -- supports default-month list
CREATE INDEX idx_pus_search ON postpaid_usage_snapshots (year_month, cid, waba_id); -- supports search (PUMS-S04)
-- db/billing/migrate/<ts>_create_postpaid_usage_exports.rb
CREATE TABLE postpaid_usage_exports (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
job_id varchar NOT NULL,
status varchar NOT NULL DEFAULT 'pending', -- pending|processing|completed|failed|expired
year_month varchar(7) NOT NULL,
selected_count integer NOT NULL DEFAULT 0,
file_size_bytes bigint,
file_ref varchar, -- object key OR local path (Decision 6 indirection)
expires_at timestamptz, -- created + 24h on completion
requested_by uuid NOT NULL, -- moderator account id
error_details text,
created_at timestamptz NOT NULL,
updated_at timestamptz NOT NULL
);
CREATE UNIQUE INDEX idx_pue_job_id ON postpaid_usage_exports (job_id); -- poll by job_id
CREATE INDEX idx_pue_status_expiry ON postpaid_usage_exports (status, expires_at); -- TTL cleanup cron
-
billing_typeenum (PUMS-S01/AC-3,4 · PUMS-S03):WA_BALANCE_V1,MUV_V1,WA_BALANCE_V3,MUV_V3,CALL_BALANCE_V3, plus whitelisted V3 quota component codes (e.g.CP-QONTAKCHAT-2025-0005). Display map:WA_BALANCE_* → "WA Balance",MUV_* → "MUV",CALL_BALANCE_V3 → "Call Balance", unmapped →"Unknown"(PUMS-S03/ERR-1). -
Cardinality / growth: ~(postpaid CIDs) × (2–5 types) rows/month. At, say, 2,000 CIDs × 4 ≈ 8,000 rows/month → ~192k rows over the 24-month retention (PRD §6.1). Trivial for Postgres.
-
Example row:
('12345', '<uuid>', '628xxx', 'Acme Corp', '2026-05', 'WA_BALANCE_V3', 1234567.89, '2026-06-01'). -
PII classification:
company_name= sensitive (business identity) — decrypted in hub_core, stored plain here for display, never logged.waba_id/cid= internal identifiers.usage_value= financial. No customer phone/PII in this table. -
Retention: snapshots 24 months (PRD §6.1) — annual purge cron (out of scope to build now; noted). Exports: row + file 24 h, then TTL cleanup →
expired+ object delete. -
Per-status lifecycle (
postpaid_usage_exports.status):Status Visibility Retention Restore semantics Transitions allowed pendingpoll only until processing n/a → processing processingpoll only until done/fail n/a → completed / failed completeddownload link active file 24 h re-trigger to regenerate → expired failed"Generation failed" 24 h re-trigger terminal expired"link expired" row retained per audit re-trigger terminal -
Partition / sharding: none — volume is small.
-
NoSQL alternative: rejected — relational filtering/pagination on
year_month/cid/waba_idis the core access pattern; Postgres already hosts the billing DB.
Detail 2.4 — APIs
Outbound endpoints (consumers call us — moderator-be)
| Endpoint | Method | AuthN/AuthZ | Request | Response | Status codes | Idempotency | Versioning | Reuse? |
|---|---|---|---|---|---|---|---|---|
/api/v1/qontak/chat_panel/reports/postpaid_usage_snapshots | GET | session + can? billing_postpaid_usage_index | year_month (req, default latest), search_query (opt, exact CID/WABA), page | { data: [rows], pagy: {…} } | 200 / 403 / 422 | n/a (read) | /api/v1 | new-with-justification |
/api/v1/qontak/chat_panel/reports/postpaid_usage_exports | POST | session + can? billing_postpaid_usage_download | { year_month, snapshot_ids: [], select_all: bool } | { status:'success', job_id } | 202 / 403 / 422 (size/zero) | job_id (server-generated) | /api/v1 | new-with-justification |
/api/v1/qontak/chat_panel/reports/postpaid_usage_exports/:job_id | GET | session + can? billing_postpaid_usage_download | path job_id | { status, download_url?, expires_at?, message? } | 200 / 403 / 404 | n/a (read) | /api/v1 | new-with-justification |
Internal endpoint (moderator-be → hub_core host app)
| Endpoint | Method | AuthN/AuthZ | Source | Request | Response | Status | Idempotency | Versioning |
|---|---|---|---|---|---|---|---|---|
/api/core/v1/reports/billing/summary_postpaid_usage | GET | chat-panel auth token | moderator-be snapshot worker | year_month, page, per_page | { data: [{cid, organization_id, waba_id, company_name, billing_type, usage_value}], page_meta } | 200 / 401 / 422 | n/a (read) | /api/core/v1 |
Per-endpoint notes:
- Pagination: list endpoint uses Pagy (already in
moderator-be); 50/page (PUMS-S02/AC-3). The internal summary uses page/per_page to bound payload (Decision 2). - Rate limits / size: export
POSTenforces ≤ 50 MB estimated (PUMS-S06/ERR-1) before enqueue; reject withzip_size_limit_exceeded. - Backward compatibility: all endpoints are net-new; the existing
download_monthly_postpaid_usageendpoint is untouched (no consumer migration). - Error shape: matches repo convention —
render json: { status: errors || :error }, status: :unprocessable_entity(Dry::Matcher::ResultMatcher).
Detail 2.A — Data Integrity Matrix
| Write path | Transaction scope | Partial failure | Idempotency key + TTL | Consistency | Duplicate handling | Stale-read |
|---|---|---|---|---|---|---|
| Snapshot insert (per CID) | single-row insert per (cid, year_month, type) | per-CID rescue → snapshot_failed, continue | unique index (cid,year_month,billing_type) → INSERT … ON CONFLICT DO NOTHING (regenerate-safe) | eventual (monthly batch) | unique index dedups a re-run | source is finalized month-end data (frozen) |
| Export row insert | single insert (pending) | request fails → no row | job_id unique | strong (local) | unique job_id | n/a |
| Export status update | single-row update | worker retry/backoff; terminal failed | job_id | strong | idempotent set-status | n/a |
Detail 2.B — Concurrency Collision Map
| Resource | Writers | Collision | Resolution | On failure |
|---|---|---|---|---|
postpaid_usage_snapshots (cid,year_month,type) | cron worker (+ manual re-run) | two runs for same month | unique index + ON CONFLICT DO NOTHING | second run no-ops (idempotent) |
postpaid_usage_exports.status | export worker | concurrent retries of same job_id | Sidekiq single-job + status guard (only pending/processing → terminal) | duplicate completion ignored |
| chat_service read load | many CID pages | thundering reads | paginate + per-page timeout + circuit breaker (AbstractRepository) | page failure isolated (ERR-1) |
Detail 2.C — Async Job / Event Consumer Spec
| Job | Trigger | Input | Retry | DLQ / retention | Concurrency | Idempotency key | Per-message timeout | Poison handling |
|---|---|---|---|---|---|---|---|---|
PostpaidUsageSnapshotWorker | sidekiq-cron 0 2 1 * * Asia/Jakarta | none (computes prior month) | retry: 0 (matches monthly-cron precedent); manual re-enqueue | none — failures logged as snapshot_failed; run summary | 1 (single monthly run) | unique snapshot index (regenerate-safe) | per-page pigeon timeout (60 s default) | per-CID rescue; never aborts run |
PostpaidUsageExportWorker | enqueued by POST exports | { job_id } | exponential backoff up to MaxFails then failed | row stays failed 24 h | bounded by Sidekiq queue concurrency | job_id + status guard | job-level (≤ 30 min budget) | on repeated failure → failed + error_details |
PostpaidUsageExportCleanupWorker (TTL) | daily cron | none | retry: 0 | n/a | 1 | n/a | short | marks past-expires_at → expired, deletes object |
Detail 2.D — Responsibility Boundary Matrix
| Step (execution order) | Owning service | Inbound trigger | Outbound effect | Failure handler | PRD anchor |
|---|---|---|---|---|---|
| 1. Fire monthly cron | moderator-be (sidekiq-cron) | schedule 0 2 1 * * | enqueue snapshot worker | missed run → manual re-enqueue | PUMS-S01 |
| 2. Read finalized usage | hub_core (via chat_service) | pigeon_get summary | return per-CID per-type usage + waba_id | 5xx/timeout → page-level snapshot_failed | PUMS-S01/AC-4, OQ-1 |
| 3. Write frozen snapshots | moderator-be | parsed summary rows | insert snapshot rows | per-CID rescue → continue | PUMS-S01/ERR-1 |
| 4. Serve dashboard | moderator-be | Finance GET | paginated rows | DB error → error state | PUMS-S02 |
| 5. Build + store ZIP | moderator-be export worker | POST exports | XLSX+ZIP → object storage | storage error → failed (Decision 6) | PUMS-S06 |
No disagreement with the PRD on ownership except the PRD's §6.2 service map (Go
qontak-billing
report-worker), which this RFC corrects tohub_core+moderator-beSidekiq (see scope note + Decisions 2, 5).
Detail 2.E — State Surface Contract
| Entity | State field / event | Default | Updated by | Read via | Stale window |
|---|---|---|---|---|---|
| Snapshot row | report_date (generation date) | set at insert | snapshot worker | list endpoint | frozen (never changes) |
| Export job | status enum + download_url | pending | export worker / cleanup | poll endpoint | poll cadence (FE Stimulus, ~3–5 s) |
3. High-Availability & Security
moderator-be api/worker pods are stateless and horizontally scaled; state lives in Postgres
(chat_billing) + Redis. A chat_service outage during generation delays (does not corrupt)
snapshots — the cron is re-runnable and idempotent. An export-worker crash leaves the row
pending/processing; the TTL cleanup and retry/backoff converge it to completed/failed.
Performance Requirement
- Dashboard read: ≤ 3 s for 1,000 rows (PRD §6) — served by
idx_pus_year_month/idx_pus_search; no cross-service call on page load (Decision 1). - Snapshot run: ≤ 2 h (PRD §6) — paginated cross-service reads, per-page timeout; single job.
- Export: ≤ 30 min for ≤ 50 MB (PRD §6).
- Scalability: generation is monthly batch (low steady RPS); export concurrency bounded by the Sidekiq queue. Load test: simulate worst-month CID count against a staging chat_service.
Monitoring & Alerting
- Metrics (Datadog
Statsd, prefixmoderator_panel_*— matchesCore::Services::Datadog::CustomMetric):moderator_panel_postpaid_snapshot_generated(count, tagyear_month),moderator_panel_postpaid_snapshot_failed(count, tagsyear_month,reason),moderator_panel_postpaid_export_completed/_failed(count, tagjob_id),moderator_panel_postpaid_export_duration_seconds(histogram),moderator_panel_postpaid_zip_size_limit_exceeded(count). - Structured logs (
Rails.logger): eventssnapshot_generated,snapshot_failed,usage_dashboard_loaded,bulk_download_triggered,zip_job_completed,zip_job_failed,zip_size_limit_exceeded(PRD §11) — plusintegration_logsrow for the cross-service summary call. - Alerts (PRD §11.1):
snapshot_failedrate > 5% in a monthly run → immediate Bifrost investigation;zip_job_failedrate > 10% in any 7-day window → escalate within 24 h. - "Debug at 3am": check the monthly run summary log →
snapshot_failedreasons → chat_service health / circuit-breaker state.
Logging
- Fields:
cid,year_month,billing_type,job_id,user_id,reason,duration_seconds. - PII scrub: never log
company_name,usage_value, or the chat-panel auth token. The cross-service summary payload is not logged verbatim (only counts) — matches the Kafka producer convention (logs error class only).
Security Implications
- Threat model: an unauthorized Modpanel user attempting to read all-CID financial data;
leakage of decrypted
company_name/usage; a stale presigned link being reused. - AuthN/AuthZ matrix:
Role × Endpoint Authorization Matrix
| Role | Endpoint(s) | Methods | Tenant scope | Constraint | Audit trail |
|---|---|---|---|---|---|
| Finance Modpanel user | postpaid_usage_snapshots, postpaid_usage_exports[/:job_id] | GET/POST | global (all CIDs — internal tool by design) | read+download only; no write to usage | bulk_download_triggered + integration_logs |
| Non-Finance Modpanel user | all above | — | none | can? denies → 403; nav hidden | denied attempt logged |
| System (cron) | none (Sidekiq-internal) | — | n/a | no manual-trigger endpoint exposed (PUMS-S01 Permission Model) | snapshot_* events |
- Enforcement layer:
before_action :can?+ Pundit policy (pundit_user = current_account_user) on every new endpoint; permission keybilling_postpaid_usage_index/_downloadadded to thepermissionstable and granted to the Finance role. - Input validation:
year_monthmatches^\d{4}-\d{2}$;search_queryexact-match only (parameterized AR — no LIKE injection);snapshot_idsare UUIDs scoped to the requestedyear_month. - Injection: ActiveRecord parameterization; outbound pigeon URL is a fixed env-configured host (no SSRF surface).
- Secrets: chat-panel auth via
Core::Services::Redis::ChatPanel::Auth(existing); lockbox key stays in hub_core only —moderator-benever receives it. Object-storage credentials (if Decision 6 Option A) via env/secret store, never logged. - Presigned URL: TTL = 24 h =
expires_at; private ACL; link not logged. - Static analysis:
bundle exec brakeman(existing CI step). - ISO 27001/27701: financial + business-identity data — covered by the Compliance table below.
Detail 3.A — Failure Mode & Retry Catalog
| External call | Timeout | Retries | Circuit breaker | DLQ | Caller behavior on persistent failure |
|---|---|---|---|---|---|
pigeon_get summary (moderator-be → chat_service) | 60 s (HTTP_REQUEST_TIMEOUT) | per AbstractRepository/pigeon retry_threshold | SLEEP_WINDOW=60, ERROR_THRESHOLD=5 (existing) | none | page-level snapshot_failed, continue run (ERR-1) |
| Object storage PUT (export) | per SDK (Decision 6) | exponential backoff up to MaxFails | n/a | row failed | mark export failed + error_details; Finance sees "Generation failed" |
Detail 3.A.1 — Branch & Skip Catalog
| Branch trigger | Where checked | Downstream effect | Audit trail | User-visible? |
|---|---|---|---|---|
Flag postpaid_usage_scheduler_enabled OFF | worker start + controller | no generation / dashboard hidden | flag state | yes (feature absent) |
| Per-CID summary read fails | snapshot worker | skip that CID, log, continue | snapshot_failed | no (Finance sees missing row) |
| Selection > 50 MB | export controller | reject before enqueue | zip_size_limit_exceeded | yes (error toast) |
| Selection = 0 | export controller / FE | button disabled / 422 | — | yes (disabled) |
Past expires_at | poll endpoint | "link expired" | — | yes |
Detail 3.B — Error Response Catalog
| Endpoint | Error code | HTTP | Message | When | User-facing? |
|---|---|---|---|---|---|
| list | error | 422 | "Could not load usage data. Try again." | DB query failure (PUMS-S02/ERR-1) | yes |
| list | (empty 200) | 200 | "No usage data available for this period." | no rows for month (PUMS-S02/ERR-2) | yes |
| export POST | zip_size_limit_exceeded | 422 | "Selection exceeds 50MB limit. Reduce your selection and try again." | size > 50 MB (PUMS-S06/ERR-1) | yes |
| export POST | error | 422 | "Could not start download. Please try again." | enqueue/Redis failure | yes |
| export poll | expired | 200 | "Download link expired. Generate again." | past TTL (PUMS-S06/AC-4) | yes |
| any | (403) | 403 | (hidden / forbidden) | non-Finance | yes |
Detail 3.C — Compliance & Data Governance
| Field | Classification | Legal basis | Retention | Encryption | Access audit | Right-to-delete |
|---|---|---|---|---|---|---|
company_name | sensitive (business identity) | UU PDP (legitimate interest — billing ops) | 24 months | rest: encrypted in hub_core (lockbox); plain in chat_billing for display; transit: HTTPS | integration_logs + Pundit denials | covered by CID offboarding (out of scope) |
usage_value | financial | UU PDP | 24 months | transit HTTPS; at rest Postgres | dashboard access logs | n/a |
waba_id, cid | internal identifier | — | 24 months | transit HTTPS | — | n/a |
Trigger met (financial + business-identity data) — infosec approval required before GA.
4. Backwards Compatibility and Rollout Plan
Compatibility
- Existing endpoints changed: none. The current
download_monthly_postpaid_usageendpoint and the hub_core CSV report are untouched (additive sibling endpoints only). - hub_core: the new summary is a new route + new builder; the existing report/builder unchanged.
- API version strategy: additive under
/api/v1(moderator-be) and/api/core/v1(chat_service).
Rollout Strategy
- Migration sequence: (1) create
postpaid_usage_snapshots+postpaid_usage_exportsinchat_billing(db/billing/migrate→db:migrate:billing); no backfill (PRD §10 — first snapshot is the next 1st-of-month cron). (2) Ship hub_core summary + chat_service route. (3) Ship moderator-be worker + endpoints behind the flag. (4) Ship Modpanel views. - Schema during migration: purely additive; no intermediate dual-write state.
- Feature flag:
postpaid_usage_scheduler_enabled(Flipper viaCore::Services::Preference), default ON at GA (PRD §6) but created OFF and enabled per stage; global (no per-CID). Kill-switch: disabling hides the dashboard and skips the cron. - Stages (PRD §13): Internal QA (staging — Bifrost + 1–2 Finance reps, 1 wk, gate: snapshots for all eligible CIDs with 0 failures, dashboard loads, ZIP valid with exact naming, 0 P0/P1) → GA (all Finance, gate: snapshot ≥ 99% on first prod run, ZIP ≥ 95% in 2 wks).
- Rollback trigger: snapshot failure > 10% for 2 consecutive monthly runs (PRD §11.1) → disable flag pending fix.
- Rollback mechanism: toggle flag OFF (instant hide + skip cron); snapshots already written are
inert frozen rows (safe to leave); migrations reversible (
db:rollback:billing). - PIC + timeline per stage: in
delivery/(not in this RFC).
Detail 4.A — Configuration Contract
| Env / config / flag | Type | Default | Required | Provisioner | Secret? |
|---|---|---|---|---|---|
postpaid_usage_scheduler_enabled | Flipper flag | created OFF → ON at GA | yes | Bifrost (Preference) | no |
CHAT_PANEL_SERVICE_BASE_URL | env | existing | yes | Infra (existing) | no |
Permission billing_postpaid_usage_index/_download | DB seed | — | yes | Bifrost (seed/migration) | no |
schedule.yml postpaid_usage_snapshot_task | cron | 0 2 1 * * Asia/Jakarta | yes | Bifrost | no |
| Object-storage bucket + creds (Decision 6 Option A) | env/secret | TBD | yes for export | Infra (unconfirmed — OQ-1) | yes |
Detail 4.B — Test Plan (commands sourced from repo)
| Layer | Command (source) | What it must prove |
|---|---|---|
| Unit/Request (moderator-be) | bundle exec rspec spec/... (source: .rspec, Makefile:84 make test) | worker inserts rows; list paginates/filters; export size-guard; status transitions |
| Lint (moderator-be) | bundle exec rubocop (source: Makefile:86 make lint) | style passes |
| Security (moderator-be) | bundle exec brakeman (source: Makefile:92 make security) | no new findings |
| Unit (hub_core) | RAILS_ENV=test bundle exec rspec app/apps/billings/... (source: bitbucket-pipelines.yml) | summary repo returns waba_id + V3 + decrypted org_name |
| Lint (hub_core) | bundle exec rubocop (source: bitbucket-pipelines.yml:89) | style passes |
| DB setup (moderator-be CI) | RAILS_ENV=test bin/rails db:create:billing db:schema:load:billing (source: bitbucket-pipelines.yml:71) | new tables load from schema |
Detail 4.C — Agent Execution Plan
| Order | Chunk | Files to modify/create | Commands | Acceptance criteria (verifiable) |
|---|---|---|---|---|
| 1 | Add migrations for both tables | moderator-be: db/billing/migrate/<ts>_create_postpaid_usage_snapshots.rb, …_create_postpaid_usage_exports.rb | bin/rails db:migrate:billing | both tables + indexes exist in db/billing_schema.rb; db:rollback:billing reverts |
| 2 | hub_core bulk summary repo/interactor/builder | hub_core: app/apps/billings/{repositories,interactors,builders}/v1/reports/summary_postpaid_usage.rb | RAILS_ENV=test bundle exec rspec app/apps/billings/.../summary_postpaid_usage_spec.rb | rspec: returns waba_id, V3 Call Balance + quota rows, decrypted company_name; paginated |
| 3 | chat_service route for summary | host app routes (/api/core/v1/reports/billing/summary_postpaid_usage) + controller | host-app request spec | GET returns 200 JSON page with auth token; 401 without |
| 4 | moderator-be summary read repo (pigeon) | moderator-be: app/domains/core/repositories/app_integrations/chat_panel/reports/summary_postpaid_usage.rb | bundle exec rspec spec/.../summary_postpaid_usage_spec.rb | rspec (stubbed pigeon): parses page; failure → Failure(reason) |
| 5 | Models | moderator-be: app/models/billings/postpaid_usage_snapshot.rb, postpaid_usage_export.rb | bundle exec rspec spec/models/... | models inherit Billings::ApplicationRecord; enum + validations |
| 6 | Snapshot worker + cron entry | moderator-be: app/domains/core/workers/sidekiq/postpaid_usage_snapshot_worker.rb; config/schedule.yml (postpaid_usage_snapshot_task) | bundle exec rspec spec/.../postpaid_usage_snapshot_worker_spec.rb | running worker for fixture month inserts N rows; per-CID raise → snapshot_failed + run completes; idempotent on re-run |
| 7 | List/search endpoint + permission | moderator-be: …/reports_controller.rb (+route config/routes/api/v1.rb); permission seed | bundle exec rspec spec/requests/.../postpaid_usage_snapshots_spec.rb | 50/page; exact-match filter; year_month default latest; can? 403 for non-Finance; empty-state |
| 8 | Modpanel view + type label map | moderator-be: app/views/billing/postpaid_usage/*, Stimulus controller; billing_type label helper | bundle exec rspec spec/.../ + system spec | table renders columns; unknown type → "Unknown"; loading/empty/error states |
| 9 | Bulk selection (Stimulus) | moderator-be: app/javascript/controllers/postpaid_usage_selection_controller.js | system/JS spec | select-all-across-pages sets count; filter change clears; refresh clears |
| 10 | Export trigger + poll endpoints | moderator-be: …/reports_controller.rb (+routes); use cases | bundle exec rspec spec/requests/.../postpaid_usage_exports_spec.rb | POST >50 MB → 422 zip_size_limit_exceeded; 0 selected → 422; returns job_id; poll returns status |
| 11 | Export worker (XLSX+ZIP) [gated on OQ-1] | moderator-be: app/domains/core/workers/sidekiq/postpaid_usage_export_worker.rb; add rubyzip to Gemfile; storage client (Decision 6) | bundle install + bundle exec rspec spec/.../postpaid_usage_export_worker_spec.rb | builds one .xlsx per CID per type named {CID} {Company Name} {Postpaid Usage YYYY-MM} {Type}.xlsx; zips; status→completed+expires_at |
| 12 | TTL cleanup cron [gated on OQ-1] | moderator-be: app/domains/core/workers/sidekiq/postpaid_usage_export_cleanup_worker.rb; config/schedule.yml | bundle exec rspec spec/.../cleanup_worker_spec.rb | past-expires_at → expired + object deleted |
Detail 4.D — Verification & Rollback Recipe
- Pre-merge (run in order):
bundle exec rubocop(moderator-be + hub_core)bundle exec brakemanbundle exec rspec(scoped to the new specs above)- CI DB load:
RAILS_ENV=test bin/rails db:create:billing db:schema:load:billing
- Post-deploy signals:
- Datadog:
moderator_panel_postpaid_snapshot_failedrate < 5% on the first monthly run. - Log query:
snapshot_generatedcount ≈ eligible-CIDs × types for the run. moderator_panel_postpaid_export_completed/_failedratio ≥ 95% completed over 2 weeks.
- Datadog:
- Rollback recipe (in order):
- Toggle
postpaid_usage_scheduler_enabledOFF (Core::Services::Preference) — hides dashboard + skips cron. - If a bad migration:
bin/rails db:rollback:billing(additive tables — safe). - Revert the moderator-be / hub_core PRs.
- Confirm
moderator_panel_postpaid_*error metrics return to baseline within 15 min; existingdownload_monthly_postpaid_usageendpoint still works (unaffected).
- Toggle
Detail 4.E — Resource & Cost Notes
- Compute: one monthly batch job + on-demand export workers — negligible steady load; export ZIP build needs memory headroom bounded by the 50 MB cap.
- DB: ~192k snapshot rows over 24 months — trivial; two new tables on
chat_billing. - Network: one monthly burst of paginated cross-service reads; per-export object PUT.
- Storage: ZIP files ≤ 50 MB each, 24 h TTL — small footprint (Decision 6 dependent).
5. Concern, Questions, or Known Limitations
| # | Question / limitation | Owner | Impact if unresolved |
|---|---|---|---|
| OQ-1 | Object storage for ZIP (BLOCKER for export delivery). No S3/OSS/GCS client or storage gem exists in moderator-be (verified). Decision 6 recommends object storage + presigned URL but the bucket/client must be provisioned (or fall back to DB/volume blob + authenticated send_data). | Infra + Bifrost | Blocks §4.C ch.11–12 (bulk export delivery). Snapshot + dashboard (ch.1–9) ship independently. |
| OQ-2-bis | V3 whitelisted quota enumeration. The exact set of quota component codes and the source field for V3 Call Balance / quotas (organization_package_component_postpaid_limits) must be confirmed with the billing team for the hub_core summary query. | hub_core / billing | Wrong/missing V3 billing_type rows in snapshots (PUMS-S01/AC-4). |
| OQ-3 | Billing version "2.0.0". Whether V2 CIDs exist in prod and which postpaid types apply (carried from PRD OQ-3). The summary query must explicitly include or exclude billing_version="2.0.0". | hub_core / billing | Silent omission or double-count of V2 CIDs. |
| OQ-4 | WABA cardinality. waba_id resolves via organizations.settings['waba_id'] OR waba_accounts (1:many). For a CID with multiple WABAs, confirm whether the snapshot is per-WABA or per-CID (one waba_id chosen). | Bifrost + Finance | Row granularity / dedup of dashboard rows. |
| OQ-5 | Modpanel visual design (Figma) pending (PRD §7). Views are built functionally against the contracts; pixel design is a follow-up, not a data-layer blocker. | Design + Bifrost | Cosmetic only. |
| Known limit | Snapshot is a frozen projection — a later monthly_usages correction does not propagate (by design, PUMS-S01/AC-2). | — | Documented behavior, not a defect. |
6. Comment logs
| Date | Comment(s) From | Action Item(s) |
|---|---|---|
| 2026-06-30 | rfc-starter (author) | Drafted from PRD v1.3, grounded in moderator-be + hub_core. Corrected PRD §6.2 service map (no Go qontak-billing/report-worker; data in hub_core, async ZIP in moderator-be). Resolved PRD OQ-1 (WABA) and OQ-2 (XLSX via caxlsx). All 16 mermaid blocks validated to parse (see below). Open blocker: object storage (OQ-1). |
7. Ready for agent execution
- yes — for Phase 1a (snapshot generation + dashboard read/search/type-label: §4.C chunks 1–9). All infra topology, ADR decisions, repo anchors (verified), DDL, APIs, sequence/ER/state/branch diagrams, failure/branch catalogs, config contract, and execution chunks are complete and grounded.
- no — one gated dependency — for Phase 1b (bulk async ZIP export: §4.C chunks 10–12). The only blocker is §5 OQ-1 (object storage mechanism); chunk 10 (trigger/poll endpoints + size guard) is execution-ready, but chunks 11–12 (build/store/cleanup) must wait on the storage decision (Decision 6). The export schema is storage-agnostic so the decision needs no migration change.
- Secondary confirmations (non-blocking for 1a): OQ-2-bis (V3 quota codes), OQ-3 (
"2.0.0"), OQ-4 (WABA cardinality) — needed before the hub_core summary is finalized.
Optional next step: hand to
rfc-reviewerfor a second-pass score, and/orrfc-task-breakdownfor man-day estimates (note the Phase 1a / 1b split above).