Qontak | Chatbot & AI | AI Agent Impact Report — Phase 1: Live Impact Report (MVP)

Template: NEW PRD v1.1 · Companion to PRD Section Reference v1.5 + Hierarchy v1.0

HEADER BLOCK

Field	Value
PM	Dimas Fauzi Hidayat
PRD Version	1.2
Status	DRAFT
PRD Type	NEW
Epic	QC-XXXXX — add once Epic is created
Squad	BOT — Bot, AI & Automation
RFC Link	N/A — RFC to be raised after READY
Figma Master	Pending — not yet designed. Reference: mekari-taste plain-language wireframe + annotated spec wireframe (this session). Stitch prompts in Appendix.
Anchor	Yes — Qontak \| Chatbot & AI \| AI Agent Impact Report — ANCHOR
Labels	`epic:qontak-chatbot-ai` \| `module:chatbot-ai` \| `feature:ai-agent-impact-report`
Last Updated	2026-07-01

Status values: DRAFT → READY → BUILD → SHIPPED Phase: Phase 1 of 3 (see ANCHOR Phase Index).

HEADER BLOCK
Scope Changes
2. One-liner + Problem
3. What Happens If We Don't Ship This Phase
4. Target Users + Persona Context
5. Non-Goals
6. Constraints
- 6.7 Data Lifecycle
7. New Features
8. API & Webhook Behavior
9. System Flow + User Stories + ACs
- 9.1 System Flow
- 9.2 User Stories
10. Rollout
- 10.5 Semantic Regression Rollback
11. Observability
- 11.1 Post-Launch Monitoring Cadence
12. Success Metrics
13. Launch Plan & Stage Gates
14. Dependencies
15. Key Decisions + Alternatives Rejected
16. Open Questions
AI-Readiness Score
Appendix A — Stitch UI Prompts
PRD CHANGELOG

Scope Changes

Backend · Frontend · Data — Phase 1 ships a native report page (Frontend) reading the ai_activity_logs datamart (Data) via a new report + cost-assumption API (Backend). No new AI-telemetry instrumentation this phase (deferred to Phase 2).

2. One-liner + Problem

One-liner: Give the Qontak org admin a plain-language, per-org report showing how much the AI agent handled on its own, how well, and what it saved.

Problem: Customers pay for the Qontak AI Agent but have no in-product view of the value it delivers, so a non-technical org admin or CS supervisor cannot answer their own boss's question, "is this AI thing worth the money?" Every AI-agent account (WhatsApp-first Indonesian SMEs) is affected, and the person judging the AI does so on inbox "feel" with no way to show load handled, quality, trend, or money saved. The cost is a paid add-on that renews on gut feeling with no value narrative — while Intercom Fin, Zendesk AI, and Ada all ship an AI resolution/ROI dashboard as standard.

3. What Happens If We Don't Ship This Phase

Within one renewal cycle (~12 months): the AI add-on keeps renewing on faith — the first line cut when budgets tighten — across every AI-enabled account, with no in-product evidence to defend it.
Immediately, in live deals and renewals: the gap with Intercom, Zendesk, and Ada (all shipping AI ROI dashboards today) stays open — and Phase 1 is the cheapest possible close, needing no new backend instrumentation, so delaying it forfeits that answer for no data reason.
Over the next 1–2 quarters: the ai_activity_logs datamart enhancements (persisted confidence, answer-scope reason) stay unprioritized with no consuming product, blocking Phases 2–3 and the wider telemetry roadmap.

4. Target Users + Persona Context

Primary Persona: Org Admin / CS Supervisor ("Bu Rina")

Field	Detail
Role	Non-technical org admin / CS supervisor at an Indonesian SME running WhatsApp-first customer service on Qontak; approved the AI add-on, manages a small human-agent team
Goal	Know at a glance whether the AI is doing a good job and worth the spend — and defend that to her boss with a clear number
Pain	No in-product view; AI value is invisible; she judges it on inbox "feel" and can't answer "is it worth it?"
Workaround	Eyeballs the inbox, counts by hand, guesses; no trend, quality, or money view

(See Section 6 for plan availability and feature flag scope.)

Secondary Persona: Business Owner / Budget Decision-Maker

Field	Detail
Role	The owner/manager who approves the AI spend and the renewal
Goal	A simple, credible hours-saved / rupiah-saved story to justify continuing or expanding
Pain	Only sees the invoice, never the value
Workaround	Asks the supervisor, who has no data to show

5. Non-Goals

No CSAT / customer-satisfaction survey. This phase does not collect or show any customer-rated satisfaction signal — deferred to Phase 3.
No quality-validated resolution. Phase 1 reports containment (resolved with no human), not answer quality (Ada-grade Relevant/Accurate/Safe) — that needs persisted confidence, deferred to Phase 3.
No answer-coverage or out-of-scope reasons. The in-scope/out-of-scope split and the "why the AI couldn't answer" breakdown are Phase 2 (they need the AI-service reason persisted).
No knowledge-gap → projected-lift or "Draft with AI". The prescriptive gap-closing section is Phase 2.
No audited rupiah. Phase 1's money figure is an admin-set assumption applied to volume — it is explicitly not a Billing-audited cost-per-resolution.
No per-message blended transcript. The AI+human journey is reported at room grain only (AI-only / AI-assisted-then-human / escalated); a message-level timeline is out (the BE does not store human-agent reply text).
No scheduled email / push digest. Phase 1 is an in-product view; a proactive monthly digest is a later enhancement.
Not on the mobile app. Web admin only this phase.

6. Constraints

Platform:       Web only (Qontak omnichannel web admin). Not available on the mobile app this phase.
Performance:    Report renders ≤ 3s at p95 for a 90-day window; underlying aggregation query ≤ 2s at p95.
                Data freshness ≤ 24h (daily pre-computed aggregation; report is not real-time).
Data limits:    Date range selectable up to 12 months; all metrics scoped to a single organization.
Plan scope:     Accounts with the AI Agent (Generative AI) add-on enabled only. Not shown for accounts without it.
Feature flag:   ai_agent_impact_report | default: OFF (enabled per account).
                Sub-flag: ai_agent_impact_report_forecast | default: OFF (forecast panel, see §10.5).
Read/write:     Report is read-only. Viewable by Owner, Admin, Supervisor (mirrors existing /reports role gate).
                The cost-saved assumption (agent-hour rate, minutes/conversation) is writable by Owner/Admin only.

6.7 Data Lifecycle

Artifact Type	Retention Period	Cleanup Trigger	User-Visible Effect
Daily pre-computed impact aggregates (per org, per day)	Rolling 13 months	TTL from `computed_at`; nightly job	None — older ranges simply unavailable beyond 12-month selector
Cost-saved assumption record (per org)	Lifetime of account; overwritten on edit	Manual edit by Owner/Admin	New rupiah figures recompute on next view

7. New Features

Feature: AI Agent Impact Report

URL:     /reports/ai-agent-impact  (also linked from the existing Reports area)
Access:  Owner / Admin / Supervisor on AI-agent-enabled accounts. Cost-assumption editor: Owner / Admin only.

Component Tree:

AiAgentImpactPage
├── ReportHeader                     — date-range selector, cost-assumption gear (Owner/Admin)
├── PlainHeadlineSummary             — one-sentence plain-language read ("your AI answered 6 in 10…")
├── ValueDeliveredGrid               — "What your AI did"
│   ├── HonestContainmentTile        — hero: net containment %, verdict pill, "leaves out repeat-askers"
│   ├── VolumeTile                    — conversations handled by AI
│   ├── AfterHoursTile               — handled while team offline (count + %)
│   └── WorkAbsorbedTile             — hours + rupiah saved; requires an admin cost assumption (prompts to set if unset)
├── BlendedJourneyBar                — AI-only / AI-assisted-then-human / escalated (room grain)
├── QualityGrid                      — "Is your AI doing a good job?"
│   ├── ReopenRateTile               — customers who had to ask again ≤48h + verdict
│   ├── SentimentDeltaTile           — customer mood (AI-contained vs escalated)
│   └── TurnsTile                     — messages to solve, avg + verdict
├── ForecastPanel                    — containment trend vs onboarding baseline + next-period projection (flag-gated)
└── CostAssumptionModal              — set agent-hour rate + minutes/conversation (Owner/Admin)

UI States:

Empty:    New / low-volume org (< 30 days of data OR < defined minimum conversations) → "Baseline forming"
          state: show volume + a friendly note that trend/verdicts appear once enough data accrues. No fabricated %.
Loading:  Skeleton tiles + skeleton chart while the daily aggregate is fetched.
Error:    "We couldn't load your report. Try again." + Retry button. Logs event report_load_failed with reason.
Success:  Full report with a verdict on every metric, plain-language headline, and (if flag on) forecast.

Figma: Pending — design not yet created. Reference: mekari-taste plain-language customer wireframe +
       annotated PM spec wireframe (this session). Stitch prompts to be generated at READY.

📊 UI State Diagram — AI Agent Impact Report

stateDiagram-v2
    [*] --> Loading: Admin opens /reports/ai-agent-impact
    Loading --> BaselineForming: Aggregate returns < 30 days or < min conversations
    Loading --> Success: Aggregate returns sufficient data
    Loading --> Error: Query fails / timeout > 3s
    Error --> Loading: User clicks Retry
    BaselineForming --> [*]: Admin leaves (volume shown, no verdicts)
    Success --> Success: Admin changes date range / edits cost assumption (recompute)
    Success --> [*]: Admin leaves

8. API & Webhook Behavior

Behavior 1: Fetch AI Agent Impact report

Entity affected:      Read-only. Reads the `ai_activity_logs` datamart — a per-org daily aggregate of room outcomes derived from `rooms`.
Triggered by:         Admin opens the report, or changes the date range.
Information passed:    Organization ID (from auth session), date range, requesting user role.
Expected behavior:    - Returns the metric set for the range: honest containment (net), volume, after-hours,
                        work-absorbed inputs, blended-journey split, reopen rate, sentiment delta, turns,
                        trend series, and (if flag on) forecast.
                      - Excludes bot_preview channel traffic and closed_reason = 'SPAM' from all metrics.
                      - If the org has insufficient data, returns a "baseline_forming" flag instead of derived %.
Failure behavior:     - If the aggregate query fails or exceeds the timeout: return an error the UI renders as
                        the Error state with Retry; log report_load_failed with reason.
                      - If the org has no AI add-on: endpoint returns 403-equivalent; report entry not shown.

Aggregate source:     Decided — the `ai_activity_logs` datamart (native page). Not a direct Postgres read, not a Metabase iframe.
[Claude to resolve during RFC: HTTP method, path, request/response JSON schema, error codes.]

Behavior 2: Read / update the cost-saved assumption

Entity affected:      Creates/updates one cost-assumption record per organization.
Triggered by:         Owner/Admin opens the cost-assumption editor and saves (agent-hour rate, minutes/conversation).
Information passed:    Organization ID, agent-hour rate (rupiah), average minutes per conversation.
Expected behavior:    - Persists the assumption; the "work absorbed ≈ rupiah" figure recomputes from it on next view.
                      - Non-negative numeric validation; sensible bounds enforced.
Failure behavior:     - Invalid input (negative / non-numeric / out of bounds): inline validation error, not saved.
                      - Non-Owner/Admin attempt: editor not rendered; write rejected server-side.

[Claude to resolve during RFC: HTTP method, path, request/response JSON schema, validation bounds.]

9. System Flow + User Stories + ACs

9.1 System Flow

Flow: Admin views the AI Agent Impact report
Type: User Journey

 Admin opens /reports/ai-agent-impact (or clicks it in the Reports area).
 System checks: feature flag ON + role ∈ {Owner, Admin, Supervisor} + account has the AI add-on.
 If not eligible → report entry hidden / access denied (no 403 surfaced to an ineligible role).
 System requests the pre-computed daily impact aggregate for the org + selected date range.
 System computes honest containment = conversations closed_reason='RESOLVE_AI', assign_channel_agent_id IS NULL,
    closed_at NOT NULL, minus same-contact reopens within 48h; excludes bot_preview channel + closed_reason='SPAM'.
 System computes value tiles (volume, after-hours via business-hours config, work-absorbed = contained volume ×
    admin assumption), blended journey (room grain), quality tiles (reopen, sentiment, turns), trend vs onboarding baseline.
 Decision: if data < minimum (new/low-volume org) → render Baseline-forming state (volume only, no derived %).
 If forecast flag ON and sufficient trend → project next-period containment.
 Render report with a plain-language headline and a verdict on every metric.
Admin optionally changes the date range → recompute (return to step 4).
Owner/Admin optionally edits the cost assumption → work-absorbed rupiah recomputes.
Failure branch: if the aggregate query fails or times out → render Error state + Retry; log report_load_failed.

📊 System Flow — AI Agent Impact Report

sequenceDiagram
    actor Admin
    participant FE as Web Admin (chatbot-fe)
    participant BE as Report API (chatbot BE)
    participant AGG as Impact Aggregate (rooms-derived)
    Admin->>FE: Open /reports/ai-agent-impact
    FE->>BE: Request report (org, date range, role)
    BE->>BE: Check flag + role + AI add-on
    alt Not eligible
        BE-->>FE: Access denied (entry hidden)
    else Eligible
        BE->>AGG: Fetch daily aggregate for range
        alt Query fails / timeout > 3s
            AGG-->>BE: Error
            BE-->>FE: Error → Retry (log report_load_failed)
        else Success
            AGG-->>BE: Aggregated room outcomes
            BE->>BE: Compute honest containment (net reopens), journey, quality, trend
            alt Data < minimum
                BE-->>FE: Baseline-forming (volume only)
            else Sufficient data
                BE-->>FE: Full report + verdicts (+ forecast if flag ON)
            end
        end
    end
    Admin->>FE: Edit cost assumption (Owner/Admin)
    FE->>BE: Save assumption → recompute rupiah

9.2 User Stories

User Story	Importance	Mockup / Technical Notes	Acceptance Criteria
[IMPACT-S01] — View honest containment + value delivered As an org admin / supervisor, I want to see how much my AI agent handled on its own and what it saved, so that I can tell whether it's worth the spend.	Must Have	Figma: Pending — mekari-taste plain-language wireframe (this session) Data Fields: • `closed_reason` (enum, required) — source: `rooms` • `assign_channel_agent_id` (id, optional/nullable) — source: `rooms` • `closed_at` (timestamp, required) — source: `rooms` • `contact_id` (id, required) — source: `rooms` • `channel` (string, required) — source: `rooms` Before-After Behavior: Before — no in-product view of AI value. After — the admin sees a net honest-containment hero plus value tiles, each with a plain-language verdict.	— Happy Path — • AC-1: Given an AI-enabled org with ≥ minimum data, when the admin opens the report, then the hero shows honest containment = share of eligible conversations with `closed_reason='RESOLVE_AI'` AND `assign_channel_agent_id IS NULL` AND `closed_at IS NOT NULL`, for the selected range. • AC-2: Given the containment calculation, when computed, then conversations reopened by the same `contact_id` within 48h are excluded from the numerator, and the hero shows the net figure with gross and the reopen gap available. • AC-3: Given eligibility filtering, when metrics are computed, then `bot_preview` channel traffic and `closed_reason='SPAM'` are excluded from both numerator and denominator. • AC-4: Given bare `closed_reason='RESOLVE'` (human-button or idle-timeout, conflated), when computing containment, then those conversations are NOT counted as an AI win. • AC-5: Given each value tile, when rendered, then it shows a plain-language label and a verdict indicator (e.g. Healthy / Always-on), never a bare number alone. — Edge — • AC-6: Given an org with < 30 days of data OR < the minimum conversation count, when the report loads, then the Baseline-forming state is shown (volume only) and no containment % is fabricated. — Error / Unhappy Path — • ERR-1: Given the aggregate query fails or exceeds the 3s p95 budget, when the admin opens the report, then the Error state with Retry is shown, and event `report_load_failed` is logged with `reason`. — Permission Model — • CAN: Owner, Admin, Supervisor on AI-enabled accounts. • CANNOT: Agent role; any role on non-AI accounts; read-only billing roles. • Unauthorized: report entry not rendered; no 403 surfaced. — UI States — • Loading: skeleton tiles. • Empty: Baseline-forming (volume only). • Error: "We couldn't load your report." + Retry. • Success: value grid with verdicts + plain headline.
[IMPACT-S02] — See the AI + human journey As an org admin, I want to see how AI and my human agents split the work, so that I understand the AI's real contribution, not just a bot-only number.	Must Have	Figma: Pending Data Fields: • `closed_reason` (enum, required) — `rooms` • `assign_channel_agent_id` (id, optional/nullable) — `rooms` • `assigned_at` (timestamp, optional/nullable) — `rooms` Before-After Behavior: Before — no view of AI vs human collaboration. After — a room-grain split bar (AI-only / AI-assisted-then-human / escalated).	— Happy Path — • AC-1: Given eligible conversations, when the journey bar renders, then it splits them into AI-only resolved (`RESOLVE_AI`, no human attached), AI-assisted-then-human (AI engaged AND `assign_channel_agent_id NOT NULL`), and escalated/human (`ASSIGN_AGENT` / `ASSIGN_AGENT_AI` / `WAITING_ASSIGN_AGENT`), summing to 100%. • AC-2: Given the AI-assisted segment, when computed, then it counts rooms where the AI produced ≥ 1 response and a human agent was subsequently attached. • AC-3: Given the segment labels, when displayed, then each uses plain language ("AI resolved alone", "AI assisted, human closed", "Escalated to human"), not the raw enum. — Error / Unhappy Path — • ERR-1: Given journey data is unavailable while the rest loads, when rendering, then the bar shows a "not enough data" placeholder without blocking the rest of the report. — Permission Model — • CAN: Owner, Admin, Supervisor (AI-enabled). • CANNOT: Agent; non-AI accounts. • Unauthorized: not rendered. — UI States — • Loading: skeleton bar. • Empty: "not enough data" placeholder. • Error: placeholder, rest of report unaffected. • Success: 3-segment bar with plain labels.
[IMPACT-S03] — See whether the AI is doing a good job As an org admin, I want to see quality signals (repeat contacts, mood, effort), so that I can trust the containment number isn't just customers giving up.	Must Have	Figma: Pending Data Fields: • `contact_id` (id, required), `closed_at` (timestamp, required) — `rooms` (reopen) • `sentiment` (enum/score, optional) — `omnichannel_room_summaries` (MongoDB) • message count per room (int, required) — histories Before-After Behavior: Before — no quality signal at all. After — reopen rate, sentiment delta, and turns-to-resolve, each with a verdict.	— Happy Path — • AC-1: Given closed AI conversations, when the reopen rate is computed, then it is the share where the same `contact_id` opened a new conversation within 48h, shown with a verdict (e.g. "Low — good"). • AC-2: Given the `sentiment` field on room summaries, when the sentiment delta is computed, then it compares average sentiment of AI-contained vs escalated conversations and renders it in plain language ("more positive"). • AC-3: Given per-room message counts, when turns-to-resolve is computed, then it shows the average customer back-and-forth for AI-contained conversations with a verdict. — Edge — • AC-4: Given `sentiment` is missing for some conversations (cross-store gap), when computing the delta, then those are excluded and the tile notes reduced coverage rather than showing a wrong value. — Error / Unhappy Path — • ERR-1: Given the MongoDB sentiment source is unreachable, when the report loads, then the sentiment tile degrades to "not available" while the rest of the report renders. — Permission Model — • CAN: Owner, Admin, Supervisor (AI-enabled). • CANNOT: Agent; non-AI accounts. • Unauthorized: not rendered. — UI States — • Loading: skeleton tiles. • Empty: Baseline-forming. • Error: per-tile "not available". • Success: 3 quality tiles with verdicts.
[IMPACT-S04] — Set a cost assumption and see money saved As an Owner / Admin, I want to set my own agent-hour cost and handling time, so that the report estimates rupiah saved in a number I trust.	Should Have	Figma: Pending — CostAssumptionModal Data Fields: • `agent_hour_rate` (int, rupiah, required) — user input • `minutes_per_conversation` (int, required) — user input • contained volume (int, required) — computed Before-After Behavior: Before — no money figure. After — an editable, org-owned assumption drives a "work absorbed ≈ rupiah" figure.	— Happy Path — • AC-1: Given an Owner/Admin opens the cost-assumption editor, when they save a valid agent-hour rate and minutes/conversation, then the "work absorbed ≈ Rp" figure recomputes as contained conversations × minutes → hours × rate. • AC-2: Given the money layer, when displayed, then it is clearly labelled as based on the org's own assumption (with an "edit" affordance) and never presented as an audited Qontak figure. • AC-3: Given no cost assumption is set, when the report loads, then the work-absorbed tile shows a "set your assumption to see hours and value saved" prompt with no fabricated figure — hours and rupiah appear only after a valid assumption (agent-hour rate + minutes/conversation) is saved. No Qontak default rate is applied. — Error / Unhappy Path — • ERR-1: Given invalid input (negative / non-numeric / out of bounds), when saving, then an inline validation error is shown and nothing is persisted. — Permission Model — • CAN: Owner, Admin (edit). • CANNOT: Supervisor (sees figure, cannot edit); Agent (no access). • Unauthorized: editor not rendered; server rejects the write. — UI States — • Loading: save spinner. • Empty: "set your assumption" prompt. • Error: inline validation. • Success: recomputed figure.
[IMPACT-S05] — See containment trend + next-period forecast As an org admin, I want to see how containment changed since onboarding and where it's heading, so that I can show improvement over time.	Should Have	Figma: Pending — ForecastPanel Data Fields: • `created_at` (timestamp, required), `closed_at` (timestamp, required), `closed_reason` (enum, required) over range — `rooms` Before-After Behavior: Before — no trend or forecast. After — a trend line vs onboarding baseline with an optional (flag-gated) projection.	— Happy Path — • AC-1: Given ≥ minimum history, when the trend renders, then it plots net containment over the selected range against an onboarding baseline (first 30 days of AI activity). • AC-2: Given `ai_agent_impact_report_forecast` is ON and sufficient trend data, when the forecast renders, then it projects next-period containment with a visibly distinct (dashed) line and a plain-language read. • AC-3: Given the forecast flag is OFF, when the report loads, then only the historical trend renders and no projection is shown. — Edge — • AC-4: Given insufficient history for a stable baseline, when the trend renders, then it shows the available series and notes the baseline is still forming rather than projecting. — Permission Model — • CAN: Owner, Admin, Supervisor (AI-enabled). • CANNOT: Agent; non-AI accounts. • Unauthorized: not rendered. — UI States — • Loading: skeleton chart. • Empty: baseline-forming note. • Error: chart hidden, rest renders. • Success: trend (+ forecast if flag on).

Negative Scenarios (from Section 5 Non-Goals)

User Story	Importance	Acceptance Criteria
[IMPACT-S01-NEG] — No AI add-on (Guard Rail)	Guard Rail	• NEG-1: Given an account without the AI add-on, when a user navigates to `/reports/ai-agent-impact`, then the report is not accessible and no metrics are returned.
[IMPACT-S02-NEG] — Ineligible role (Guard Rail)	Guard Rail	• NEG-2: Given an Agent-role user, when they attempt to open the report, then no report entry is rendered and access is denied (no 403 surfaced).
[IMPACT-S03-NEG] — Out-of-scope metrics (Guard Rail)	Guard Rail	• NEG-3: Given a user looks for CSAT or answer-quality scoring, when viewing the Phase-1 report, then those signals are absent (out of scope this phase).
[IMPACT-S04-NEG] — Mobile app (Guard Rail)	Guard Rail	• NEG-4: Given the mobile app, when a user looks for the report, then it is not available (web only).

🧪 Test Coverage Matrix — [IMPACT-S01]

Dimension	Coverage	Notes
Boundary values	✅ defined	AC-6 covers < 30 days / < min; AC-2 covers reopen exclusion boundary (48h)
State transitions	✅ defined	UI states cover loading → baseline / success / error
Data validation	⚠️ partial	AC-3/AC-4 exclude bot_preview/SPAM/bare RESOLVE; ⚠️ QA: verify NULL `closed_at` handling
Concurrency	⚠️ TBD	⚠️ QA: two admins viewing while the daily aggregate recomputes
Network/timeout	✅ defined	ERR-1 covers query failure/timeout → Error + Retry

🧪 Test Coverage Matrix — [IMPACT-S02]

Dimension	Coverage	Notes
Boundary values	⚠️ partial	Segments sum to 100%; ⚠️ QA: room with AI response but immediate human takeover (assisted vs escalated boundary)
State transitions	✅ defined	Loading / placeholder / success covered
Data validation	✅ defined	AC-1 enumerates exact closed_reason → segment mapping
Concurrency	⚠️ TBD	⚠️ QA: assignment changes mid-aggregation window
Network/timeout	✅ defined	ERR-1 covers journey-data-unavailable without blocking report

🧪 Test Coverage Matrix — [IMPACT-S03]

Dimension	Coverage	Notes
Boundary values	✅ defined	AC-1 reopen window (48h); AC-4 missing-sentiment exclusion
State transitions	✅ defined	Per-tile degrade covered
Data validation	⚠️ partial	⚠️ QA: sentiment enum vs score normalization across room summaries
Concurrency	⚠️ TBD	⚠️ QA: contact reopens exactly at the 48h boundary
Network/timeout	✅ defined	ERR-1 covers Mongo unreachable → tile "not available"

10. Rollout

Feature flag:    ai_agent_impact_report (see Section 6 — OFF by default)
Rollout:         Stage 1 → Internal QA accounts only (5 accounts)
                 Stage 2 → Closed beta: 10 AI-enabled orgs across tiers (manually enabled)
                 Stage 3 → Open beta: AI-enabled Professional + Enterprise accounts, on request
                 GA       → All AI-agent-enabled accounts (flag on)
Backward compat: Yes — read-only, additive. No existing report or behavior changes.
Migration:       None to existing data. Additive only: a daily impact-aggregation job and a per-org
                 cost-assumption record. No backfill of user-facing state required.

10.5 Semantic Regression Rollback

The forecast panel (§IMPACT-S05) is a derived projection — it can look valid but mislead.

Model flag:          ai_agent_impact_report_forecast | default: OFF (per account)
Regression metric:   Forecast error — absolute gap between projected next-period containment and the actual
                     realized containment once the period closes.
Rollback threshold:  If mean absolute forecast error ≥ 10 percentage points across beta accounts over 2 periods.
Rollback path:       Toggle ai_agent_impact_report_forecast OFF (no deploy) — the historical trend and all other
                     tiles remain live; only the projection is withdrawn.
Monitoring:          Review forecast_rendered vs realized containment (Section 11 events) each period during beta.

11. Observability

Key Events:

Event Name	Trigger	Properties
`impact_report_viewed`	Admin opens the report	org_id, role, date_range, has_forecast, timestamp
`impact_report_load_failed`	Aggregate query fails/timeout	org_id, reason, latency_ms, timestamp
`impact_report_baseline_forming`	Report renders in baseline-forming state	org_id, days_of_data, conversation_count, timestamp
`cost_assumption_updated`	Owner/Admin saves an assumption	org_id, agent_hour_rate, minutes_per_conversation, timestamp
`impact_forecast_rendered`	Forecast panel renders (flag ON)	org_id, projected_containment, baseline_containment, timestamp

Dashboard owner: BOT — Bot, AI & Automation (squad: BOT)

Alerts:
- impact_report_load_failed rate > 5% of views in 1h → Slack #bot-ai-alerts
- p95 report render latency > 3s over 30 min → Slack #bot-ai-alerts

11.1 Post-Launch Monitoring Cadence

Review cadence:    Weekly for the first 4 weeks post-GA, then monthly.
Owner:             BOT squad PM (Dimas).
Review scope:      impact_report_viewed (adoption), impact_report_load_failed (reliability),
                   impact_report_baseline_forming (coverage), impact_forecast_rendered vs realized (forecast quality).
Trigger thresholds:
  - impact_report_load_failed > 5% of views in any week → investigate immediately.
  - Adoption (viewing orgs) drops > 10% week-over-week during ramp → investigate.
Rollback consideration:
  If load-failure rate exceeds 5% and cannot be resolved within 48h, PM disables ai_agent_impact_report
  globally pending root cause. Forecast issues follow the §10.5 forecast-flag rollback.

12. Success Metrics

Adoption & Usage:

⭐ Primary KPI: Impact report adoption
   Definition:  % of AI-agent-enabled org accounts whose admin views the report ≥ 1× per month
   Baseline:    N/A — report does not exist today
   Target:      ≥ 50% of AI-enabled orgs within 90 days of GA

- Repeat engagement
   Definition:  % of viewing admins who return in ≥ 2 consecutive months
   Baseline:    N/A
   Target:      ≥ 30% within 2 quarters of GA

Quality & Accuracy:

- Honest-containment integrity
   Definition:  % of rendered reports that display the reopen-adjusted (net) containment + its false-resolution
                gap, never a gross-only number
   Baseline:    N/A
   Target:      100% of reports at GA

- Report reliability
   Definition:  1 − (impact_report_load_failed ÷ impact_report_viewed)
   Baseline:    N/A — new surface
   Target:      ≥ 99% successful loads within 30 days of GA

Efficiency & Impact:

- Value configured & visible
   Definition:  % of AI-enabled orgs that set a cost assumption and see a "work absorbed" figure (hours + rupiah)
   Baseline:    0% — no value figure exists today
   Target:      ≥ 40% of AI-enabled orgs within 90 days of GA (gated on the admin setting an assumption)

13. Launch Plan & Stage Gates

Stage	Audience	Duration	Success Gate to Advance	Owner
Internal Alpha	Internal QA — 5 accounts	2 weeks	0 P0/P1 bugs; containment formula validated against a hand-checked sample; `impact_report_load_failed` ≤ 5%; p95 render ≤ 3s	PM + QA
Closed Beta	10 AI-enabled orgs across tiers	3 weeks	≥ 60% of beta admins view ≥ 1×; load reliability ≥ 99%; no metric-correctness escalation; forecast MAE < 10 pts	PM + CSM
Open Beta	AI-enabled Pro + Ent, on request	3 weeks	Reliability ≥ 99% sustained 1 week; honest-containment shown on 100% of reports; no P0/P1 open	Eng Lead
GA	All AI-agent-enabled accounts	Ongoing	All Open Beta gates sustained 2 weeks; PMM launch approved; adoption tracking live per Section 12	PM + PMM

14. Dependencies

Dependency	Owning Team	Deliverable Needed	Blocking?
`ai_activity_logs` datamart (per-org daily aggregate of room outcomes)	BI / Data	A queryable per-org daily aggregate of room outcomes (`closed_reason`, assignment, timestamps, contact) sized for ≤ 2s reads	YES
Report screen + cost-assumption modal	Chatbot FE (chatbot-fe)	New `/reports/ai-agent-impact` page, tiles, journey bar, forecast panel, CostAssumptionModal	YES
Business-hours configuration source	Chatbot BE	Per-org working-hours reference to classify after-hours conversations	YES (for the after-hours tile only — see §16 mitigation)
Cross-store sentiment read	Chatbot BE / Data	Access to `omnichannel_room_summaries.sentiment` (MongoDB) joined to room outcomes	NO (sentiment tile degrades gracefully)
Reports role gate / ownership middleware	Chatbot BE	Reuse existing owner/supervisor/admin gate from the current `/reports` endpoint	NO — already exists

📊 Dependency Graph — AI Agent Impact Report (Phase 1)

graph LR
    R[AI Agent Impact Report P1]
    R -->|BLOCKING| AGG[ai_activity_logs datamart - BI/Data]
    R -->|BLOCKING| FE[Report screen + modal - chatbot-fe]
    R -->|BLOCKING: after-hours tile| BH[Business-hours config - BE]
    R -->|non-blocking| SENT[Sentiment read - Mongo]
    R -->|non-blocking: exists| GATE[Reports role gate - BE]

15. Key Decisions + Alternatives Rejected

15a — Decisions Made

Date	Decision	Rationale
2026-07-01	Hero = honest containment (`RESOLVE_AI`, no human, net of 48h reopens); exclude bare `RESOLVE`, `bot_preview`, `SPAM`	Code dig: `RESOLVE` conflates human-resolve with idle-timeout; a reopen-adjusted net figure is the credible number (enforced in IMPACT-S01 AC-1..4).
2026-07-01	Blended journey is room-grain only this phase	BE does not store human-agent reply text; message-grain timeline is out (Non-Goal 6). Room-grain still delivers the differentiator.
2026-07-01	Money is an admin-set assumption applied to volume; not Billing-audited	No rupiah rate/human benchmark in BE; a customer-set figure is feasible now and more trusted (audited rupiah is Phase 3).
2026-07-01	Forecast ships behind its own flag, OFF by default	It's a projection that can mislead; flag + §10.5 rollback keep the rest of the report safe.
2026-07-01	Plain-language UI with a verdict on every metric; no jargon/internal markers	Readability audit ("Bu Rina") — the customer view must be jargon-free; the annotated spec wireframe stays internal.
2026-07-01	Phase 1 reads existing data only — no new backend instrumentation	Keeps the MVP dependency-light and shippable; instrumentation (reason, confidence) is pulled forward in Phase 2.
2026-07-01	Phase 1 is a native page sourced from the `ai_activity_logs` datamart — not a direct Postgres read, not a Metabase iframe	Aligns with the flip-the-dependency strategy: the datamart becomes the report's spine and is ready for the Phase 2/3 telemetry contract; a native page is required for the interactive cost toggle + forecast.
2026-07-01	The work-absorbed value figure requires the admin to set a cost assumption first (rate + minutes/conversation); no Qontak default rate	Keeps every money/hours figure the org's own — avoids a vendor-imposed number. Tile shows a "set your assumption" prompt until configured (IMPACT-S04 AC-3).

15b — Alternatives Rejected

Alternative	Why Rejected	Date
Gross containment as hero	Over-counts silent abandonment / idle-timeout closes; fails the credibility bar.	2026-07-01
Qontak-computed rupiah ROI	No rate/benchmark in-product; contestable. Admin-set assumption chosen.	2026-07-01
Message-grain AI+human transcript	BE doesn't persist human-agent reply text; not feasible this phase.	2026-07-01
Pure Metabase-iframe report	Can't host the interactive cost toggle / forecast / assumption editor; native surface needed.	2026-07-01
Ship the report with CSAT/quality now	Requires persisted confidence + survey capture (not present); would block the whole MVP. Deferred to Phase 3.	2026-07-01

16. Open Questions

#	Type	Question	Owner	Deadline
1	Assumption	48h is the right reopen window, and "same `contact_id`, new conversation" is the right false-resolution rule (topic-match not required in P1).	Dimas + BI	2026-07-18
2	Assumption	A per-org business-hours config exists (or is derivable) for the after-hours tile. Mitigation: if unavailable at build, ship P1 without the after-hours tile and add it when the config lands — non-blocking to the rest.	Dimas + BE	2026-07-18
3	Open Question	Cross-store join feasibility/perf for the sentiment tile (Postgres `rooms` ⋈ Mongo `omnichannel_room_summaries`) once the datamart aggregate is the primary source.	Dimas + BI/BE	2026-08-15
4	Risk	Excluding bare `RESOLVE` under-counts genuine human-button resolves that were actually AI-assisted, slightly understating containment. Mitigation: label the outcome clearly and revisit with `closed_reason` enrichment / reason persistence in Phase 2.	Dimas + BE	2026-08-15

Types: Assumption · Open Question · Risk Resolved during grooming (now in §15 Decisions): aggregate source = ai_activity_logs datamart (native page); money requires the admin to set an assumption first. Risk row #4 carries an explicit mitigation, so it does not block READY.

AI-Readiness Score

PRD AI-Readiness Score

Score: 9.8 / 10.0
PRD Type: NEW (Phase 1 of 3)
Verdict: READY — sufficient for RFC and story generation
        (Status is DRAFT — coaching score; all READY gates pass)
Scored: 2026-07-01 by Claude

Gates:
  ✅ Gate 1 — Overall ≥ 8.0 (9.8)
  ✅ Gate 2 — Stories + ACs (§9) ≥ 9.0 (9.9)
  ✅ Gate 3 — No blocking open items (Risk #4 mitigated; no deadline past)

Section Scores (display №):
  Header Block                     9.2 ✅   (H5 1/2 — Epic deferred + Figma pending)
  2  One-liner + Problem          10.0 ✅
  3  What If We Don't Ship        10.0 ✅
  4  Persona                      10.0 ✅
  5  Non-Goals                    10.0 ✅
  6  Constraints                  10.0 ✅
  7  New Features                  8.3 ⚠️   (11.5 0/2 — Figma frame pending)
  8  API & Webhook                10.0 ✅
  9  System Flow + Stories + ACs   9.9 ✅   (flow 10.0 · stories 9.9; 13b.10 Figma 0/2)
  10 Rollout                      10.0 ✅
  11 Observability                10.0 ✅
  12 Success Metrics              10.0 ✅
  13 Launch Plan & Stage Gates    10.0 ✅
  14 Dependencies                 10.0 ✅
  15 Decisions + Alternatives     10.0 ✅
  16 Open Questions               10.0 ✅

DIAGRAM COVERAGE
  S9   System flow diagram     ✅ present & consistent
  S7   UI state diagram        ✅ present & consistent
  S14  Dependency graph        ✅ present & consistent   [Tier 2]
  S8   API sequence diagram    ❌ missing (optional)     [Tier 2]
  S7   Component tree diagram  ❌ text tree only         [Tier 2]
  3 present · 2 missing · 0 contradicting

Remaining opportunities (design/epic-gated, non-blocking):
  - Figma frames pending → New Features 11.5 (0/2) and Stories 13b.10 (0/2).
    Add frame-level links once design lands (Stitch prompts in Appendix A bootstrap this).
  - Header H5 (1/2) lifts once the Jira Epic key + Anchor Confluence link are wired in.

Known gaps (treat as undefined downstream): Figma frame links; final aggregate-query
implementation within the ai_activity_logs datamart (RFC to confirm).

Appendix A — Stitch UI Prompts

Generated at READY to bootstrap UI references while Figma is pending. Run each row in Google Stitch in order; paste each generated image into the next row's reference. Anchor the visual style on the mekari-taste plain-language wireframe from grooming.

=== SHARED PREAMBLE (paste at the start of every Stitch prompt) ===
Product: Qontak (Chatbot & AI) — B2B omnichannel CRM, WhatsApp-first, Indonesian SMEs
Users: Org Admin / CS Supervisor (non-technical) and Business Owner
Design tone: Enterprise B2B SaaS, Mekari Pixel look — calm, layered surfaces, brand color only on the
  hero number + primary action, a plain-language verdict on every metric, NO jargon, NO internal markers
Persistent UI: Qontak web-admin shell — white top bar (mekari qontak logo, user block) + left nav rail
  (surface tone, "Reports" active); report content on a white stage
Navigation flow: Reports → "AI agent impact" → (optionally open Cost Assumption modal) → back to report
Cross-screen consistency: For Screen 2, attach the Generated Image from Screen 1 and match its palette,
  type scale, spacing, and component style exactly.
=== END PREAMBLE ===

#	Screen	Instructions	Stitch Prompt (copy in full)	Reference Image	Generated Image
1	AI Agent Impact Report (main page)	Open Stitch. Paste the prompt. Attach the mekari-taste plain-language wireframe as reference. Generate.	[SHARED PREAMBLE] Screen: AI Agent Impact Report Purpose: A non-technical admin sees, in plain language, how much the AI handled, how well, and what it saved. Access: Owner / Admin / Supervisor on AI-enabled accounts. Sections top to bottom: - Plain headline sentence ("Last month your AI answered 6 in 10 conversations on its own…") - "What your AI did" — 4 tiles: Handled by AI on its own (hero %, verdict "Healthy"), Conversations, After-hours (count + %), Team time saved (hours; rupiah only if assumption set) - "AI + human collaboration" — one horizontal split bar (AI only / AI assisted, human closed / Escalated) - "Is your AI doing a good job?" — 3 tiles: Customers who had to ask again (%, verdict), Customer mood (words), Messages to solve (avg, verdict) - "If this keeps up" — small trend line, solid past + dashed projection Generate all 4 UI states: Empty: "Baseline forming" — show volume only, friendly note, no fabricated %. Loading: skeleton tiles + skeleton chart. Error: "We couldn't load your report." + Retry button. Success: full report with a verdict on every metric. Do NOT include: CSAT, quality scores, in-scope/out-of-scope, competitor names, phase tags, "net/gross", any jargon.	Attach the mekari-taste plain-language wireframe.	(paste Stitch output here)
2	Cost Assumption modal	Open Stitch. Paste the prompt. Attach the Generated Image from Step 1 as reference. Generate.	[SHARED PREAMBLE] Screen: Cost Assumption modal (opened from the report's "Team time saved" tile / header gear) Purpose: Owner/Admin sets their own agent-hour cost and average handling time so the report can estimate rupiah saved. Access: Owner / Admin only (Supervisor sees the figure but cannot edit). Fields: - Agent cost per hour (rupiah input) - Average minutes per conversation (number input) - Helper text: "We use your own numbers — Qontak never guesses your costs." - Primary button "Save", secondary "Cancel" Generate all 4 UI states: Empty: first-time — fields blank with the helper text and a short "why we ask" line. Loading: Save button spinner, fields disabled. Error: inline validation under a field ("Enter a positive number"). Success: modal closes, the report's rupiah figure recomputes (show the tile updated). Do NOT include: any Qontak default/suggested rate, audited-cost claims, jargon.	Attach the Generated Image from Step 1.	(paste Stitch output here)

⚠️ Generate in order (#1 → #2). After Stitch produces each screen, paste it into its Generated Image cell before running the next. Share the finished table with your designer to rebuild in Figma using real Pixel components.

PRD CHANGELOG

Append-only. Claude generates entries — PM never writes to this table manually.

Version	Date	By	Section	Type	Summary
1.0	2026-07-01	Claude	All	CREATED	Initial Phase 1 NEW PRD generated from the product-ideation → feature-comparison → code-grounding → mekari-taste wireframe session. Live Impact Report MVP off existing `rooms` data: honest-containment hero, blended journey, quality tiles, trend + forecast, admin-set cost assumption. Companion to the AI Agent Impact Report ANCHOR.
1.1	2026-07-01	Claude	S8, S12, S14, S15, S16	MODIFIED	Incorporated grooming decisions: (1) Phase 1 sources from the `ai_activity_logs` datamart as a native page (resolved OQ); (2) work-absorbed value requires the admin to set a cost assumption first, no default rate (resolved OQ, updated IMPACT-S04 AC-3 + S12 metric); (3) confirmed ≥50% adoption target; (4) Jira epic deferred (placeholder retained).
1.2	2026-07-01	Claude	S2, S3, S9, S16, +Score +Appendix A	MODIFIED	Post-score polish to READY: tightened one-liner to ≤25 words; added time horizons + magnitude to S3; added `required` flags to all story Data Fields; converted open-question deadlines to dates; inserted AI-Readiness Score (9.8, READY) and Appendix A Stitch UI prompts.

HEADER BLOCK​

Table of Contents​

Scope Changes​

2. One-liner + Problem​

3. What Happens If We Don't Ship This Phase​

4. Target Users + Persona Context​

5. Non-Goals​

6. Constraints​

6.7 Data Lifecycle​

7. New Features​

📊 UI State Diagram — AI Agent Impact Report​

8. API & Webhook Behavior​

9. System Flow + User Stories + ACs​

9.1 System Flow​

📊 System Flow — AI Agent Impact Report​

9.2 User Stories​

🧪 Test Coverage Matrix — [IMPACT-S01]​

🧪 Test Coverage Matrix — [IMPACT-S02]​

🧪 Test Coverage Matrix — [IMPACT-S03]​

10. Rollout​

10.5 Semantic Regression Rollback​

11. Observability​

11.1 Post-Launch Monitoring Cadence​

12. Success Metrics​

13. Launch Plan & Stage Gates​

14. Dependencies​

📊 Dependency Graph — AI Agent Impact Report (Phase 1)​

15. Key Decisions + Alternatives Rejected​

16. Open Questions​

AI-Readiness Score​

Appendix A — Stitch UI Prompts​

PRD CHANGELOG​