Skip to main content

RFC: Unified Agent Quality Scorecard — Phase 1: Scorecard Settings & Rubric Config

Document Conventions (do not remove)

This RFC follows the Qontak RFC Template format for governance — the metadata table, sections 1–6, and Comment log are mandatory.

It is also agent-execution-ready: §1 PRD-to-Schema Derivation (BE half) + §2.A UI Contract (FE half), §2.0 Repo Reading Guide for both layers, mermaid diagrams, §2.G Cross-Layer Contract Verification, and §4 Agent Execution Plan + Verification & Rollback Recipe are complete.

Agent-execution-ready RFC derived 1:1 from ../prds/phase-1-settings-and-rubric-config.md. Phase 1 builds the config layer only — no scoring, in-room panel, report, or gate. Backend = Qontak Chatbot (chatbot, Rails 7.1 / Grape / Clean Architecture). Frontend = Qontak Chatbot FE (chatbot-fe, Nuxt 4 / Vue 3 / Pinia / Pixel3).

Metadata

FieldValueNotes
StatusRFCWorking vocabulary IDEA/RFC/AGREED/ABANDON. YAML status: uses linter enum (RFC→in-review); kept draft until reviewed.
Owner (DRI)Dimas Fauzi HidayatMirrors frontmatter dri. Single accountable owner; staffing lives in delivery/.
Source PRD../prds/phase-1-settings-and-rubric-config.mdPRD v1.2.
Anchor../unified-agent-scorecard-anchor.mdInitiative master index.
Deliverynot yet handed to deliveryTimeline/effort/rollout scheduling lives in delivery/.
Typefull-stackBackend (Grape API + chatbot_gpt DB) + Frontend (Nuxt settings UI).
SquadBOT — Bot, AI & Automation
Infosec approverrequired at review — see §7Touches auth-gated org config + audited PII-adjacent settings.
Last Updated2026-06-20

Sections at a Glance

§SectionType hint
§1Overview, traceability, decisions indexPRD coverage, AC map, schema derivation
§2Technical designRepo reading guide, infra topology, ADRs, ER + sequence diagrams, API + UI contracts
§3HA & SecurityPerf, auth matrix, failure catalog, error catalog, a11y
§4Backwards compat & rolloutFlag contract, test plan, agent execution plan, rollback recipe
§5Concerns / open questionsCarried from PRD + grounding gaps
§6Comment log
§7Ready for agent executionThe readiness gate

1. Overview

Phase 1 ships the configuration layer for AI-agent quality scoring, behind the ai_qa_unified_scorecard feature flag. It does three things and nothing more:

  1. Extends scorecard_preferences with an AI on-switch + AI pass threshold (is_ai_auto_score, ai_passing_grade), leaving the existing human auto-score (is_auto_score / passing_grade, which already drive auto_agent_scoring.rb) untouched.
  2. Wires the dormant scorecard_custom_parameters.prompt field into the API + UI as an "AI judging rubric", widening it string → text. A non-empty rubric marks a custom parameter auto-scorable (a derived attribute).
  3. Surfaces a read-only Default AI Rubric viewer (the 9 Qontak-calibrated metrics), served by a new read-only backend endpoint from static config.

No scores are produced in Phase 1. The persisted config is consumed by the Phase 2 scoring pipeline.

Success Criteria

  • AI scoring preference (is_ai_auto_score + ai_passing_grade) persists per org and round-trips through GET; settings save success ≥ 99% (PRD §13).
  • A custom parameter with a non-empty prompt persists and is returned auto_scorable: true; empty prompt returns auto_scorable: false.
  • The 9-metric default rubric loads read-only with veto flags on Groundedness + Policy.
  • All three surfaces are invisible unless ai_qa_unified_scorecard is enabled for the org.
  • Save P95 ≤ 500ms (PRD §6).
  • Human manual scorecard config and human auto-scoring behavior are byte-for-byte unchanged.

Out of Scope

Phase 2 scoring pipeline, in-room panel, multi-actor scoring, Analytics report (P3), validation harness (P4), go-live gate (P5), mobile, billing/packaging, any change to human manual scoring or auto_agent_scoring.rb runtime behavior. See PRD §5.

Forward-looking note (not built here): the room-resolve trigger at chatbot/app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63 currently skips AutoAgentScoringWorker when custom parameters exist (unless is_custom_parameter || scorecard_exists). Phase 2 must revisit this guard so AI scoring with custom params actually runs. Phase 1 changes nothing here.

DocumentPathWhat was taken from it
Phase 1 PRD v1.2../prds/phase-1-settings-and-rubric-config.mdAll requirements, ACs, rubric content.
Initiative anchor../unified-agent-scorecard-anchor.mdPhase map; confirms auto_agent_scoring.rb scores the human agent only.
Phase 2 PRD../prds/phase-2-auto-scoring-and-in-room-scorecard.mdReviewed — consumer of this config; no Phase 1 impact.
chatbot AGENTS.md API ruleschatbot/AGENTS.md §"API Specification Rules"Mandatory OpenAPI bundle/split/validate workflow (§4.B).

Assumptions

  • A1 — Enabling is_ai_auto_score before Phase 2 exists is a recorded preference with no customer-visible effect (PRD Open Q#3). Grounded: GET/PATCH already persist a preference with no runtime side effect beyond the human path; the new AI columns are inert until Phase 2 reads them.
  • A2 — Plan-gating (Pro+Ent only) is enforced by provisioning the ai_qa_unified_scorecard org-feature only for eligible plans; the build only checks the flag. (No plan-tier check exists in code to reuse — see §5 Open Q#5.)
  • A3 — prompt max length is 4,000 chars (PRD Open Q#2 proposed value). Adopted as the build default; tunable via the validation macro.

Dependencies

DependencyOwnerNeededBlocking?
prompt widen string→text migrationBOT (this RFC)DDLNO — in scope (§2.3).
Design frames (settings + rubric editor)Design squadFigma for CHG-001/CHG-002YES for FE pixel-faithfulness — see §5 Open Q#1. Stitch prompts in PRD Appendix B are the interim spec.
DSAI 9-metric definitionsDSAIConfirm default rubric contentNO for build · advisory for accuracy (rubric served as PROPOSED).
ai_qa_unified_scorecard org-feature provisioningBilling/ProvisioningFeature rows for Pro+Ent orgsNO for build (defaults OFF).

Detail 1.A — Coverage Matrices

1.A.1 — PRD Section Coverage

PRD §TitleCovered in
2One-liner + Problem§1 Overview
3What happens if we don't build§1 (motivation)
4Target users + persona§3 Role × Endpoint matrix
5Non-Goals§1 Out of Scope
6Constraints§3 Performance, §4 Flag contract, §3 Role matrix
7Feature Changes (CHG-001/002)§2.3 DDL, §2.4 APIs, §2.A UI
8New Features (editor + viewer)§2.A UI Contract, §2.4 (new endpoint)
9API & Webhook Behavior§2.4 APIs
10System Flow + Stories + ACs§1.C, §2.1a Sequence, §1.A.4 AC map
11Rollout§4 Rollout
12Observability§3 Monitoring & Logging
13Success Metrics§1 Success Criteria, §4.D signals
14Launch Plan & Stage Gates§4 Rollout (technical view; scheduling → delivery/)
15Dependencies§1 Dependencies
16Key Decisions + Alternatives§1.B + §2 ADRs
17Open Questions§5
App. AAI Scoring Rubric§2.4 default-rubric endpoint payload
App. BStitch UI Prompts§2.A interim design spec

1.A.2 — UI / Consumer Surface Coverage

SurfacePRD refBacking read endpointRFC anchor
Scorecard settings page /settings/scorecardCHG-001, S01, S03GET .../scorecard_preferences + GET .../scorecard_ai_default_rubric§2.A, §2.4
Custom-parameter editor /settings/scorecard/custom-parametersCHG-002, S02GET .../scorecard_custom_parameters (existing list)§2.A, §2.4
Default Rubric viewer (within settings)S03GET .../scorecard_ai_default_rubric (new)§2.A, §2.4

1.A.3 — Role Coverage

PRD personaGrounded role (hub-core user.rb:38-44 enum)Access
QA Lead / SupervisorsupervisorRead all; write threshold + custom rubric
Bot / AI Admin (Agent Owner)owner / adminRead all; write threshold + custom rubric
End CS agentagent / memberNo access (controls not rendered; API 403)

Grounding correction (PRD vs code): the PRD names "QA Lead" and "Bot/AI Admin" roles. The platform has no such roles. The single-role enum issued by hub-service /users/me is {owner, admin, supervisor, agent, member}supervisor is the closest to QA Lead. Per confirmed decision, scorecard writes keep set_role(%w[owner admin supervisor]) and map the personas onto these roles (ADR-7).

1.A.4 — Acceptance-Criteria → Design Element Map

PRD StoryComposite AC idsDesign elementTest spec ref
UASC-S01 — Enable AI auto-scoring + thresholdUASC-S01/AC-1, /AC-2, /AC-3, /ERR-1, /NEG-1§2.3 new cols · §2.4 row 1 (PATCH pref) · §2.A AutoScoreToggle · §4.C chunks 1,3,6tests/phase-1-settings-and-rubric-config.md
UASC-S02 — Custom param + rubricUASC-S02/AC-1..AC-4, /ERR-1, /NEG-1§2.3 prompt text · §2.4 row 2 (custom param) · §2.A CustomParamEditor · §4.C chunks 2,4,7
UASC-S03 — View default rubricUASC-S03/AC-1, /AC-2, /AC-3, /ERR-1§2.4 row 3 (new endpoint) · §2.A DefaultRubricViewer · §4.C chunks 5,8

1.A.5 — PRD-to-Schema Derivation (BE)

PRD entity/attribute/ruletable.columnExposed byEnforced atPRD ref
AI auto-score on-switchscorecard_preferences.is_ai_auto_score (new, bool, default false)GET/PATCH preferenceDry contract (optional bool), upsert repoCHG-001
AI pass thresholdscorecard_preferences.ai_passing_grade (new, float, nullable)GET/PATCH preferenceDry contract rule 0–100CHG-001, S01/AC-2,3
Org-specific AI rubricscorecard_custom_parameters.prompt (string→text)POST/PATCH custom param; list GETDry contract length ≤ 4000CHG-002, S02
Auto-scorable flagderived prompt.present? (not stored)custom param response auto_scorablecomputed in entity/builderS02/AC-1,3; NEG-1
9 default AI metrics + veto flagsstatic config (no table)new read-only endpointconstant + Grape entityS03, App. A

Detail 1.B — Decisions Closed (index → §2 ADRs)

#DecisionADR
1New columns is_ai_auto_score + ai_passing_grade, not overloading existing human colsADR-1
2Widen prompt string→text (Postgres change_column)ADR-2
3auto_scorable is derived from prompt.present?, not a stored booleanADR-3
4Default rubric served by a new read-only endpoint from static Ruby configADR-4
5Gate all surfaces on ai_qa_unified_scorecard (BE: OrganizationFeatures::FindFeature; FE: $hasSubscription)ADR-5
6Settings/rubric writes are synchronous; analytics fired async via SendMixpanelEventWorkerADR-6
7Reuse set_role(%w[owner admin supervisor]); map PRD personas onto the existing enumADR-7
8ai_passing_grade validated 0–100 inclusive (PRD), diverging from human passing_grade 1–99ADR-8

Detail 1.C — Per-Story Change Map

StoryLayer scopeChanges (concrete artifacts)Acceptance criteriaRFC anchors
UASC-S01FE + BEBE: migration add 2 cols; ScorecardPreference::Patch/Get contracts + defaults; scorecard_preferences Grape params; entity ScorecardPreference (FE-svc + gpt-svc); Upsert/FindBy repos; entity Entities::FrontendServices::Gpt::ScorecardPreference + DEFAULT_AI_PASSING_GRADE; OpenAPI. FE: AutoScoreToggle.vue; store/scorecard state/actions; scorecard.ts service + endpoint.ts; Vuelidate 0–100; mixpanel scorecard_settings_updated/_save_failed.UASC-S01/AC-1,2 persist+roundtrip; AC-3 0–100 validation rejects; ERR-1 error+retry, no partial state, log; NEG-1 Starter/Free hidden (flag off).§2.3 · §2.4 r1 · §2.A · §4.C c1,c3,c6
UASC-S02FE + BEBE: migration widen prompt; add prompt to custom-param Grape params (POST+PATCH) + Create/Update Dry contracts + validate_prompt_length macro; repos ScorecardCustomParameter::Create/Update persist prompt; entity ScorecardCustomParameter expose prompt + auto_scorable; OpenAPI. FE: CustomParamEditor.vue (textarea + length counter + auto-scorable chip); store actions; service+endpoint; Vuelidate max-len; mixpanel scorecard_custom_param_saved/_save_failed.S02/AC-1 non-empty→auto_scorable; AC-2 shows rubric+state; AC-3 empty→manual-only; AC-4 over-limit rejected; ERR-1 error+retry+log; NEG-1 empty NOT auto-scorable.§2.3 · §2.4 r2 · §2.A · §4.C c2,c4,c7
UASC-S03FE + BEBE: new ScorecardAiDefaultRubric Grape resource + use case reading Constants::ScorecardAiDefaultRubric; entity; mount in frontend_service/gpt_api.rb (+ optionally gpt_service); OpenAPI. FE: DefaultRubricViewer.vue; store action; service+endpoint; "PROPOSED" + veto badges; mixpanel default_rubric_viewed/default_rubric_load_failed.S03/AC-1 9 metrics read-only; AC-2 PROPOSED note; AC-3 veto flag on Groundedness+Policy; ERR-1 load error+retry+log.§2.4 r3 · §2.A · §4.C c5,c8

2. Technical Design

Detail 2.0 — Repo Reading Guide

Repo Map (slice this RFC touches)

flowchart LR
subgraph FE["chatbot-fe (Nuxt 4)"]
page["pages/settings/scorecard/*.vue (NEW)"]
views["modules/settings/views/* (pattern: ai-assist.vue)"]
store["store/scorecard/* (NEW, pattern: store/ai-assist)"]
svc["common/services/main/v1/scorecard.ts (NEW)"]
ep["common/services/main/endpoint.ts (+scorecard)"]
flag["plugins/botSubscriptionFeature.ts ($hasSubscription)"]
end
subgraph BE["chatbot (Rails 7.1 / Grape)"]
apipref["app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb"]
apicp["app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb"]
apirub["app/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb (NEW)"]
uc["app/core/use_cases/api/frontend_service/v1/gpt/scorecard_*"]
repo["app/core/repositories/gpt/scorecard_*"]
ent["app/core/entities/frontend_services/gpt/scorecard_preference.rb"]
auth["app/api/frontend_service/middlewares/auth.rb -> hub-service /users/me"]
end
db[("chatbot_gpt DB (Postgres)\nscorecard_preferences\nscorecard_custom_parameters")]
mp["SendMixpanelEventWorker -> Mixpanel"]

page --> store --> svc --> ep -->|"$apiMain /api"| apipref & apicp & apirub
flag -.gates.-> page
apipref & apicp & apirub --> auth
apipref --> uc --> repo --> db
apicp --> uc
uc --> ent
page -.fires.-> mp

Existing Code Anchors (read before writing)

#PathWhat to learn
1chatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rbGrape GET/PATCH shape, set_role, Dry::Matcher::ResultMatcher, mount target.
2chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rbDry contract + rule(:passing_grade) 1–99; how AI cols/rule are added.
3chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/get.rbDefault-fill pattern via Entities::...::ScorecardPreference::DEFAULT_*.
4chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rbFind-by-org upsert; where to set the new columns.
5chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rbDEFAULT_PASSING_GRADE=75, DEFAULT_AUTO_SCORE=true; add DEFAULT_AI_*.
6chatbot/app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rbPOST/PATCH params (no prompt today); where to add it.
7chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rbregister_macro(:validate_*_length) pattern → model validate_prompt_length.
8chatbot/db/chatbot_gpt_migrate/20241113041150_create_scorecard_custom_parameters.rbMigrator dialect for the chatbot_gpt DB; prompt is t.string.
9chatbot/app/models/chatbot_gpt_record.rbChatbotGptRecord base — migrations target :chatbot_gpt connection.
10chatbot-fe/modules/settings/views/ai-assist.vue + store/ai-assist/*Canonical settings page + Pinia store + Vuelidate + toast + $hasSubscription pattern.

Patterns to Follow

ConcernReference file (opened)Pattern
Grape endpoint + authchatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rbformat :json; set_role; Dry::Matcher::ResultMatcher success/failure.
Use case + validation.../scorecard_preference/patch.rb; .../scorecard_custom_parameter/create.rbcontract do params … rule … register_macro end; Success/Failure(build_*_params).
Repository upsertchatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rbfind-or-build → assign → save!Builders::...build.
Entity defaultschatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rbdry-struct attributes + public_constant :DEFAULT_*.
Grape response entitychatbot/app/api/frontend_service/v1/entities/gpt/scorecard_preference.rbGrape::Entity expose with documentation.
External LLM call (Phase 2 ref only)chatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:160-179OpenAI::Client.new(request_timeout: 240), max_attempts = 2, Rollbar.error.
Async analyticschatbot/app/workers/send_mixpanel_event_worker.rb + app/core/use_cases/system/receive_webhook.rb:76-89SendMixpanelEventWorker.perform_async(org, event, props.as_json).
Feature flag (BE)chatbot/app/core/repositories/organization_features/find_feature.rb; usage in app/core/repositories/ai_knowledge_sources/search.rbOrganizationFeatures::FindFeature.new(feature_code:, organization_id:).call_by_organization.
Settings page (FE)chatbot-fe/modules/settings/views/ai-assist.vueMpFormControl/MpInput/MpButton; useVuelidate; $toast; isFetch* computed.
Pinia store (FE)chatbot-fe/store/ai-assist/{state,actions,getters,types}.tsfetchStatus: idle/pending/resolved/rejected; service via mainService.
API service (FE)chatbot-fe/common/services/main/v1/ai-assist.ts + common/services/main/endpoint.ts$apiMain(endpoint, {method, body, signal}); AbortController.
Feature flag (FE)chatbot-fe/plugins/botSubscriptionFeature.ts$hasSubscription('code') boolean.
Analytics (FE)chatbot-fe/common/contants/mixpanel-events.ts + ai-assist.vue:925mixpanel.track(MIXPANEL_EVENTS.X, props).

Reading Order for the Agent

  1. chatbot/AGENTS.md (§Workflow Commands + §API Specification Rules)
  2. Anchor #1 (preference Grape) → #2 (patch UC) → #3 (get UC) → #4 (upsert) → #5 (entity)
  3. Anchor #6 (custom-param Grape) → #7 (create UC macros)
  4. Anchor #8 + #9 (chatbot_gpt migration dialect + base record)
  5. chatbot/app/api/frontend_service/gpt_api.rb (FE-facing mount paths) and chatbot/app/api/gpt_service/api.rb (gpt-svc mount paths)
  6. Anchor #10 (chatbot-fe settings page + store + service)
  7. chatbot-fe/common/services/main/endpoint.ts, plugins/api/apiMain.ts, plugins/botSubscriptionFeature.ts

Existing-Endpoint Check (reuse / extend / new)

EndpointSurface(s)TagEvidence
PATCH /v1/gpt/omnichannel/scorecard_preferences (+ PATCH /v1/scorecards/preferences)frontend_service + gpt_serviceextendedgpt_api.rb:28, gpt_service/api.rb:21; adding is_ai_auto_score/ai_passing_grade.
GET same pathbothextendedscorecard_preferences.rb get '/'; add AI fields to response.
POST /v1/gpt/scorecard_custom_parameters + PATCH :id (+ /v1/scorecards/parameters/custom)bothextendedgpt_api.rb:33, gpt_service/api.rb:24; adding prompt.
GET .../scorecard_ai_default_rubricfrontend_service (+ gpt_service optional)new-with-justificationNo endpoint serves AI default metrics today (grep scorecard in app/api — only categories/parameters/custom/preferences). The 9 AI metrics are a new concept with no table; a static-config read endpoint is the single source of truth Phase 2 reuses, and the PRD defines default_rubric_load_failed (a fetch failure mode).

Source Verification

ClaimEvidence (file:line / identifier)
scorecard_preferences has is_auto_score (bool, default false) + passing_grade (float)chatbot/db/chatbot_gpt_schema.rb:454-465; migration db/chatbot_gpt_migrate/20240206095006_create_scorecard_preference.rb
Human passing_grade validated 1–99chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rb:18-22 rule(:passing_grade)
Preference upsert keyed by organization_idchatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb:13-25; model validates_uniqueness_of :organization_id
Defaults DEFAULT_PASSING_GRADE=75, DEFAULT_AUTO_SCORE=truechatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb:7-9
scorecard_custom_parameters.prompt exists as string, unused by APIchatbot/db/chatbot_gpt_schema.rb:418-441 (t.string "prompt"); custom-param Grape params omit prompt (app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb POST/PATCH params)
Length validation macro patternchatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rb:39-59
auto_agent_scoring.rb scores the human agent on room resolvechatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:6,76-79; trigger app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63; worker app/workers/auto_agent_scoring_worker.rb
OpenAI client pattern (Phase 2 ref)auto_agent_scoring.rb:160-179 OpenAI::Client.new(request_timeout: 240), max_attempts = 2
Role enum {owner,admin,supervisor,agent,member} (single role)hub-core app/core/domains/models/user.rb:38-44; surfaced via hub-service /api/core/v1/users/me; consumed chatbot/app/api/frontend_service/middlewares/auth.rb:15-24env['user']current_user['role']; checked app/api/frontend_service/helpers/authorization_helpers.rb:6-10
Existing scorecard endpoints gate owner/admin/supervisorscorecard_preferences.rb (get/patch) + scorecard_custom_parameter.rb (post/patch/delete) set_role(%w[owner admin supervisor])
Feature-flag mechanism (BE)chatbot/app/core/repositories/organization_features/find_feature.rb:4-22; usage app/core/repositories/ai_knowledge_sources/search.rb
paper_trail already on both modelschatbot/app/models/chatbot_gpt/scorecard_preference.rb:5; scorecard_custom_parameter.rb:6
Mixpanel async workerchatbot/app/workers/send_mixpanel_event_worker.rb; config/initializers/mixpanel.rb
OpenAPI mandatory workflowchatbot/AGENTS.md:235-247
FE has no scorecard code todaygrep scorecard|passing_grade|is_auto_score in chatbot-fe/{common,store,pages,modules} → 0 hits
FE settings/store/service/flag patternschatbot-fe/modules/settings/views/ai-assist.vue; store/ai-assist/*; common/services/main/v1/ai-assist.ts; common/services/main/endpoint.ts:123-157; plugins/botSubscriptionFeature.ts:23-40; common/contants/mixpanel-events.ts
chatbot_gpt DB connectionchatbot/app/models/chatbot_gpt_record.rb connects_to database: { writing: :chatbot_gpt, reading: :chatbot_gpt }

Detail 2.1 — Infrastructure Topology

flowchart TB
user(["QA Lead / Supervisor / Owner / Admin (web)"])
lb["LB / Ingress"]
fe["chatbot-fe pods (Nuxt 4 SSR/SPA)"]
api["chatbot pods (Puma · Grape FrontendService)"]
hub["hub-service /api/core/v1/users/me (auth)"]
pg[("Postgres — chatbot_gpt DB\n(writing+reading)")]
redis[("Redis")]
sidekiq["Sidekiq workers"]
mp["Mixpanel (external)"]
oai["OpenAI (external) — Phase 2 only"]

user --> lb --> fe -->|"$apiMain /api (Bearer + X-Auth-Token)"| lb
lb --> api
api -->|"validate token"| hub
api -->|"read/write settings + rubric"| pg
fe -.->|"track events"| mp
api -->|"enqueue analytics"| redis --> sidekiq --> mp
sidekiq -. "Phase 2 AutoAgentScoring" .-> oai

Per-service responsibilities

ServiceUse cases (this RFC)Internal calls (owner)External APIs
chatbot-feRender settings + rubric editor + viewer; client validation; fire eventschatbot API (BOT)Mixpanel (browser)
chatbot (Grape)Persist AI preference; persist custom-param rubric; serve default rubric; authz; feature-gatehub-service /users/me (Core team); Mixpanel worker— (Phase 1); OpenAI in Phase 2
hub-service / hub-coreToken validation, issues single role
chatbot_gpt DBStore scorecard_preferences, scorecard_custom_parameters (+ paper_trail versions)

Detail 2.1a — Sequence Diagrams (happy + failure paths)

S01 — Save AI preference (authz + validation failure)

sequenceDiagram
participant U as User
participant FE as chatbot-fe
participant LB as LB
participant API as chatbot (Grape)
participant HUB as hub-service /users/me
participant DB as chatbot_gpt (Postgres)
participant MP as Mixpanel (async)
U->>FE: toggle AI on, set ai_passing_grade
FE->>LB: PATCH scorecard_preferences (Bearer+X-Auth)
LB->>API: forward
API->>HUB: validate token
HUB-->>API: {role: supervisor, organization_id}
alt role not in {owner,admin,supervisor}
API-->>FE: 403 Permission denied
else authorized
API->>API: Dry coerce :float + rule ai_passing_grade in 0..100
alt non-numeric or out of range
API-->>FE: 422 "AI passing grade only between 0 - 100"
FE-->>U: inline error, nothing saved
else valid
API->>DB: upsert by organization_id (save!)
DB-->>API: ok
API-->>FE: 200 {is_ai_auto_score, ai_passing_grade}
FE-)MP: track scorecard_settings_updated
FE-->>U: "Change saved"
end
end

S02 — Save custom param + rubric (happy + DB failure)

sequenceDiagram
participant FE as chatbot-fe
participant API as chatbot (Grape)
participant DB as chatbot_gpt
participant MP as Mixpanel (async)
FE->>API: POST scorecard_custom_parameters {name, prompt}
API->>API: validate_prompt_length (<=4000) + name rules
alt prompt > 4000
API-->>FE: 422 "AI judging rubric cannot exceed 4000 characters."
else valid
API->>DB: create (save!) — company_id from token
alt save! raises
API->>API: Rollbar.error(e)
API-->>FE: 500 "Something went wrong"
FE-)MP: track scorecard_custom_param_save_failed
FE-->>FE: $toast error + Retry (no partial state)
else ok
API-->>FE: 200 {prompt, auto_scorable: prompt.present?}
FE-)MP: track scorecard_custom_param_saved {has_rubric}
end
end

S03 — Load default rubric (happy + fetch failure)

sequenceDiagram
participant FE as chatbot-fe
participant API as chatbot (Grape)
FE->>API: GET scorecard_ai_default_rubric
alt success
API-->>FE: 200 {status:PROPOSED, metrics:[9 + veto]}
FE-->>FE: render list + veto badges
else 500 / network
FE-->>FE: "Couldn't load the default rubric." + Retry
FE-)FE: track default_rubric_load_failed
end

Detail 2.1b — Rubric Gate Branch

flowchart TD
A[Save custom parameter] --> B{prompt non-empty?}
B -->|Yes| C[auto_scorable = true]
B -->|No| D[auto_scorable = false — manual-only]
C --> E[Persist + return auto_scorable]
D --> E

Detail 2.2 — Technical Decisions (ADR-format)

ADR-1 — Store AI scoring as new columns, not overloaded human columns

  • Context. scorecard_preferences.is_auto_score/passing_grade already drive auto_agent_scoring.rb (human). PRD requires AI on-switch + AI threshold "with the existing human auto-score untouched."
  • Options.
    • A. New columns is_ai_auto_score + ai_passing_grade — clean separation; human path provably unchanged; Phase 2 reads AI cols explicitly. Con: one migration + a few columns.
    • B. Overload is_auto_score/passing_grade for both lenses — Con: entangles human and AI, high regression risk on a live path; a single threshold can't differ per lens.
    • C. Reuse is_auto_score switch, add only ai_passing_gradeCon: can't enable AI without enabling human auto-scoring and vice-versa.
  • Decision. Option A (confirmed by DRI).
  • Rationale. Strongest guarantee that human auto-scoring is byte-for-byte unchanged; matches PRD's two-lens intent.
  • Consequences. Migration adds is_ai_auto_score (bool, default false) + ai_passing_grade (float, nullable). Contracts/entities/upsert extended. Phase 2 reads the AI columns.
  • Reversibility. High — drop the two columns; no human-path coupling.

ADR-2 — Widen scorecard_custom_parameters.prompt string → text

  • Context. A real judging rubric (PRD ~4,000 chars) does not fit a single-line string.
  • Options. A. change_column … :text (Postgres in-place, no rewrite for varchar→text). B. Add a new rubric text column and dual-write — Con: duplicate field, migration of an unused column for no benefit.
  • Decision. Option A. change_column :scorecard_custom_parameters, :prompt, :text.
  • Rationale. prompt is already the PRD's named field and is currently unused, so the widen is non-destructive (varchar→text widening preserves data).
  • Consequences. Migration in db/chatbot_gpt_migrate/; chatbot_gpt_schema.rb regen.
  • Reversibility. Low/risky (text→string truncates) — treat as forward-only; rollback is the flag, not the column type.

ADR-3 — auto_scorable is derived, not stored

  • Context. PRD: "non-empty rubric marks the param auto-scorable."
  • Options. A. Compute auto_scorable = prompt.present? at read time. B. Store a boolean column kept in sync — Con: drift risk, redundant with the source of truth.
  • Decision. Option A — expose auto_scorable in the response entity/builder.
  • Rationale. Single source of truth (prompt); no sync bug; Phase 2 re-derives the same way.
  • Consequences. Response entity gains a computed auto_scorable field; no schema change.
  • Reversibility. High.

ADR-4 — Default rubric via a new read-only endpoint from static config

  • Context. The 9 AI metrics (PROPOSED, DSAI-owned) have no table; PRD defines a default_rubric_viewed/default_rubric_load_failed fetch.
  • Options.
    • A. New read-only endpoint serving a Ruby constant/YAML (Constants::ScorecardAiDefaultRubric).
    • B. FE static constant — Con: no real load-failure mode; duplicates the list Phase 2 needs server-side.
    • C. Seed the 9 metrics into scorecard_parameters/categoriesCon: mixes AI metrics into human-scorecard tables; risks human auto-scorer picking them up.
  • Decision. Option A (confirmed by DRI).
  • Rationale. Server is the single source of truth; Phase 2 scoring reads the same constant; honors the PRD's fetch + failure event; no schema entanglement.
  • Consequences. New Grape resource + use case + entity; content carries status: PROPOSED.
  • Reversibility. High — delete endpoint + constant.

ADR-5 — Feature gate on ai_qa_unified_scorecard

  • Context. Surfaces must ship dark until Phase 2; Pro+Ent only.
  • Options. A. Reuse org-feature mechanism (BE OrganizationFeatures::FindFeature, FE $hasSubscription). B. New bespoke flag system — Con: reinvents an existing pattern.
  • Decision. Option A. BE guards the three surfaces (return 404/empty or feature_enabled:false); FE hides routes/controls via $hasSubscription('ai_qa_unified_scorecard').
  • Rationale. Matches existing AI-assist gating; plan-gating piggybacks on provisioning (A2).
  • Consequences. Feature row must be provisioned per org; default OFF.
  • Reversibility. High — toggle the feature off.

ADR-6 — Synchronous writes; async analytics

  • Context. Save P95 ≤ 500ms; events must not block saves.
  • Decision. Settings/rubric writes are synchronous single-row upserts (well under 500ms); Mixpanel events enqueued via SendMixpanelEventWorker.perform_async.
  • Options. No async needed for the write itself (no alternative considered — single-row DB write under budget). Analytics async is the existing pattern.
  • Consequences. Event delivery is best-effort and never fails the save.
  • Reversibility. High.

ADR-7 — Authorization reuses the existing role enum

  • Context. No qa_lead/bot_admin role exists; single-role enum {owner,admin,supervisor,agent,member} (hub-core user.rb:38-44).
  • Options. A. Keep set_role(%w[owner admin supervisor]); map QA Lead→supervisor, Bot/AI Admin→owner/admin. B. Introduce new roles — Con: cross-cutting change to hub-core
    • hub-service token issuance, far outside this initiative.
  • Decision. Option A (confirmed by DRI).
  • Rationale. Matches every existing scorecard endpoint; agent/member excluded (= "end CS agents: no access").
  • Consequences. Read + write on all three surfaces gate owner/admin/supervisor.
  • Reversibility. High; revisit if a QA role lands platform-wide.

ADR-8 — ai_passing_grade range 0–100 (diverges from human 1–99)

  • Context. PRD §6/§9 say AI threshold 0–100; existing human rule is 1–99 (patch.rb:18-22).
  • Options. A. Validate the new field 0–100 inclusive per PRD. B. Match human 1–99 for consistency — Con: contradicts PRD's stated bar (0 and 100 both meaningful).
  • Decision. Option Arule(:ai_passing_grade) { key.failure unless (0..100).cover?(value) }.
  • Rationale. New field, follow the PRD spec; 0 ("any pass") and 100 ("perfect only") are legitimate.
  • Consequences. Two different valid ranges in one table — documented; surfaced as a minor follow-up to align (§5 Open Q#4).
  • Reversibility. High — change the rule bound.
Minimum-coverage checklist
  • Storage — chatbot_gpt Postgres; new cols + widened prompt (ADR-1,2).
  • Sync vs async — sync writes, async analytics (ADR-6).
  • Caching — n/a — single-row reads, no cache; default rubric is a static constant.
  • Third-party — Mixpanel via existing worker (ADR-6); OpenAI is Phase 2.
  • Consistency — strong (single-row upsert, unique per org).
  • Multi-tenancy — org-scoped by organization_id (preference) / company_id (custom param) from the validated token; never client-supplied (§3 Security).
  • Reuse vs new — 2 extended endpoints + 1 new (ADR-4, Existing-Endpoint Check).

Detail 2.3 — Database Model

erDiagram
SCORECARD_PREFERENCES {
bigint id PK
string organization_id UK "unique where deleted_at IS NULL"
boolean is_auto_score "human (existing), default false"
float passing_grade "human (existing), 1-99"
boolean is_ai_auto_score "NEW, default false"
float ai_passing_grade "NEW, nullable, 0-100"
string company_id
datetime deleted_at
}
SCORECARD_CUSTOM_PARAMETERS {
uuid id PK
string name
string code
string description
text prompt "WIDENED string->text (AI judging rubric)"
string company_id "UK [code, company_id] where deleted_at IS NULL"
datetime deleted_at
}
SCORECARD_CATEGORIES_PARAMETERS }o--|| SCORECARD_CUSTOM_PARAMETERS : references

DDL (Rails DSL, chatbot_gpt connection — pattern: db/chatbot_gpt_migrate/20241113041150_*):

# db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb
class AddAiScoringToScorecardPreferences < ActiveRecord::Migration[7.1]
def change
add_column :scorecard_preferences, :is_ai_auto_score, :boolean, null: false, default: false
add_column :scorecard_preferences, :ai_passing_grade, :float
add_index :scorecard_preferences, :is_ai_auto_score
end
end

# db/chatbot_gpt_migrate/<ts+1>_widen_scorecard_custom_parameter_prompt.rb
class WidenScorecardCustomParameterPrompt < ActiveRecord::Migration[7.1]
def up
change_column :scorecard_custom_parameters, :prompt, :text
end

def down
change_column :scorecard_custom_parameters, :prompt, :string # WARNING: truncates >255
end
end

No data backfill (PRD §11). New AI columns default to "off/unset"; existing rows unaffected. Regenerate db/chatbot_gpt_schema.rb after migrating.

Per-status lifecycle: n/a — no status enum introduced (no new state machine; acts_as_paranoid soft-delete + paper_trail versioning already exist on both models and are unchanged).

State Surface Contract:

EntitySurfaced toField(s)VisibilityAudit
scorecard_preferencessettings pageis_ai_auto_score, ai_passing_grade (+ existing human)owner/admin/supervisor; flag onpaper_trail (existing)
scorecard_custom_parameterseditor + listprompt, derived auto_scorableowner/admin/supervisor; flag onpaper_trail (existing)
default rubric (static)viewer9 metrics + veto + PROPOSEDowner/admin/supervisor; flag onn/a (read-only constant)

Detail 2.4 — APIs (Outbound the FE consumes)

Base: chatbot frontend_service surface, called by FE $apiMain at /api (gpt_api.rb:28,33). Mirror the same Grape classes on the gpt_service surface (/v1/scorecards/..., gpt_service/api.rb:21,24) for parity. Auth: Bearer access-token + X-Auth-Token (validated via middlewares/auth.rb → hub-service /users/me).

Row 1 — extended — Preference (AI fields)

GET /api/v1/gpt/omnichannel/scorecard_preferences # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/omnichannel/scorecard_preferences

Request (PATCH):

{
"passing_grade": 75, // existing human (required by current contract)
"is_auto_score": true, // existing human (required)
"is_ai_auto_score": true, // NEW (optional; default false)
"ai_passing_grade": 80 // NEW (optional; validated 0..100 when present)
}

Response (GET/PATCH 200):

{
"data": {
"is_auto_score": true, "passing_grade": 75,
"is_ai_auto_score": true, "ai_passing_grade": 80
},
"message": "OK"
}

Errors: 422 ai_passing_grade non-numeric → coercion failure; 422 outside 0–100 → "AI passing grade only between 0 - 100 are allowed"; 403 role; 401 auth; 500 save fail.

New AI fields are optional in the contract so existing callers sending only human fields keep working (backward compat). Contract: optional(:is_ai_auto_score).maybe(:bool), optional(:ai_passing_grade).maybe(:float) — Dry coerces type before the range rule(:ai_passing_grade), so a non-numeric value 422s instead of raising. On read, absent ai_passing_gradeDEFAULT_AI_PASSING_GRADE (75), is_ai_auto_score → false.

Row 2 — extended — Custom parameter (rubric)

POST /api/v1/gpt/scorecard_custom_parameters # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/scorecard_custom_parameters/:id

Request adds:

{ "name": "BANT capture", "prompt": "Score how completely … 0-100 + which were missed." }

Response 200 adds:

{ "data": { "id": "<uuid>", "name": "BANT capture", "prompt": "…", "auto_scorable": true }, "message": "…" }

Errors: 422 prompt length > 4000 → "AI judging rubric cannot exceed 4000 characters."; existing 422 name rules; 403/401/500 as today.

Create vs update / duplicate handling (REV-5): POST creates, PATCH :id updates — these are not an upsert, so adding prompt does not change create semantics. Duplicate names are already rejected by the existing Repositories::Gpt::ScorecardCustomParameter::NameUniquenessValidator#validate_create (create.rb422 "The name field is already exist or name cannot be the same as default parameter"). The new prompt field is orthogonal to uniqueness; no new collision surface is introduced.

Row 3 — new-with-justification — Default AI rubric (read-only)

GET /api/v1/gpt/scorecard_ai_default_rubric # role: owner|admin|supervisor; flag-gated

Response 200:

{
"data": {
"status": "PROPOSED",
"group": "Qontak AI Quality (default)",
"metrics": [
{ "code": "groundedness", "name": "Groundedness / factual accuracy", "description": "Claims backed by KB sources or customer data; no invented product facts", "veto": true },
{ "code": "resolution", "name": "Resolution / task completion", "description": "Did it resolve the goal (skill_completed signal)", "veto": false },
{ "code": "relevance", "name": "Relevance / intent understanding", "description": "Addressed the real intent, not a different question", "veto": false },
{ "code": "policy", "name": "Policy & safety adherence", "description": "Stayed within 'what to avoid'; no unsafe content / PII leak", "veto": true },
{ "code": "tone", "name": "Tone & brand voice", "description": "Matched configured tone_of_voice; courteous", "veto": false },
{ "code": "language", "name": "Language quality (Bahasa)", "description": "Fluent target language; no broken/mixed language", "veto": false },
{ "code": "handoff", "name": "Handoff appropriateness", "description": "No false handover (Pattern A); no missed escalation", "veto": false },
{ "code": "tool", "name": "Tool / action correctness", "description": "Right action, right params, not skipped (Pattern B)", "veto": false },
{ "code": "efficiency", "name": "Conversation efficiency", "description": "No loops / re-asking; resolved within turn budget", "veto": false }
]
},
"message": "OK"
}

Errors: 500 → FE shows default_rubric_load_failed; 403/401.

APIs (Inbound — other services → us): n/a — Phase 1 adds no inbound webhook (the existing room-resolve webhook is unchanged).

Detail 2.A — UI Contract (FE)

Design status: Figma Pending (PRD). Interim spec = PRD Appendix B Stitch prompts. Components use Pixel3 (Mp*). New page under pages/settings/scorecard/; logic in modules/settings/views/ mirroring ai-assist.vue.

ComponentFile (new)PurposeKey Pixel3 elementsBacking endpoint
ScorecardSettingsPagepages/settings/scorecard/index.vueContainer; flag guardlayout + MpTabs/sectionspreference + default rubric
AutoScoreTogglemodules/settings/views/scorecard/auto-score-toggle.vueAI on-switch + ai_passing_grade (0–100)MpSwitch/MpFormControl/MpInput+MpFormErrorMessage/MpButton(is-loading)PATCH .../scorecard_preferences
DefaultRubricViewermodules/settings/views/scorecard/default-rubric-viewer.vueRead-only 9 metrics + 🛑 veto + PROPOSED noteMpText/MpBadge/skeletonGET .../scorecard_ai_default_rubric
CustomParamEditorpages/settings/scorecard/custom-parameters.vue + modules/settings/views/scorecard/custom-param-editor.vueAdd/edit param + rubric textarea + length counter + auto-scorable chipMpInput/MpTextarea/MpBadge/MpButtonPOST/PATCH .../scorecard_custom_parameters

Design ↔ Code Mapping: n/a — Figma pending; tokens follow the existing settings shell (ai-assist.vue). Any deviation re-checked once frames land (§5 Open Q#1).

Detail 2.B — Data-Fetching Strategy (FE)

  • New Pinia store chatbot-fe/store/scorecard/{state,actions,getters,types,index}.ts, mirroring store/ai-assist (fetchStatus: idle|pending|resolved|rejected).
  • New service chatbot-fe/common/services/main/v1/scorecard.ts using $apiMain + AbortController (pattern: ai-assist.ts:151-184). Endpoints added to common/services/main/endpoint.ts:
scorecard: {
preference: { get: "/v1/gpt/omnichannel/scorecard_preferences", update: "/v1/gpt/omnichannel/scorecard_preferences" },
customParam: { create: "/v1/gpt/scorecard_custom_parameters", update: "/v1/gpt/scorecard_custom_parameters", list: "/v1/gpt/scorecard_custom_parameters" },
defaultRubric: { get: "/v1/gpt/scorecard_ai_default_rubric" },
}
  • Fetch on page mount; optimistic UI not used (single Save action), matching ai-assist.vue.

Casing convention (REV-6): the BE returns snake_case keys; the FE consumes them directly without transformation, matching the existing pattern (e.g. store/ai-assist reads state.reply_limit straight off the API). Do not introduce a camelCase mapping layer for these endpoints — keep the snake_case field names end-to-end so the contract stays 1:1.

Typed contracts (REV-1) — store/scorecard/types.ts + common/services/main/v1/scorecard.ts:

// API request/response shapes (snake_case, matching BE Grape entities)
export interface ScorecardPreference {
is_auto_score: boolean // human (existing)
passing_grade: number // human (existing, 1–99)
is_ai_auto_score: boolean // NEW
ai_passing_grade: number | null // NEW (0–100; null → default 75 on read)
}
export interface CustomParam {
id: string
name: string
prompt: string // "" when manual-only
auto_scorable: boolean // derived = prompt non-empty
}
export interface DefaultRubricMetric {
code: string; name: string; description: string; veto: boolean
}
export interface DefaultRubric {
status: "PROPOSED"; group: string; metrics: DefaultRubricMetric[]
}
// Pinia store slice (mirrors store/ai-assist fetchStatus pattern)
type FetchStatus = "idle" | "pending" | "resolved" | "rejected"
export interface ScorecardState {
preference: { data?: ScorecardPreference; fetchStatus: FetchStatus }
preferenceUpdate: { fetchStatus: FetchStatus }
customParams: { data?: CustomParam[]; fetchStatus: FetchStatus }
customParamSave: { fetchStatus: FetchStatus }
defaultRubric: { data?: DefaultRubric; fetchStatus: FetchStatus }
}

Component props/emits types are deferred to implementation, inferred from the ai-assist.vue family (RFC §5 #7) — low risk, single owning module.

Detail 2.C — UI State Matrix

stateDiagram-v2
[*] --> Loading: Open Scorecard settings
Loading --> Empty: No custom params yet
Loading --> Success: Saved config loaded
Loading --> Error: Load / save fails
Error --> Loading: Retry
Empty --> Success: Add first custom param
Success --> [*]: Config saved
StateAutoScoreToggleCustomParamEditorDefaultRubricViewer
Loadingfields disabled + spinnertextarea disabled + spinnerskeleton list
Emptydefaults (off / 75)"No custom parameters…" + add hintn/a — 9 defaults always exist
Error$toast error + Retry; log scorecard_settings_save_failed$toast + Retry; log scorecard_custom_param_save_failed"Couldn't load the default rubric." + Retry; log default_rubric_load_failed
Success"Change saved""Saved — will be auto-scored when scoring ships"; chip lit if rubric present9 metrics listed, veto badges

Detail 2.D — Scope Boundaries

In scopeOut of scope
AI cols + prompt widen; 3 endpoints; 4 FE components + store/service; flag gating; analytics events; OpenAPIAny scoring/computation; in-room panel; report; gate; the room-resolve is_custom_parameter skip guard; new roles; plan-tier code; i18n introduction

Detail 2.E — Branch & Skip Catalog

Branch / skipConditionBehaviorOwner
Rubric auto-scorable gateprompt.present?non-empty → auto_scorable:true; empty → manual-only (false)BE (custom-param entity) — §2.1b flowchart, S02/AC-3, NEG-1
Flag-off skipai_qa_unified_scorecard disabled for orgFE hides routes/controls ($hasSubscription); BE returns flag-gated empty/404FE + BE (ADR-5), S01/NEG-1
Plan-not-eligible skipStarter/Free org (feature not provisioned)Same as flag-off (no surface)Provisioning (A2), S01/NEG-1
Unauthorized skiprole ∈ {agent, member}controls not rendered (FE); 403 (BE)§3 Role matrix
AI-enable-without-Phase-2is_ai_auto_score=true pre-Phase-2recorded preference, no scores produced (inert)A1; PRD Open Q#3
Room-resolve AI skip (Phase 2, NOT built here)existing unless is_custom_parameter || scorecard_exists guardunchanged in Phase 1; Phase 2 must revisitBE (forward note §1)

Detail 2.G — Cross-Layer Contract Verification

EndpointPRD-to-Schema row (§1.A.5)Interim design (App. B)Match?
PATCH preference (AI fields)rows 1–2 (AI on-switch, AI threshold)Stitch #1yes
POST/PATCH custom param (prompt)rows 3–4 (org rubric, derived auto_scorable)Stitch #2yes
GET default rubricrow 5 (9 default metrics)Stitch #1 (viewer block)yes

3. High-Availability & Security

Performance Requirement

Save P95 ≤ 500ms (PRD §6). Single-row upsert on an org-unique index; default-rubric is an in-memory constant. No N+1 (custom-param list already paginated/scoped by company_id).

Monitoring & Alerting

Reuse Mixpanel + the squad dashboard (owner: BOT). Events (PRD §12): scorecard_settings_updated, scorecard_settings_save_failed, scorecard_custom_param_saved, scorecard_custom_param_save_failed, default_rubric_viewed, default_rubric_load_failed. Alert: scorecard_settings_save_failed + scorecard_custom_param_save_failed rate > 5% in 1h → Slack #bot-ai-oncall. (Naming mirrors existing [CHATBOT] Mixpanel events in chatbot-fe/common/contants/mixpanel-events.ts.)

Logging

Server errors via Rollbar.error(e) (existing pattern in the use cases); structured request logs via lograge. Never log full prompt content at error level (may contain org IP) — log org_id, custom_param_id, reason only (matches PRD event props).

Tracing (REV-3)

The BE already runs ddtrace (Datadog) + Aegis/OpenTelemetry (chatbot/Gemfile). The three new / extended endpoints are ordinary Grape requests, so they inherit existing request spans automatically — no new instrumentation needed. Distributed FE→API→BE trace correlation is explicitly out of scope for Phase 1 (no new correlation-id propagation is added); on-call follows an FE error to the BE via the existing per-request Datadog span + the *_save_failed Mixpanel event's org_id. Revisit cross-tier trace stitching with the Phase-2 scoring pipeline, where the async OpenAI call makes it materially useful.

Security Implications

  • AuthN: every endpoint behind middlewares/auth.rb (Bearer + X-Auth-Token → hub-service).
  • AuthZ: set_role(%w[owner admin supervisor]) on all three (read + write). agent/member → 403.
  • Tenancy (critical): organization_id (preference) and company_id (custom param) are taken only from current_user (the validated token) — never the request body. This matches the existing endpoints: preference passes current_user[:organization_id] (scorecard_preferences.rb get/patch), custom param passes current_user.try(:[], 'company_id') (scorecard_custom_parameter.rb post/patch). The new prompt/AI fields must not introduce a body-supplied org/company id. Add a request-spec assertion that a token for org A cannot read/write org B's preference or params (cross-tenant write → scoped to token org).
  • Input validation: ai_passing_grade coerced to :float then range-checked 0–100 (non-numeric → 422, never a raised exception). prompt capped 4,000 chars server-side; strip null bytes / control characters before persist so Phase-2 prompt assembly can't be broken by injected control chars.
  • Injection / XSS: prompt is stored and rendered as text (an LLM instruction, not HTML). Custom-param name/description already pass sanitize_html (scorecard_custom_parameter.rb before_save); prompt does not need HTML sanitization but all custom-param text fields (name, description, prompt) must render via Vue interpolation, never v-html (Vue escapes by default).
  • Prompt-injection (forward-looking): the prompt becomes part of an LLM system prompt in Phase 2; Phase 1 only stores it. Note for Phase 2: treat stored rubric as untrusted input.
  • Audit: paper_trail already records versions on both models; ensure whodunnit is populated from current_user on these write paths (verify the existing PaperTrail.request.whodunnit wiring covers Grape requests — if not, set it from the token user in the use case).
  • Secrets / PII in logs: never log full prompt (org IP). Rollbar.error(e) is already used; add prompt (and system_prompt) to the Rollbar param scrub list so request bodies aren't captured. Events log only org_id / custom_param_id / has_rubric / reason (PRD §12).
  • DoS / size: prompt capped at 4,000 chars server-side (not just client); writes rely on the platform's existing request rate limiting (no new endpoint-specific limiter introduced).
  • AuthZ on default-rubric endpoint: serves only static, non-tenant config but still requires auth + set_role (no anonymous access to the rubric).
  • Data governance / retention (REV-7): the custom-param prompt is org-authored configuration (org IP), not end-customer PII — Phase 1 stores no conversation/customer data. Retention follows the existing acts_as_paranoid soft-delete on scorecard_custom_parameters (deleting a param soft-deletes its rubric); paper_trail versions persist for audit. The rubric is therefore out of scope for end-customer DSAR/export (it is account config, handled by normal account-deletion processes), and stays within the existing chatbot_gpt data boundary — no new data export, no new third-party data egress in Phase 1 (the rubric reaches OpenAI only in Phase 2, which owns that data-flow review).

Role × Endpoint Authorization

Endpointowneradminsupervisoragentmember
GET/PATCH preference❌ 403❌ 403
POST/PATCH custom param❌ 403❌ 403
GET default rubric❌ 403❌ 403

Detail 3.A — Failure Mode Catalog

FailureDetectionBehaviorRecovery
ai_passing_grade out of 0–100Dry rule422, nothing savedFE inline error (Vuelidate mirror)
prompt > 4000Dry macro422, nothing savedFE length counter blocks + server reject
DB write failssave! raises → rescued500, no partial state (single-row tx)FE $toast + Retry; log *_save_failed
Concurrent saves (two admins)org-unique index + single-row upsertlast-write-wins; no partial row; paper_trail keeps both versionsacceptable for a config row; no lock needed
hub-service down / slowmiddlewares/auth.rbRepositories::ChatService::Users::Me returns nil401 "User service unavailable"FE re-auth. Timeout/retry of this auth call is inherited from the existing middleware — out of scope to change here.
default-rubric load fails500 / networkFE error stateRetry; default_rubric_load_failed
flag offBE gate + FE $hasSubscriptionsurfaces not rendered / 404n/a (by design)

Detail 3.B — Error Message Catalog

CodeMessageSurface
422"AI passing grade only between 0 - 100 are allowed"toggle
422"AI judging rubric cannot exceed 4000 characters."editor
500"Couldn't save. Try again."toggle/editor
500"Couldn't load the default rubric."viewer
403"Permission denied" (existing)all

Detail 3.C — Accessibility

Pixel3 components are used as-is (existing settings a11y). New textarea has an associated MpFormLabel; veto status conveyed by text + badge (not color alone); length counter has aria-live=polite. Keyboard: Save reachable via tab; errors announced via MpFormErrorMessage.

Detail 3.D — Browser Support & FE Performance Budget (REV-4)

  • Browser support: inherits the existing chatbot-fe Nuxt 4 target — no new matrix introduced; the new pages must work on the same browsers the current /settings/* pages support (no new polyfills, no APIs beyond what ai-assist.vue already uses).
  • FE performance budget: the scorecard settings route is a lazy-loaded page (Nuxt route-level code-split, like other pages/settings/*), so it adds no weight to the initial bundle. The default-rubric list is 9 static rows and the custom-param list is already paginated server-side — no large client render. No new heavy dependency is added (reuses Pixel3
    • Vuelidate + mixpanel-browser already in package.json).

4. Backwards Compatibility and Rollout Plan

Compatibility

Additive only. New AI columns default off/unset; existing GET/PATCH callers that omit AI fields keep working (AI fields optional). Human auto-scoring (auto_agent_scoring.rb) reads only is_auto_score/passing_grade — untouched. prompt widen preserves existing data.

Rollout Strategy

Flag ai_qa_unified_scorecard default OFF. Stage 1 internal QA (3–5 accounts); Stage 2 closed beta (TransGo, Talenta LMS + 3 partners, dark); held for customer GA with Phase 2 scoring (PRD §11/§14). Detailed scheduling lives in delivery/ (not here).

Cross-Layer Rollout Compatibility

OrderStepSafe if FE not yet shipped?Safe if BE not yet shipped?
1BE migration (add cols, widen prompt)yes (inert columns)
2BE API + OpenAPIyes (flag-gated, optional fields)
3FE behind $hasSubscriptionFE no-ops (feature off)
4Provision feature for beta orgs

Deploy BE before FE. Rollback FE before BE (FE depends on BE fields, not vice-versa).

Detail 4.A — Configuration Contract

KeyTypeDefaultWhere
ai_qa_unified_scorecardorg feature (ChatbotGpt::Feature + OrganizationFeature)OFFBE OrganizationFeatures::FindFeature; FE $hasSubscription
DEFAULT_AI_PASSING_GRADEconstant (entity)75 (decided for build; mirrors human default)entities/frontend_services/gpt/scorecard_preference.rb
prompt max lengthconstant in Dry macro4000 (decided for build — PRD Open Q#2 confirmation is advisory; tunable)custom-param create/update use cases
ai_passing_grade valid rangeDry rule0–100 inclusive (decided — ADR-8)scorecard_preference/patch.rb
default rubric contentConstants::ScorecardAiDefaultRubric9 metrics, content status:"PROPOSED" (served as-is)new constant/config (seed from PRD App A)

These values are locked for the Phase-1 build so the agent has no ambiguity. DSAI rubric confirmation (Open Q#2) and the max-length confirmation (Open Q#3) are advisory follow-ups that change config constants only — they do not block implementation.

Detail 4.B — Test Plan (commands from the repos)

Backend (chatbot/AGENTS.md:56-187,235-247):

# migrate the chatbot_gpt DB in test, then run specs
RAILS_ENV=test bundle exec rails db:migrate
bundle exec rspec spec/api/frontend_service/v1/gpt_spec.rb \
spec/core/use_cases/api/frontend_service/v1/gpt
bundle exec rubocop
bundle exec brakeman
bundle exec fasterer && bundle exec reek
# OpenAPI (MANDATORY when endpoints change):
ruby scripts/openapi/split.rb
npx --yes @apidevtools/swagger-cli validate docs/openapi/openapi.yaml
npx --yes @stoplight/spectral-cli lint docs/openapi/openapi.yaml --fail-severity=error

Frontend (chatbot-fe/package.json:10-22):

pnpm lint
pnpm test # vitest run
pnpm test:e2e # playwright (visual/e2e)
pnpm build

Cross-boundary contract test (REV-2). Because FE and BE land in separate repos/PRs, pin the contract on both sides so a casing/shape drift fails CI rather than production:

  • BE (RSpec request spec): assert the PATCH-preference response and the custom-param response serialize exactly {is_auto_score, passing_grade, is_ai_auto_score, ai_passing_grade} and {id, name, prompt, auto_scorable} (snake_case, auto_scorable derived) — this is the authoritative contract. Add to spec/api/frontend_service/v1/gpt_spec.rb.
  • FE (Vitest service test): assert common/services/main/v1/scorecard.ts parses a fixture whose shape is copied verbatim from the BE spec's expected JSON into the ScorecardPreference / CustomParam / DefaultRubric interfaces (§2.B), and that the store maps auto_scorable → the chip. The shared fixture is the contract anchor: if the BE entity changes a key, the BE spec changes the fixture, and the FE test (using the same fixture) breaks — catching the drift.

Detail 4.C — Agent Execution Plan

Order respects dependencies (migration → BE API → OpenAPI → FE). Each chunk has files + commands + assertable acceptance. Use the chatbot repo's openapi-spec-sync skill for chunks touching endpoints.

#ChunkFilesCommandsAcceptance
1Migration: AI columnschatbot/db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb; regen db/chatbot_gpt_schema.rbRAILS_ENV=test bundle exec rails db:migrateschema shows is_ai_auto_score (bool default false) + ai_passing_grade (float)
2Migration: widen promptchatbot/db/chatbot_gpt_migrate/<ts>_widen_scorecard_custom_parameter_prompt.rb; regen schemasameprompt column type = text
3BE preference AI fieldsapp/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb; .../scorecard_preference/patch.rb,get.rb; repositories/gpt/scorecard_preferences/upsert.rb,find_by.rb; entities/frontend_services/gpt/scorecard_preference.rb (+DEFAULT_AI_PASSING_GRADE); api/.../entities/gpt/scorecard_preference.rb & get_scorecard_preference_response.rbbundle exec rspec spec/.../gptnew request spec: PATCH with ai_passing_grade:80 persists & GET returns it; ai_passing_grade:150→422
4BE custom-param promptapp/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb; .../scorecard_custom_parameter/create.rb,update.rb (+validate_prompt_length); repositories/gpt/scorecard_custom_parameter/create.rb,update.rb; entities/.../gpt/scorecard_custom_parameter.rb + response entity (prompt,auto_scorable)bundle exec rspecPOST with promptauto_scorable:true; empty→false; 4001 chars→422
5BE default-rubric endpointapp/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb (NEW); use case + entity; Constants::ScorecardAiDefaultRubric (seed the 9 metrics' code/name/description/veto verbatim from PRD Appendix A Tier-1, exactly as in §2.4 Row 3; the per-metric LLM judge prompts stay out — they belong to Phase 2 scoring); mount in app/api/frontend_service/gpt_api.rb (+ app/api/gpt_service/api.rb)bundle exec rspecGET returns the 9 metrics with the §2.4 descriptions, veto:true on groundedness+policy, status:"PROPOSED"
6OpenAPI syncdocs/openapi/openapi.yaml + dist/{frontend,gpt}.yaml; docs/openapi/SESSION-LOG.mdruby scripts/openapi/split.rb; swagger-cli + spectralboth validators pass; roleScopedAuth overlay present for the 3 ops
7FE store+service+endpointschatbot-fe/store/scorecard/*; common/services/main/v1/scorecard.ts; common/services/main/endpoint.ts; register in mainService indexpnpm teststore action unit tests: pending→resolved/rejected transitions
8FE settings page + togglepages/settings/scorecard/index.vue; modules/settings/views/scorecard/auto-score-toggle.vue; Vuelidate 0–100; $hasSubscription guard; mixpanel eventspnpm test; pnpm lintcomponent test: out-of-range shows error; flag off → not rendered; save fires scorecard_settings_updated
9FE custom-param editorpages/settings/scorecard/custom-parameters.vue; modules/settings/views/scorecard/custom-param-editor.vue; length counter + auto-scorable chippnpm testchip lights when textarea non-empty; >4000 blocked; saved toast
10FE default-rubric viewermodules/settings/views/scorecard/default-rubric-viewer.vue; veto badges + PROPOSED note; error+retrypnpm testrenders 9 metrics + veto badges; error state shows retry + logs event
11Cross-boundary contract testBE: add expected-JSON assertions to spec/api/frontend_service/v1/gpt_spec.rb; FE: tests/unit/.../scorecard.service.test.ts parsing the same fixture into §2.B interfacesbundle exec rspec; pnpm testshared fixture → BE serializes it, FE parses it; a key/casing change breaks both (REV-2)
12Test specs docdocuments/chatbot/unified-agent-scorecard/tests/phase-1-settings-and-rubric-config.mdn/acovers_acceptance_criteria lists every UASC-S0x/AC-n

Detail 4.D — Verification & Rollback Recipe

Pre-merge (in order): BE rails db:migraterspecrubocop/brakeman → OpenAPI split.rb + swagger-cli + spectral; FE pnpm lintpnpm testpnpm build.

Post-deploy signals:

  • scorecard_settings_updated events appear for beta orgs; save-failure events < 5%/h.
  • Manual: enable flag for one internal org → toggle AI scoring + set threshold + add a custom param with a rubric → reload → values persist; default rubric lists 9 metrics.
  • Regression: human scorecard config + a resolved-room human auto-score still behaves as before.
  • Adoption leading indicators (PRD §13 — settings save success ≥ 99%): track config-readiness = % beta Pro+Ent orgs that enabled is_ai_auto_score + accepted the default rubric or added ≥ 1 custom param (target ≥ 80% before Phase-2 GA), and scorecard_custom_param_saved count (target ≥ 1 per beta org) via the scorecard_settings_updated / scorecard_custom_param_saved events. (These are PM-owned program metrics; the RFC only ensures the events exist to compute them.)

Rollback (numbered):

  1. Disable ai_qa_unified_scorecard for affected orgs (instant; surfaces vanish).
  2. If BE bug: revert the BE PR (AI fields optional → no caller breaks).
  3. If FE bug: revert the FE PR (BE inert without FE).
  4. Columns/prompt widen are forward-only — do not run the prompt down in prod (truncation). Leave columns; they are inert when the flag is off.
  5. Confirm scorecard_settings_save_failed returns to baseline and human auto-scoring intact.

5. Concerns, Questions, or Known Limitations

#ItemTypeOwnerStatus
1Figma frames for both surfaces are pending; FE built against PRD Appendix B Stitch prompts until frames land — pixel deviations re-checked then.Blocker (FE fidelity)DesignOpen (PRD Dep "Design — YES")
2Confirm the 9 default metric definitions/order with DSAI (Appendix A is PROPOSED).Open (accuracy)DSAIPRD Open Q#1, due 2026-07-15
3prompt max length = 4,000 adopted; confirm.OpenBOT+PMPRD Open Q#2
4ai_passing_grade (0–100) diverges from human passing_grade (1–99). Align later?Known limitationBOTNew (ADR-8)
5No plan-tier check exists in code; plan-gating relies on provisioning the org-feature only for Pro+Ent. Confirm provisioning owner/path.OpenBilling/Provisioning + PMNew (A2)
6Forward-looking: Phase 2 must change the room-resolve skip guard (is_custom_parameter) so AI scoring with custom params runs.Forward noteBOTPhase 2
7REV-1/REV-9 — decide whether the new components' props/emits get explicit TS contracts in this RFC or are inferred at implementation time (store + service types are specified in §2.B). rfc-reviewer R3 note: the reference target modules/settings/views/ai-assist.vue is <script setup> with no defineProps/defineEmits, so it offers no prop contract to copy — typing the 3 components in §2.A is the recommended close.Open (low risk)BOT FEFrom rfc-reviewer R1 (sharpened R3)
8REV-8plugins/botSubscriptionFeature.ts ($hasSubscription, cited by ADR-5/§2.0) is marked @deprecated in favor of the useSubscription composable. Confirm whether new FE uses $hasSubscription for parity or migrates to useSubscription.Open (low risk)BOT FEFrom rfc-reviewer R3

6. Comment Log

DateAuthorNote
2026-06-20rfc-starter (Claude)Initial draft from PRD v1.2; grounded against chatbot, chatbot-fe, hub-core, hub-service. Corrected PRD persona/role premise (no qa_lead/bot_admin role) and confirmed auto_agent_scoring.rb scores the human agent.
2026-06-20rfc-reviewer (Claude)R1 review → 8.0 (Strong/PROCEED). Applied R2 fixes: typed store/service contracts + casing convention (§2.B, REV-1/REV-6), FE↔BE contract test (§4.B/§4.C ch.11, REV-2), tracing scope (§3, REV-3), browser/perf budget (§3.D, REV-4), custom-param dedup note (§2.4 r2, REV-5), data-governance/retention (§3, REV-7). Re-score → 8.5. Open: REV-1 (component prop typing) carried to §5 #7. See rfc-phase-1-settings-and-rubric-config-review.md.

7. Ready for Agent Execution

  • Every PRD story + composite AC id traced in §1.A.4 / §1.C
  • Architecture, ER, sequence (see below), state, and branch diagrams — happy + failure paths
  • Every endpoint tagged reused/extended/new with evidence (Existing-Endpoint Check)
  • Source Verification table backs every "existing" claim with file:line
  • ADRs cover storage, sync/async, caching, third-party, consistency, multi-tenancy, reuse/new
  • Test/build commands sourced from chatbot/AGENTS.md + chatbot-fe/package.json
  • Agent Execution Plan: ordered chunks with files + commands + assertable acceptance
  • Rollback recipe concrete (flag → revert → forward-only columns)
  • Figma frames for both surfaces (Open Q#1) — FE chunks proceed on interim Stitch spec; re-verify on frames
  • Infosec approver sign-off at review (Metadata)
  • DSAI confirmation of the 9 metrics (Open Q#2) — non-blocking for build (served PROPOSED)

Ready for agent execution: yes (backend + FE logic). FE visual fidelity is gated on Figma (Open Q#1); build the components against the interim Stitch spec and reconcile when frames land. Optionally hand to rfc-reviewer for a second-pass score.