RFC: Unified Agent Quality Scorecard — Phase 1: Scorecard Settings & Rubric Config

Document Conventions (do not remove)

This RFC follows the Qontak RFC Template format for governance — the metadata table, sections 1–6, and Comment log are mandatory.

It is also agent-execution-ready: §1 PRD-to-Schema Derivation (BE half) + §2.A UI Contract (FE half), §2.0 Repo Reading Guide for both layers, mermaid diagrams, §2.G Cross-Layer Contract Verification, and §4 Agent Execution Plan + Verification & Rollback Recipe are complete.

Agent-execution-ready RFC derived 1:1 from ../prds/phase-1-settings-and-rubric-config.md. Phase 1 builds the config layer only — no scoring, in-room panel, report, or gate. Backend = Qontak Chatbot (chatbot, Rails 7.1 / Grape / Clean Architecture). Frontend = Qontak Chatbot FE (chatbot-fe, Nuxt 4 / Vue 3 / Pinia / Pixel3).

Metadata

Field	Value	Notes
Status	`RFC`	Working vocabulary `IDEA`/`RFC`/`AGREED`/`ABANDON`. YAML `status:` uses linter enum (`RFC→in-review`); kept `draft` until reviewed.
Owner (DRI)	Dimas Fauzi Hidayat	Mirrors frontmatter `dri`. Single accountable owner; staffing lives in `delivery/`.
Source PRD	`../prds/phase-1-settings-and-rubric-config.md`	PRD v1.2.
Anchor	`../unified-agent-scorecard-anchor.md`	Initiative master index.
Delivery	`not yet handed to delivery`	Timeline/effort/rollout scheduling lives in `delivery/`.
Type	full-stack	Backend (Grape API + chatbot_gpt DB) + Frontend (Nuxt settings UI).
Squad	BOT — Bot, AI & Automation
Infosec approver	required at review — see §7	Touches auth-gated org config + audited PII-adjacent settings.
Last Updated	2026-06-20

Sections at a Glance

§	Section	Type hint
§1	Overview, traceability, decisions index	PRD coverage, AC map, schema derivation
§2	Technical design	Repo reading guide, infra topology, ADRs, ER + sequence diagrams, API + UI contracts
§3	HA & Security	Perf, auth matrix, failure catalog, error catalog, a11y
§4	Backwards compat & rollout	Flag contract, test plan, agent execution plan, rollback recipe
§5	Concerns / open questions	Carried from PRD + grounding gaps
§6	Comment log
§7	Ready for agent execution	The readiness gate

1. Overview

Phase 1 ships the configuration layer for AI-agent quality scoring, behind the ai_qa_unified_scorecard feature flag. It does three things and nothing more:

Extends scorecard_preferences with an AI on-switch + AI pass threshold (is_ai_auto_score, ai_passing_grade), leaving the existing human auto-score (is_auto_score / passing_grade, which already drive auto_agent_scoring.rb) untouched.
Wires the dormant scorecard_custom_parameters.prompt field into the API + UI as an "AI judging rubric", widening it string → text. A non-empty rubric marks a custom parameter auto-scorable (a derived attribute).
Surfaces a read-only Default AI Rubric viewer (the 9 Qontak-calibrated metrics), served by a new read-only backend endpoint from static config.

No scores are produced in Phase 1. The persisted config is consumed by the Phase 2 scoring pipeline.

Success Criteria

AI scoring preference (is_ai_auto_score + ai_passing_grade) persists per org and round-trips through GET; settings save success ≥ 99% (PRD §13).
A custom parameter with a non-empty prompt persists and is returned auto_scorable: true; empty prompt returns auto_scorable: false.
The 9-metric default rubric loads read-only with veto flags on Groundedness + Policy.
All three surfaces are invisible unless ai_qa_unified_scorecard is enabled for the org.
Save P95 ≤ 500ms (PRD §6).
Human manual scorecard config and human auto-scoring behavior are byte-for-byte unchanged.

Out of Scope

Phase 2 scoring pipeline, in-room panel, multi-actor scoring, Analytics report (P3), validation harness (P4), go-live gate (P5), mobile, billing/packaging, any change to human manual scoring or auto_agent_scoring.rb runtime behavior. See PRD §5.

Forward-looking note (not built here): the room-resolve trigger at chatbot/app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63 currently skips AutoAgentScoringWorker when custom parameters exist (unless is_custom_parameter || scorecard_exists). Phase 2 must revisit this guard so AI scoring with custom params actually runs. Phase 1 changes nothing here.

Document	Path	What was taken from it
Phase 1 PRD v1.2	`../prds/phase-1-settings-and-rubric-config.md`	All requirements, ACs, rubric content.
Initiative anchor	`../unified-agent-scorecard-anchor.md`	Phase map; confirms `auto_agent_scoring.rb` scores the human agent only.
Phase 2 PRD	`../prds/phase-2-auto-scoring-and-in-room-scorecard.md`	Reviewed — consumer of this config; no Phase 1 impact.
chatbot AGENTS.md API rules	`chatbot/AGENTS.md` §"API Specification Rules"	Mandatory OpenAPI bundle/split/validate workflow (§4.B).

Assumptions

A1 — Enabling is_ai_auto_score before Phase 2 exists is a recorded preference with no customer-visible effect (PRD Open Q#3). Grounded: GET/PATCH already persist a preference with no runtime side effect beyond the human path; the new AI columns are inert until Phase 2 reads them.
A2 — Plan-gating (Pro+Ent only) is enforced by provisioning the ai_qa_unified_scorecard org-feature only for eligible plans; the build only checks the flag. (No plan-tier check exists in code to reuse — see §5 Open Q#5.)
A3 — prompt max length is 4,000 chars (PRD Open Q#2 proposed value). Adopted as the build default; tunable via the validation macro.

Dependencies

Dependency	Owner	Needed	Blocking?
`prompt` widen `string→text` migration	BOT (this RFC)	DDL	NO — in scope (§2.3).
Design frames (settings + rubric editor)	Design squad	Figma for CHG-001/CHG-002	YES for FE pixel-faithfulness — see §5 Open Q#1. Stitch prompts in PRD Appendix B are the interim spec.
DSAI 9-metric definitions	DSAI	Confirm default rubric content	NO for build · advisory for accuracy (rubric served as PROPOSED).
`ai_qa_unified_scorecard` org-feature provisioning	Billing/Provisioning	Feature rows for Pro+Ent orgs	NO for build (defaults OFF).

Detail 1.A — Coverage Matrices

1.A.1 — PRD Section Coverage

PRD §	Title	Covered in
2	One-liner + Problem	§1 Overview
3	What happens if we don't build	§1 (motivation)
4	Target users + persona	§3 Role × Endpoint matrix
5	Non-Goals	§1 Out of Scope
6	Constraints	§3 Performance, §4 Flag contract, §3 Role matrix
7	Feature Changes (CHG-001/002)	§2.3 DDL, §2.4 APIs, §2.A UI
8	New Features (editor + viewer)	§2.A UI Contract, §2.4 (new endpoint)
9	API & Webhook Behavior	§2.4 APIs
10	System Flow + Stories + ACs	§1.C, §2.1a Sequence, §1.A.4 AC map
11	Rollout	§4 Rollout
12	Observability	§3 Monitoring & Logging
13	Success Metrics	§1 Success Criteria, §4.D signals
14	Launch Plan & Stage Gates	§4 Rollout (technical view; scheduling → delivery/)
15	Dependencies	§1 Dependencies
16	Key Decisions + Alternatives	§1.B + §2 ADRs
17	Open Questions	§5
App. A	AI Scoring Rubric	§2.4 default-rubric endpoint payload
App. B	Stitch UI Prompts	§2.A interim design spec

1.A.2 — UI / Consumer Surface Coverage

Surface	PRD ref	Backing read endpoint	RFC anchor
Scorecard settings page `/settings/scorecard`	CHG-001, S01, S03	`GET .../scorecard_preferences` + `GET .../scorecard_ai_default_rubric`	§2.A, §2.4
Custom-parameter editor `/settings/scorecard/custom-parameters`	CHG-002, S02	`GET .../scorecard_custom_parameters` (existing list)	§2.A, §2.4
Default Rubric viewer (within settings)	S03	`GET .../scorecard_ai_default_rubric` (new)	§2.A, §2.4

1.A.3 — Role Coverage

PRD persona	Grounded role (hub-core `user.rb:38-44` enum)	Access
QA Lead / Supervisor	`supervisor`	Read all; write threshold + custom rubric
Bot / AI Admin (Agent Owner)	`owner` / `admin`	Read all; write threshold + custom rubric
End CS agent	`agent` / `member`	No access (controls not rendered; API 403)

Grounding correction (PRD vs code): the PRD names "QA Lead" and "Bot/AI Admin" roles. The platform has no such roles. The single-role enum issued by hub-service /users/me is {owner, admin, supervisor, agent, member} — supervisor is the closest to QA Lead. Per confirmed decision, scorecard writes keep set_role(%w[owner admin supervisor]) and map the personas onto these roles (ADR-7).

1.A.4 — Acceptance-Criteria → Design Element Map

PRD Story	Composite AC ids	Design element	Test spec ref
`UASC-S01 — Enable AI auto-scoring + threshold`	`UASC-S01/AC-1`, `/AC-2`, `/AC-3`, `/ERR-1`, `/NEG-1`	§2.3 new cols · §2.4 row 1 (PATCH pref) · §2.A AutoScoreToggle · §4.C chunks 1,3,6	`tests/phase-1-settings-and-rubric-config.md`
`UASC-S02 — Custom param + rubric`	`UASC-S02/AC-1..AC-4`, `/ERR-1`, `/NEG-1`	§2.3 `prompt` text · §2.4 row 2 (custom param) · §2.A CustomParamEditor · §4.C chunks 2,4,7	″
`UASC-S03 — View default rubric`	`UASC-S03/AC-1`, `/AC-2`, `/AC-3`, `/ERR-1`	§2.4 row 3 (new endpoint) · §2.A DefaultRubricViewer · §4.C chunks 5,8	″

1.A.5 — PRD-to-Schema Derivation (BE)

PRD entity/attribute/rule	table.column	Exposed by	Enforced at	PRD ref
AI auto-score on-switch	`scorecard_preferences.is_ai_auto_score` (new, bool, default false)	GET/PATCH preference	Dry contract (optional bool), upsert repo	CHG-001
AI pass threshold	`scorecard_preferences.ai_passing_grade` (new, float, nullable)	GET/PATCH preference	Dry contract rule 0–100	CHG-001, S01/AC-2,3
Org-specific AI rubric	`scorecard_custom_parameters.prompt` (`string→text`)	POST/PATCH custom param; list GET	Dry contract length ≤ 4000	CHG-002, S02
Auto-scorable flag	derived `prompt.present?` (not stored)	custom param response `auto_scorable`	computed in entity/builder	S02/AC-1,3; NEG-1
9 default AI metrics + veto flags	static config (no table)	new read-only endpoint	constant + Grape entity	S03, App. A

Detail 1.B — Decisions Closed (index → §2 ADRs)

#	Decision	ADR
1	New columns `is_ai_auto_score` + `ai_passing_grade`, not overloading existing human cols	ADR-1
2	Widen `prompt` `string→text` (Postgres `change_column`)	ADR-2
3	`auto_scorable` is derived from `prompt.present?`, not a stored boolean	ADR-3
4	Default rubric served by a new read-only endpoint from static Ruby config	ADR-4
5	Gate all surfaces on `ai_qa_unified_scorecard` (BE: `OrganizationFeatures::FindFeature`; FE: `$hasSubscription`)	ADR-5
6	Settings/rubric writes are synchronous; analytics fired async via `SendMixpanelEventWorker`	ADR-6
7	Reuse `set_role(%w[owner admin supervisor])`; map PRD personas onto the existing enum	ADR-7
8	`ai_passing_grade` validated 0–100 inclusive (PRD), diverging from human `passing_grade` 1–99	ADR-8

Detail 1.C — Per-Story Change Map

Story	Layer scope	Changes (concrete artifacts)	Acceptance criteria	RFC anchors
UASC-S01	FE + BE	BE: migration add 2 cols; `ScorecardPreference::Patch`/`Get` contracts + defaults; `scorecard_preferences` Grape params; entity `ScorecardPreference` (FE-svc + gpt-svc); `Upsert`/`FindBy` repos; entity `Entities::FrontendServices::Gpt::ScorecardPreference` + `DEFAULT_AI_PASSING_GRADE`; OpenAPI. FE: `AutoScoreToggle.vue`; `store/scorecard` state/actions; `scorecard.ts` service + `endpoint.ts`; Vuelidate 0–100; mixpanel `scorecard_settings_updated`/`_save_failed`.	`UASC-S01/AC-1,2` persist+roundtrip; `AC-3` 0–100 validation rejects; `ERR-1` error+retry, no partial state, log; `NEG-1` Starter/Free hidden (flag off).	§2.3 · §2.4 r1 · §2.A · §4.C c1,c3,c6
UASC-S02	FE + BE	BE: migration widen `prompt`; add `prompt` to custom-param Grape params (POST+PATCH) + `Create`/`Update` Dry contracts + `validate_prompt_length` macro; repos `ScorecardCustomParameter::Create`/`Update` persist `prompt`; entity `ScorecardCustomParameter` expose `prompt` + `auto_scorable`; OpenAPI. FE: `CustomParamEditor.vue` (textarea + length counter + auto-scorable chip); store actions; service+endpoint; Vuelidate max-len; mixpanel `scorecard_custom_param_saved`/`_save_failed`.	`S02/AC-1` non-empty→auto_scorable; `AC-2` shows rubric+state; `AC-3` empty→manual-only; `AC-4` over-limit rejected; `ERR-1` error+retry+log; `NEG-1` empty NOT auto-scorable.	§2.3 · §2.4 r2 · §2.A · §4.C c2,c4,c7
UASC-S03	FE + BE	BE: new `ScorecardAiDefaultRubric` Grape resource + use case reading `Constants::ScorecardAiDefaultRubric`; entity; mount in `frontend_service/gpt_api.rb` (+ optionally gpt_service); OpenAPI. FE: `DefaultRubricViewer.vue`; store action; service+endpoint; "PROPOSED" + veto badges; mixpanel `default_rubric_viewed`/`default_rubric_load_failed`.	`S03/AC-1` 9 metrics read-only; `AC-2` PROPOSED note; `AC-3` veto flag on Groundedness+Policy; `ERR-1` load error+retry+log.	§2.4 r3 · §2.A · §4.C c5,c8

2. Technical Design

Detail 2.0 — Repo Reading Guide

Repo Map (slice this RFC touches)

flowchart LR
  subgraph FE["chatbot-fe (Nuxt 4)"]
    page["pages/settings/scorecard/*.vue (NEW)"]
    views["modules/settings/views/* (pattern: ai-assist.vue)"]
    store["store/scorecard/* (NEW, pattern: store/ai-assist)"]
    svc["common/services/main/v1/scorecard.ts (NEW)"]
    ep["common/services/main/endpoint.ts (+scorecard)"]
    flag["plugins/botSubscriptionFeature.ts ($hasSubscription)"]
  end
  subgraph BE["chatbot (Rails 7.1 / Grape)"]
    apipref["app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb"]
    apicp["app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb"]
    apirub["app/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb (NEW)"]
    uc["app/core/use_cases/api/frontend_service/v1/gpt/scorecard_*"]
    repo["app/core/repositories/gpt/scorecard_*"]
    ent["app/core/entities/frontend_services/gpt/scorecard_preference.rb"]
    auth["app/api/frontend_service/middlewares/auth.rb -> hub-service /users/me"]
  end
  db[("chatbot_gpt DB (Postgres)\nscorecard_preferences\nscorecard_custom_parameters")]
  mp["SendMixpanelEventWorker -> Mixpanel"]

  page --> store --> svc --> ep -->|"$apiMain /api"| apipref & apicp & apirub
  flag -.gates.-> page
  apipref & apicp & apirub --> auth
  apipref --> uc --> repo --> db
  apicp --> uc
  uc --> ent
  page -.fires.-> mp

Existing Code Anchors (read before writing)

#	Path	What to learn
1	`chatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb`	Grape GET/PATCH shape, `set_role`, `Dry::Matcher::ResultMatcher`, mount target.
2	`chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rb`	Dry contract + `rule(:passing_grade)` 1–99; how AI cols/rule are added.
3	`chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/get.rb`	Default-fill pattern via `Entities::...::ScorecardPreference::DEFAULT_*`.
4	`chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb`	Find-by-org upsert; where to set the new columns.
5	`chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb`	`DEFAULT_PASSING_GRADE=75`, `DEFAULT_AUTO_SCORE=true`; add `DEFAULT_AI_*`.
6	`chatbot/app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb`	POST/PATCH params (no `prompt` today); where to add it.
7	`chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rb`	`register_macro(:validate_*_length)` pattern → model `validate_prompt_length`.
8	`chatbot/db/chatbot_gpt_migrate/20241113041150_create_scorecard_custom_parameters.rb`	Migrator dialect for the `chatbot_gpt` DB; `prompt` is `t.string`.
9	`chatbot/app/models/chatbot_gpt_record.rb`	`ChatbotGptRecord` base — migrations target `:chatbot_gpt` connection.
10	`chatbot-fe/modules/settings/views/ai-assist.vue` + `store/ai-assist/*`	Canonical settings page + Pinia store + Vuelidate + toast + `$hasSubscription` pattern.

Patterns to Follow

Concern	Reference file (opened)	Pattern
Grape endpoint + auth	`chatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb`	`format :json`; `set_role`; `Dry::Matcher::ResultMatcher` success/failure.
Use case + validation	`.../scorecard_preference/patch.rb`; `.../scorecard_custom_parameter/create.rb`	`contract do params … rule … register_macro end`; `Success/Failure(build_*_params)`.
Repository upsert	`chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb`	find-or-build → assign → `save!` → `Builders::...build`.
Entity defaults	`chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb`	`dry-struct` attributes + `public_constant :DEFAULT_*`.
Grape response entity	`chatbot/app/api/frontend_service/v1/entities/gpt/scorecard_preference.rb`	`Grape::Entity` `expose` with documentation.
External LLM call (Phase 2 ref only)	`chatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:160-179`	`OpenAI::Client.new(request_timeout: 240)`, `max_attempts = 2`, `Rollbar.error`.
Async analytics	`chatbot/app/workers/send_mixpanel_event_worker.rb` + `app/core/use_cases/system/receive_webhook.rb:76-89`	`SendMixpanelEventWorker.perform_async(org, event, props.as_json)`.
Feature flag (BE)	`chatbot/app/core/repositories/organization_features/find_feature.rb`; usage in `app/core/repositories/ai_knowledge_sources/search.rb`	`OrganizationFeatures::FindFeature.new(feature_code:, organization_id:).call_by_organization`.
Settings page (FE)	`chatbot-fe/modules/settings/views/ai-assist.vue`	`MpFormControl`/`MpInput`/`MpButton`; `useVuelidate`; `$toast`; `isFetch*` computed.
Pinia store (FE)	`chatbot-fe/store/ai-assist/{state,actions,getters,types}.ts`	`fetchStatus: idle/pending/resolved/rejected`; service via `mainService`.
API service (FE)	`chatbot-fe/common/services/main/v1/ai-assist.ts` + `common/services/main/endpoint.ts`	`$apiMain(endpoint, {method, body, signal})`; `AbortController`.
Feature flag (FE)	`chatbot-fe/plugins/botSubscriptionFeature.ts`	`$hasSubscription('code')` boolean.
Analytics (FE)	`chatbot-fe/common/contants/mixpanel-events.ts` + `ai-assist.vue:925`	`mixpanel.track(MIXPANEL_EVENTS.X, props)`.

Reading Order for the Agent

chatbot/AGENTS.md (§Workflow Commands + §API Specification Rules)
Anchor #1 (preference Grape) → #2 (patch UC) → #3 (get UC) → #4 (upsert) → #5 (entity)
Anchor #6 (custom-param Grape) → #7 (create UC macros)
Anchor #8 + #9 (chatbot_gpt migration dialect + base record)
chatbot/app/api/frontend_service/gpt_api.rb (FE-facing mount paths) and chatbot/app/api/gpt_service/api.rb (gpt-svc mount paths)
Anchor #10 (chatbot-fe settings page + store + service)
chatbot-fe/common/services/main/endpoint.ts, plugins/api/apiMain.ts, plugins/botSubscriptionFeature.ts

Existing-Endpoint Check (reuse / extend / new)

Endpoint	Surface(s)	Tag	Evidence
`PATCH /v1/gpt/omnichannel/scorecard_preferences` (+ `PATCH /v1/scorecards/preferences`)	frontend_service + gpt_service	extended	`gpt_api.rb:28`, `gpt_service/api.rb:21`; adding `is_ai_auto_score`/`ai_passing_grade`.
`GET` same path	both	extended	`scorecard_preferences.rb get '/'`; add AI fields to response.
`POST /v1/gpt/scorecard_custom_parameters` + `PATCH :id` (+ `/v1/scorecards/parameters/custom`)	both	extended	`gpt_api.rb:33`, `gpt_service/api.rb:24`; adding `prompt`.
`GET .../scorecard_ai_default_rubric`	frontend_service (+ gpt_service optional)	new-with-justification	No endpoint serves AI default metrics today (grep `scorecard` in `app/api` — only categories/parameters/custom/preferences). The 9 AI metrics are a new concept with no table; a static-config read endpoint is the single source of truth Phase 2 reuses, and the PRD defines `default_rubric_load_failed` (a fetch failure mode).

Source Verification

Claim	Evidence (file:line / identifier)
`scorecard_preferences` has `is_auto_score` (bool, default false) + `passing_grade` (float)	`chatbot/db/chatbot_gpt_schema.rb:454-465`; migration `db/chatbot_gpt_migrate/20240206095006_create_scorecard_preference.rb`
Human `passing_grade` validated 1–99	`chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rb:18-22` `rule(:passing_grade)`
Preference upsert keyed by `organization_id`	`chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb:13-25`; model `validates_uniqueness_of :organization_id`
Defaults `DEFAULT_PASSING_GRADE=75`, `DEFAULT_AUTO_SCORE=true`	`chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb:7-9`
`scorecard_custom_parameters.prompt` exists as `string`, unused by API	`chatbot/db/chatbot_gpt_schema.rb:418-441` (`t.string "prompt"`); custom-param Grape params omit `prompt` (`app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb` POST/PATCH params)
Length validation macro pattern	`chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rb:39-59`
`auto_agent_scoring.rb` scores the human agent on room resolve	`chatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:6,76-79`; trigger `app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63`; worker `app/workers/auto_agent_scoring_worker.rb`
OpenAI client pattern (Phase 2 ref)	`auto_agent_scoring.rb:160-179` `OpenAI::Client.new(request_timeout: 240)`, `max_attempts = 2`
Role enum `{owner,admin,supervisor,agent,member}` (single role)	hub-core `app/core/domains/models/user.rb:38-44`; surfaced via hub-service `/api/core/v1/users/me`; consumed `chatbot/app/api/frontend_service/middlewares/auth.rb:15-24` → `env['user']` → `current_user['role']`; checked `app/api/frontend_service/helpers/authorization_helpers.rb:6-10`
Existing scorecard endpoints gate `owner/admin/supervisor`	`scorecard_preferences.rb` (get/patch) + `scorecard_custom_parameter.rb` (post/patch/delete) `set_role(%w[owner admin supervisor])`
Feature-flag mechanism (BE)	`chatbot/app/core/repositories/organization_features/find_feature.rb:4-22`; usage `app/core/repositories/ai_knowledge_sources/search.rb`
paper_trail already on both models	`chatbot/app/models/chatbot_gpt/scorecard_preference.rb:5`; `scorecard_custom_parameter.rb:6`
Mixpanel async worker	`chatbot/app/workers/send_mixpanel_event_worker.rb`; `config/initializers/mixpanel.rb`
OpenAPI mandatory workflow	`chatbot/AGENTS.md:235-247`
FE has no scorecard code today	grep `scorecard\|passing_grade\|is_auto_score` in `chatbot-fe/{common,store,pages,modules}` → 0 hits
FE settings/store/service/flag patterns	`chatbot-fe/modules/settings/views/ai-assist.vue`; `store/ai-assist/*`; `common/services/main/v1/ai-assist.ts`; `common/services/main/endpoint.ts:123-157`; `plugins/botSubscriptionFeature.ts:23-40`; `common/contants/mixpanel-events.ts`
chatbot_gpt DB connection	`chatbot/app/models/chatbot_gpt_record.rb` `connects_to database: { writing: :chatbot_gpt, reading: :chatbot_gpt }`

Detail 2.1 — Infrastructure Topology

flowchart TB
  user(["QA Lead / Supervisor / Owner / Admin (web)"])
  lb["LB / Ingress"]
  fe["chatbot-fe pods (Nuxt 4 SSR/SPA)"]
  api["chatbot pods (Puma · Grape FrontendService)"]
  hub["hub-service /api/core/v1/users/me (auth)"]
  pg[("Postgres — chatbot_gpt DB\n(writing+reading)")]
  redis[("Redis")]
  sidekiq["Sidekiq workers"]
  mp["Mixpanel (external)"]
  oai["OpenAI (external) — Phase 2 only"]

  user --> lb --> fe -->|"$apiMain /api (Bearer + X-Auth-Token)"| lb
  lb --> api
  api -->|"validate token"| hub
  api -->|"read/write settings + rubric"| pg
  fe -.->|"track events"| mp
  api -->|"enqueue analytics"| redis --> sidekiq --> mp
  sidekiq -. "Phase 2 AutoAgentScoring" .-> oai

Per-service responsibilities

Service	Use cases (this RFC)	Internal calls (owner)	External APIs
chatbot-fe	Render settings + rubric editor + viewer; client validation; fire events	chatbot API (BOT)	Mixpanel (browser)
chatbot (Grape)	Persist AI preference; persist custom-param rubric; serve default rubric; authz; feature-gate	hub-service `/users/me` (Core team); Mixpanel worker	— (Phase 1); OpenAI in Phase 2
hub-service / hub-core	Token validation, issues single `role`	—	—
chatbot_gpt DB	Store `scorecard_preferences`, `scorecard_custom_parameters` (+ paper_trail versions)	—	—

Detail 2.1a — Sequence Diagrams (happy + failure paths)

S01 — Save AI preference (authz + validation failure)

sequenceDiagram
  participant U as User
  participant FE as chatbot-fe
  participant LB as LB
  participant API as chatbot (Grape)
  participant HUB as hub-service /users/me
  participant DB as chatbot_gpt (Postgres)
  participant MP as Mixpanel (async)
  U->>FE: toggle AI on, set ai_passing_grade
  FE->>LB: PATCH scorecard_preferences (Bearer+X-Auth)
  LB->>API: forward
  API->>HUB: validate token
  HUB-->>API: {role: supervisor, organization_id}
  alt role not in {owner,admin,supervisor}
    API-->>FE: 403 Permission denied
  else authorized
    API->>API: Dry coerce :float + rule ai_passing_grade in 0..100
    alt non-numeric or out of range
      API-->>FE: 422 "AI passing grade only between 0 - 100"
      FE-->>U: inline error, nothing saved
    else valid
      API->>DB: upsert by organization_id (save!)
      DB-->>API: ok
      API-->>FE: 200 {is_ai_auto_score, ai_passing_grade}
      FE-)MP: track scorecard_settings_updated
      FE-->>U: "Change saved"
    end
  end

S02 — Save custom param + rubric (happy + DB failure)

sequenceDiagram
  participant FE as chatbot-fe
  participant API as chatbot (Grape)
  participant DB as chatbot_gpt
  participant MP as Mixpanel (async)
  FE->>API: POST scorecard_custom_parameters {name, prompt}
  API->>API: validate_prompt_length (<=4000) + name rules
  alt prompt > 4000
    API-->>FE: 422 "AI judging rubric cannot exceed 4000 characters."
  else valid
    API->>DB: create (save!) — company_id from token
    alt save! raises
      API->>API: Rollbar.error(e)
      API-->>FE: 500 "Something went wrong"
      FE-)MP: track scorecard_custom_param_save_failed
      FE-->>FE: $toast error + Retry (no partial state)
    else ok
      API-->>FE: 200 {prompt, auto_scorable: prompt.present?}
      FE-)MP: track scorecard_custom_param_saved {has_rubric}
    end
  end

S03 — Load default rubric (happy + fetch failure)

sequenceDiagram
  participant FE as chatbot-fe
  participant API as chatbot (Grape)
  FE->>API: GET scorecard_ai_default_rubric
  alt success
    API-->>FE: 200 {status:PROPOSED, metrics:[9 + veto]}
    FE-->>FE: render list + veto badges
  else 500 / network
    FE-->>FE: "Couldn't load the default rubric." + Retry
    FE-)FE: track default_rubric_load_failed
  end

Detail 2.1b — Rubric Gate Branch

flowchart TD
  A[Save custom parameter] --> B{prompt non-empty?}
  B -->|Yes| C[auto_scorable = true]
  B -->|No| D[auto_scorable = false — manual-only]
  C --> E[Persist + return auto_scorable]
  D --> E

Detail 2.2 — Technical Decisions (ADR-format)

ADR-1 — Store AI scoring as new columns, not overloaded human columns

Context. scorecard_preferences.is_auto_score/passing_grade already drive auto_agent_scoring.rb (human). PRD requires AI on-switch + AI threshold "with the existing human auto-score untouched."
Options.
- A. New columns is_ai_auto_score + ai_passing_grade — clean separation; human path provably unchanged; Phase 2 reads AI cols explicitly. Con: one migration + a few columns.
- B. Overload is_auto_score/passing_grade for both lenses — Con: entangles human and AI, high regression risk on a live path; a single threshold can't differ per lens.
- C. Reuse is_auto_score switch, add only ai_passing_grade — Con: can't enable AI without enabling human auto-scoring and vice-versa.
Decision. Option A (confirmed by DRI).
Rationale. Strongest guarantee that human auto-scoring is byte-for-byte unchanged; matches PRD's two-lens intent.
Consequences. Migration adds is_ai_auto_score (bool, default false) + ai_passing_grade (float, nullable). Contracts/entities/upsert extended. Phase 2 reads the AI columns.
Reversibility. High — drop the two columns; no human-path coupling.

ADR-2 — Widen `scorecard_custom_parameters.prompt` `string → text`

Context. A real judging rubric (PRD ~4,000 chars) does not fit a single-line string.
Options. A. change_column … :text (Postgres in-place, no rewrite for varchar→text). B. Add a new rubric text column and dual-write — Con: duplicate field, migration of an unused column for no benefit.
Decision. Option A. change_column :scorecard_custom_parameters, :prompt, :text.
Rationale. prompt is already the PRD's named field and is currently unused, so the widen is non-destructive (varchar→text widening preserves data).
Consequences. Migration in db/chatbot_gpt_migrate/; chatbot_gpt_schema.rb regen.
Reversibility. Low/risky (text→string truncates) — treat as forward-only; rollback is the flag, not the column type.

ADR-3 — `auto_scorable` is derived, not stored

Context. PRD: "non-empty rubric marks the param auto-scorable."
Options. A. Compute auto_scorable = prompt.present? at read time. B. Store a boolean column kept in sync — Con: drift risk, redundant with the source of truth.
Decision. Option A — expose auto_scorable in the response entity/builder.
Rationale. Single source of truth (prompt); no sync bug; Phase 2 re-derives the same way.
Consequences. Response entity gains a computed auto_scorable field; no schema change.
Reversibility. High.

ADR-4 — Default rubric via a new read-only endpoint from static config

Context. The 9 AI metrics (PROPOSED, DSAI-owned) have no table; PRD defines a default_rubric_viewed/default_rubric_load_failed fetch.
Options.
- A. New read-only endpoint serving a Ruby constant/YAML (Constants::ScorecardAiDefaultRubric).
- B. FE static constant — Con: no real load-failure mode; duplicates the list Phase 2 needs server-side.
- C. Seed the 9 metrics into scorecard_parameters/categories — Con: mixes AI metrics into human-scorecard tables; risks human auto-scorer picking them up.
Decision. Option A (confirmed by DRI).
Rationale. Server is the single source of truth; Phase 2 scoring reads the same constant; honors the PRD's fetch + failure event; no schema entanglement.
Consequences. New Grape resource + use case + entity; content carries status: PROPOSED.
Reversibility. High — delete endpoint + constant.

ADR-5 — Feature gate on `ai_qa_unified_scorecard`

Context. Surfaces must ship dark until Phase 2; Pro+Ent only.
Options. A. Reuse org-feature mechanism (BE OrganizationFeatures::FindFeature, FE $hasSubscription). B. New bespoke flag system — Con: reinvents an existing pattern.
Decision. Option A. BE guards the three surfaces (return 404/empty or feature_enabled:false); FE hides routes/controls via $hasSubscription('ai_qa_unified_scorecard').
Rationale. Matches existing AI-assist gating; plan-gating piggybacks on provisioning (A2).
Consequences. Feature row must be provisioned per org; default OFF.
Reversibility. High — toggle the feature off.

ADR-6 — Synchronous writes; async analytics

Context. Save P95 ≤ 500ms; events must not block saves.
Decision. Settings/rubric writes are synchronous single-row upserts (well under 500ms); Mixpanel events enqueued via SendMixpanelEventWorker.perform_async.
Options. No async needed for the write itself (no alternative considered — single-row DB write under budget). Analytics async is the existing pattern.
Consequences. Event delivery is best-effort and never fails the save.
Reversibility. High.

ADR-7 — Authorization reuses the existing role enum

Context. No qa_lead/bot_admin role exists; single-role enum {owner,admin,supervisor,agent,member} (hub-core user.rb:38-44).
Options. A. Keep set_role(%w[owner admin supervisor]); map QA Lead→supervisor, Bot/AI Admin→owner/admin. B. Introduce new roles — Con: cross-cutting change to hub-core
- hub-service token issuance, far outside this initiative.
Decision. Option A (confirmed by DRI).
Rationale. Matches every existing scorecard endpoint; agent/member excluded (= "end CS agents: no access").
Consequences. Read + write on all three surfaces gate owner/admin/supervisor.
Reversibility. High; revisit if a QA role lands platform-wide.

ADR-8 — `ai_passing_grade` range 0–100 (diverges from human 1–99)

Context. PRD §6/§9 say AI threshold 0–100; existing human rule is 1–99 (patch.rb:18-22).
Options. A. Validate the new field 0–100 inclusive per PRD. B. Match human 1–99 for consistency — Con: contradicts PRD's stated bar (0 and 100 both meaningful).
Decision. Option A — rule(:ai_passing_grade) { key.failure unless (0..100).cover?(value) }.
Rationale. New field, follow the PRD spec; 0 ("any pass") and 100 ("perfect only") are legitimate.
Consequences. Two different valid ranges in one table — documented; surfaced as a minor follow-up to align (§5 Open Q#4).
Reversibility. High — change the rule bound.

Minimum-coverage checklist

Storage — chatbot_gpt Postgres; new cols + widened prompt (ADR-1,2).
Sync vs async — sync writes, async analytics (ADR-6).
Caching — n/a — single-row reads, no cache; default rubric is a static constant.
Third-party — Mixpanel via existing worker (ADR-6); OpenAI is Phase 2.
Consistency — strong (single-row upsert, unique per org).
Multi-tenancy — org-scoped by organization_id (preference) / company_id (custom param) from the validated token; never client-supplied (§3 Security).
Reuse vs new — 2 extended endpoints + 1 new (ADR-4, Existing-Endpoint Check).

Detail 2.3 — Database Model

erDiagram
  SCORECARD_PREFERENCES {
    bigint id PK
    string organization_id UK "unique where deleted_at IS NULL"
    boolean is_auto_score "human (existing), default false"
    float passing_grade "human (existing), 1-99"
    boolean is_ai_auto_score "NEW, default false"
    float ai_passing_grade "NEW, nullable, 0-100"
    string company_id
    datetime deleted_at
  }
  SCORECARD_CUSTOM_PARAMETERS {
    uuid id PK
    string name
    string code
    string description
    text prompt "WIDENED string->text (AI judging rubric)"
    string company_id "UK [code, company_id] where deleted_at IS NULL"
    datetime deleted_at
  }
  SCORECARD_CATEGORIES_PARAMETERS }o--|| SCORECARD_CUSTOM_PARAMETERS : references

DDL (Rails DSL, chatbot_gpt connection — pattern: db/chatbot_gpt_migrate/20241113041150_*):

# db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb
class AddAiScoringToScorecardPreferences < ActiveRecord::Migration[7.1]
  def change
    add_column :scorecard_preferences, :is_ai_auto_score, :boolean, null: false, default: false
    add_column :scorecard_preferences, :ai_passing_grade, :float
    add_index  :scorecard_preferences, :is_ai_auto_score
  end
end

# db/chatbot_gpt_migrate/<ts+1>_widen_scorecard_custom_parameter_prompt.rb
class WidenScorecardCustomParameterPrompt < ActiveRecord::Migration[7.1]
  def up
    change_column :scorecard_custom_parameters, :prompt, :text
  end

  def down
    change_column :scorecard_custom_parameters, :prompt, :string  # WARNING: truncates >255
  end
end

No data backfill (PRD §11). New AI columns default to "off/unset"; existing rows unaffected. Regenerate db/chatbot_gpt_schema.rb after migrating.

Per-status lifecycle: n/a — no status enum introduced (no new state machine; acts_as_paranoid soft-delete + paper_trail versioning already exist on both models and are unchanged).

State Surface Contract:

Entity	Surfaced to	Field(s)	Visibility	Audit
`scorecard_preferences`	settings page	`is_ai_auto_score`, `ai_passing_grade` (+ existing human)	owner/admin/supervisor; flag on	paper_trail (existing)
`scorecard_custom_parameters`	editor + list	`prompt`, derived `auto_scorable`	owner/admin/supervisor; flag on	paper_trail (existing)
default rubric (static)	viewer	9 metrics + veto + PROPOSED	owner/admin/supervisor; flag on	n/a (read-only constant)

Detail 2.4 — APIs (Outbound the FE consumes)

Base: chatbot frontend_service surface, called by FE $apiMain at /api (gpt_api.rb:28,33). Mirror the same Grape classes on the gpt_service surface (/v1/scorecards/..., gpt_service/api.rb:21,24) for parity. Auth: Bearer access-token + X-Auth-Token (validated via middlewares/auth.rb → hub-service /users/me).

Row 1 — extended — Preference (AI fields)

GET  /api/v1/gpt/omnichannel/scorecard_preferences      # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/omnichannel/scorecard_preferences

Request (PATCH):

{
  "passing_grade": 75,          // existing human (required by current contract)
  "is_auto_score": true,        // existing human (required)
  "is_ai_auto_score": true,     // NEW (optional; default false)
  "ai_passing_grade": 80        // NEW (optional; validated 0..100 when present)
}

Response (GET/PATCH 200):

{
  "data": {
    "is_auto_score": true, "passing_grade": 75,
    "is_ai_auto_score": true, "ai_passing_grade": 80
  },
  "message": "OK"
}

Errors: 422 ai_passing_grade non-numeric → coercion failure; 422 outside 0–100 → "AI passing grade only between 0 - 100 are allowed"; 403 role; 401 auth; 500 save fail.

New AI fields are optional in the contract so existing callers sending only human fields keep working (backward compat). Contract: optional(:is_ai_auto_score).maybe(:bool), optional(:ai_passing_grade).maybe(:float) — Dry coerces type before the range rule(:ai_passing_grade), so a non-numeric value 422s instead of raising. On read, absent ai_passing_grade → DEFAULT_AI_PASSING_GRADE (75), is_ai_auto_score → false.

Row 2 — extended — Custom parameter (rubric)

POST  /api/v1/gpt/scorecard_custom_parameters           # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/scorecard_custom_parameters/:id

Request adds:

{ "name": "BANT capture", "prompt": "Score how completely … 0-100 + which were missed." }

Response 200 adds:

{ "data": { "id": "<uuid>", "name": "BANT capture", "prompt": "…", "auto_scorable": true }, "message": "…" }

Errors: 422 prompt length > 4000 → "AI judging rubric cannot exceed 4000 characters."; existing 422 name rules; 403/401/500 as today.

Create vs update / duplicate handling (REV-5): POST creates, PATCH :id updates — these are not an upsert, so adding prompt does not change create semantics. Duplicate names are already rejected by the existing Repositories::Gpt::ScorecardCustomParameter::NameUniquenessValidator#validate_create (create.rb → 422 "The name field is already exist or name cannot be the same as default parameter"). The new prompt field is orthogonal to uniqueness; no new collision surface is introduced.

Row 3 — new-with-justification — Default AI rubric (read-only)

GET /api/v1/gpt/scorecard_ai_default_rubric             # role: owner|admin|supervisor; flag-gated

Response 200:

{
  "data": {
    "status": "PROPOSED",
    "group": "Qontak AI Quality (default)",
    "metrics": [
      { "code": "groundedness", "name": "Groundedness / factual accuracy", "description": "Claims backed by KB sources or customer data; no invented product facts", "veto": true },
      { "code": "resolution",   "name": "Resolution / task completion",    "description": "Did it resolve the goal (skill_completed signal)", "veto": false },
      { "code": "relevance",    "name": "Relevance / intent understanding", "description": "Addressed the real intent, not a different question", "veto": false },
      { "code": "policy",       "name": "Policy & safety adherence",        "description": "Stayed within 'what to avoid'; no unsafe content / PII leak", "veto": true },
      { "code": "tone",         "name": "Tone & brand voice",              "description": "Matched configured tone_of_voice; courteous", "veto": false },
      { "code": "language",     "name": "Language quality (Bahasa)",       "description": "Fluent target language; no broken/mixed language", "veto": false },
      { "code": "handoff",      "name": "Handoff appropriateness",         "description": "No false handover (Pattern A); no missed escalation", "veto": false },
      { "code": "tool",         "name": "Tool / action correctness",       "description": "Right action, right params, not skipped (Pattern B)", "veto": false },
      { "code": "efficiency",   "name": "Conversation efficiency",         "description": "No loops / re-asking; resolved within turn budget", "veto": false }
    ]
  },
  "message": "OK"
}

Errors: 500 → FE shows default_rubric_load_failed; 403/401.

APIs (Inbound — other services → us): n/a — Phase 1 adds no inbound webhook (the existing room-resolve webhook is unchanged).

Detail 2.A — UI Contract (FE)

Design status: Figma Pending (PRD). Interim spec = PRD Appendix B Stitch prompts. Components use Pixel3 (Mp*). New page under pages/settings/scorecard/; logic in modules/settings/views/ mirroring ai-assist.vue.

Component	File (new)	Purpose	Key Pixel3 elements	Backing endpoint
ScorecardSettingsPage	`pages/settings/scorecard/index.vue`	Container; flag guard	layout + `MpTabs`/sections	preference + default rubric
AutoScoreToggle	`modules/settings/views/scorecard/auto-score-toggle.vue`	AI on-switch + `ai_passing_grade` (0–100)	`MpSwitch`/`MpFormControl`/`MpInput`+`MpFormErrorMessage`/`MpButton(is-loading)`	`PATCH .../scorecard_preferences`
DefaultRubricViewer	`modules/settings/views/scorecard/default-rubric-viewer.vue`	Read-only 9 metrics + 🛑 veto + PROPOSED note	`MpText`/`MpBadge`/skeleton	`GET .../scorecard_ai_default_rubric`
CustomParamEditor	`pages/settings/scorecard/custom-parameters.vue` + `modules/settings/views/scorecard/custom-param-editor.vue`	Add/edit param + rubric textarea + length counter + auto-scorable chip	`MpInput`/`MpTextarea`/`MpBadge`/`MpButton`	`POST/PATCH .../scorecard_custom_parameters`

Design ↔ Code Mapping: n/a — Figma pending; tokens follow the existing settings shell (ai-assist.vue). Any deviation re-checked once frames land (§5 Open Q#1).

Detail 2.B — Data-Fetching Strategy (FE)

New Pinia store chatbot-fe/store/scorecard/{state,actions,getters,types,index}.ts, mirroring store/ai-assist (fetchStatus: idle|pending|resolved|rejected).
New service chatbot-fe/common/services/main/v1/scorecard.ts using $apiMain + AbortController (pattern: ai-assist.ts:151-184). Endpoints added to common/services/main/endpoint.ts:

scorecard: {
  preference: { get: "/v1/gpt/omnichannel/scorecard_preferences", update: "/v1/gpt/omnichannel/scorecard_preferences" },
  customParam: { create: "/v1/gpt/scorecard_custom_parameters", update: "/v1/gpt/scorecard_custom_parameters", list: "/v1/gpt/scorecard_custom_parameters" },
  defaultRubric: { get: "/v1/gpt/scorecard_ai_default_rubric" },
}

Fetch on page mount; optimistic UI not used (single Save action), matching ai-assist.vue.

Casing convention (REV-6): the BE returns snake_case keys; the FE consumes them directly without transformation, matching the existing pattern (e.g. store/ai-assist reads state.reply_limit straight off the API). Do not introduce a camelCase mapping layer for these endpoints — keep the snake_case field names end-to-end so the contract stays 1:1.

Typed contracts (REV-1) — store/scorecard/types.ts + common/services/main/v1/scorecard.ts:

// API request/response shapes (snake_case, matching BE Grape entities)
export interface ScorecardPreference {
  is_auto_score: boolean        // human (existing)
  passing_grade: number         // human (existing, 1–99)
  is_ai_auto_score: boolean     // NEW
  ai_passing_grade: number | null // NEW (0–100; null → default 75 on read)
}
export interface CustomParam {
  id: string
  name: string
  prompt: string                // "" when manual-only
  auto_scorable: boolean        // derived = prompt non-empty
}
export interface DefaultRubricMetric {
  code: string; name: string; description: string; veto: boolean
}
export interface DefaultRubric {
  status: "PROPOSED"; group: string; metrics: DefaultRubricMetric[]
}
// Pinia store slice (mirrors store/ai-assist fetchStatus pattern)
type FetchStatus = "idle" | "pending" | "resolved" | "rejected"
export interface ScorecardState {
  preference: { data?: ScorecardPreference; fetchStatus: FetchStatus }
  preferenceUpdate: { fetchStatus: FetchStatus }
  customParams: { data?: CustomParam[]; fetchStatus: FetchStatus }
  customParamSave: { fetchStatus: FetchStatus }
  defaultRubric: { data?: DefaultRubric; fetchStatus: FetchStatus }
}

Component props/emits types are deferred to implementation, inferred from the ai-assist.vue family (RFC §5 #7) — low risk, single owning module.

Detail 2.C — UI State Matrix

stateDiagram-v2
  [*] --> Loading: Open Scorecard settings
  Loading --> Empty: No custom params yet
  Loading --> Success: Saved config loaded
  Loading --> Error: Load / save fails
  Error --> Loading: Retry
  Empty --> Success: Add first custom param
  Success --> [*]: Config saved

State	AutoScoreToggle	CustomParamEditor	DefaultRubricViewer
Loading	fields disabled + spinner	textarea disabled + spinner	skeleton list
Empty	defaults (off / 75)	"No custom parameters…" + add hint	n/a — 9 defaults always exist
Error	`$toast` error + Retry; log `scorecard_settings_save_failed`	`$toast` + Retry; log `scorecard_custom_param_save_failed`	"Couldn't load the default rubric." + Retry; log `default_rubric_load_failed`
Success	"Change saved"	"Saved — will be auto-scored when scoring ships"; chip lit if rubric present	9 metrics listed, veto badges

Detail 2.D — Scope Boundaries

In scope	Out of scope
AI cols + `prompt` widen; 3 endpoints; 4 FE components + store/service; flag gating; analytics events; OpenAPI	Any scoring/computation; in-room panel; report; gate; the room-resolve `is_custom_parameter` skip guard; new roles; plan-tier code; i18n introduction

Detail 2.E — Branch & Skip Catalog

Branch / skip	Condition	Behavior	Owner
Rubric auto-scorable gate	`prompt.present?`	non-empty → `auto_scorable:true`; empty → manual-only (`false`)	BE (custom-param entity) — §2.1b flowchart, S02/AC-3, NEG-1
Flag-off skip	`ai_qa_unified_scorecard` disabled for org	FE hides routes/controls (`$hasSubscription`); BE returns flag-gated empty/404	FE + BE (ADR-5), S01/NEG-1
Plan-not-eligible skip	Starter/Free org (feature not provisioned)	Same as flag-off (no surface)	Provisioning (A2), S01/NEG-1
Unauthorized skip	role ∈ {agent, member}	controls not rendered (FE); 403 (BE)	§3 Role matrix
AI-enable-without-Phase-2	`is_ai_auto_score=true` pre-Phase-2	recorded preference, no scores produced (inert)	A1; PRD Open Q#3
Room-resolve AI skip (Phase 2, NOT built here)	existing `unless is_custom_parameter \|\| scorecard_exists` guard	unchanged in Phase 1; Phase 2 must revisit	BE (forward note §1)

Detail 2.G — Cross-Layer Contract Verification

Endpoint	PRD-to-Schema row (§1.A.5)	Interim design (App. B)	Match?
PATCH preference (AI fields)	rows 1–2 (AI on-switch, AI threshold)	Stitch #1	yes
POST/PATCH custom param (`prompt`)	rows 3–4 (org rubric, derived auto_scorable)	Stitch #2	yes
GET default rubric	row 5 (9 default metrics)	Stitch #1 (viewer block)	yes

3. High-Availability & Security

Performance Requirement

Save P95 ≤ 500ms (PRD §6). Single-row upsert on an org-unique index; default-rubric is an in-memory constant. No N+1 (custom-param list already paginated/scoped by company_id).

Monitoring & Alerting

Reuse Mixpanel + the squad dashboard (owner: BOT). Events (PRD §12): scorecard_settings_updated, scorecard_settings_save_failed, scorecard_custom_param_saved, scorecard_custom_param_save_failed, default_rubric_viewed, default_rubric_load_failed. Alert: scorecard_settings_save_failed + scorecard_custom_param_save_failed rate > 5% in 1h → Slack #bot-ai-oncall. (Naming mirrors existing [CHATBOT] Mixpanel events in chatbot-fe/common/contants/mixpanel-events.ts.)

Logging

Server errors via Rollbar.error(e) (existing pattern in the use cases); structured request logs via lograge. Never log full prompt content at error level (may contain org IP) — log org_id, custom_param_id, reason only (matches PRD event props).

Tracing (REV-3)

The BE already runs ddtrace (Datadog) + Aegis/OpenTelemetry (chatbot/Gemfile). The three new / extended endpoints are ordinary Grape requests, so they inherit existing request spans automatically — no new instrumentation needed. Distributed FE→API→BE trace correlation is explicitly out of scope for Phase 1 (no new correlation-id propagation is added); on-call follows an FE error to the BE via the existing per-request Datadog span + the *_save_failed Mixpanel event's org_id. Revisit cross-tier trace stitching with the Phase-2 scoring pipeline, where the async OpenAI call makes it materially useful.

Security Implications

AuthN: every endpoint behind middlewares/auth.rb (Bearer + X-Auth-Token → hub-service).
AuthZ: set_role(%w[owner admin supervisor]) on all three (read + write). agent/member → 403.
Tenancy (critical): organization_id (preference) and company_id (custom param) are taken only from current_user (the validated token) — never the request body. This matches the existing endpoints: preference passes current_user[:organization_id] (scorecard_preferences.rb get/patch), custom param passes current_user.try(:[], 'company_id') (scorecard_custom_parameter.rb post/patch). The new prompt/AI fields must not introduce a body-supplied org/company id. Add a request-spec assertion that a token for org A cannot read/write org B's preference or params (cross-tenant write → scoped to token org).
Input validation: ai_passing_grade coerced to :float then range-checked 0–100 (non-numeric → 422, never a raised exception). prompt capped 4,000 chars server-side; strip null bytes / control characters before persist so Phase-2 prompt assembly can't be broken by injected control chars.
Injection / XSS: prompt is stored and rendered as text (an LLM instruction, not HTML). Custom-param name/description already pass sanitize_html (scorecard_custom_parameter.rb before_save); prompt does not need HTML sanitization but all custom-param text fields (name, description, prompt) must render via Vue interpolation, never v-html (Vue escapes by default).
Prompt-injection (forward-looking): the prompt becomes part of an LLM system prompt in Phase 2; Phase 1 only stores it. Note for Phase 2: treat stored rubric as untrusted input.
Audit: paper_trail already records versions on both models; ensure whodunnit is populated from current_user on these write paths (verify the existing PaperTrail.request.whodunnit wiring covers Grape requests — if not, set it from the token user in the use case).
Secrets / PII in logs: never log full prompt (org IP). Rollbar.error(e) is already used; add prompt (and system_prompt) to the Rollbar param scrub list so request bodies aren't captured. Events log only org_id / custom_param_id / has_rubric / reason (PRD §12).
DoS / size: prompt capped at 4,000 chars server-side (not just client); writes rely on the platform's existing request rate limiting (no new endpoint-specific limiter introduced).
AuthZ on default-rubric endpoint: serves only static, non-tenant config but still requires auth + set_role (no anonymous access to the rubric).
Data governance / retention (REV-7): the custom-param prompt is org-authored configuration (org IP), not end-customer PII — Phase 1 stores no conversation/customer data. Retention follows the existing acts_as_paranoid soft-delete on scorecard_custom_parameters (deleting a param soft-deletes its rubric); paper_trail versions persist for audit. The rubric is therefore out of scope for end-customer DSAR/export (it is account config, handled by normal account-deletion processes), and stays within the existing chatbot_gpt data boundary — no new data export, no new third-party data egress in Phase 1 (the rubric reaches OpenAI only in Phase 2, which owns that data-flow review).

Role × Endpoint Authorization

Endpoint	owner	admin	supervisor	agent	member
GET/PATCH preference	✅	✅	✅	❌ 403	❌ 403
POST/PATCH custom param	✅	✅	✅	❌ 403	❌ 403
GET default rubric	✅	✅	✅	❌ 403	❌ 403

Detail 3.A — Failure Mode Catalog

Failure	Detection	Behavior	Recovery
`ai_passing_grade` out of 0–100	Dry rule	422, nothing saved	FE inline error (Vuelidate mirror)
`prompt` > 4000	Dry macro	422, nothing saved	FE length counter blocks + server reject
DB write fails	`save!` raises → rescued	500, no partial state (single-row tx)	FE `$toast` + Retry; log `*_save_failed`
Concurrent saves (two admins)	org-unique index + single-row upsert	last-write-wins; no partial row; paper_trail keeps both versions	acceptable for a config row; no lock needed
hub-service down / slow	`middlewares/auth.rb` → `Repositories::ChatService::Users::Me` returns nil	401 "User service unavailable"	FE re-auth. Timeout/retry of this auth call is inherited from the existing middleware — out of scope to change here.
default-rubric load fails	500 / network	FE error state	Retry; `default_rubric_load_failed`
flag off	BE gate + FE `$hasSubscription`	surfaces not rendered / 404	n/a (by design)

Detail 3.B — Error Message Catalog

Code	Message	Surface
422	"AI passing grade only between 0 - 100 are allowed"	toggle
422	"AI judging rubric cannot exceed 4000 characters."	editor
500	"Couldn't save. Try again."	toggle/editor
500	"Couldn't load the default rubric."	viewer
403	"Permission denied" (existing)	all

Detail 3.C — Accessibility

Pixel3 components are used as-is (existing settings a11y). New textarea has an associated MpFormLabel; veto status conveyed by text + badge (not color alone); length counter has aria-live=polite. Keyboard: Save reachable via tab; errors announced via MpFormErrorMessage.

Detail 3.D — Browser Support & FE Performance Budget (REV-4)

Browser support: inherits the existing chatbot-fe Nuxt 4 target — no new matrix introduced; the new pages must work on the same browsers the current /settings/* pages support (no new polyfills, no APIs beyond what ai-assist.vue already uses).
FE performance budget: the scorecard settings route is a lazy-loaded page (Nuxt route-level code-split, like other pages/settings/*), so it adds no weight to the initial bundle. The default-rubric list is 9 static rows and the custom-param list is already paginated server-side — no large client render. No new heavy dependency is added (reuses Pixel3
- Vuelidate + mixpanel-browser already in package.json).

4. Backwards Compatibility and Rollout Plan

Compatibility

Additive only. New AI columns default off/unset; existing GET/PATCH callers that omit AI fields keep working (AI fields optional). Human auto-scoring (auto_agent_scoring.rb) reads only is_auto_score/passing_grade — untouched. prompt widen preserves existing data.

Rollout Strategy

Flag ai_qa_unified_scorecard default OFF. Stage 1 internal QA (3–5 accounts); Stage 2 closed beta (TransGo, Talenta LMS + 3 partners, dark); held for customer GA with Phase 2 scoring (PRD §11/§14). Detailed scheduling lives in delivery/ (not here).

Cross-Layer Rollout Compatibility

Order	Step	Safe if FE not yet shipped?	Safe if BE not yet shipped?
1	BE migration (add cols, widen prompt)	yes (inert columns)	—
2	BE API + OpenAPI	yes (flag-gated, optional fields)	—
3	FE behind `$hasSubscription`	—	FE no-ops (feature off)
4	Provision feature for beta orgs	—	—

Deploy BE before FE. Rollback FE before BE (FE depends on BE fields, not vice-versa).

Detail 4.A — Configuration Contract

Key	Type	Default	Where
`ai_qa_unified_scorecard`	org feature (`ChatbotGpt::Feature` + `OrganizationFeature`)	OFF	BE `OrganizationFeatures::FindFeature`; FE `$hasSubscription`
`DEFAULT_AI_PASSING_GRADE`	constant (entity)	75 (decided for build; mirrors human default)	`entities/frontend_services/gpt/scorecard_preference.rb`
`prompt` max length	constant in Dry macro	4000 (decided for build — PRD Open Q#2 confirmation is advisory; tunable)	custom-param create/update use cases
`ai_passing_grade` valid range	Dry rule	0–100 inclusive (decided — ADR-8)	`scorecard_preference/patch.rb`
default rubric content	`Constants::ScorecardAiDefaultRubric`	9 metrics, content `status:"PROPOSED"` (served as-is)	new constant/config (seed from PRD App A)

These values are locked for the Phase-1 build so the agent has no ambiguity. DSAI rubric confirmation (Open Q#2) and the max-length confirmation (Open Q#3) are advisory follow-ups that change config constants only — they do not block implementation.

Detail 4.B — Test Plan (commands from the repos)

Backend (chatbot/AGENTS.md:56-187,235-247):

# migrate the chatbot_gpt DB in test, then run specs
RAILS_ENV=test bundle exec rails db:migrate
bundle exec rspec spec/api/frontend_service/v1/gpt_spec.rb \
  spec/core/use_cases/api/frontend_service/v1/gpt
bundle exec rubocop
bundle exec brakeman
bundle exec fasterer && bundle exec reek
# OpenAPI (MANDATORY when endpoints change):
ruby scripts/openapi/split.rb
npx --yes @apidevtools/swagger-cli validate docs/openapi/openapi.yaml
npx --yes @stoplight/spectral-cli lint docs/openapi/openapi.yaml --fail-severity=error

Frontend (chatbot-fe/package.json:10-22):

pnpm lint
pnpm test            # vitest run
pnpm test:e2e        # playwright (visual/e2e)
pnpm build

Cross-boundary contract test (REV-2). Because FE and BE land in separate repos/PRs, pin the contract on both sides so a casing/shape drift fails CI rather than production:

BE (RSpec request spec): assert the PATCH-preference response and the custom-param response serialize exactly {is_auto_score, passing_grade, is_ai_auto_score, ai_passing_grade} and {id, name, prompt, auto_scorable} (snake_case, auto_scorable derived) — this is the authoritative contract. Add to spec/api/frontend_service/v1/gpt_spec.rb.
FE (Vitest service test): assert common/services/main/v1/scorecard.ts parses a fixture whose shape is copied verbatim from the BE spec's expected JSON into the ScorecardPreference / CustomParam / DefaultRubric interfaces (§2.B), and that the store maps auto_scorable → the chip. The shared fixture is the contract anchor: if the BE entity changes a key, the BE spec changes the fixture, and the FE test (using the same fixture) breaks — catching the drift.

Detail 4.C — Agent Execution Plan

Order respects dependencies (migration → BE API → OpenAPI → FE). Each chunk has files + commands + assertable acceptance. Use the chatbot repo's openapi-spec-sync skill for chunks touching endpoints.

#	Chunk	Files	Commands	Acceptance
1	Migration: AI columns	`chatbot/db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb`; regen `db/chatbot_gpt_schema.rb`	`RAILS_ENV=test bundle exec rails db:migrate`	schema shows `is_ai_auto_score` (bool default false) + `ai_passing_grade` (float)
2	Migration: widen `prompt`	`chatbot/db/chatbot_gpt_migrate/<ts>_widen_scorecard_custom_parameter_prompt.rb`; regen schema	same	`prompt` column type = `text`
3	BE preference AI fields	`app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb`; `.../scorecard_preference/patch.rb`,`get.rb`; `repositories/gpt/scorecard_preferences/upsert.rb`,`find_by.rb`; `entities/frontend_services/gpt/scorecard_preference.rb` (+`DEFAULT_AI_PASSING_GRADE`); `api/.../entities/gpt/scorecard_preference.rb` & `get_scorecard_preference_response.rb`	`bundle exec rspec spec/.../gpt`	new request spec: PATCH with `ai_passing_grade:80` persists & GET returns it; `ai_passing_grade:150`→422
4	BE custom-param `prompt`	`app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb`; `.../scorecard_custom_parameter/create.rb`,`update.rb` (+`validate_prompt_length`); `repositories/gpt/scorecard_custom_parameter/create.rb`,`update.rb`; `entities/.../gpt/scorecard_custom_parameter.rb` + response entity (`prompt`,`auto_scorable`)	`bundle exec rspec`	POST with `prompt`→`auto_scorable:true`; empty→false; 4001 chars→422
5	BE default-rubric endpoint	`app/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb` (NEW); use case + entity; `Constants::ScorecardAiDefaultRubric` (seed the 9 metrics' `code`/`name`/`description`/`veto` verbatim from PRD Appendix A Tier-1, exactly as in §2.4 Row 3; the per-metric LLM judge prompts stay out — they belong to Phase 2 scoring); mount in `app/api/frontend_service/gpt_api.rb` (+ `app/api/gpt_service/api.rb`)	`bundle exec rspec`	GET returns the 9 metrics with the §2.4 descriptions, `veto:true` on groundedness+policy, `status:"PROPOSED"`
6	OpenAPI sync	`docs/openapi/openapi.yaml` + `dist/{frontend,gpt}.yaml`; `docs/openapi/SESSION-LOG.md`	`ruby scripts/openapi/split.rb`; swagger-cli + spectral	both validators pass; `roleScopedAuth` overlay present for the 3 ops
7	FE store+service+endpoints	`chatbot-fe/store/scorecard/*`; `common/services/main/v1/scorecard.ts`; `common/services/main/endpoint.ts`; register in `mainService` index	`pnpm test`	store action unit tests: pending→resolved/rejected transitions
8	FE settings page + toggle	`pages/settings/scorecard/index.vue`; `modules/settings/views/scorecard/auto-score-toggle.vue`; Vuelidate 0–100; `$hasSubscription` guard; mixpanel events	`pnpm test`; `pnpm lint`	component test: out-of-range shows error; flag off → not rendered; save fires `scorecard_settings_updated`
9	FE custom-param editor	`pages/settings/scorecard/custom-parameters.vue`; `modules/settings/views/scorecard/custom-param-editor.vue`; length counter + auto-scorable chip	`pnpm test`	chip lights when textarea non-empty; >4000 blocked; saved toast
10	FE default-rubric viewer	`modules/settings/views/scorecard/default-rubric-viewer.vue`; veto badges + PROPOSED note; error+retry	`pnpm test`	renders 9 metrics + veto badges; error state shows retry + logs event
11	Cross-boundary contract test	BE: add expected-JSON assertions to `spec/api/frontend_service/v1/gpt_spec.rb`; FE: `tests/unit/.../scorecard.service.test.ts` parsing the same fixture into §2.B interfaces	`bundle exec rspec`; `pnpm test`	shared fixture → BE serializes it, FE parses it; a key/casing change breaks both (REV-2)
12	Test specs doc	`documents/chatbot/unified-agent-scorecard/tests/phase-1-settings-and-rubric-config.md`	n/a	`covers_acceptance_criteria` lists every `UASC-S0x/AC-n`

Detail 4.D — Verification & Rollback Recipe

Pre-merge (in order): BE rails db:migrate → rspec → rubocop/brakeman → OpenAPI split.rb + swagger-cli + spectral; FE pnpm lint → pnpm test → pnpm build.

Post-deploy signals:

scorecard_settings_updated events appear for beta orgs; save-failure events < 5%/h.
Manual: enable flag for one internal org → toggle AI scoring + set threshold + add a custom param with a rubric → reload → values persist; default rubric lists 9 metrics.
Regression: human scorecard config + a resolved-room human auto-score still behaves as before.
Adoption leading indicators (PRD §13 — settings save success ≥ 99%): track config-readiness = % beta Pro+Ent orgs that enabled is_ai_auto_score + accepted the default rubric or added ≥ 1 custom param (target ≥ 80% before Phase-2 GA), and scorecard_custom_param_saved count (target ≥ 1 per beta org) via the scorecard_settings_updated / scorecard_custom_param_saved events. (These are PM-owned program metrics; the RFC only ensures the events exist to compute them.)

Rollback (numbered):

Disable ai_qa_unified_scorecard for affected orgs (instant; surfaces vanish).
If BE bug: revert the BE PR (AI fields optional → no caller breaks).
If FE bug: revert the FE PR (BE inert without FE).
Columns/prompt widen are forward-only — do not run the prompt down in prod (truncation). Leave columns; they are inert when the flag is off.
Confirm scorecard_settings_save_failed returns to baseline and human auto-scoring intact.

5. Concerns, Questions, or Known Limitations

#	Item	Type	Owner	Status
1	Figma frames for both surfaces are pending; FE built against PRD Appendix B Stitch prompts until frames land — pixel deviations re-checked then.	Blocker (FE fidelity)	Design	Open (PRD Dep "Design — YES")
2	Confirm the 9 default metric definitions/order with DSAI (Appendix A is PROPOSED).	Open (accuracy)	DSAI	PRD Open Q#1, due 2026-07-15
3	`prompt` max length = 4,000 adopted; confirm.	Open	BOT+PM	PRD Open Q#2
4	`ai_passing_grade` (0–100) diverges from human `passing_grade` (1–99). Align later?	Known limitation	BOT	New (ADR-8)
5	No plan-tier check exists in code; plan-gating relies on provisioning the org-feature only for Pro+Ent. Confirm provisioning owner/path.	Open	Billing/Provisioning + PM	New (A2)
6	Forward-looking: Phase 2 must change the room-resolve skip guard (`is_custom_parameter`) so AI scoring with custom params runs.	Forward note	BOT	Phase 2
7	`REV-1`/`REV-9` — decide whether the new components' `props`/`emits` get explicit TS contracts in this RFC or are inferred at implementation time (store + service types are specified in §2.B). rfc-reviewer R3 note: the reference target `modules/settings/views/ai-assist.vue` is `<script setup>` with no `defineProps`/`defineEmits`, so it offers no prop contract to copy — typing the 3 components in §2.A is the recommended close.	Open (low risk)	BOT FE	From rfc-reviewer R1 (sharpened R3)
8	`REV-8` — `plugins/botSubscriptionFeature.ts` (`$hasSubscription`, cited by ADR-5/§2.0) is marked `@deprecated` in favor of the `useSubscription` composable. Confirm whether new FE uses `$hasSubscription` for parity or migrates to `useSubscription`.	Open (low risk)	BOT FE	From rfc-reviewer R3

6. Comment Log

Date	Author	Note
2026-06-20	rfc-starter (Claude)	Initial draft from PRD v1.2; grounded against `chatbot`, `chatbot-fe`, `hub-core`, `hub-service`. Corrected PRD persona/role premise (no qa_lead/bot_admin role) and confirmed `auto_agent_scoring.rb` scores the human agent.
2026-06-20	rfc-reviewer (Claude)	R1 review → 8.0 (Strong/PROCEED). Applied R2 fixes: typed store/service contracts + casing convention (§2.B, REV-1/REV-6), FE↔BE contract test (§4.B/§4.C ch.11, REV-2), tracing scope (§3, REV-3), browser/perf budget (§3.D, REV-4), custom-param dedup note (§2.4 r2, REV-5), data-governance/retention (§3, REV-7). Re-score → 8.5. Open: REV-1 (component prop typing) carried to §5 #7. See `rfc-phase-1-settings-and-rubric-config-review.md`.

7. Ready for Agent Execution

Ready for agent execution: yes (backend + FE logic). FE visual fidelity is gated on Figma (Open Q#1); build the components against the interim Stitch spec and reconcile when frames land. Optionally hand to rfc-reviewer for a second-pass score.

Metadata​

Sections at a Glance​

1. Overview​

Success Criteria​

Out of Scope​

Related Documents​

Assumptions​

Dependencies​

Detail 1.A — Coverage Matrices​

1.A.1 — PRD Section Coverage​

1.A.2 — UI / Consumer Surface Coverage​

1.A.3 — Role Coverage​

1.A.4 — Acceptance-Criteria → Design Element Map​

1.A.5 — PRD-to-Schema Derivation (BE)​

Detail 1.B — Decisions Closed (index → §2 ADRs)​

Detail 1.C — Per-Story Change Map​

2. Technical Design​

Detail 2.0 — Repo Reading Guide​

Repo Map (slice this RFC touches)​

Existing Code Anchors (read before writing)​

Patterns to Follow​

Reading Order for the Agent​

Existing-Endpoint Check (reuse / extend / new)​

Source Verification​

Detail 2.1 — Infrastructure Topology​

Per-service responsibilities​

Detail 2.1a — Sequence Diagrams (happy + failure paths)​

Detail 2.1b — Rubric Gate Branch​

Detail 2.2 — Technical Decisions (ADR-format)​

ADR-1 — Store AI scoring as new columns, not overloaded human columns​

ADR-2 — Widen scorecard_custom_parameters.prompt string → text​

ADR-3 — auto_scorable is derived, not stored​

ADR-4 — Default rubric via a new read-only endpoint from static config​

ADR-5 — Feature gate on ai_qa_unified_scorecard​

ADR-6 — Synchronous writes; async analytics​

ADR-7 — Authorization reuses the existing role enum​

ADR-8 — ai_passing_grade range 0–100 (diverges from human 1–99)​

Minimum-coverage checklist​

Detail 2.3 — Database Model​

Detail 2.4 — APIs (Outbound the FE consumes)​

Detail 2.A — UI Contract (FE)​

Detail 2.B — Data-Fetching Strategy (FE)​

Detail 2.C — UI State Matrix​

Detail 2.D — Scope Boundaries​

Detail 2.E — Branch & Skip Catalog​

Detail 2.G — Cross-Layer Contract Verification​

3. High-Availability & Security​

Performance Requirement​

Monitoring & Alerting​

Logging​

Tracing (REV-3)​

Security Implications​

Role × Endpoint Authorization​

Detail 3.A — Failure Mode Catalog​

Detail 3.B — Error Message Catalog​

Detail 3.C — Accessibility​

Detail 3.D — Browser Support & FE Performance Budget (REV-4)​

4. Backwards Compatibility and Rollout Plan​

Compatibility​

Rollout Strategy​

Cross-Layer Rollout Compatibility​

Detail 4.A — Configuration Contract​

Detail 4.B — Test Plan (commands from the repos)​

Detail 4.C — Agent Execution Plan​

Detail 4.D — Verification & Rollback Recipe​

5. Concerns, Questions, or Known Limitations​

6. Comment Log​

7. Ready for Agent Execution​