RFC: Unified Agent Quality Scorecard — Phase 1: Scorecard Settings & Rubric Config
Document Conventions (do not remove)
This RFC follows the Qontak RFC Template format for governance — the metadata table, sections 1–6, and Comment log are mandatory.
It is also agent-execution-ready: §1 PRD-to-Schema Derivation (BE half) + §2.A UI Contract (FE half), §2.0 Repo Reading Guide for both layers, mermaid diagrams, §2.G Cross-Layer Contract Verification, and §4 Agent Execution Plan + Verification & Rollback Recipe are complete.
Agent-execution-ready RFC derived 1:1 from
../prds/phase-1-settings-and-rubric-config.md. Phase 1 builds the config layer only — no scoring, in-room panel, report, or gate. Backend = Qontak Chatbot (chatbot, Rails 7.1 / Grape / Clean Architecture). Frontend = Qontak Chatbot FE (chatbot-fe, Nuxt 4 / Vue 3 / Pinia / Pixel3).
Metadata
| Field | Value | Notes |
|---|---|---|
| Status | RFC | Working vocabulary IDEA/RFC/AGREED/ABANDON. YAML status: uses linter enum (RFC→in-review); kept draft until reviewed. |
| Owner (DRI) | Dimas Fauzi Hidayat | Mirrors frontmatter dri. Single accountable owner; staffing lives in delivery/. |
| Source PRD | ../prds/phase-1-settings-and-rubric-config.md | PRD v1.2. |
| Anchor | ../unified-agent-scorecard-anchor.md | Initiative master index. |
| Delivery | not yet handed to delivery | Timeline/effort/rollout scheduling lives in delivery/. |
| Type | full-stack | Backend (Grape API + chatbot_gpt DB) + Frontend (Nuxt settings UI). |
| Squad | BOT — Bot, AI & Automation | |
| Infosec approver | required at review — see §7 | Touches auth-gated org config + audited PII-adjacent settings. |
| Last Updated | 2026-06-20 |
Sections at a Glance
| § | Section | Type hint |
|---|---|---|
| §1 | Overview, traceability, decisions index | PRD coverage, AC map, schema derivation |
| §2 | Technical design | Repo reading guide, infra topology, ADRs, ER + sequence diagrams, API + UI contracts |
| §3 | HA & Security | Perf, auth matrix, failure catalog, error catalog, a11y |
| §4 | Backwards compat & rollout | Flag contract, test plan, agent execution plan, rollback recipe |
| §5 | Concerns / open questions | Carried from PRD + grounding gaps |
| §6 | Comment log | |
| §7 | Ready for agent execution | The readiness gate |
1. Overview
Phase 1 ships the configuration layer for AI-agent quality scoring, behind the
ai_qa_unified_scorecard feature flag. It does three things and nothing more:
- Extends
scorecard_preferenceswith an AI on-switch + AI pass threshold (is_ai_auto_score,ai_passing_grade), leaving the existing human auto-score (is_auto_score/passing_grade, which already driveauto_agent_scoring.rb) untouched. - Wires the dormant
scorecard_custom_parameters.promptfield into the API + UI as an "AI judging rubric", widening itstring → text. A non-empty rubric marks a custom parameter auto-scorable (a derived attribute). - Surfaces a read-only Default AI Rubric viewer (the 9 Qontak-calibrated metrics), served by a new read-only backend endpoint from static config.
No scores are produced in Phase 1. The persisted config is consumed by the Phase 2 scoring pipeline.
Success Criteria
- AI scoring preference (
is_ai_auto_score+ai_passing_grade) persists per org and round-trips through GET; settings save success ≥ 99% (PRD §13). - A custom parameter with a non-empty
promptpersists and is returnedauto_scorable: true; emptypromptreturnsauto_scorable: false. - The 9-metric default rubric loads read-only with veto flags on Groundedness + Policy.
- All three surfaces are invisible unless
ai_qa_unified_scorecardis enabled for the org. - Save P95 ≤ 500ms (PRD §6).
- Human manual scorecard config and human auto-scoring behavior are byte-for-byte unchanged.
Out of Scope
Phase 2 scoring pipeline, in-room panel, multi-actor scoring, Analytics report (P3),
validation harness (P4), go-live gate (P5), mobile, billing/packaging, any change to
human manual scoring or auto_agent_scoring.rb runtime behavior. See PRD §5.
Forward-looking note (not built here): the room-resolve trigger at
chatbot/app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63currently skipsAutoAgentScoringWorkerwhen custom parameters exist (unless is_custom_parameter || scorecard_exists). Phase 2 must revisit this guard so AI scoring with custom params actually runs. Phase 1 changes nothing here.
Related Documents
| Document | Path | What was taken from it |
|---|---|---|
| Phase 1 PRD v1.2 | ../prds/phase-1-settings-and-rubric-config.md | All requirements, ACs, rubric content. |
| Initiative anchor | ../unified-agent-scorecard-anchor.md | Phase map; confirms auto_agent_scoring.rb scores the human agent only. |
| Phase 2 PRD | ../prds/phase-2-auto-scoring-and-in-room-scorecard.md | Reviewed — consumer of this config; no Phase 1 impact. |
| chatbot AGENTS.md API rules | chatbot/AGENTS.md §"API Specification Rules" | Mandatory OpenAPI bundle/split/validate workflow (§4.B). |
Assumptions
- A1 — Enabling
is_ai_auto_scorebefore Phase 2 exists is a recorded preference with no customer-visible effect (PRD Open Q#3). Grounded: GET/PATCH already persist a preference with no runtime side effect beyond the human path; the new AI columns are inert until Phase 2 reads them. - A2 — Plan-gating (Pro+Ent only) is enforced by provisioning the
ai_qa_unified_scorecardorg-feature only for eligible plans; the build only checks the flag. (No plan-tier check exists in code to reuse — see §5 Open Q#5.) - A3 —
promptmax length is 4,000 chars (PRD Open Q#2 proposed value). Adopted as the build default; tunable via the validation macro.
Dependencies
| Dependency | Owner | Needed | Blocking? |
|---|---|---|---|
prompt widen string→text migration | BOT (this RFC) | DDL | NO — in scope (§2.3). |
| Design frames (settings + rubric editor) | Design squad | Figma for CHG-001/CHG-002 | YES for FE pixel-faithfulness — see §5 Open Q#1. Stitch prompts in PRD Appendix B are the interim spec. |
| DSAI 9-metric definitions | DSAI | Confirm default rubric content | NO for build · advisory for accuracy (rubric served as PROPOSED). |
ai_qa_unified_scorecard org-feature provisioning | Billing/Provisioning | Feature rows for Pro+Ent orgs | NO for build (defaults OFF). |
Detail 1.A — Coverage Matrices
1.A.1 — PRD Section Coverage
| PRD § | Title | Covered in |
|---|---|---|
| 2 | One-liner + Problem | §1 Overview |
| 3 | What happens if we don't build | §1 (motivation) |
| 4 | Target users + persona | §3 Role × Endpoint matrix |
| 5 | Non-Goals | §1 Out of Scope |
| 6 | Constraints | §3 Performance, §4 Flag contract, §3 Role matrix |
| 7 | Feature Changes (CHG-001/002) | §2.3 DDL, §2.4 APIs, §2.A UI |
| 8 | New Features (editor + viewer) | §2.A UI Contract, §2.4 (new endpoint) |
| 9 | API & Webhook Behavior | §2.4 APIs |
| 10 | System Flow + Stories + ACs | §1.C, §2.1a Sequence, §1.A.4 AC map |
| 11 | Rollout | §4 Rollout |
| 12 | Observability | §3 Monitoring & Logging |
| 13 | Success Metrics | §1 Success Criteria, §4.D signals |
| 14 | Launch Plan & Stage Gates | §4 Rollout (technical view; scheduling → delivery/) |
| 15 | Dependencies | §1 Dependencies |
| 16 | Key Decisions + Alternatives | §1.B + §2 ADRs |
| 17 | Open Questions | §5 |
| App. A | AI Scoring Rubric | §2.4 default-rubric endpoint payload |
| App. B | Stitch UI Prompts | §2.A interim design spec |
1.A.2 — UI / Consumer Surface Coverage
| Surface | PRD ref | Backing read endpoint | RFC anchor |
|---|---|---|---|
Scorecard settings page /settings/scorecard | CHG-001, S01, S03 | GET .../scorecard_preferences + GET .../scorecard_ai_default_rubric | §2.A, §2.4 |
Custom-parameter editor /settings/scorecard/custom-parameters | CHG-002, S02 | GET .../scorecard_custom_parameters (existing list) | §2.A, §2.4 |
| Default Rubric viewer (within settings) | S03 | GET .../scorecard_ai_default_rubric (new) | §2.A, §2.4 |
1.A.3 — Role Coverage
| PRD persona | Grounded role (hub-core user.rb:38-44 enum) | Access |
|---|---|---|
| QA Lead / Supervisor | supervisor | Read all; write threshold + custom rubric |
| Bot / AI Admin (Agent Owner) | owner / admin | Read all; write threshold + custom rubric |
| End CS agent | agent / member | No access (controls not rendered; API 403) |
Grounding correction (PRD vs code): the PRD names "QA Lead" and "Bot/AI Admin" roles. The platform has no such roles. The single-role enum issued by hub-service
/users/meis{owner, admin, supervisor, agent, member}—supervisoris the closest to QA Lead. Per confirmed decision, scorecard writes keepset_role(%w[owner admin supervisor])and map the personas onto these roles (ADR-7).
1.A.4 — Acceptance-Criteria → Design Element Map
| PRD Story | Composite AC ids | Design element | Test spec ref |
|---|---|---|---|
UASC-S01 — Enable AI auto-scoring + threshold | UASC-S01/AC-1, /AC-2, /AC-3, /ERR-1, /NEG-1 | §2.3 new cols · §2.4 row 1 (PATCH pref) · §2.A AutoScoreToggle · §4.C chunks 1,3,6 | tests/phase-1-settings-and-rubric-config.md |
UASC-S02 — Custom param + rubric | UASC-S02/AC-1..AC-4, /ERR-1, /NEG-1 | §2.3 prompt text · §2.4 row 2 (custom param) · §2.A CustomParamEditor · §4.C chunks 2,4,7 | ″ |
UASC-S03 — View default rubric | UASC-S03/AC-1, /AC-2, /AC-3, /ERR-1 | §2.4 row 3 (new endpoint) · §2.A DefaultRubricViewer · §4.C chunks 5,8 | ″ |
1.A.5 — PRD-to-Schema Derivation (BE)
| PRD entity/attribute/rule | table.column | Exposed by | Enforced at | PRD ref |
|---|---|---|---|---|
| AI auto-score on-switch | scorecard_preferences.is_ai_auto_score (new, bool, default false) | GET/PATCH preference | Dry contract (optional bool), upsert repo | CHG-001 |
| AI pass threshold | scorecard_preferences.ai_passing_grade (new, float, nullable) | GET/PATCH preference | Dry contract rule 0–100 | CHG-001, S01/AC-2,3 |
| Org-specific AI rubric | scorecard_custom_parameters.prompt (string→text) | POST/PATCH custom param; list GET | Dry contract length ≤ 4000 | CHG-002, S02 |
| Auto-scorable flag | derived prompt.present? (not stored) | custom param response auto_scorable | computed in entity/builder | S02/AC-1,3; NEG-1 |
| 9 default AI metrics + veto flags | static config (no table) | new read-only endpoint | constant + Grape entity | S03, App. A |
Detail 1.B — Decisions Closed (index → §2 ADRs)
| # | Decision | ADR |
|---|---|---|
| 1 | New columns is_ai_auto_score + ai_passing_grade, not overloading existing human cols | ADR-1 |
| 2 | Widen prompt string→text (Postgres change_column) | ADR-2 |
| 3 | auto_scorable is derived from prompt.present?, not a stored boolean | ADR-3 |
| 4 | Default rubric served by a new read-only endpoint from static Ruby config | ADR-4 |
| 5 | Gate all surfaces on ai_qa_unified_scorecard (BE: OrganizationFeatures::FindFeature; FE: $hasSubscription) | ADR-5 |
| 6 | Settings/rubric writes are synchronous; analytics fired async via SendMixpanelEventWorker | ADR-6 |
| 7 | Reuse set_role(%w[owner admin supervisor]); map PRD personas onto the existing enum | ADR-7 |
| 8 | ai_passing_grade validated 0–100 inclusive (PRD), diverging from human passing_grade 1–99 | ADR-8 |
Detail 1.C — Per-Story Change Map
| Story | Layer scope | Changes (concrete artifacts) | Acceptance criteria | RFC anchors |
|---|---|---|---|---|
| UASC-S01 | FE + BE | BE: migration add 2 cols; ScorecardPreference::Patch/Get contracts + defaults; scorecard_preferences Grape params; entity ScorecardPreference (FE-svc + gpt-svc); Upsert/FindBy repos; entity Entities::FrontendServices::Gpt::ScorecardPreference + DEFAULT_AI_PASSING_GRADE; OpenAPI. FE: AutoScoreToggle.vue; store/scorecard state/actions; scorecard.ts service + endpoint.ts; Vuelidate 0–100; mixpanel scorecard_settings_updated/_save_failed. | UASC-S01/AC-1,2 persist+roundtrip; AC-3 0–100 validation rejects; ERR-1 error+retry, no partial state, log; NEG-1 Starter/Free hidden (flag off). | §2.3 · §2.4 r1 · §2.A · §4.C c1,c3,c6 |
| UASC-S02 | FE + BE | BE: migration widen prompt; add prompt to custom-param Grape params (POST+PATCH) + Create/Update Dry contracts + validate_prompt_length macro; repos ScorecardCustomParameter::Create/Update persist prompt; entity ScorecardCustomParameter expose prompt + auto_scorable; OpenAPI. FE: CustomParamEditor.vue (textarea + length counter + auto-scorable chip); store actions; service+endpoint; Vuelidate max-len; mixpanel scorecard_custom_param_saved/_save_failed. | S02/AC-1 non-empty→auto_scorable; AC-2 shows rubric+state; AC-3 empty→manual-only; AC-4 over-limit rejected; ERR-1 error+retry+log; NEG-1 empty NOT auto-scorable. | §2.3 · §2.4 r2 · §2.A · §4.C c2,c4,c7 |
| UASC-S03 | FE + BE | BE: new ScorecardAiDefaultRubric Grape resource + use case reading Constants::ScorecardAiDefaultRubric; entity; mount in frontend_service/gpt_api.rb (+ optionally gpt_service); OpenAPI. FE: DefaultRubricViewer.vue; store action; service+endpoint; "PROPOSED" + veto badges; mixpanel default_rubric_viewed/default_rubric_load_failed. | S03/AC-1 9 metrics read-only; AC-2 PROPOSED note; AC-3 veto flag on Groundedness+Policy; ERR-1 load error+retry+log. | §2.4 r3 · §2.A · §4.C c5,c8 |
2. Technical Design
Detail 2.0 — Repo Reading Guide
Repo Map (slice this RFC touches)
flowchart LR
subgraph FE["chatbot-fe (Nuxt 4)"]
page["pages/settings/scorecard/*.vue (NEW)"]
views["modules/settings/views/* (pattern: ai-assist.vue)"]
store["store/scorecard/* (NEW, pattern: store/ai-assist)"]
svc["common/services/main/v1/scorecard.ts (NEW)"]
ep["common/services/main/endpoint.ts (+scorecard)"]
flag["plugins/botSubscriptionFeature.ts ($hasSubscription)"]
end
subgraph BE["chatbot (Rails 7.1 / Grape)"]
apipref["app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb"]
apicp["app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb"]
apirub["app/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb (NEW)"]
uc["app/core/use_cases/api/frontend_service/v1/gpt/scorecard_*"]
repo["app/core/repositories/gpt/scorecard_*"]
ent["app/core/entities/frontend_services/gpt/scorecard_preference.rb"]
auth["app/api/frontend_service/middlewares/auth.rb -> hub-service /users/me"]
end
db[("chatbot_gpt DB (Postgres)\nscorecard_preferences\nscorecard_custom_parameters")]
mp["SendMixpanelEventWorker -> Mixpanel"]
page --> store --> svc --> ep -->|"$apiMain /api"| apipref & apicp & apirub
flag -.gates.-> page
apipref & apicp & apirub --> auth
apipref --> uc --> repo --> db
apicp --> uc
uc --> ent
page -.fires.-> mp
Existing Code Anchors (read before writing)
| # | Path | What to learn |
|---|---|---|
| 1 | chatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb | Grape GET/PATCH shape, set_role, Dry::Matcher::ResultMatcher, mount target. |
| 2 | chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rb | Dry contract + rule(:passing_grade) 1–99; how AI cols/rule are added. |
| 3 | chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/get.rb | Default-fill pattern via Entities::...::ScorecardPreference::DEFAULT_*. |
| 4 | chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb | Find-by-org upsert; where to set the new columns. |
| 5 | chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb | DEFAULT_PASSING_GRADE=75, DEFAULT_AUTO_SCORE=true; add DEFAULT_AI_*. |
| 6 | chatbot/app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb | POST/PATCH params (no prompt today); where to add it. |
| 7 | chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rb | register_macro(:validate_*_length) pattern → model validate_prompt_length. |
| 8 | chatbot/db/chatbot_gpt_migrate/20241113041150_create_scorecard_custom_parameters.rb | Migrator dialect for the chatbot_gpt DB; prompt is t.string. |
| 9 | chatbot/app/models/chatbot_gpt_record.rb | ChatbotGptRecord base — migrations target :chatbot_gpt connection. |
| 10 | chatbot-fe/modules/settings/views/ai-assist.vue + store/ai-assist/* | Canonical settings page + Pinia store + Vuelidate + toast + $hasSubscription pattern. |
Patterns to Follow
| Concern | Reference file (opened) | Pattern |
|---|---|---|
| Grape endpoint + auth | chatbot/app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb | format :json; set_role; Dry::Matcher::ResultMatcher success/failure. |
| Use case + validation | .../scorecard_preference/patch.rb; .../scorecard_custom_parameter/create.rb | contract do params … rule … register_macro end; Success/Failure(build_*_params). |
| Repository upsert | chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb | find-or-build → assign → save! → Builders::...build. |
| Entity defaults | chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb | dry-struct attributes + public_constant :DEFAULT_*. |
| Grape response entity | chatbot/app/api/frontend_service/v1/entities/gpt/scorecard_preference.rb | Grape::Entity expose with documentation. |
| External LLM call (Phase 2 ref only) | chatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:160-179 | OpenAI::Client.new(request_timeout: 240), max_attempts = 2, Rollbar.error. |
| Async analytics | chatbot/app/workers/send_mixpanel_event_worker.rb + app/core/use_cases/system/receive_webhook.rb:76-89 | SendMixpanelEventWorker.perform_async(org, event, props.as_json). |
| Feature flag (BE) | chatbot/app/core/repositories/organization_features/find_feature.rb; usage in app/core/repositories/ai_knowledge_sources/search.rb | OrganizationFeatures::FindFeature.new(feature_code:, organization_id:).call_by_organization. |
| Settings page (FE) | chatbot-fe/modules/settings/views/ai-assist.vue | MpFormControl/MpInput/MpButton; useVuelidate; $toast; isFetch* computed. |
| Pinia store (FE) | chatbot-fe/store/ai-assist/{state,actions,getters,types}.ts | fetchStatus: idle/pending/resolved/rejected; service via mainService. |
| API service (FE) | chatbot-fe/common/services/main/v1/ai-assist.ts + common/services/main/endpoint.ts | $apiMain(endpoint, {method, body, signal}); AbortController. |
| Feature flag (FE) | chatbot-fe/plugins/botSubscriptionFeature.ts | $hasSubscription('code') boolean. |
| Analytics (FE) | chatbot-fe/common/contants/mixpanel-events.ts + ai-assist.vue:925 | mixpanel.track(MIXPANEL_EVENTS.X, props). |
Reading Order for the Agent
chatbot/AGENTS.md(§Workflow Commands + §API Specification Rules)- Anchor #1 (preference Grape) → #2 (patch UC) → #3 (get UC) → #4 (upsert) → #5 (entity)
- Anchor #6 (custom-param Grape) → #7 (create UC macros)
- Anchor #8 + #9 (chatbot_gpt migration dialect + base record)
chatbot/app/api/frontend_service/gpt_api.rb(FE-facing mount paths) andchatbot/app/api/gpt_service/api.rb(gpt-svc mount paths)- Anchor #10 (
chatbot-fesettings page + store + service) chatbot-fe/common/services/main/endpoint.ts,plugins/api/apiMain.ts,plugins/botSubscriptionFeature.ts
Existing-Endpoint Check (reuse / extend / new)
| Endpoint | Surface(s) | Tag | Evidence |
|---|---|---|---|
PATCH /v1/gpt/omnichannel/scorecard_preferences (+ PATCH /v1/scorecards/preferences) | frontend_service + gpt_service | extended | gpt_api.rb:28, gpt_service/api.rb:21; adding is_ai_auto_score/ai_passing_grade. |
GET same path | both | extended | scorecard_preferences.rb get '/'; add AI fields to response. |
POST /v1/gpt/scorecard_custom_parameters + PATCH :id (+ /v1/scorecards/parameters/custom) | both | extended | gpt_api.rb:33, gpt_service/api.rb:24; adding prompt. |
GET .../scorecard_ai_default_rubric | frontend_service (+ gpt_service optional) | new-with-justification | No endpoint serves AI default metrics today (grep scorecard in app/api — only categories/parameters/custom/preferences). The 9 AI metrics are a new concept with no table; a static-config read endpoint is the single source of truth Phase 2 reuses, and the PRD defines default_rubric_load_failed (a fetch failure mode). |
Source Verification
| Claim | Evidence (file:line / identifier) |
|---|---|
scorecard_preferences has is_auto_score (bool, default false) + passing_grade (float) | chatbot/db/chatbot_gpt_schema.rb:454-465; migration db/chatbot_gpt_migrate/20240206095006_create_scorecard_preference.rb |
Human passing_grade validated 1–99 | chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_preference/patch.rb:18-22 rule(:passing_grade) |
Preference upsert keyed by organization_id | chatbot/app/core/repositories/gpt/scorecard_preferences/upsert.rb:13-25; model validates_uniqueness_of :organization_id |
Defaults DEFAULT_PASSING_GRADE=75, DEFAULT_AUTO_SCORE=true | chatbot/app/core/entities/frontend_services/gpt/scorecard_preference.rb:7-9 |
scorecard_custom_parameters.prompt exists as string, unused by API | chatbot/db/chatbot_gpt_schema.rb:418-441 (t.string "prompt"); custom-param Grape params omit prompt (app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb POST/PATCH params) |
| Length validation macro pattern | chatbot/app/core/use_cases/api/frontend_service/v1/gpt/scorecard_custom_parameter/create.rb:39-59 |
auto_agent_scoring.rb scores the human agent on room resolve | chatbot/app/core/use_cases/gpt/omnichannel/auto_agent_scoring.rb:6,76-79; trigger app/core/use_cases/api/internal_service/v1/webhook/room_resolve_interactions.rb:48-63; worker app/workers/auto_agent_scoring_worker.rb |
| OpenAI client pattern (Phase 2 ref) | auto_agent_scoring.rb:160-179 OpenAI::Client.new(request_timeout: 240), max_attempts = 2 |
Role enum {owner,admin,supervisor,agent,member} (single role) | hub-core app/core/domains/models/user.rb:38-44; surfaced via hub-service /api/core/v1/users/me; consumed chatbot/app/api/frontend_service/middlewares/auth.rb:15-24 → env['user'] → current_user['role']; checked app/api/frontend_service/helpers/authorization_helpers.rb:6-10 |
Existing scorecard endpoints gate owner/admin/supervisor | scorecard_preferences.rb (get/patch) + scorecard_custom_parameter.rb (post/patch/delete) set_role(%w[owner admin supervisor]) |
| Feature-flag mechanism (BE) | chatbot/app/core/repositories/organization_features/find_feature.rb:4-22; usage app/core/repositories/ai_knowledge_sources/search.rb |
| paper_trail already on both models | chatbot/app/models/chatbot_gpt/scorecard_preference.rb:5; scorecard_custom_parameter.rb:6 |
| Mixpanel async worker | chatbot/app/workers/send_mixpanel_event_worker.rb; config/initializers/mixpanel.rb |
| OpenAPI mandatory workflow | chatbot/AGENTS.md:235-247 |
| FE has no scorecard code today | grep scorecard|passing_grade|is_auto_score in chatbot-fe/{common,store,pages,modules} → 0 hits |
| FE settings/store/service/flag patterns | chatbot-fe/modules/settings/views/ai-assist.vue; store/ai-assist/*; common/services/main/v1/ai-assist.ts; common/services/main/endpoint.ts:123-157; plugins/botSubscriptionFeature.ts:23-40; common/contants/mixpanel-events.ts |
| chatbot_gpt DB connection | chatbot/app/models/chatbot_gpt_record.rb connects_to database: { writing: :chatbot_gpt, reading: :chatbot_gpt } |
Detail 2.1 — Infrastructure Topology
flowchart TB
user(["QA Lead / Supervisor / Owner / Admin (web)"])
lb["LB / Ingress"]
fe["chatbot-fe pods (Nuxt 4 SSR/SPA)"]
api["chatbot pods (Puma · Grape FrontendService)"]
hub["hub-service /api/core/v1/users/me (auth)"]
pg[("Postgres — chatbot_gpt DB\n(writing+reading)")]
redis[("Redis")]
sidekiq["Sidekiq workers"]
mp["Mixpanel (external)"]
oai["OpenAI (external) — Phase 2 only"]
user --> lb --> fe -->|"$apiMain /api (Bearer + X-Auth-Token)"| lb
lb --> api
api -->|"validate token"| hub
api -->|"read/write settings + rubric"| pg
fe -.->|"track events"| mp
api -->|"enqueue analytics"| redis --> sidekiq --> mp
sidekiq -. "Phase 2 AutoAgentScoring" .-> oai
Per-service responsibilities
| Service | Use cases (this RFC) | Internal calls (owner) | External APIs |
|---|---|---|---|
| chatbot-fe | Render settings + rubric editor + viewer; client validation; fire events | chatbot API (BOT) | Mixpanel (browser) |
| chatbot (Grape) | Persist AI preference; persist custom-param rubric; serve default rubric; authz; feature-gate | hub-service /users/me (Core team); Mixpanel worker | — (Phase 1); OpenAI in Phase 2 |
| hub-service / hub-core | Token validation, issues single role | — | — |
| chatbot_gpt DB | Store scorecard_preferences, scorecard_custom_parameters (+ paper_trail versions) | — | — |
Detail 2.1a — Sequence Diagrams (happy + failure paths)
S01 — Save AI preference (authz + validation failure)
sequenceDiagram
participant U as User
participant FE as chatbot-fe
participant LB as LB
participant API as chatbot (Grape)
participant HUB as hub-service /users/me
participant DB as chatbot_gpt (Postgres)
participant MP as Mixpanel (async)
U->>FE: toggle AI on, set ai_passing_grade
FE->>LB: PATCH scorecard_preferences (Bearer+X-Auth)
LB->>API: forward
API->>HUB: validate token
HUB-->>API: {role: supervisor, organization_id}
alt role not in {owner,admin,supervisor}
API-->>FE: 403 Permission denied
else authorized
API->>API: Dry coerce :float + rule ai_passing_grade in 0..100
alt non-numeric or out of range
API-->>FE: 422 "AI passing grade only between 0 - 100"
FE-->>U: inline error, nothing saved
else valid
API->>DB: upsert by organization_id (save!)
DB-->>API: ok
API-->>FE: 200 {is_ai_auto_score, ai_passing_grade}
FE-)MP: track scorecard_settings_updated
FE-->>U: "Change saved"
end
end
S02 — Save custom param + rubric (happy + DB failure)
sequenceDiagram
participant FE as chatbot-fe
participant API as chatbot (Grape)
participant DB as chatbot_gpt
participant MP as Mixpanel (async)
FE->>API: POST scorecard_custom_parameters {name, prompt}
API->>API: validate_prompt_length (<=4000) + name rules
alt prompt > 4000
API-->>FE: 422 "AI judging rubric cannot exceed 4000 characters."
else valid
API->>DB: create (save!) — company_id from token
alt save! raises
API->>API: Rollbar.error(e)
API-->>FE: 500 "Something went wrong"
FE-)MP: track scorecard_custom_param_save_failed
FE-->>FE: $toast error + Retry (no partial state)
else ok
API-->>FE: 200 {prompt, auto_scorable: prompt.present?}
FE-)MP: track scorecard_custom_param_saved {has_rubric}
end
end
S03 — Load default rubric (happy + fetch failure)
sequenceDiagram
participant FE as chatbot-fe
participant API as chatbot (Grape)
FE->>API: GET scorecard_ai_default_rubric
alt success
API-->>FE: 200 {status:PROPOSED, metrics:[9 + veto]}
FE-->>FE: render list + veto badges
else 500 / network
FE-->>FE: "Couldn't load the default rubric." + Retry
FE-)FE: track default_rubric_load_failed
end
Detail 2.1b — Rubric Gate Branch
flowchart TD
A[Save custom parameter] --> B{prompt non-empty?}
B -->|Yes| C[auto_scorable = true]
B -->|No| D[auto_scorable = false — manual-only]
C --> E[Persist + return auto_scorable]
D --> E
Detail 2.2 — Technical Decisions (ADR-format)
ADR-1 — Store AI scoring as new columns, not overloaded human columns
- Context.
scorecard_preferences.is_auto_score/passing_gradealready driveauto_agent_scoring.rb(human). PRD requires AI on-switch + AI threshold "with the existing human auto-score untouched." - Options.
- A. New columns
is_ai_auto_score+ai_passing_grade— clean separation; human path provably unchanged; Phase 2 reads AI cols explicitly. Con: one migration + a few columns. - B. Overload
is_auto_score/passing_gradefor both lenses — Con: entangles human and AI, high regression risk on a live path; a single threshold can't differ per lens. - C. Reuse
is_auto_scoreswitch, add onlyai_passing_grade— Con: can't enable AI without enabling human auto-scoring and vice-versa.
- A. New columns
- Decision. Option A (confirmed by DRI).
- Rationale. Strongest guarantee that human auto-scoring is byte-for-byte unchanged; matches PRD's two-lens intent.
- Consequences. Migration adds
is_ai_auto_score(bool, default false) +ai_passing_grade(float, nullable). Contracts/entities/upsert extended. Phase 2 reads the AI columns. - Reversibility. High — drop the two columns; no human-path coupling.
ADR-2 — Widen scorecard_custom_parameters.prompt string → text
- Context. A real judging rubric (PRD ~4,000 chars) does not fit a single-line
string. - Options. A.
change_column … :text(Postgres in-place, no rewrite for varchar→text). B. Add a newrubrictext column and dual-write — Con: duplicate field, migration of an unused column for no benefit. - Decision. Option A.
change_column :scorecard_custom_parameters, :prompt, :text. - Rationale.
promptis already the PRD's named field and is currently unused, so the widen is non-destructive (varchar→text widening preserves data). - Consequences. Migration in
db/chatbot_gpt_migrate/;chatbot_gpt_schema.rbregen. - Reversibility. Low/risky (text→string truncates) — treat as forward-only; rollback is the flag, not the column type.
ADR-3 — auto_scorable is derived, not stored
- Context. PRD: "non-empty rubric marks the param auto-scorable."
- Options. A. Compute
auto_scorable = prompt.present?at read time. B. Store a boolean column kept in sync — Con: drift risk, redundant with the source of truth. - Decision. Option A — expose
auto_scorablein the response entity/builder. - Rationale. Single source of truth (
prompt); no sync bug; Phase 2 re-derives the same way. - Consequences. Response entity gains a computed
auto_scorablefield; no schema change. - Reversibility. High.
ADR-4 — Default rubric via a new read-only endpoint from static config
- Context. The 9 AI metrics (PROPOSED, DSAI-owned) have no table; PRD defines a
default_rubric_viewed/default_rubric_load_failedfetch. - Options.
- A. New read-only endpoint serving a Ruby constant/YAML (
Constants::ScorecardAiDefaultRubric). - B. FE static constant — Con: no real load-failure mode; duplicates the list Phase 2 needs server-side.
- C. Seed the 9 metrics into
scorecard_parameters/categories— Con: mixes AI metrics into human-scorecard tables; risks human auto-scorer picking them up.
- A. New read-only endpoint serving a Ruby constant/YAML (
- Decision. Option A (confirmed by DRI).
- Rationale. Server is the single source of truth; Phase 2 scoring reads the same constant; honors the PRD's fetch + failure event; no schema entanglement.
- Consequences. New Grape resource + use case + entity; content carries
status: PROPOSED. - Reversibility. High — delete endpoint + constant.
ADR-5 — Feature gate on ai_qa_unified_scorecard
- Context. Surfaces must ship dark until Phase 2; Pro+Ent only.
- Options. A. Reuse org-feature mechanism (BE
OrganizationFeatures::FindFeature, FE$hasSubscription). B. New bespoke flag system — Con: reinvents an existing pattern. - Decision. Option A. BE guards the three surfaces (return 404/empty or
feature_enabled:false); FE hides routes/controls via$hasSubscription('ai_qa_unified_scorecard'). - Rationale. Matches existing AI-assist gating; plan-gating piggybacks on provisioning (A2).
- Consequences. Feature row must be provisioned per org; default OFF.
- Reversibility. High — toggle the feature off.
ADR-6 — Synchronous writes; async analytics
- Context. Save P95 ≤ 500ms; events must not block saves.
- Decision. Settings/rubric writes are synchronous single-row upserts (well under
500ms); Mixpanel events enqueued via
SendMixpanelEventWorker.perform_async. - Options. No async needed for the write itself (
no alternative considered — single-row DB write under budget). Analytics async is the existing pattern. - Consequences. Event delivery is best-effort and never fails the save.
- Reversibility. High.
ADR-7 — Authorization reuses the existing role enum
- Context. No
qa_lead/bot_adminrole exists; single-role enum{owner,admin,supervisor,agent,member}(hub-coreuser.rb:38-44). - Options. A. Keep
set_role(%w[owner admin supervisor]); map QA Lead→supervisor, Bot/AI Admin→owner/admin. B. Introduce new roles — Con: cross-cutting change to hub-core- hub-service token issuance, far outside this initiative.
- Decision. Option A (confirmed by DRI).
- Rationale. Matches every existing scorecard endpoint;
agent/memberexcluded (= "end CS agents: no access"). - Consequences. Read + write on all three surfaces gate
owner/admin/supervisor. - Reversibility. High; revisit if a QA role lands platform-wide.
ADR-8 — ai_passing_grade range 0–100 (diverges from human 1–99)
- Context. PRD §6/§9 say AI threshold 0–100; existing human rule is 1–99
(
patch.rb:18-22). - Options. A. Validate the new field 0–100 inclusive per PRD. B. Match human 1–99 for consistency — Con: contradicts PRD's stated bar (0 and 100 both meaningful).
- Decision. Option A —
rule(:ai_passing_grade) { key.failure unless (0..100).cover?(value) }. - Rationale. New field, follow the PRD spec; 0 ("any pass") and 100 ("perfect only") are legitimate.
- Consequences. Two different valid ranges in one table — documented; surfaced as a minor follow-up to align (§5 Open Q#4).
- Reversibility. High — change the rule bound.
Minimum-coverage checklist
- Storage — chatbot_gpt Postgres; new cols + widened
prompt(ADR-1,2). - Sync vs async — sync writes, async analytics (ADR-6).
- Caching —
n/a — single-row reads, no cache; default rubric is a static constant. - Third-party — Mixpanel via existing worker (ADR-6); OpenAI is Phase 2.
- Consistency — strong (single-row upsert, unique per org).
- Multi-tenancy — org-scoped by
organization_id(preference) /company_id(custom param) from the validated token; never client-supplied (§3 Security). - Reuse vs new — 2 extended endpoints + 1 new (ADR-4, Existing-Endpoint Check).
Detail 2.3 — Database Model
erDiagram
SCORECARD_PREFERENCES {
bigint id PK
string organization_id UK "unique where deleted_at IS NULL"
boolean is_auto_score "human (existing), default false"
float passing_grade "human (existing), 1-99"
boolean is_ai_auto_score "NEW, default false"
float ai_passing_grade "NEW, nullable, 0-100"
string company_id
datetime deleted_at
}
SCORECARD_CUSTOM_PARAMETERS {
uuid id PK
string name
string code
string description
text prompt "WIDENED string->text (AI judging rubric)"
string company_id "UK [code, company_id] where deleted_at IS NULL"
datetime deleted_at
}
SCORECARD_CATEGORIES_PARAMETERS }o--|| SCORECARD_CUSTOM_PARAMETERS : references
DDL (Rails DSL, chatbot_gpt connection — pattern: db/chatbot_gpt_migrate/20241113041150_*):
# db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb
class AddAiScoringToScorecardPreferences < ActiveRecord::Migration[7.1]
def change
add_column :scorecard_preferences, :is_ai_auto_score, :boolean, null: false, default: false
add_column :scorecard_preferences, :ai_passing_grade, :float
add_index :scorecard_preferences, :is_ai_auto_score
end
end
# db/chatbot_gpt_migrate/<ts+1>_widen_scorecard_custom_parameter_prompt.rb
class WidenScorecardCustomParameterPrompt < ActiveRecord::Migration[7.1]
def up
change_column :scorecard_custom_parameters, :prompt, :text
end
def down
change_column :scorecard_custom_parameters, :prompt, :string # WARNING: truncates >255
end
end
No data backfill (PRD §11). New AI columns default to "off/unset"; existing rows unaffected. Regenerate
db/chatbot_gpt_schema.rbafter migrating.
Per-status lifecycle: n/a — no status enum introduced (no new state machine; acts_as_paranoid soft-delete + paper_trail versioning already exist on both models and are unchanged).
State Surface Contract:
| Entity | Surfaced to | Field(s) | Visibility | Audit |
|---|---|---|---|---|
scorecard_preferences | settings page | is_ai_auto_score, ai_passing_grade (+ existing human) | owner/admin/supervisor; flag on | paper_trail (existing) |
scorecard_custom_parameters | editor + list | prompt, derived auto_scorable | owner/admin/supervisor; flag on | paper_trail (existing) |
| default rubric (static) | viewer | 9 metrics + veto + PROPOSED | owner/admin/supervisor; flag on | n/a (read-only constant) |
Detail 2.4 — APIs (Outbound the FE consumes)
Base: chatbot frontend_service surface, called by FE
$apiMainat/api(gpt_api.rb:28,33). Mirror the same Grape classes on the gpt_service surface (/v1/scorecards/...,gpt_service/api.rb:21,24) for parity. Auth: Bearer access-token +X-Auth-Token(validated viamiddlewares/auth.rb→ hub-service/users/me).
Row 1 — extended — Preference (AI fields)
GET /api/v1/gpt/omnichannel/scorecard_preferences # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/omnichannel/scorecard_preferences
Request (PATCH):
{
"passing_grade": 75, // existing human (required by current contract)
"is_auto_score": true, // existing human (required)
"is_ai_auto_score": true, // NEW (optional; default false)
"ai_passing_grade": 80 // NEW (optional; validated 0..100 when present)
}
Response (GET/PATCH 200):
{
"data": {
"is_auto_score": true, "passing_grade": 75,
"is_ai_auto_score": true, "ai_passing_grade": 80
},
"message": "OK"
}
Errors: 422 ai_passing_grade non-numeric → coercion failure; 422 outside 0–100 → "AI passing grade only between 0 - 100 are allowed"; 403 role; 401 auth; 500 save fail.
New AI fields are optional in the contract so existing callers sending only human fields keep working (backward compat). Contract:
optional(:is_ai_auto_score).maybe(:bool),optional(:ai_passing_grade).maybe(:float)— Dry coerces type before the rangerule(:ai_passing_grade), so a non-numeric value 422s instead of raising. On read, absentai_passing_grade→DEFAULT_AI_PASSING_GRADE(75),is_ai_auto_score→ false.
Row 2 — extended — Custom parameter (rubric)
POST /api/v1/gpt/scorecard_custom_parameters # role: owner|admin|supervisor; flag-gated
PATCH /api/v1/gpt/scorecard_custom_parameters/:id
Request adds:
{ "name": "BANT capture", "prompt": "Score how completely … 0-100 + which were missed." }
Response 200 adds:
{ "data": { "id": "<uuid>", "name": "BANT capture", "prompt": "…", "auto_scorable": true }, "message": "…" }
Errors: 422 prompt length > 4000 → "AI judging rubric cannot exceed 4000 characters."; existing 422 name rules; 403/401/500 as today.
Create vs update / duplicate handling (REV-5):
POSTcreates,PATCH :idupdates — these are not an upsert, so addingpromptdoes not change create semantics. Duplicate names are already rejected by the existingRepositories::Gpt::ScorecardCustomParameter::NameUniquenessValidator#validate_create(create.rb→422 "The name field is already exist or name cannot be the same as default parameter"). The newpromptfield is orthogonal to uniqueness; no new collision surface is introduced.
Row 3 — new-with-justification — Default AI rubric (read-only)
GET /api/v1/gpt/scorecard_ai_default_rubric # role: owner|admin|supervisor; flag-gated
Response 200:
{
"data": {
"status": "PROPOSED",
"group": "Qontak AI Quality (default)",
"metrics": [
{ "code": "groundedness", "name": "Groundedness / factual accuracy", "description": "Claims backed by KB sources or customer data; no invented product facts", "veto": true },
{ "code": "resolution", "name": "Resolution / task completion", "description": "Did it resolve the goal (skill_completed signal)", "veto": false },
{ "code": "relevance", "name": "Relevance / intent understanding", "description": "Addressed the real intent, not a different question", "veto": false },
{ "code": "policy", "name": "Policy & safety adherence", "description": "Stayed within 'what to avoid'; no unsafe content / PII leak", "veto": true },
{ "code": "tone", "name": "Tone & brand voice", "description": "Matched configured tone_of_voice; courteous", "veto": false },
{ "code": "language", "name": "Language quality (Bahasa)", "description": "Fluent target language; no broken/mixed language", "veto": false },
{ "code": "handoff", "name": "Handoff appropriateness", "description": "No false handover (Pattern A); no missed escalation", "veto": false },
{ "code": "tool", "name": "Tool / action correctness", "description": "Right action, right params, not skipped (Pattern B)", "veto": false },
{ "code": "efficiency", "name": "Conversation efficiency", "description": "No loops / re-asking; resolved within turn budget", "veto": false }
]
},
"message": "OK"
}
Errors: 500 → FE shows default_rubric_load_failed; 403/401.
APIs (Inbound — other services → us): n/a — Phase 1 adds no inbound webhook (the existing room-resolve webhook is unchanged).
Detail 2.A — UI Contract (FE)
Design status: Figma
Pending(PRD). Interim spec = PRD Appendix B Stitch prompts. Components use Pixel3 (Mp*). New page underpages/settings/scorecard/; logic inmodules/settings/views/mirroringai-assist.vue.
| Component | File (new) | Purpose | Key Pixel3 elements | Backing endpoint |
|---|---|---|---|---|
| ScorecardSettingsPage | pages/settings/scorecard/index.vue | Container; flag guard | layout + MpTabs/sections | preference + default rubric |
| AutoScoreToggle | modules/settings/views/scorecard/auto-score-toggle.vue | AI on-switch + ai_passing_grade (0–100) | MpSwitch/MpFormControl/MpInput+MpFormErrorMessage/MpButton(is-loading) | PATCH .../scorecard_preferences |
| DefaultRubricViewer | modules/settings/views/scorecard/default-rubric-viewer.vue | Read-only 9 metrics + 🛑 veto + PROPOSED note | MpText/MpBadge/skeleton | GET .../scorecard_ai_default_rubric |
| CustomParamEditor | pages/settings/scorecard/custom-parameters.vue + modules/settings/views/scorecard/custom-param-editor.vue | Add/edit param + rubric textarea + length counter + auto-scorable chip | MpInput/MpTextarea/MpBadge/MpButton | POST/PATCH .../scorecard_custom_parameters |
Design ↔ Code Mapping: n/a — Figma pending; tokens follow the existing settings shell
(ai-assist.vue). Any deviation re-checked once frames land (§5 Open Q#1).
Detail 2.B — Data-Fetching Strategy (FE)
- New Pinia store
chatbot-fe/store/scorecard/{state,actions,getters,types,index}.ts, mirroringstore/ai-assist(fetchStatus: idle|pending|resolved|rejected). - New service
chatbot-fe/common/services/main/v1/scorecard.tsusing$apiMain+AbortController(pattern:ai-assist.ts:151-184). Endpoints added tocommon/services/main/endpoint.ts:
scorecard: {
preference: { get: "/v1/gpt/omnichannel/scorecard_preferences", update: "/v1/gpt/omnichannel/scorecard_preferences" },
customParam: { create: "/v1/gpt/scorecard_custom_parameters", update: "/v1/gpt/scorecard_custom_parameters", list: "/v1/gpt/scorecard_custom_parameters" },
defaultRubric: { get: "/v1/gpt/scorecard_ai_default_rubric" },
}
- Fetch on page mount; optimistic UI not used (single Save action), matching
ai-assist.vue.
Casing convention (REV-6): the BE returns snake_case keys; the FE consumes them
directly without transformation, matching the existing pattern (e.g. store/ai-assist
reads state.reply_limit straight off the API). Do not introduce a camelCase mapping layer
for these endpoints — keep the snake_case field names end-to-end so the contract stays 1:1.
Typed contracts (REV-1) — store/scorecard/types.ts + common/services/main/v1/scorecard.ts:
// API request/response shapes (snake_case, matching BE Grape entities)
export interface ScorecardPreference {
is_auto_score: boolean // human (existing)
passing_grade: number // human (existing, 1–99)
is_ai_auto_score: boolean // NEW
ai_passing_grade: number | null // NEW (0–100; null → default 75 on read)
}
export interface CustomParam {
id: string
name: string
prompt: string // "" when manual-only
auto_scorable: boolean // derived = prompt non-empty
}
export interface DefaultRubricMetric {
code: string; name: string; description: string; veto: boolean
}
export interface DefaultRubric {
status: "PROPOSED"; group: string; metrics: DefaultRubricMetric[]
}
// Pinia store slice (mirrors store/ai-assist fetchStatus pattern)
type FetchStatus = "idle" | "pending" | "resolved" | "rejected"
export interface ScorecardState {
preference: { data?: ScorecardPreference; fetchStatus: FetchStatus }
preferenceUpdate: { fetchStatus: FetchStatus }
customParams: { data?: CustomParam[]; fetchStatus: FetchStatus }
customParamSave: { fetchStatus: FetchStatus }
defaultRubric: { data?: DefaultRubric; fetchStatus: FetchStatus }
}
Component
props/emitstypes are deferred to implementation, inferred from theai-assist.vuefamily (RFC §5 #7) — low risk, single owning module.
Detail 2.C — UI State Matrix
stateDiagram-v2
[*] --> Loading: Open Scorecard settings
Loading --> Empty: No custom params yet
Loading --> Success: Saved config loaded
Loading --> Error: Load / save fails
Error --> Loading: Retry
Empty --> Success: Add first custom param
Success --> [*]: Config saved
| State | AutoScoreToggle | CustomParamEditor | DefaultRubricViewer |
|---|---|---|---|
| Loading | fields disabled + spinner | textarea disabled + spinner | skeleton list |
| Empty | defaults (off / 75) | "No custom parameters…" + add hint | n/a — 9 defaults always exist |
| Error | $toast error + Retry; log scorecard_settings_save_failed | $toast + Retry; log scorecard_custom_param_save_failed | "Couldn't load the default rubric." + Retry; log default_rubric_load_failed |
| Success | "Change saved" | "Saved — will be auto-scored when scoring ships"; chip lit if rubric present | 9 metrics listed, veto badges |
Detail 2.D — Scope Boundaries
| In scope | Out of scope |
|---|---|
AI cols + prompt widen; 3 endpoints; 4 FE components + store/service; flag gating; analytics events; OpenAPI | Any scoring/computation; in-room panel; report; gate; the room-resolve is_custom_parameter skip guard; new roles; plan-tier code; i18n introduction |
Detail 2.E — Branch & Skip Catalog
| Branch / skip | Condition | Behavior | Owner |
|---|---|---|---|
| Rubric auto-scorable gate | prompt.present? | non-empty → auto_scorable:true; empty → manual-only (false) | BE (custom-param entity) — §2.1b flowchart, S02/AC-3, NEG-1 |
| Flag-off skip | ai_qa_unified_scorecard disabled for org | FE hides routes/controls ($hasSubscription); BE returns flag-gated empty/404 | FE + BE (ADR-5), S01/NEG-1 |
| Plan-not-eligible skip | Starter/Free org (feature not provisioned) | Same as flag-off (no surface) | Provisioning (A2), S01/NEG-1 |
| Unauthorized skip | role ∈ {agent, member} | controls not rendered (FE); 403 (BE) | §3 Role matrix |
| AI-enable-without-Phase-2 | is_ai_auto_score=true pre-Phase-2 | recorded preference, no scores produced (inert) | A1; PRD Open Q#3 |
| Room-resolve AI skip (Phase 2, NOT built here) | existing unless is_custom_parameter || scorecard_exists guard | unchanged in Phase 1; Phase 2 must revisit | BE (forward note §1) |
Detail 2.G — Cross-Layer Contract Verification
| Endpoint | PRD-to-Schema row (§1.A.5) | Interim design (App. B) | Match? |
|---|---|---|---|
| PATCH preference (AI fields) | rows 1–2 (AI on-switch, AI threshold) | Stitch #1 | yes |
POST/PATCH custom param (prompt) | rows 3–4 (org rubric, derived auto_scorable) | Stitch #2 | yes |
| GET default rubric | row 5 (9 default metrics) | Stitch #1 (viewer block) | yes |
3. High-Availability & Security
Performance Requirement
Save P95 ≤ 500ms (PRD §6). Single-row upsert on an org-unique index; default-rubric is an
in-memory constant. No N+1 (custom-param list already paginated/scoped by company_id).
Monitoring & Alerting
Reuse Mixpanel + the squad dashboard (owner: BOT). Events (PRD §12):
scorecard_settings_updated, scorecard_settings_save_failed, scorecard_custom_param_saved,
scorecard_custom_param_save_failed, default_rubric_viewed, default_rubric_load_failed.
Alert: scorecard_settings_save_failed + scorecard_custom_param_save_failed rate > 5%
in 1h → Slack #bot-ai-oncall. (Naming mirrors existing [CHATBOT] Mixpanel events in
chatbot-fe/common/contants/mixpanel-events.ts.)
Logging
Server errors via Rollbar.error(e) (existing pattern in the use cases); structured request
logs via lograge. Never log full prompt content at error level (may contain org IP) —
log org_id, custom_param_id, reason only (matches PRD event props).
Tracing (REV-3)
The BE already runs ddtrace (Datadog) + Aegis/OpenTelemetry (chatbot/Gemfile). The three new
/ extended endpoints are ordinary Grape requests, so they inherit existing request spans
automatically — no new instrumentation needed. Distributed FE→API→BE trace correlation is
explicitly out of scope for Phase 1 (no new correlation-id propagation is added); on-call
follows an FE error to the BE via the existing per-request Datadog span + the *_save_failed
Mixpanel event's org_id. Revisit cross-tier trace stitching with the Phase-2 scoring pipeline,
where the async OpenAI call makes it materially useful.
Security Implications
- AuthN: every endpoint behind
middlewares/auth.rb(Bearer +X-Auth-Token→ hub-service). - AuthZ:
set_role(%w[owner admin supervisor])on all three (read + write).agent/member→ 403. - Tenancy (critical):
organization_id(preference) andcompany_id(custom param) are taken only fromcurrent_user(the validated token) — never the request body. This matches the existing endpoints: preference passescurrent_user[:organization_id](scorecard_preferences.rbget/patch), custom param passescurrent_user.try(:[], 'company_id')(scorecard_custom_parameter.rbpost/patch). The newprompt/AI fields must not introduce a body-supplied org/company id. Add a request-spec assertion that a token for org A cannot read/write org B's preference or params (cross-tenant write → scoped to token org). - Input validation:
ai_passing_gradecoerced to:floatthen range-checked 0–100 (non-numeric → 422, never a raised exception).promptcapped 4,000 chars server-side; strip null bytes / control characters before persist so Phase-2 prompt assembly can't be broken by injected control chars. - Injection / XSS:
promptis stored and rendered as text (an LLM instruction, not HTML). Custom-paramname/descriptionalready passsanitize_html(scorecard_custom_parameter.rbbefore_save);promptdoes not need HTML sanitization but all custom-param text fields (name,description,prompt) must render via Vue interpolation, neverv-html(Vue escapes by default). - Prompt-injection (forward-looking): the
promptbecomes part of an LLM system prompt in Phase 2; Phase 1 only stores it. Note for Phase 2: treat stored rubric as untrusted input. - Audit: paper_trail already records versions on both models; ensure
whodunnitis populated fromcurrent_useron these write paths (verify the existingPaperTrail.request.whodunnitwiring covers Grape requests — if not, set it from the token user in the use case). - Secrets / PII in logs: never log full
prompt(org IP).Rollbar.error(e)is already used; addprompt(andsystem_prompt) to the Rollbar param scrub list so request bodies aren't captured. Events log onlyorg_id/custom_param_id/has_rubric/reason(PRD §12). - DoS / size:
promptcapped at 4,000 chars server-side (not just client); writes rely on the platform's existing request rate limiting (no new endpoint-specific limiter introduced). - AuthZ on default-rubric endpoint: serves only static, non-tenant config but still requires
auth +
set_role(no anonymous access to the rubric). - Data governance / retention (REV-7): the custom-param
promptis org-authored configuration (org IP), not end-customer PII — Phase 1 stores no conversation/customer data. Retention follows the existingacts_as_paranoidsoft-delete onscorecard_custom_parameters(deleting a param soft-deletes its rubric); paper_trail versions persist for audit. The rubric is therefore out of scope for end-customer DSAR/export (it is account config, handled by normal account-deletion processes), and stays within the existingchatbot_gptdata boundary — no new data export, no new third-party data egress in Phase 1 (the rubric reaches OpenAI only in Phase 2, which owns that data-flow review).
Role × Endpoint Authorization
| Endpoint | owner | admin | supervisor | agent | member |
|---|---|---|---|---|---|
| GET/PATCH preference | ✅ | ✅ | ✅ | ❌ 403 | ❌ 403 |
| POST/PATCH custom param | ✅ | ✅ | ✅ | ❌ 403 | ❌ 403 |
| GET default rubric | ✅ | ✅ | ✅ | ❌ 403 | ❌ 403 |
Detail 3.A — Failure Mode Catalog
| Failure | Detection | Behavior | Recovery |
|---|---|---|---|
ai_passing_grade out of 0–100 | Dry rule | 422, nothing saved | FE inline error (Vuelidate mirror) |
prompt > 4000 | Dry macro | 422, nothing saved | FE length counter blocks + server reject |
| DB write fails | save! raises → rescued | 500, no partial state (single-row tx) | FE $toast + Retry; log *_save_failed |
| Concurrent saves (two admins) | org-unique index + single-row upsert | last-write-wins; no partial row; paper_trail keeps both versions | acceptable for a config row; no lock needed |
| hub-service down / slow | middlewares/auth.rb → Repositories::ChatService::Users::Me returns nil | 401 "User service unavailable" | FE re-auth. Timeout/retry of this auth call is inherited from the existing middleware — out of scope to change here. |
| default-rubric load fails | 500 / network | FE error state | Retry; default_rubric_load_failed |
| flag off | BE gate + FE $hasSubscription | surfaces not rendered / 404 | n/a (by design) |
Detail 3.B — Error Message Catalog
| Code | Message | Surface |
|---|---|---|
| 422 | "AI passing grade only between 0 - 100 are allowed" | toggle |
| 422 | "AI judging rubric cannot exceed 4000 characters." | editor |
| 500 | "Couldn't save. Try again." | toggle/editor |
| 500 | "Couldn't load the default rubric." | viewer |
| 403 | "Permission denied" (existing) | all |
Detail 3.C — Accessibility
Pixel3 components are used as-is (existing settings a11y). New textarea has an associated
MpFormLabel; veto status conveyed by text + badge (not color alone); length counter has
aria-live=polite. Keyboard: Save reachable via tab; errors announced via MpFormErrorMessage.
Detail 3.D — Browser Support & FE Performance Budget (REV-4)
- Browser support: inherits the existing chatbot-fe Nuxt 4 target — no new matrix introduced;
the new pages must work on the same browsers the current
/settings/*pages support (no new polyfills, no APIs beyond whatai-assist.vuealready uses). - FE performance budget: the scorecard settings route is a lazy-loaded page (Nuxt
route-level code-split, like other
pages/settings/*), so it adds no weight to the initial bundle. The default-rubric list is 9 static rows and the custom-param list is already paginated server-side — no large client render. No new heavy dependency is added (reuses Pixel3- Vuelidate + mixpanel-browser already in
package.json).
- Vuelidate + mixpanel-browser already in
4. Backwards Compatibility and Rollout Plan
Compatibility
Additive only. New AI columns default off/unset; existing GET/PATCH callers that omit AI
fields keep working (AI fields optional). Human auto-scoring (auto_agent_scoring.rb) reads
only is_auto_score/passing_grade — untouched. prompt widen preserves existing data.
Rollout Strategy
Flag ai_qa_unified_scorecard default OFF. Stage 1 internal QA (3–5 accounts); Stage 2 closed
beta (TransGo, Talenta LMS + 3 partners, dark); held for customer GA with Phase 2 scoring
(PRD §11/§14). Detailed scheduling lives in delivery/ (not here).
Cross-Layer Rollout Compatibility
| Order | Step | Safe if FE not yet shipped? | Safe if BE not yet shipped? |
|---|---|---|---|
| 1 | BE migration (add cols, widen prompt) | yes (inert columns) | — |
| 2 | BE API + OpenAPI | yes (flag-gated, optional fields) | — |
| 3 | FE behind $hasSubscription | — | FE no-ops (feature off) |
| 4 | Provision feature for beta orgs | — | — |
Deploy BE before FE. Rollback FE before BE (FE depends on BE fields, not vice-versa).
Detail 4.A — Configuration Contract
| Key | Type | Default | Where |
|---|---|---|---|
ai_qa_unified_scorecard | org feature (ChatbotGpt::Feature + OrganizationFeature) | OFF | BE OrganizationFeatures::FindFeature; FE $hasSubscription |
DEFAULT_AI_PASSING_GRADE | constant (entity) | 75 (decided for build; mirrors human default) | entities/frontend_services/gpt/scorecard_preference.rb |
prompt max length | constant in Dry macro | 4000 (decided for build — PRD Open Q#2 confirmation is advisory; tunable) | custom-param create/update use cases |
ai_passing_grade valid range | Dry rule | 0–100 inclusive (decided — ADR-8) | scorecard_preference/patch.rb |
| default rubric content | Constants::ScorecardAiDefaultRubric | 9 metrics, content status:"PROPOSED" (served as-is) | new constant/config (seed from PRD App A) |
These values are locked for the Phase-1 build so the agent has no ambiguity. DSAI rubric confirmation (Open Q#2) and the max-length confirmation (Open Q#3) are advisory follow-ups that change config constants only — they do not block implementation.
Detail 4.B — Test Plan (commands from the repos)
Backend (chatbot/AGENTS.md:56-187,235-247):
# migrate the chatbot_gpt DB in test, then run specs
RAILS_ENV=test bundle exec rails db:migrate
bundle exec rspec spec/api/frontend_service/v1/gpt_spec.rb \
spec/core/use_cases/api/frontend_service/v1/gpt
bundle exec rubocop
bundle exec brakeman
bundle exec fasterer && bundle exec reek
# OpenAPI (MANDATORY when endpoints change):
ruby scripts/openapi/split.rb
npx --yes @apidevtools/swagger-cli validate docs/openapi/openapi.yaml
npx --yes @stoplight/spectral-cli lint docs/openapi/openapi.yaml --fail-severity=error
Frontend (chatbot-fe/package.json:10-22):
pnpm lint
pnpm test # vitest run
pnpm test:e2e # playwright (visual/e2e)
pnpm build
Cross-boundary contract test (REV-2). Because FE and BE land in separate repos/PRs, pin the contract on both sides so a casing/shape drift fails CI rather than production:
- BE (RSpec request spec): assert the PATCH-preference response and the custom-param response
serialize exactly
{is_auto_score, passing_grade, is_ai_auto_score, ai_passing_grade}and{id, name, prompt, auto_scorable}(snake_case,auto_scorablederived) — this is the authoritative contract. Add tospec/api/frontend_service/v1/gpt_spec.rb. - FE (Vitest service test): assert
common/services/main/v1/scorecard.tsparses a fixture whose shape is copied verbatim from the BE spec's expected JSON into theScorecardPreference/CustomParam/DefaultRubricinterfaces (§2.B), and that the store mapsauto_scorable→ the chip. The shared fixture is the contract anchor: if the BE entity changes a key, the BE spec changes the fixture, and the FE test (using the same fixture) breaks — catching the drift.
Detail 4.C — Agent Execution Plan
Order respects dependencies (migration → BE API → OpenAPI → FE). Each chunk has files + commands + assertable acceptance. Use the
chatbotrepo'sopenapi-spec-syncskill for chunks touching endpoints.
| # | Chunk | Files | Commands | Acceptance |
|---|---|---|---|---|
| 1 | Migration: AI columns | chatbot/db/chatbot_gpt_migrate/<ts>_add_ai_scoring_to_scorecard_preferences.rb; regen db/chatbot_gpt_schema.rb | RAILS_ENV=test bundle exec rails db:migrate | schema shows is_ai_auto_score (bool default false) + ai_passing_grade (float) |
| 2 | Migration: widen prompt | chatbot/db/chatbot_gpt_migrate/<ts>_widen_scorecard_custom_parameter_prompt.rb; regen schema | same | prompt column type = text |
| 3 | BE preference AI fields | app/api/frontend_service/v1/gpt/omnichannel/scorecard_preferences.rb; .../scorecard_preference/patch.rb,get.rb; repositories/gpt/scorecard_preferences/upsert.rb,find_by.rb; entities/frontend_services/gpt/scorecard_preference.rb (+DEFAULT_AI_PASSING_GRADE); api/.../entities/gpt/scorecard_preference.rb & get_scorecard_preference_response.rb | bundle exec rspec spec/.../gpt | new request spec: PATCH with ai_passing_grade:80 persists & GET returns it; ai_passing_grade:150→422 |
| 4 | BE custom-param prompt | app/api/frontend_service/v1/gpt/scorecard_custom_parameter.rb; .../scorecard_custom_parameter/create.rb,update.rb (+validate_prompt_length); repositories/gpt/scorecard_custom_parameter/create.rb,update.rb; entities/.../gpt/scorecard_custom_parameter.rb + response entity (prompt,auto_scorable) | bundle exec rspec | POST with prompt→auto_scorable:true; empty→false; 4001 chars→422 |
| 5 | BE default-rubric endpoint | app/api/frontend_service/v1/gpt/scorecard_ai_default_rubric.rb (NEW); use case + entity; Constants::ScorecardAiDefaultRubric (seed the 9 metrics' code/name/description/veto verbatim from PRD Appendix A Tier-1, exactly as in §2.4 Row 3; the per-metric LLM judge prompts stay out — they belong to Phase 2 scoring); mount in app/api/frontend_service/gpt_api.rb (+ app/api/gpt_service/api.rb) | bundle exec rspec | GET returns the 9 metrics with the §2.4 descriptions, veto:true on groundedness+policy, status:"PROPOSED" |
| 6 | OpenAPI sync | docs/openapi/openapi.yaml + dist/{frontend,gpt}.yaml; docs/openapi/SESSION-LOG.md | ruby scripts/openapi/split.rb; swagger-cli + spectral | both validators pass; roleScopedAuth overlay present for the 3 ops |
| 7 | FE store+service+endpoints | chatbot-fe/store/scorecard/*; common/services/main/v1/scorecard.ts; common/services/main/endpoint.ts; register in mainService index | pnpm test | store action unit tests: pending→resolved/rejected transitions |
| 8 | FE settings page + toggle | pages/settings/scorecard/index.vue; modules/settings/views/scorecard/auto-score-toggle.vue; Vuelidate 0–100; $hasSubscription guard; mixpanel events | pnpm test; pnpm lint | component test: out-of-range shows error; flag off → not rendered; save fires scorecard_settings_updated |
| 9 | FE custom-param editor | pages/settings/scorecard/custom-parameters.vue; modules/settings/views/scorecard/custom-param-editor.vue; length counter + auto-scorable chip | pnpm test | chip lights when textarea non-empty; >4000 blocked; saved toast |
| 10 | FE default-rubric viewer | modules/settings/views/scorecard/default-rubric-viewer.vue; veto badges + PROPOSED note; error+retry | pnpm test | renders 9 metrics + veto badges; error state shows retry + logs event |
| 11 | Cross-boundary contract test | BE: add expected-JSON assertions to spec/api/frontend_service/v1/gpt_spec.rb; FE: tests/unit/.../scorecard.service.test.ts parsing the same fixture into §2.B interfaces | bundle exec rspec; pnpm test | shared fixture → BE serializes it, FE parses it; a key/casing change breaks both (REV-2) |
| 12 | Test specs doc | documents/chatbot/unified-agent-scorecard/tests/phase-1-settings-and-rubric-config.md | n/a | covers_acceptance_criteria lists every UASC-S0x/AC-n |
Detail 4.D — Verification & Rollback Recipe
Pre-merge (in order): BE rails db:migrate → rspec → rubocop/brakeman → OpenAPI
split.rb + swagger-cli + spectral; FE pnpm lint → pnpm test → pnpm build.
Post-deploy signals:
scorecard_settings_updatedevents appear for beta orgs; save-failure events < 5%/h.- Manual: enable flag for one internal org → toggle AI scoring + set threshold + add a custom param with a rubric → reload → values persist; default rubric lists 9 metrics.
- Regression: human scorecard config + a resolved-room human auto-score still behaves as before.
- Adoption leading indicators (PRD §13 — settings save success ≥ 99%): track config-readiness
= % beta Pro+Ent orgs that enabled
is_ai_auto_score+ accepted the default rubric or added ≥ 1 custom param (target ≥ 80% before Phase-2 GA), andscorecard_custom_param_savedcount (target ≥ 1 per beta org) via thescorecard_settings_updated/scorecard_custom_param_savedevents. (These are PM-owned program metrics; the RFC only ensures the events exist to compute them.)
Rollback (numbered):
- Disable
ai_qa_unified_scorecardfor affected orgs (instant; surfaces vanish). - If BE bug: revert the BE PR (AI fields optional → no caller breaks).
- If FE bug: revert the FE PR (BE inert without FE).
- Columns/
promptwiden are forward-only — do not run thepromptdownin prod (truncation). Leave columns; they are inert when the flag is off. - Confirm
scorecard_settings_save_failedreturns to baseline and human auto-scoring intact.
5. Concerns, Questions, or Known Limitations
| # | Item | Type | Owner | Status |
|---|---|---|---|---|
| 1 | Figma frames for both surfaces are pending; FE built against PRD Appendix B Stitch prompts until frames land — pixel deviations re-checked then. | Blocker (FE fidelity) | Design | Open (PRD Dep "Design — YES") |
| 2 | Confirm the 9 default metric definitions/order with DSAI (Appendix A is PROPOSED). | Open (accuracy) | DSAI | PRD Open Q#1, due 2026-07-15 |
| 3 | prompt max length = 4,000 adopted; confirm. | Open | BOT+PM | PRD Open Q#2 |
| 4 | ai_passing_grade (0–100) diverges from human passing_grade (1–99). Align later? | Known limitation | BOT | New (ADR-8) |
| 5 | No plan-tier check exists in code; plan-gating relies on provisioning the org-feature only for Pro+Ent. Confirm provisioning owner/path. | Open | Billing/Provisioning + PM | New (A2) |
| 6 | Forward-looking: Phase 2 must change the room-resolve skip guard (is_custom_parameter) so AI scoring with custom params runs. | Forward note | BOT | Phase 2 |
| 7 | REV-1/REV-9 — decide whether the new components' props/emits get explicit TS contracts in this RFC or are inferred at implementation time (store + service types are specified in §2.B). rfc-reviewer R3 note: the reference target modules/settings/views/ai-assist.vue is <script setup> with no defineProps/defineEmits, so it offers no prop contract to copy — typing the 3 components in §2.A is the recommended close. | Open (low risk) | BOT FE | From rfc-reviewer R1 (sharpened R3) |
| 8 | REV-8 — plugins/botSubscriptionFeature.ts ($hasSubscription, cited by ADR-5/§2.0) is marked @deprecated in favor of the useSubscription composable. Confirm whether new FE uses $hasSubscription for parity or migrates to useSubscription. | Open (low risk) | BOT FE | From rfc-reviewer R3 |
6. Comment Log
| Date | Author | Note |
|---|---|---|
| 2026-06-20 | rfc-starter (Claude) | Initial draft from PRD v1.2; grounded against chatbot, chatbot-fe, hub-core, hub-service. Corrected PRD persona/role premise (no qa_lead/bot_admin role) and confirmed auto_agent_scoring.rb scores the human agent. |
| 2026-06-20 | rfc-reviewer (Claude) | R1 review → 8.0 (Strong/PROCEED). Applied R2 fixes: typed store/service contracts + casing convention (§2.B, REV-1/REV-6), FE↔BE contract test (§4.B/§4.C ch.11, REV-2), tracing scope (§3, REV-3), browser/perf budget (§3.D, REV-4), custom-param dedup note (§2.4 r2, REV-5), data-governance/retention (§3, REV-7). Re-score → 8.5. Open: REV-1 (component prop typing) carried to §5 #7. See rfc-phase-1-settings-and-rubric-config-review.md. |
7. Ready for Agent Execution
- Every PRD story + composite AC id traced in §1.A.4 / §1.C
- Architecture, ER, sequence (see below), state, and branch diagrams — happy + failure paths
- Every endpoint tagged reused/extended/new with evidence (Existing-Endpoint Check)
- Source Verification table backs every "existing" claim with file:line
- ADRs cover storage, sync/async, caching, third-party, consistency, multi-tenancy, reuse/new
- Test/build commands sourced from
chatbot/AGENTS.md+chatbot-fe/package.json - Agent Execution Plan: ordered chunks with files + commands + assertable acceptance
- Rollback recipe concrete (flag → revert → forward-only columns)
- Figma frames for both surfaces (Open Q#1) — FE chunks proceed on interim Stitch spec; re-verify on frames
- Infosec approver sign-off at review (Metadata)
- DSAI confirmation of the 9 metrics (Open Q#2) — non-blocking for build (served PROPOSED)
Ready for agent execution: yes (backend + FE logic). FE visual fidelity is gated on
Figma (Open Q#1); build the components against the interim Stitch spec and reconcile when
frames land. Optionally hand to rfc-reviewer for a second-pass score.