[PRD] Qontak CDP | Customers/Notes | Legacy Migration: CRM Contact Notes → CDP Notes (v2.0 — Grounded Rewrite)
Supersedes: [PRD] Legacy Migration CRM Contact Notes to CDP Notes (page 51182272878). Why v2.0: v1.1 was validated against the actual code in
contact-service,qontak.com, andqontak-customer-fe. Roughly half its load-bearing technical claims were ABSENT or CONTRADICTED — the target CDP endpoints, the idempotency field, caller-set timestamps, and the CRM extraction API don't exist as written; the CRM source schema was mis-modeled. This version corrects all of it and is grounded in code (Appendix A).
| Field | Value |
|---|---|
| PM | Zhelia Alifa |
| PRD Version | 2.1 |
| Status | DRAFT |
| PRD Type | NEW |
| Epic | TF-3183 |
| Squad | CDP — Task Force |
| RFC Link | TBD (the new CDP migrate/count endpoints + CRM extraction contract belong here) |
| Figma Master | N/A — backend migration; output via existing CDP Notes UI |
| Anchor | No — standalone single-squad feature |
| Labels | epic:qontak-cdp · module:customers · feature:crm-notes-migration |
| Last Updated | 2026-06-03 |
Table of Contents
- HEADER BLOCK
- 3. One-liner + Problem
- 4. What Happens If We Don't Build This
- 5. Target Users + Persona Context
- 6. Non-Goals
- 7. Constraints
- 8. New Features
- 9. API & Webhook Behavior
- 10. System Flow + User Stories + ACs
- 11. Rollout
- 12. Observability
- 13. Success Metrics
- 14. Launch Plan & Stage Gates
- 15. Dependencies
- 16. Key Decisions + Alternatives Rejected
- 17. Open Questions
- Appendix A — Grounded Code References
- PRD CHANGELOG
3. One-liner + Problem
One-liner: Migrate historical Contact (Person) notes from Legacy CRM into CDP Notes so migrated clients keep full customer context on day 1 of Qontak One — via new, purpose-built CDP ingestion + CRM extraction contracts (neither exists today).
Problem:
~130 CRM client accounts (~21,000+ notes) hold critical interaction history. When they migrate to Qontak One (CDP), that history is absent because no migration bridge exists. v1.1 assumed the bridge was mostly reuse; it is not — the CDP batch-insert and count endpoints, the legacy_crm_note_id idempotency field, and caller-settable timestamps do not exist, and the CRM extraction API the plan named (GET /crm/notes?organization_id&limit&offset) does not exist either. This PRD scopes that net-new work explicitly and corrects the CRM source schema (rich HTML, a third attachment type, geolocation check-ins, an activity-type taxonomy) that v1.1 mis-modeled.
4. What Happens If We Don't Build This
- CDP General Release is blocked for the ~130 Notes-using CRM accounts — they can't migrate without losing history.
- Agents arrive on CDP with blank customer context; they revert to legacy CRM or rebuild notes manually — destroying the migration value proposition.
- Trust/retention risk: clients discovering empty Notes post-migration escalate, delay, or churn.
5. Target Users + Persona Context
| Persona | Role | Goal | Pain | Workaround |
|---|---|---|---|---|
| Primary — Sales/Support Agent (post-migration) | Agent on a migrated CRM account | See full historical notes per contact in CDP, as in legacy CRM | After migration, notes are absent in CDP | Toggles back to legacy CRM, or loses context |
| Primary — Internal Ops / Migration Engineer | Qontak internal running migrations | Run, monitor, and validate the notes migration per CID before cutover | No tooling, and the assumed endpoints don't exist | Manual export/re-entry, or skip-and-flag |
Scope Changes
Engineering surfaces this PRD touches (controlled vocab). Kept in sync with the scope_changes frontmatter above.
- Backend —
contact-service: net-new CDP migrate + count endpoints, agocraft/workingestion consumer,legacy_crm_note_ididempotency, caller-set timestamps, andContact.source_idresolution; plus the net-new CRM extraction contract. - Data — one-time historical migration of
crm_notes(qontak.com Postgres, ~21k notes / ~130 CIDs) → CDPcontact_notes(MongoDB), incl. rich-HTML content and attachment mapping.
6. Non-Goals
- No real-time/ongoing sync — one-time historical migration only.
- No migration of legacy @mentions as live mentions — embedded CRM mention anchors are stripped to plain text (D-8); native CDP mentions are a separate PRD.
- No dedup vs human-created CDP notes — idempotency is enforced only by
legacy_crm_note_id. - No client self-service trigger/monitor UI — Ops-triggered only.
- No deletion/archival of source CRM notes during the retention window.
- No migration of non-note CRM activity unless explicitly decided — the CRM
crm_notestable also stores activity entries (Calls/Emails/Meetings/WhatsApp/SMS); scope decision in OQ-4. - No notes for other Qontak products (Inbox/Campaign/Chatbot).
- S06 banner/"Legacy" tag is NOT a no-UI-change item — it is explicitly out of scope of this backend PRD (see D-9); if wanted, it is a separate FE+backend change.
7. Constraints
| Constraint | Value |
|---|---|
| Platform | Backend migration job. Output via existing CDP Notes UI (web + mobile). |
| Datastore | CDP Notes live in MongoDB (contact-service), not SQL. Idempotency uses a stored legacy_crm_note_id + unique index. |
| New endpoints required | POST /cdp/notes/migrate (batch, S2S) and a source-scoped count — both net-new (do not exist today). |
| Performance | ≥ 10,000 notes/hour/CID; ≤ 4h window/CID; batch insert ≤ 2s/500; attachment re-upload ≤ 30s/file P95. |
| Batch size | Default 500, max 1,000 notes/batch. |
| Idempotency | REQUIRED — stored legacy_crm_note_id + unique index (company_sso_id, legacy_crm_note_id); skip-on-conflict. |
| Timestamps | Migrate path must accept caller-set created_at/updated_at (today SetDefaults() overwrites them — D-2). |
| Auth/tenancy | Migrate is S2S with explicit per-batch company_sso_id (note CRUD today derives company from user IAG context — no system path exists; D-3). |
| Data integrity | Failure rate ≤ 1%/CID; halt + alert above; zero silent failures (every failure logged with reason). |
| Feature flag | crm_notes_migration_enabled | default: OFF, per CID. |
| Security | Note HTML sanitized server-side on the migrate write path (no server-side sanitization exists today). CRM attachment fetch via internal creds; CDP stores company-scoped proxy URLs, not raw S3. |
| Plan scope | Same plans that have CDP Notes — Growth and Enterprise (not Starter). Applies to all eligible CRM clients regardless of package. |
| Orchestration | Reuse the existing contact-service migration pattern — handler + service + gocraft/work consumer (as ActivityLogMigrationConsumer.ProcessUpdateUserIDJob), with a status endpoint under the house namespace GET /private/notes/migration/status. Trigger = job enqueue, not a synchronous HTTP call (D-11). |
| Backward compat | CRM source untouched; existing CDP notes unaffected. |
7.1 Data Lifecycle
| Artifact | Retention | Cleanup | Visibility |
|---|---|---|---|
| Migration job log (per CID) | 1 year | Annual cleanup | Internal |
| Failed-record queue (reason codes) | 30 days | Auto-expiry; manual retry window | Internal Ops |
Audit map (legacy_crm_note_id → CDP note id) | Permanent (idempotency/compliance) | Manual archival | Internal |
| Source CRM notes | Per CRM policy (≥90d read-only) | CRM retention | Read-only in CRM |
| CRM attachment originals (S3) | Per CRM policy | CRM retention | None (CDP has its own copy) |
8. New Features
Backend-only pipeline. Output visible via existing CDP Notes UI.
Component tree:
CRM Notes Migration Pipeline
├── MigrationJobRunner — entry point; validates preconditions (flag, idempotency, S2S auth), orchestrates
│ ├── IdempotencyChecker — uses stored legacy_crm_note_id (net-new field)
│ ├── CRMExtractor — reads CRM Person notes (via the REAL CRM API/export — see D-5, OQ-7)
│ ├── SchemaTransformer — maps CRM note → CDP note
│ │ ├── ContactResolver — crm_person_id → CDP contact UUID (Person scope; multi-FK precedence per D-6)
│ │ ├── OwnerResolver — CRM creator_id → SSO UUID (fallback when unmappable — D-7)
│ │ ├── HtmlNormalizer — sanitize CRM rich HTML; strip mention anchors to text (D-8)
│ │ └── AttachmentProcessor — images + audios + DOCUMENTS (crm_note_attachments) → CDP company-scoped storage
│ ├── CDPNoteInserter — POST /cdp/notes/migrate (net-new batch S2S endpoint)
│ ├── ValidationRunner — count compare via the net-new source-scoped count
│ └── MigrationLogger — per-record success/failure; audit trail
└── MigrationMonitorAPI — read-only job status/progress/error log
Access: internal S2S / Ops only — not exposed to client admins or end users.
Monitor API states (the 4 states this backend feature exposes):
- Empty:
{ "status": "not_started" } - In progress (loading):
{ "status": "in_progress", "progress_pct": N, "notes_processed": N, "notes_total": N } - Error:
{ "status": "halted", "failure_rate": N, "error_log_url": "..." } - Success:
{ "status": "completed_success", "notes_migrated": N, "validation": { "match_pct": N } }
(This is a backend pipeline — there is no end-user UI screen; output is rendered by the existing CDP Notes UI. The states above are the MigrationMonitorAPI's lifecycle.)
9. API & Webhook Behavior
| # | Behavior | Entity Affected | Triggered By | Expected Behavior | Failure Behavior |
|---|---|---|---|---|---|
| 1 | Trigger migration job | migration_job | Ops enqueues a gocraft/work job (S2S, explicit cid + company_sso_id) — mirroring ActivityLogMigrationConsumer.ProcessUpdateUserIDJob; not a synchronous HTTP call | Validate CID, flag ON, S2S auth, not already completed → enqueue job, return job_id, processed by the worker/consumer | Already migrated: 409 ALREADY_MIGRATED; flag OFF: 403; CID not found: 404 |
| 2 | Extract CRM Person notes | CRM notes | CRMExtractor | Read all Person notes for the CID via the real CRM mechanism (export endpoint or DB read — D-5/OQ-7) using the actual pagination (page/per_page), collecting note HTML, creator_id, images, audios, documents, checkin, timestamps | CRM API 5xx/timeout: retry 3× backoff; then halt + CRM_EXTRACT_FAILED |
| 3 | Resolve contact (Person) | CDP contact | ContactResolver | Resolve crm_person_id → CDP contact by querying the existing contact.source_id / crm_data.id on the CDP contact document (the CRM linkage already stored — D-12); for multi-FK notes apply precedence (D-6). Use an external mapping table only as a fallback if source_id coverage is incomplete | No match: log CONTACT_NOT_MAPPED; skip note; count failure; no halt |
| 4 | Resolve owner | SSO identity | OwnerResolver | CRM creator_id → SSO UUID; sets owner_id | Unmappable: owner_id = null; store a legacy_owner_label so the author still shows (D-7); non-blocking |
| 5 | Normalize content | note HTML | HtmlNormalizer | Sanitize CRM rich HTML server-side; strip mention anchors to plain @Name text (D-8); do NOT re-wrap in <p> | Malformed HTML: store sanitized best-effort; log warning |
| 6 | Re-link attachments | CDP storage | AttachmentProcessor | For images, audios, AND documents (crm_note_attachments): download from CRM S3 → re-upload to CDP company-scoped storage ({company_sso_id}/...) → store proxy URL + derived type/size; respect CDP allowlist + ≤1 voice_note rule (split/flag per OQ-8) | Download/upload fail: insert note without that attachment; log ATTACHMENT_*_FAILED; non-blocking |
| 7 | Batch insert | CDP notes | CDPNoteInserter | POST /cdp/notes/migrate (net-new, S2S, explicit company_sso_id) with array incl. legacy_crm_note_id, caller created_at/updated_at; skip if legacy_crm_note_id exists | 5xx: retry once; then BATCH_INSERT_FAILED, continue; failure rate >1%: halt + alert |
| 8 | Validate | count compare | ValidationRunner | Source count vs CDP migrated count (filtered by legacy_crm_note_id presence / a migrated marker — net-new); match_pct ≥99% → success | Count unavailable after retries: VALIDATION_SKIPPED, completed_with_errors, alert |
| 9 | Get status | migration_job | Ops GET /private/notes/migration/status?cid={cid} (house namespace, mirrors GET /private/activity_logs/migration/status) | Returns status + counts + failure_rate + validation | CID not found: 404; never started: not_started |
9.1 Schema Mapping: CRM Notes → CDP Notes
| CRM Field | CRM Reality (grounded) | CDP Field | Transformation |
|---|---|---|---|
id | integer | (audit only) | Store as legacy_crm_note_id (net-new field) + unique index; CDP generates its own ObjectID |
note | sanitized rich HTML (not plain text; before_save :sanitize_note) | note (HTML) | Sanitize server-side; preserve safe markup; strip mention anchors → @Name text. Do NOT wrap in <p> (v1.1 error) |
type (Crm::PersonNote/DealNote/CompanyNote) | STI type; also nullable crm_person_id/company_id/deal_id/ticket_id can co-exist | (routing only) | This PRD migrates Person notes; multi-FK precedence per D-6 |
crm_person_id | integer | contact_id | Resolve via the CDP contact's existing source_id / crm_data.id (CRM linkage already on the contact — D-12); fallback to a mapping table only if source_id coverage is incomplete; no match → CONTACT_NOT_MAPPED |
creator_id | integer | owner_id (+ legacy_owner_label) | CRM user → SSO UUID; unmappable → owner_id=null + stored label (D-7) |
crm_note_images | has_many Asset (S3) | attachments[] | Re-link to CDP company-scoped storage |
crm_note_audios | has_many Asset (S3) | attachments[] | Re-link; respect ≤1 voice_note (OQ-8) |
crm_note_attachments | has_one + has_many Crm::NoteAttachment (documents) — MISSED by v1.1 | attachments[] | Re-link documents (map to CDP doc/pdf/xls types) — or document the drop (OQ-5) |
crm_checkin | has_one Crm::Checkin (geolocation: lat/long/address/time) — not a string | (not migrated) | Explicit data-loss decision; log per note (D-10) |
crm_note_type_id | activity taxonomy (Notes/Calls/Emails/Meetings/WhatsApp/SMS) | (filter) | Decide migrate-all vs notes-only (OQ-4) |
created_at | ISO8601 +TZ | created_at | Normalize to UTC; preserve (requires caller-set timestamp — D-2) |
updated_at | ISO8601 +TZ | updated_at | Normalize to UTC; preserve actual value (not overwritten with created_at) |
owner_name | — | computed live (not stored) | CDP resolves owner name live from Launchpad; unmappable owner → blank unless legacy_owner_label stored (D-7) |
permission {update,delete} | — | computed live | CDP computes from Launchpad CRS; unmappable owner → edit/delete hidden (D-7 / OQ-6) |
10. System Flow + User Stories + ACs
10.1 System Flow
- Ops enqueues a
gocraft/workmigration job (S2S,cid+company_sso_id) → validate flag + S2S auth + idempotency → job accepted (mirrorsActivityLogMigrationConsumer). - CRMExtractor reads Person notes via the real CRM mechanism (D-5) using actual pagination, capturing HTML, creator, images/audios/documents, checkin, timestamps.
- Per note: resolve contact (Person) via the CDP contact's
source_id/crm_data.id(multi-FK precedence) → resolve owner (fallback label) → sanitize HTML + strip mentions → re-link attachments (incl. documents). - CDPNoteInserter
POST /cdp/notes/migrate(S2S, explicitcompany_sso_id,legacy_crm_note_id, caller timestamps); idempotent skip on conflict. - If failure rate > 1% → halt + alert.
- ValidationRunner compares source count vs CDP migrated count →
match_pct. - Ops polls
GET /private/notes/migration/status?cid={cid}.
📊 System Flow — CRM Notes Migration (corrected)
sequenceDiagram
participant Ops
participant Job as MigrationJobRunner
participant CRM as Legacy CRM (real API/export)
participant Map as Person↔Contact Map
participant SSO as Identity (Launchpad)
participant S3 as CRM S3
participant Store as CDP Storage (company-scoped)
participant CDP as CDP /cdp/notes/migrate (NET-NEW, S2S)
Ops->>Job: enqueue gocraft/work migration job {cid, company_sso_id}
Job->>Job: validate flag + S2S auth + idempotency
alt precondition fail
Job-->>Ops: 409/403/404
else ok
Job-->>Ops: {job_id} async
loop paginated (page/per_page)
Job->>CRM: read Person notes
CRM-->>Job: notes (HTML, images, audios, documents, checkin, ts)
end
loop per note
Job->>Map: crm_person_id -> contact UUID
alt not mapped
Job-->>Job: FAIL CONTACT_NOT_MAPPED (skip)
end
Job->>SSO: creator_id -> SSO UUID
SSO-->>Job: owner_id or null+legacy_owner_label
Job->>Job: sanitize HTML + strip mention anchors -> text
loop per attachment (image/audio/document)
Job->>S3: download
Job->>Store: re-upload (company-scoped) -> proxy URL
end
end
Job->>CDP: POST /cdp/notes/migrate (batch, legacy_crm_note_id, caller ts)
alt failure_rate > 1%
Job-->>Ops: HALT + alert
end
Job->>Job: validate count -> match_pct
Job-->>Ops: status + validation
end
10.2 User Stories
| User Story | Importance | Mockup | Technical Notes | Acceptance Criteria |
|---|---|---|---|---|
| [NOTES-MIG-S01] — Run batch migration for a CID As a CDP Engineer, I want to trigger a migration job for a CID that extracts CRM Person notes and inserts them into CDP, so that the client's history is available on day 1. | Must Have | Figma: N/A | Data Fields: cid (req), job_id, status, crm_notes_migration_enabledBefore-After Behavior: Before: no tooling and the target endpoints don't exist. After: an S2S batch job (a gocraft/work consumer mirroring ActivityLogMigrationConsumer) ingests CRM Person notes into CDP via the net-new /cdp/notes/migrate. | — Happy Path — • AC-1: Given flag ON and no completed job, when Ops enqueues a migration job (S2S, with cid + company_sso_id), then a job is created in_progress, returns job_id, and is processed by the worker/consumer.• AC-2: Given a job in progress, when Ops calls GET /private/notes/migration/status?cid={cid}, then it returns progress_pct, notes_processed, notes_total.• AC-3: Given all batches complete with failure ≤1% and match_pct ≥99%, then status = completed_success with counts + validation.— Error / Unhappy Path — • ERR-1: Given flag OFF, then 403 FLAG_DISABLED, no job.• ERR-2: Given an already-completed CID, then 409 ALREADY_MIGRATED.• ERR-3: Given failure rate > 1%, then halt, status halted, PagerDuty alert with job_id/cid.• ERR-4: Given the migrate call is not S2S-authenticated (no system token / no explicit company_sso_id), then 401/403 — a logged-in user IAG context is not accepted for bulk migrate. — Permission Model — • CAN: internal S2S service token only. • CANNOT: client admins/end users. |
| [NOTES-MIG-S02] — Transform CRM note to CDP schema (grounded) As the Migration Pipeline, I want to correctly transform each CRM Person note, so that migrated notes are complete and valid. | Must Have | Figma: N/A | Data Fields: legacy_crm_note_id, contact_id, owner_id+legacy_owner_label, note(sanitized HTML), attachments[](image/audio/document), created_at/updated_at(preserved).Before-After Behavior: Before: incompatible schemas; the assumed fields/endpoints don't exist. After: each note is resolved, sanitized, and inserted with idempotency + preserved timestamps. | — Happy Path — • AC-1: Given a Person note with crm_person_id, when the resolver queries the CDP contact by source_id / crm_data.id, then CDP contact_id = the matched contact's UUID.• AC-2: Given CRM note rich HTML, then it is sanitized and stored as-is (safe markup preserved); it is not wrapped in <p>; mention anchors become plain @Name text.• AC-3: Given a crm_note_attachments document, then it is re-linked to CDP storage as a doc/pdf/xls attachment (not dropped).• AC-4: Given CRM created_at/updated_at with TZ offsets, then CDP stores the preserved UTC-normalized originals (not overwritten by insert time).• AC-5: Given the same legacy_crm_note_id already in CDP, then the insert is skipped (no duplicate).— Error / Unhappy Path — • ERR-1: Given crm_person_id has no mapping, then CONTACT_NOT_MAPPED, skip, count failure.• ERR-2: Given an attachment S3 download fails, then the note inserts without that attachment; ATTACHMENT_DOWNLOAD_FAILED logged; not counted as note failure.• ERR-3: Given creator_id unmappable to SSO, then owner_id=null + stored legacy_owner_label; non-blocking.• ERR-4: Given a note whose crm_note_type is an activity (Call/Email/etc.) and OQ-4 excludes activities, then it is skipped and counted as out-of-scope (not a failure). |
| [NOTES-MIG-S03] — Idempotent re-run (net-new field) As a CDP Engineer, I want safe re-runs with no duplicates, so that I can retry failed/interrupted jobs. | Must Have | Figma: N/A | Data Fields: legacy_crm_note_id (net-new, stored + unique index (company_sso_id, legacy_crm_note_id)).Before-After Behavior: Before: no idempotency field exists. After: insert skips notes whose legacy_crm_note_id already exists. | — Happy Path — • AC-1: Given a halted job, when re-triggered, then only notes absent from CDP (by legacy_crm_note_id) are inserted.• AC-2: Given a note already inserted, when re-attempted, then it is skipped (no duplicate), counted as already-migrated. • AC-3: Given a full re-run where all exist, then notes_migrated=0, notes_skipped=N, success.— Error / Unhappy Path — • ERR-1: Given two concurrent jobs for one CID, then only one runs (unique constraint on CID + in_progress); the other returns 409 JOB_ALREADY_RUNNING. |
| [NOTES-MIG-S04] — Validation & error reporting As a CDP Engineer, I want count validation + a structured error log, so that I can confirm integrity before cutover. | Must Have | Figma: N/A | Data Fields: crm_total, cdp_inserted, match_pct, failure_rate, error log {legacy_crm_note_id, reason_code, details}.Before-After Behavior: Before: no count endpoint exists to measure this. After: validation uses the net-new source-scoped count + the legacy_crm_note_id marker. | — Happy Path — • AC-1: Given all batches done, then ValidationRunner compares source count vs CDP migrated count (by marker) and computes match_pct.• AC-2: Given match_pct ≥99%, then completed_success with full summary.— Error / Unhappy Path — • ERR-1: Given match_pct <99%, then completed_with_errors + alert + downloadable error log; no auto-retry.• ERR-2: Given the count source is unavailable after retries, then VALIDATION_SKIPPED + completed_with_errors + alert. |
| [NOTES-MIG-S05] — View migrated notes in CDP As a migrated agent, I want my historical notes visible in CDP, so that I keep customer context. | Must Have | Figma: N/A — existing CDP Notes UI | Data Fields (rendered): note(HTML), author (owner_name or legacy_owner_label), created_at, attachments[].Before-After Behavior: Before: empty notes post-migration. After: migrated notes appear with content, author, original timestamp, attachments. | — Happy Path — • AC-1: Given completed_success, when an agent opens a contact, then migrated notes show with original content, author, and original created_at, sorted reverse-chronological.• AC-2: Given a re-linked attachment, when clicked, then it downloads from CDP storage (not CRM S3). • AC-3: Given an unmappable owner, then the note shows the legacy_owner_label (e.g. "[Legacy CRM User]") rather than a blank author.— Error / Unhappy Path — • ERR-1: Given an attachment failed to re-link, then the note shows but the attachment shows "Attachment unavailable — could not be migrated"; note not hidden. • ERR-2: Given an unmappable owner, then edit/delete may be hidden (permission computed live yields false) — acceptable for historical notes (OQ-6). |
| [NOTES-MIG-S06-NEG] — Mentions are not live; activities not silently flooded (Guard Rail — from Non-Goals) As the pipeline, when a CRM note contains mention anchors or is an activity-type entry, then mentions become plain text and activity scope follows OQ-4. | Guard Rail | — | — | • NEG-1: Given a CRM note with <a data-user-id> mention markup, when migrated, then it renders as plain @Name text — no CDP mention, no notification.• NEG-2: Given OQ-4 = notes-only, when an activity entry (Call/Email) is encountered, then it is excluded (not migrated into the Notes panel). |
Dependencies: S01 → S2S migrate endpoint; S02/S04 → net-new fields/endpoints; S05 → timestamp preservation (D-2).
🧪 Test Coverage Matrix — [NOTES-MIG-S02]
| Dimension | Coverage | Notes |
|---|---|---|
| Boundary values | ✅ defined | AC-5 (dup), ERR-1 (no mapping); ⚠️ QA: note with 0 attachments, max attachments |
| State transitions | ✅ defined | AC-5 skip-on-conflict; halt at >1% |
| Data validation | ⚠️ partial | AC-2 sanitize/mention-strip; ⚠️ QA: malformed/oversized HTML, unsupported file types |
| Concurrency | ✅ defined | S03 ERR-1 concurrent jobs |
| Network/timeout | ✅ defined | ERR-2 attachment download fail; batch retry |
11. Rollout
| Field | Detail |
|---|---|
| Flag | crm_notes_migration_enabled | default: OFF, per CID. |
| Stage 1 — Internal QA | 2 synthetic CIDs (100 + 5,000 notes incl. images/audios/documents). Verify idempotency, timestamp preservation, sanitization, attachment re-link ≥95%. |
| Stage 2 — Pilot | 5–10 CSM-approved CRM clients with Notes. |
| Stage 3 — Batch | Remaining ~120 CIDs per schedule. |
| Backward compat | CRM source untouched; CDP native notes unaffected. |
11.1 Migration Transition Window
- In-progress: notes appear progressively; CSM informs agents population is underway.
- After success: all historical notes visible; new CDP notes use the standard write path.
- Coexistence (≥90d): CRM notes read-only in CRM; dual access; no auto-sync.
- End state: CDP is sole source of truth for notes.
12. Observability
| Event | Trigger | Properties |
|---|---|---|
crm_notes_migration_started | Job triggered | job_id, cid, triggered_by |
crm_notes_migration_batch_completed | Per batch | job_id, cid, batch_number, notes_in_batch, failed_in_batch |
crm_notes_migration_note_failed | Note fail | job_id, cid, legacy_crm_note_id, reason_code, details |
crm_notes_migration_attachment_failed | Attachment fail | job_id, cid, legacy_crm_note_id, attachment_type, reason_code |
crm_notes_migration_owner_not_resolved | Owner unmappable | job_id, cid, legacy_crm_note_id, crm_user_id |
crm_notes_migration_halted | Failure >1% | job_id, cid, failure_rate |
crm_notes_migration_completed | Job done | job_id, cid, status, notes_migrated, notes_failed, match_pct, duration_seconds |
Owner: CDP Task Force. Alerts: halted → PagerDuty P1; match_pct<99% → P2; attachment-fail >20% → Slack. Cadence: per-CID review post-migration; weekly aggregate during Stage 3.
13. Success Metrics
| Metric | Definition | Baseline | Target |
|---|---|---|---|
| ⭐ Migration completeness | match_pct per CID | 0% | ≥ 99% before any CID cutover |
| ⭐ CIDs migrated | Notes-using CIDs at completed_success | 0 | 100% of ~130 by CDP GA |
| Attachment success | re-linked / total (incl. documents) | N/A | ≥ 95% |
| Halt rate | halted / triggered | N/A | < 2% in Stage 3 |
| Agent adoption | agents accessing migrated notes within 7d | N/A | ≥ 70% |
14. Launch Plan & Stage Gates
| Stage | Audience | Duration | Success Gate | Owner |
|---|---|---|---|---|
| Internal QA | 2 synthetic CIDs (incl. documents + mentions + activities) | 1 wk | match_pct=100%; zero dup on re-run; timestamps preserved; mention-stripping verified; attachment ≥95% | QA |
| Pilot | 5–10 CSM-approved CIDs | 2 wk | match_pct≥99%; zero pipeline-bug halts; error log root-caused | PM + CSM |
| Batch | ~120 CIDs | Ongoing | Halt <2%; match_pct≥99% before each cutover | PM + Ops |
| GA | All Qontak One (post-migration) | Permanent | 100% of Notes CIDs completed_success | PM |
15. Dependencies
| Dependency | Owning Team | Deliverable | Blocking? |
|---|---|---|---|
POST /cdp/notes/migrate (NET-NEW, S2S, batch) | CDP Backend | Batch insert (≤1,000) with legacy_crm_note_id, caller created_at/updated_at, explicit company_sso_id, server-side sanitization, skip-on-conflict | YES |
| Source-scoped migrated-note count (NET-NEW) | CDP Backend | Count of migrated notes per CID (by legacy_crm_note_id marker) for validation | YES |
legacy_crm_note_id field + unique index | CDP Backend | New persisted field on ContactNote + (company_sso_id, legacy_crm_note_id) unique index | YES |
| CRM Person-notes extraction mechanism (real) | Legacy CRM Squad | A real export endpoint or DB read using actual pagination (page/per_page) — the assumed GET /crm/notes?organization_id&limit&offset does NOT exist | YES |
Person→Contact resolution via contact.source_id / crm_data.id | CDP / Data Eng | Confirm the CDP contact's source_id holds the CRM crm_person_id and its coverage per CID; provide a fallback mapping table only where coverage is incomplete | YES |
| CDP storage (company-scoped re-upload) | CDP Infra | Re-upload to {company_sso_id}/...; respect allowlist; quota for full attachment volume incl. documents | YES |
| User identity (CRM user_id → SSO UUID) | Launchpad / Identity | Lookup for owner resolution | NO (degrades quality only) |
| CSM approval + window | CSM | Per-CID consent + maintenance window | YES (Stage 2+) |
📊 Dependency Graph
graph LR
M[CRM Notes Migration v2] -->|BLOCKING| CDPM[POST /cdp/notes/migrate net-new]
M -->|BLOCKING| CNT[migrated-note count net-new]
M -->|BLOCKING| LID[legacy_crm_note_id field+index]
M -->|BLOCKING| EXT[real CRM extraction]
M -->|BLOCKING| MAP[Person to Contact map]
M -->|BLOCKING| STORE[CDP company-scoped storage]
M -->|non-blocking| ID[CRM user to SSO]
M -->|Stage2+| CSM[CSM approval]
16. Key Decisions + Alternatives Rejected
16a — Decisions Made
All decisions below made 2026-06-03 (grounded code review).
| ID | Decision | Rationale (grounded) |
|---|---|---|
| D-1 | Build net-new /cdp/notes/migrate (batch, S2S) + a migrated-note count + legacy_crm_note_id field/index | None exist today — only single-CRUD notes under /iag/v1/contacts/{id}/notes (rest_router.go:150-158); no count, no legacy field (base.go:26-36) |
| D-2 | Migrate path accepts caller-set created_at/updated_at | SetDefaults() overwrites both with time.Now() (base.go:51-54, create.go:12); preservation is required for correct reverse-chron order |
| D-3 | Migrate is S2S with explicit per-batch company_sso_id | Note CRUD derives company from user IAG context (handler.go:75-79); bulk Ops migration has no user — needs S2S (mirror the field_properties migrate S2S pattern, rest_router.go:338) |
| D-4 | Note content treated as rich HTML; sanitized server-side; NOT re-wrapped in <p> | CRM content is already sanitized rich HTML (note.rb:42,378); v1.1's plain-text assumption was wrong; no server-side sanitization exists in CDP today |
| D-5 | Use the real CRM extraction contract (page/per_page export or DB read) | GET /crm/notes?organization_id&limit&offset and /crm/notes/count do not exist; real APIs are /api/v4/notes with page/per_page (api/v4/notes.rb:132-135) |
| D-6 | Person-scope migration; multi-FK notes routed by precedence (person first) | A CRM note can carry person/company/deal/ticket FKs simultaneously; STI type is metadata not a constraint (note.rb:5-8) |
| D-7 | Unmappable owner → owner_id=null + stored legacy_owner_label | owner_name/permission are computed live from Launchpad (service.go:131-137, handler.go:143-166); a null owner would render blank author + hidden edit/delete unless a label is stored |
| D-8 | Strip embedded CRM mentions to plain @Name text | CDP has no mention support; CRM data-user-id references CRM int IDs that don't resolve (note.rb:98-105). Native CDP mentions = separate PRD |
| D-9 | S06 banner/"Legacy" tag is OUT of this backend PRD's scope | FE has zero banner/dismiss/tag infra and CustomerNote has no metadata field (CustomerStore.ts) — it cannot be "no UI change"; re-scope separately if wanted |
| D-10 | Document attachments migrated; check-in geolocation explicitly dropped | crm_note_attachments (documents) exists and was missed by v1.1 (note.rb:14,19); crm_checkin is a geolocation has_one (note.rb:15) — dropping it is a deliberate data-loss decision |
| D-11 | Reuse the existing migration framework: handler + service + gocraft/work consumer + a /private/notes/migration/status endpoint; trigger = job enqueue (not synchronous HTTP) | contact-service already has ActivityLogMigrationHandler/Consumer/service and ContactMigrationHandler/service, with GET /private/activity_logs/migration/status (rest_router.go:74) and ProcessUpdateUserIDJob(job *work.Job) — mirror it rather than inventing a new orchestration |
| D-12 | Resolve Person→Contact via the CDP contact's existing source_id / crm_data.id, not a net-new external mapping table | The CDP contact already stores Source, SourceID, SourceName, and CrmData{ID} (base.go:67-69, 331-333) — the CRM linkage exists; a mapping table is only a fallback where coverage is incomplete |
16b — Alternatives Rejected
All rejections dated 2026-06-03.
| Alternative | Why Rejected |
|---|---|
| Reuse existing CDP note create endpoint (v1.1 assumption) | No batch/idempotency/timestamp/S2S support — single-CRUD only |
Plain-text note transform (<p> wrap) | CRM content is rich HTML; wrapping corrupts markup and changes XSS posture |
| Keep CRM S3 URLs in CDP | Permanent legacy dependency; CDP storage is company-scoped proxy URLs |
| Migrate mentions as live CDP mentions | CDP has no mention feature; would create dangling links + false notifications |
Treat crm_checkin as a droppable string | It's an associated geolocation record; mischaracterizing hides real data loss |
| Direct DB import into CDP Mongo | Bypasses validation/sanitization; schema divergence risks corruption |
17. Open Questions
| # | Type | Question | Mitigation / Plan | Owner | Deadline |
|---|---|---|---|---|---|
| OQ-1 | Decision | Migration ownership & mechanism: CDP backfill job vs Bifrost (Postgres crm_people/crm_notes → Mongo CDP)? | Decide at design kickoff; default to a contact-service gocraft/work consumer reusing the existing migration pattern. | CDP Eng + Platform | 2026-06-17 |
| OQ-2 | Risk | Person→Contact resolution coverage per CID via contact.source_id/crm_data.id may be incomplete → CONTACT_NOT_MAPPED. | Mitigation: run a pre-migration coverage report per CID; block job start if Person→Contact coverage < 99%; route unmatched notes to the failed-record queue for retry after source_id/mapping backfill. No CID cuts over below the threshold. | CDP / Data Eng | 2026-06-17 |
| OQ-3 | Open | Failed (CONTACT_NOT_MAPPED) notes → retry queue after mapping update, or permanent error log? | Default: 30-day failed-record queue (per §7.1) with manual retry. | PM + Eng | 2026-06-17 |
| OQ-4 | Decision | Migrate all crm_notes or notes-only (exclude Call/Email/Meeting/WhatsApp/SMS activity types)? | Default to notes-only (filter by crm_note_type) to avoid flooding the Notes panel; confirm with PM. | PM | 2026-06-17 |
| OQ-5 | Decision | Are document attachments (crm_note_attachments) in scope (recommended yes) or documented as dropped? | Default: in scope, mapped to CDP doc/pdf/xls types. | PM + Eng | 2026-06-17 |
| OQ-6 | Open | For unmappable-owner notes, confirm acceptable that edit/delete are hidden (permission computed live = false). | Default acceptable for historical notes; legacy_owner_label preserves author display. | PM | 2026-06-17 |
| OQ-7 | Risk | Real CRM extraction at bulk throughput: export endpoint vs DB read; rate limits with page/per_page. | Mitigation: load-test the chosen extraction path at realistic CID size in staging before Internal QA; add configurable inter-page delay if throttled. | Legacy CRM Squad | 2026-06-17 |
| OQ-8 | Decision | CDP allows ≤1 voice_note/note + a fixed file-extension allowlist; how to handle notes with multiple audios or unsupported types (split, skip, convert)? | Default: keep first voice_note + attach the rest as audio attachments where allowed; log + skip unsupported types. Confirm with PM. | PM + Eng | 2026-06-17 |
| OQ-9 | Risk | Voice-note S3 path/format may differ from documents/images. | Mitigation: confirm the CRM voice-note S3 path pattern and run a sample download before Internal QA. | Eng | 2026-06-17 |
Appendix A — Grounded Code References
contact-service (CDP, Go/MongoDB)
- Notes are single-CRUD under
/iag/v1/contacts/{contact_id}/notes; no/cdp/notes/migrate, no count, nosourcefilter —internal/server/rest_router.go:150-158;internal/app/handler/contact_notes_handler.go. ContactNotehas nolegacy_crm_note_id—internal/app/repository/contact_notes/base.go:26-36.SetDefaults()overwritesCreatedAt/UpdatedAt = time.Now()on create —base.go:51-54,create.go:12.owner_nameresolved live —internal/app/service/contact_notes/contact_notes_service.go:131-137;permission{update,delete}computed live —contact_notes_handler.go:143-166.- Company derived from user IAG context (no S2S path) —
contact_notes_handler.go:75-79; field_properties migrate uses S2S pattern —rest_router.go:338. - Content validated by length only; no server-side sanitization —
contact_notes_service.go:268-274. - Existing migration framework to reuse:
ActivityLogMigrationHandler+ActivityLogMigrationConsumer.ProcessUpdateUserIDJob(job *work.Job)+activity_log_migration_service.go;ContactMigrationHandler+contact_migration_service.go; status routeGET /private/activity_logs/migration/status—internal/server/rest_router.go:74. - Existing CRM linkage on the contact:
Source,SourceID,SourceName,CrmData{ID}—internal/app/repository/contact/base.go:67-69, 331-333(basis for Person→Contact resolution).
qontak.com (Legacy CRM, Rails)
- Note content is sanitized rich HTML —
app/models/crm/note.rb:42,378; columnt.text—db/schema.rb:1980. - Three attachment types: images, audios, and documents (
crm_note_attachments/Crm::NoteAttachment) —app/models/crm/note.rb:14,19;app/models/crm/note_attachment.rb. crm_checkinis a has_oneCrm::Checkin(geolocation) —note.rb:15;app/models/crm/checkin.rb.- Multi-FK note (person/company/deal/ticket nullable; type is metadata) —
note.rb:5-8. - Real notes API
/api/v4/notesusespage/per_page(no org-scoped bulk/count) —app/controllers/api/v4/notes.rb:132-135. - Mentions via
data-user-id(CRM int IDs) —note.rb:98-105. - No
deleted_atoncrm_notes(hard-delete only) —db/schema.rb:1980-2006.
qontak-customer-fe (Nuxt 3)
CustomerNotehas no metadata/tag field —features/customers/store/CustomerStore.ts:37-50; no banner/dismiss/legacy/imported infra (grep empty).- Renders via DOMPurify; author via
owner_name; edit/delete gated onpermission—features/customers/detail/components/Notes/components/NotesList/NotesList.vue.
PRD CHANGELOG
| Version | Date | By | Section | Type | Summary |
|---|---|---|---|---|---|
| 2.1 | 2026-06-03 | Score fixes | S4, S6, S7, S8, S13, S14, S15 | UPDATED | Post-score corrections: OQ-2 Risk given a mitigation (pre-migration coverage report + 99% gate) and all OQ deadlines dated (clears Gate 3); reframed trigger as a gocraft/work consumer job + /private/notes/migration/status (D-11); Person→Contact resolution via existing contact.source_id/crm_data.id (D-12); added plan scope + Monitor API states; dated decisions/alternatives. |
| 2.0 | 2026-06-03 | Grounded rewrite | All | REWRITE | Corrected against code: net-new /cdp/notes/migrate + count + legacy_crm_note_id (none existed); caller-set timestamps (SetDefaults overwrote); S2S tenant scoping; CRM content is rich HTML (not plain text); added crm_note_attachments documents; crm_checkin is geolocation; real CRM API is /api/v4/notes page/per_page; multi-FK routing; unmappable-owner label; mentions stripped to text; S06 banner/tag declared out-of-scope (no FE infra). |
| 1.1 | 2026-05-21 | (prior) | — | SUPERSEDED | Prior version (assumed endpoints/fields/source-schema that don't exist). |