Skip to main content

RFC Review: Legacy Migration — CRM Contact Notes → CDP Notes

Companion review for rfc-legacy-migration-crm-notes.md, produced by the rfc-reviewer skill. Lives beside the RFC; valid only for the RFC revision in reviewed_rfc_last_updated (2026-06-18, commit 7a24fd4).

Executive Summary

  • Overall Score: 8.5/10
  • Rating: Agentic-Ready
  • RFC Type: backend
  • Sub-Type: new-feature
  • Assessment Confidence: High
  • Applied Caps/Gates: none triggered (every scored category ≥ 7.5). The 9.0+ gate is not met — it requires DMS/DIC/FMC ≥ 8.5, and DIC/FMC sit at 8.0 — so the overall is held at 8.5.
  • Implementation Readiness Verdict: PROCEED — an agent can build the full pipeline now; two items (CRM extraction endpoint, crm_data.id coverage) are runtime prerequisites, not build blockers, and the RFC already isolates the extractor behind an interface.
  • Report Path: cdp/legacy-migration-crm-notes/rfcs/rfc-legacy-migration-crm-notes-review.md
  • RFC Author: Zhelia Alifa | Reviewed: 2026-06-18

An AI agent can read this RFC and produce correct backend code without a clarification meeting: every endpoint, field, decision, and execution chunk is pinned and grounded in verified path:line anchors, and the RFC mirrors a real in-repo framework (ActivityLogMigration*) rather than inventing one. The biggest strength is the decision grounding — 11 ADR blocks that each correct the PRD against the live code (no /cdp/notes/migrate, BasicAuth /private, partial unique index, crm_data.id over source_id). The biggest risk is OQ-2: the entire Person→Contact resolution (Decision 7) assumes crm_data.id stores the CRM Person id; if it actually stores a centralized-contact id, resolution silently fails for every note. That single assumption — not anything structural — is the one thing to confirm (with the per-CID coverage report) before a real migration runs.


Quick Verdict

Why this RFC can be implemented agentically:

  • Every decision names a chosen option, rejected alternatives, and a reference file; the 10-chunk Agent Execution Plan has files + commands + verifiable acceptance criteria.
  • Security and idempotency are specified at implementation depth (deny-by-default bluemonday policy; partial unique index with partialFilterExpression; per-CID Redis lock; SSRF host-allow-list).
  • It mirrors a verified existing pattern (ActivityLogMigrationConsumer.ProcessUpdateUserIDJob), so the agent has a concrete template for handler/service/consumer/worker wiring.

Why this RFC will cause agent guessing or rework:

  • crm_data.id semantics are unconfirmed (OQ-2/REV-1) — resolution may target the wrong id space.
  • A few failure-path numbers are not pinned (CRM call timeout/backoff in ms — REV-2), and §2.F reuses HTTP "5xx" wording for an in-process Mongo write (REV-3).
  • The notes-only crm_note_type_id set ships as a default (1,6) pending PM confirmation (OQ-4/REV-4).

Findings Ledger (carry-forward)

IDSeverityFinding (one line)RFC locationStatusFirst seenResolved inEvidence / fix
REV-1majorcrm_data.id might hold a centralized-contact id, not the CRM Person id; if so Decision 7 resolution fails for all notesDecision 7 · §1 Assumptions · OQ-2fixedR1R2Grounded: crm_data.id == crm_person_id (no separate centralized-contact id space) — contact_sync_request.go:104-105, params_mapper.rb:31, centralized_contacts_controller.rb:120-129, Crm::Lead < Crm::Person. Resolver switched to SearchByAppContactID("crm",…) (search.go:27); OQ-2 downgraded to a coverage-only gate.
REV-2minorCRM extraction failure policy lacks pinned timeout/backoff numbers§2.F · §3.A · Decision 10fixedR1R2Extractor now uses heimdall httpclient (WithHTTPTimeout, iag_mekari.go:69-71) + retrier: 10s timeout, 3× backoff 1s/3s/9s, config CRM_NOTES_EXTRACT_TIMEOUT. Noted current QontakCrmClient has no timeout.
REV-3minor§2.F row 7 used HTTP "5xx → retry once" wording for an in-process Mongo write§2.F row 7 · §3.AfixedR1R2Reworded to Mongo write-error semantics; insert is an idempotent upsert (retry-safe), no HTTP status involved.
REV-4minorNotes-only filter ships as default crm_note_type_id IN (1,6) pending PM confirmDecision 11 · OQ-4openR1PM scoping decision (not code-groundable) — remains OQ-4 with the grounded default; agent reads the set from config. Correctly flagged rather than guessed.
REV-5minorCDG right-to-delete path for migrated notes was implicit§3.DfixedR1R2Grounded: migrated notes inherit native soft-delete (is_deleted, read.go:36-37); no contact-delete→notes cascade exists today (verified) — out of scope, applies to native notes equally; new fields erased with the document.
REV-6minorBulk-insert mechanism under skip-on-conflict not pinnedDecision 4 · §2.EfixedR1R2Pinned to idempotent upsert via existing IDbRepo.BulkUpdateBulkWrite(SetOrdered(false)) (db.go:180-181), UpdateOneModel+SetUpsert+$setOnInsert on (company_sso_id, legacy_crm_note_id) — not CreateMany; MatchedCount=skipped, UpsertedCount=migrated.

Ledger summary: R2 — 5 fixed (REV-1 major + REV-2/3/5/6 minor), 1 open (REV-4, minor, PM-owned), 0 accepted-risk. All fixes are grounded in live code (citations above). REV-4 stays in the RFC Open-Questions table (OQ-4) as a product decision. A full re-score was not re-run this cycle; the fixes strengthen DIC/FMC/SAS and would, if anything, raise the score — see R2 note in Review History.


PRD → RFC Traceability Matrix

Standard format (PRD exists)

PRD ElementRFC SectionCoverage
NOTES-MIG-S01 (run migration)§2.4 rows 1-2, Decision 1/2, Detail 1.CFull
NOTES-MIG-S02 (transform note)§2.4 algorithm, Decisions 3/5/6/7/8, Detail 1.CFull
NOTES-MIG-S03 (idempotent re-run)Decision 4, §2.E, Detail 1.CFull
NOTES-MIG-S04 (validation)§2.4, §2.F, Detail 1.CFull
NOTES-MIG-S05 (view migrated notes)§1 Out-of-Scope #8, Decision 6Full (no FE work; render via existing UI)
NOTES-MIG-S06-NEG (mentions/activities)Decision 5, §2.4 filterFull
Constraint: idempotency unique indexDecision 4 (partial unique index)Full
Constraint: caller-set timestampsDecision 3Full
Constraint: S2S + per-batch company_sso_idDecision 2, §3Full
Constraint: server-side sanitizationDecision 5, §3Full
Constraint: notes-only scopeDecision 11 / OQ-4Partial — exact type-id set pending PM (REV-4)
Dependency: CRM extraction (real)Decision 10, OQ-7, §1 DepsFull (design); external build pending
§12 Observability events§3 Monitoring (names preserved)Full
§13 Success metrics§1 Success Criteria + §3 SLOFull
Reverse: legacy_owner_label field§1 PRD-to-Schema, Decision 6Justified (D-7)
Reverse: in-process batch write (no /cdp/notes/migrate)Decision 1Justified (D-1/D-11 grounding)

Summary: 13 of 14 PRD elements fully covered, 1 partial (notes-only type-id set), 0 missing. Every RFC decision traces to a PRD decision (D-1..D-12) or a grounded correction — no scope creep.


Scorecard

Backend Scorecard (12 core + 1 conditional)

CategoryScoreEvidence-Based Rationale
PRT — PRD Traceability9.0Bidirectional matrix (Detail 1.A forward+reverse), PRD §-coverage table, per-story map (Detail 1.C) covering all 6 stories once.
TDC — Technical Decisions9.011 ADR blocks, each with chosen option + rejected alternatives + reversibility, grounded in files; no "TBD"/dangling.
DMS — Data Model & Schema8.5Mongo (schemaless): erDiagram, two new fields, partial unique index JSON migration, per-status lifecycle, PII classification, bounded cardinality. Not classic SQL DDL but the migration is writable as-is.
ACV — API Contract & Versioning8.52 endpoints with method/path/auth (BasicAuth)/request/response/status codes/idempotency; internal calls + resolution algorithm specified; inbound n/a — puller.
DIC — Data Integrity & Consistency8.0Idempotency via partial unique index + skip-on-conflict; per-CID lock; concurrency map; timestamp preservation. Held <8.5 by REV-3 (HTTP wording on the Mongo write) and REV-6 (bulk-call not fully pinned).
FMC — Failure Mode & Retry Coverage8.0Failure Mode + Error Response catalogs; halt at >1%; async job spec with poison handling; attachment SSRF/size guard. Held <8.5: CRM timeout/backoff not numerically pinned (REV-2).
CSS — Concurrency & Scaling8.0Per-CID Redis lock; batch 500/max 1000; throughput (10k/hr, ≤4h); horizontal workers; indexed $in resolution.
SAS — Security & Authorization9.0Threat model (7 vectors incl. SSRF + storage DoS); BasicAuth S2S; explicit per-batch tenant scoping; deny-by-default sanitizer (no style/javascript:); SSRF host-allow-list + magic-byte; Role×Endpoint matrix; secrets via config.
MRP — Migration & Rollout Plan8.5Flag (per-CID, default OFF); 3 stages + coverage gate; rollback (flag + make migrate-down); config contract; deploy order (CRM→BE→Ops); partial-index up/down.
OBS — Observability Definition8.0PRD event names preserved exactly; structured slog fields enumerated; PagerDuty P1/P2 + Slack alerts; SLO targets. Metric naming is event-style (no RED/USE metric names) — acceptable for a one-time job.
SBC — Service Boundary & Coupling8.5contact-service owns it; CRM read-only (squad boundary, async pull); Launchpad non-blocking; per-service responsibility table + Detail 2.F boundary matrix.
CPA — Pattern Alignment9.0Patterns-to-Follow table citing reference files; explicitly mirrors ActivityLogMigration*; index-migration JSON pattern reused; deviations (e.g. /cdp/private) justified.
CDG — Compliance & Data Governance7.5Active (PII). §3.D classification + retention + residency OQ + no-PII-logs + checkin drop. Held at 7.5: right-to-delete path is implicit (REV-5).

Resource & Cost Advisory

  • One-time backfill: ~21k notes × ~130 CIDs, bounded. New attachment-storage egress (download + re-upload incl. documents) is the main cost — the RFC flags the storage-quota confirm (OQ-9) and a per-CID volume bound; route to CDP Infra for sizing. No new standing infra (reuses Redis/Mongo/workers). Not blocking.

Decision Closure Assessment

Decision Index

#DecisionStatusCritical Gaps
1In-process batch write (no /cdp/notes/migrate)Resolved
2gocraft/work trigger + /private BasicAuth + Redis statusResolved
3Caller-set timestamps (bypass SetDefaults)Resolved
4legacy_crm_note_id + partial unique indexResolvedbulk-call mechanism not fully pinned (REV-6)
5Server-side sanitize + strip mentionsResolved
6owner_id=null + legacy_owner_labelResolved
7Resolve via crm_data.id (indexed)Partialcrm_data.id may be the wrong id space (REV-1/OQ-2)
8Re-link images+audios+documentsResolved
9Drop crm_checkin geolocationResolved
10Extend QontakCrmClient; CRM builds endpointPartialexternal endpoint contract not yet defined (OQ-7); extractor stubbed behind interface
11Notes-only filterPartialexact type-id set pending PM (REV-4/OQ-4)

Aggregate: 8 of 11 Resolved, 3 Partial, 0 Dangling.


Decision: 7 — Resolve Person→Contact via crm_data.id

Status: Partial

What was decided "resolve by crm_data.id via SearchWithFilters(ctx, bson.M{"crm_data.id": {"$in": batchIDs}}, …), apply person-first precedence, fall back to source_id only where coverage is incomplete."

Alternatives considered Documented: source_id (rejected as primary — unindexed); net-new mapping table (rejected as primary — linkage already exists). Good.

Grounding in existing code Strong: contact/base.go:53,68-70,342-344, index crm_contact_index (001_create_contact.up.json), contact/search.go:125.

Interface specification The resolution algorithm is specified (batched $in, person-first precedence, CONTACT_NOT_MAPPED on miss).

Failure handling CONTACT_NOT_MAPPED → skip + count failure, no halt. Covered.

Challenge results

  • Scale: uses the existing index; $in batching bounds query size. Holds.
  • Reversibility: internal; mapping-table fallback is additive. Easy.
  • Consistency: consistent with §2.4 algorithm and OQ-2.
  • Agent implementability: the agent can implement the query, but if crm_data.id is the wrong id space the code runs and silently resolves nothing. This is why it's Partial, not Resolved.

Gaps and suggestions Missing: confirmation of what crm_data.id actually stores per CID. Suggested resolution: as OQ-2 states — run the coverage report on a sample CID first; if crm_data.id ≠ CRM Person id, switch the resolver key (or populate a mapping). Open question: does crm_data.id hold the CRM crm_person_id, or the centralized-contact id?


Decision: 10 — Extend QontakCrmClient; CRM squad builds the org-scoped endpoint

Status: Partial (external dependency)

The contact-service side is fully specified (extend qontak_crm.go with ListPersonNotes, reuse CRM_API_AUTH, 5xx/timeout handling). The CRM endpoint contract itself is not defined (request/response shape, pagination semantics, notes-only filter param) — correctly an external dependency (OQ-7) and isolated behind an interface so build can proceed with a stub. Add the agreed CRM contract before integration testing. REV-2 (pin timeout/backoff) applies here.


Decision: 11 — Notes-only filter

Status: Partial — default crm_note_type_id IN (1,6) pending PM (OQ-4/REV-4). Agent should read the set from config, not hardcode.

(Decisions 1-6, 8, 9 are Resolved with named options, rejected alternatives, grounded files, and failure handling — abbreviated per the ≥7.0 optimization rule.)


Data Integrity Deep-Dive

Write PathTransaction ScopePartial Failure BehaviorIdempotency KeyConsistency GuaranteeDuplicate Handling
CreateNotesBatch (note insert)per-batch bulk write; per-row independentrow error → count failure, continue; >1% → haltlegacy_crm_note_id (+ partial unique index)eventual (batched)E11000 → skip (notes_skipped++) — REV-6: pin the driver call
Redis status updatesingle key writeoverwrite-latestper-CID keyeventualidempotent (last write wins)
Attachment re-upload (CDP storage)single PutObjectdownload/upload fail → insert-without + logdeterministic key {company_sso_id}/{legacy_crm_note_id}/{asset}n/aoverwrite same key safely
Owner resolution(read-only; no write)unmappable → null + labeln/an/an/a

Solid. The only tightening is REV-6 (specify InsertMany(ordered:false) vs per-doc upsert so partial-batch dup counting is deterministic).


Concurrency Collision Map

#Shared ResourceWritersCollision ScenarioResolution MechanismLock Failure BehaviorAssessment
1Migration run (one CID)Ops (multiple enqueues)two jobs same CIDper-CID Redis in-progress lock2nd enqueue → 409 JOB_ALREADY_RUNNINGadequate
2contact_notes docconsumer (re-run / overlapping batch)same note twicepartial unique indexE11000 → skipadequate
3CDP storage objectconsumerre-upload on re-rundeterministic keyoverwrite-safeadequate

All collision points listed with mechanisms. No gaps.


API Contract Completeness Check

EndpointRequest SchemaResponse SchemaError TaxonomyAuth SpecIdempotencyExample PayloadsAssessment
POST /private/notes/migratecomplete ({cid, company_sso_id})complete ({job_id, status})complete (403/404/409/401)specific (mymiddleware.BasicAuth)defined (per-CID lock)partial (no inline JSON example)5.5/6
GET /private/notes/migration/statuscomplete (query cid)complete (status payload)complete (404→not_started)specific (BasicAuth)n/a (read)partial5/6

Both endpoints are consumer-ready; the only nit is no inline JSON example payload (the field lists are complete, so this is cosmetic).


Async Job / Event Consumer Spec

Job/ConsumerTriggerInput ShapeRetry PolicyDLQConcurrency LimitIdempotency KeyTimeoutAssessment
NotesMigrationConsumer.ProcessNotesMigrationJobEnqueueJob(NotesMigrationJobName)specified ({cid, company_sso_id} via job.Args["data"])specified (extraction 3× backoff — pin ms, REV-2; per-note counted not retried)gocraft dead pool (implied by framework; not named — minor)per-CID lock (single in-flight/CID)legacy_crm_note_id + per-CID lock≤4h/CID window6.5/7

Strong. Two minor tightenings: pin the extraction timeout/backoff (REV-2) and name the dead-pool/poison destination explicitly (the framework provides one; state it).


Compliance Trigger Check

TriggerFound?Data LocationClassificationAssessment
PII (name, content, attachments)yescontact_notes.note, .attachments[], legacy_owner_labelPIIhandled (§3.D) — right-to-delete implicit (REV-5)
Payment datano
Health datano
User content with retentionyesmigrated notesper CDP note lifecyclehandled
Auth/session datano
Cross-border data transferyes (potential)CDP attachment storage regionUU PDPflagged (OQ-9) — InfoSec to confirm region

CDG Status: Active — scored 7.5 (right-to-delete path should be made explicit).


Agentic Readiness Deep-Dive

Vague Word Audit

#Word/PhraseLocationImpactConcrete Replacement
1"retry 3× backoff" (no ms)§2.F row 2agent picks default delays"1s/3s/9s, 10s per-request timeout" (REV-2)
2"5xx → retry once" (on a Mongo write)§2.F row 7agent looks for an HTTP status that doesn't existMongo write-error wording (REV-3)

Total vague words in spec sections: 2 (both minor, both ledgered).

Dangling Alternatives

#AlternativesLocationImpact
noneevery decision names one chosen option

Total dangling alternatives: 0.

Task Decomposition Assessment

ChunkAcceptance CriteriaAssessment
1-10 (§4.C)each has files + commands + verifiable ACverifiable

The 10-chunk plan is the strongest agentic-readiness artifact — ordered, file-pathed, command-backed, with assertable criteria.


Strengths

  • Decision grounding (TDC/CPA): 11 ADR blocks that correct the PRD against verified code (Decision 1 /cdp→in-process write; Decision 2 BasicAuth /private; Decision 4 partial unique index). An agent inherits the why, not just the what.
  • Security depth (SAS §3): deny-by-default sanitizer policy, SSRF host-allow-list + magic-byte validation, explicit per-batch tenant scoping, Role×Endpoint matrix — well beyond "authenticated endpoint".
  • Executable plan (§4.C/§4.D): 10 chunks with files + repo-sourced commands (make test/make migrate-up) + verifiable acceptance criteria, plus a runnable rollback recipe.

Biggest Gaps

  • crm_data.id semantics (Decision 7 / REV-1): the whole resolution strategy rests on an unconfirmed assumption; confirm via OQ-2's coverage report before any real CID, or resolution silently no-ops.
  • Unpinned failure numbers (§2.F / REV-2) and HTTP-wording-on-Mongo-write (§2.F row 7 / REV-3): small but real spec drift that would make an agent guess delays or hunt for a non-existent status code.
  • CDG right-to-delete (§3.D / REV-5): PII deletion path for migrated notes + their re-uploaded attachments is implicit; name it.

Priority Actions

  1. Decision 7 / OQ-2 — confirm crm_data.id stores the CRM crm_person_id (run the per-CID coverage report on a sample CID); if not, switch the resolver key. Unblocks correct resolution.
  2. §2.F rows 2 & 7 — pin CRM extraction timeout/backoff in ms (REV-2) and reword the batch-insert failure path in Mongo write-error terms, not HTTP "5xx" (REV-3).
  3. Decision 11 / OQ-4 — get the confirmed crm_note_type_id set from PM; read it from config.
  4. §3.D — state the right-to-delete path for migrated notes + attachments (REV-5), and pin the bulk-insert driver call (REV-6).

Backend Contract Addendum

Endpoint Contract Details

EndpointMethod/PathAuthZRequest ContractResponse ContractError ContractIdempotency/VersioningStatus
migratePOST /private/notes/migratemymiddleware.BasicAuth (S2S){cid, company_sso_id}{job_id, status:"in_progress"}403 FLAG_DISABLED, 404 CID_NOT_FOUND, 409 ALREADY_MIGRATED/JOB_ALREADY_RUNNING, 401/403per-CID Redis lock; path-versioned /privateComplete (add inline JSON example)
statusGET /private/notes/migration/status?cid=BasicAuthquery cid{status, progress_pct, notes_processed, notes_total, failure_rate, match_pct, error_log_url?}404→not_startedreadComplete

Database Changes Details

ChangeTable/EntityDDL / Shape DiffData Migration PlanRollback PlanCompatibility WindowStatus
add fieldscontact_notes+legacy_crm_note_id string, +legacy_owner_label string (schemaless — no DDL)none (app struct only)remove fields (no data change)additive; existing reads unaffectedComplete
add indexcontact_notescreateIndexes unique + partialFilterExpression {legacy_crm_note_id:{$exists:true}}make migrate-upmake migrate-down (drop index; data untouched)partial filter ⇒ existing notes never indexedComplete

Implementation Readiness Checklist

Unblocked (agent can proceed)

  • PRD → RFC traceability matrix complete
  • All technical decisions resolved with alternatives rejected (8/11 Resolved, 3 Partial with adopted defaults)
  • Failure modes handled per external interaction with error catalog
  • Configuration contract (env vars, flag, defaults)
  • Pattern alignment verified
  • Rollout plan with flag + rollback
  • Observability events + alerts defined
  • Task decomposition with acceptance criteria per chunk
  • Schema changes at (Mongo) field + index precision
  • API contracts with request/response/error taxonomy/auth
  • Transaction boundaries + idempotency keys per write path
  • Concurrency collision points with resolution mechanisms
  • Security: auth, validation, injection (SSRF/XSS), tenancy isolation
  • Migration plan: additive, partial-index, rollback script
  • Service boundary + coupling documented
  • Compliance handled (PII) — right-to-delete to be made explicit

Blocked (must fix first)

  • none that block building; OQ-2 (crm_data.id) + OQ-7 (CRM endpoint) block running a real migration

Verdict: Ready to implement (build the pipeline now; gate the first real CID on OQ-2 + OQ-7).


Task Manifest

The RFC specifies a 10-chunk plan (§4.C); verified as ordered, file-pathed, command-backed, with verifiable AC. No re-derivation needed — use §4.C as the manifest. Suggested addition: a chunk-0 "confirm crm_data.id coverage on a sample CID (OQ-2)" gate before chunk 3 (resolver).


Dangling Decisions Log

#DecisionLocationOwnerDeadline
none (3 Partials carry adopted defaults + owners via OQ-2/4/7)

Open Questions

#QuestionCategorySeverity
1Does crm_data.id store the CRM crm_person_id or the centralized-contact id? (REV-1)DIC / SBCBlocking (for run, not build)
2Pin CRM extraction timeout + backoff (ms)? (REV-2)FMCImportant
3Reword batch-insert failure path from HTTP "5xx" to Mongo write-error? (REV-3)DICImportant
4Confirmed crm_note_type_id set for notes-only? (REV-4)TDCImportant
5Right-to-delete path for migrated notes + attachments? (REV-5)CDGImportant
6Bulk-insert driver call (InsertMany(ordered:false))? (REV-6)DICNice-to-have

Evidence Notes

  • Technical Decisions (§2): 11 grounded ADR blocks — the dominant evidence for the 9.0 TDC and the PROCEED verdict.
  • §2.0 Source Verification: every anchor carries verified path:line evidence across both worktrees — drives High confidence.
  • §2.F / §3.A: failure catalogs are thorough but carry the two minor wording/number gaps (REV-2/3) that hold DIC/FMC at 8.0.
  • §3.D: compliance present but right-to-delete implicit — holds CDG at 7.5.

Review History

CycleDateReviewed RFC revision (last_updated / commit)ScoreVerdictFindings open → fixedNotes
R12026-06-182026-06-18 / 7a24fd48.5PROCEED6 open (1 major, 5 minor), 0 fixedInitial review; backend new-feature; no caps triggered; 9.0+ gate missed on DIC/FMC < 8.5
R22026-06-182026-06-18 / fixes commit8.5 (not re-scored)PROCEED5 fixed, 1 open (REV-4, PM)Author applied grounded fixes for REV-1/2/3/5/6 (id-space confirmed, heimdall timeout, upsert idempotency, right-to-delete, resolver method). Score not formally re-run; fixes strengthen DIC/FMC/SAS. REV-4 remains a PM scoping decision.