Duplicate Detection Service
Component Detail
Description
Scheduled background service that compares newly submitted activities against recent history across configurable dimensions including peer mentor ID, contact ID, activity type, and date tolerance window. Computes a similarity score (0-100), persists the score and duplicate_candidate_id back onto the activities table, then auto-flags high-confidence pairs via the activity-flagging subsystem with reason_code duplicate_suspected.
duplicate-detection-service
Sources & reasoning
Required by blueprint as the core detection engine. NHF operates 1 400 local associations where multiple coordinators can register for the same contact, creating concrete double-counting risk in Bufdir reports. Scheduled detection across peer mentor, contact, type, and date dimensions is the primary technical mechanism to catch duplicates before export.
No source references — this artifact was included based on reasoning alone (see above).
Responsibilities
- Run scheduled background comparisons of new activities against recent history per organization
- Compute similarity scores across peer_mentor_id, contact_id, activity_type, and date proximity dimensions
- Persist similarity_score and duplicate_candidate_id on activity records
- Auto-flag high-confidence pairs via activity-flagging with reason_code duplicate_suspected
- Apply per-organization tolerance configuration for thresholds and date windows
Interfaces
runDetectionJob(organizationId: string): Promise<DetectionResult>
computeSimilarityScore(a: Activity, b: Activity): number
flagHighConfidencePairs(pairs: DuplicatePair[]): Promise<void>
getDetectionConfig(organizationId: string): Promise<DetectionConfig>
updateDetectionConfig(organizationId: string, config: Partial<DetectionConfig>): Promise<void>
Relationships
Dependencies (3)
Components this component depends on
Dependents (3)
Components that depend on this component
Related Data Entities (2)
Data entities managed by this component