Data deduplication

Administrators and data managers can create data maintenanceClosed User directed automated jobs that improve data quality by targeting specific data quality issues like sub-object inactivation and data deduplication detection. subscriptions to identify duplicate HCP and HCO records in their Network instance. Duplicate records exist because of poor match rules, DCR processing errors, or because records are added without searching for existing records first. Use data deduplication jobs to compare specific records in your Network instance against all other records in your instance. Data deduplication subscriptions are not supported for custom objects.

This feature is enabled by default in your Network instance.

Finding duplicate records

Similar to source subscriptions, data deduplication subscriptions use the following tools:

  • Data groups - to narrow the number of records being compared
  • Match rules - to determine if records are the same or not

Match logs contain a summary of the records that were merged or suspect matched during the job. Comparisons are done at the entity level. Existing merge rules are used for sub-objects when records are merged.

All previous unmerges that have occurred in the instance are tracked and can be ignored from comparisons when you create a data deduplication job. However, rejected suspect match tasks and duplicates are tracked only from the feature implementation and onwards.

The jobs can be thoroughly tested before you commit to merging any records.

Considerations for data deduplication subscriptions

Before you create a subscription, review the following considerations: