Deduplication policies
Duplicate types
Kernel categorizes all accounts into one of five types:
Primary
Primary record
Note that duplicates of this account may exist, but this is the record that is recommended to survive the merge.
Exact
Exact match found after ‘cleaning’ and standardizing the URLs
Subdomain
Similar to exact match, e.g. shop.ccs.com vs. ccs.com
Regional
fr.amazon.com, amazon.fr, amazon.com are all regional duplicates, but apollo.de and apollo.com are not.
Potential
A catch-all category for all the hard cases that require extensive work, e.g. corporate or careers sites, investor relationship, or product/marketing sites.
Regional account policy
You can treat regional duplicates in one of two ways:
Treat regional sites as subsidiaries (keep separate, e.g., amazon.fr is a child of amazon.com)
Treat regional sites as duplicates (collapse into the global parent)
This setting is relevant for determining the Cleaning action
Primary record selection
When the Cleaning Agent identifies duplicate records in your database, it automatically selects one record as the "primary" and marks the others as duplicates. The primary record becomes the master record that all duplicates will merge into or reference.
How Primary Records Are Selected
The system evaluates all duplicate records using the following criteria, applied in priority order until a clear winner emerges:
Domain authority
The system prioritizes established domain extensions in this order:
Commercial domains (.com) - highest priority
Government and educational domains (.gov, .edu)
Organizational and network domains (.org, .net)
Newer domains (.io, .ai, .tech)
Regional domains (.co.uk, .de, .fr) - lower priority
Risk score
The record with the highest Account riskis preserved.
Last updated

