Deduplication
Kernel identifies groups of duplicate accounts and identifies the master record to preserve. Kernel uses its proprietary, AI-driven algorithm to scan all accounts in the CRM.
The following data points are provided:
Duplicate type
See table below
Duplicate group
A number used to group duplicate accounts into unique groups
Duplicate of ID
Salesforce ID of the account of which the account is a duplicate
Duplicate - Reasoning
Plain-text reasoning explaining the logic behind the duplicate analysis
Duplicate types
Each record is associated with one of the following duplicate types
Primary
Primary record
Note that duplicates of this account may exist, but this is the record that is recommended to survive the merge.
Exact
Exact match found after ‘cleaning’ and standardizing the URLs
Subdomain
Similar to exact match, e.g. shop.ccs.com vs. ccs.com
Regional
fr.amazon.com, amazon.fr, amazon.com are all regional duplicates, but apollo.de and apollo.com are not.
Potential
A catch-all category for all the hard cases that require extensive work, e.g. corporate or careers sites, investor relationship, or product/marketing sites.
How Kernel identifies duplicates in your CRM
Kernel's deduplication works in two steps:
Candidate selection
Kernel will crawl the websites of all candidates and use data from the Website analysisto determine if the pair is a true duplicate pair. The duplicate type and group will also be calculated.
Kernel uses a contextual, AI-based approach to determine duplicate pairs, e.g. to decide that amazon.fr
is a regional duplicate of amazon.com
, but apollo.de
is not a regional duplicate of apollo.com
Master record selection
Kernel uses a variety of factors to determine the master record (Primary
), including
The top-level domain (TLD)
Redirecting domains
Website verification status (Website analysis)
Duplicate groups
All associated duplicates are assigned a Duplicate group ID. Each duplicate group can only have 1 Primary
account.
Last updated