Deduplication

Kernel identifies groups of duplicate accounts and identifies the primary record to preserve. Kernel uses its proprietary, AI-driven algorithm to scan all accounts in the CRM.

The following data points are provided:

Data
Definition

Duplicate type

See table below

Duplicate group

A number used to group duplicate accounts into unique groups

Duplicate of ID

Salesforce ID of the account of which the account is a duplicate

Duplicate - Reasoning

Plain-text reasoning explaining the logic behind the duplicate analysis

Duplicate types

Each record is associated with one of the following duplicate types

Type
Definition

Primary

Primary record

Note that duplicates of this account may exist, but this is the record that is recommended to survive the merge.

Exact

Two accounts are an exact match when they share the same Kernel ID, or when their legal name, legal country, name, and trading country all match, or when their URL, name, and legal name all align.

Location

Physical establishments of the same legal entity sharing the same domain — for example, hotel locations, offices, or stores operating at different URLs under the same root domain. One account must be classified as an Establishment.

Regional

The trading presence of a legal entity in a different country. For example, the UK trading entity of a company whose legal registration is in Germany. Identified when two accounts share the same legal name but operate in different trading countries.

Trading

The trading entity linked to its legal entity within the same country — for example, the operating brand of a holding company. Identified when two accounts share the same legal name, one as the legal identity and one as the trading identity.

Website

Accounts sharing the same URL and name. A softer match than Exact — legal name alignment is not required. Off by default; can be enabled per configuration.

How Kernel identifies duplicates in your CRM

Kernel's deduplication works in two steps:

1

Candidate generation

For each account in your CRM, Kernel will scan the full CRM to create a long-list of potential duplicate candidates.

2

Candidate selection

Kernel will crawl the websites of all candidates and use data from the Website analysisto determine if the pair is a true duplicate pair. The duplicate type and group will also be calculated.

Kernel uses a contextual, AI-based approach to determine duplicate pairs, e.g. to decide that amazon.fr is a regional duplicate of amazon.com, but apollo.de is not a regional duplicate of apollo.com

Primary record selection

When Kernel identifies duplicates, it designates one record as Primary:

Selection is determined in the following order:

1

Duplicate type priority

The type with the highest priority in the group takes precedence. Where multiple types are present, the hierarchy applied is:

Exact > Location > Regional > Trading > Website

2

Identity type

For Regional and Trading groups, the legal entity is preferred over the trading entity.

For Location groups, the parent entity is preferred over the establishment.

3

CRM field values

CRM fields configured for primary selection are compared across remaining candidates. By default, risk score and last activity date are used. The record with the highest value is preferred.

Duplicate groups

All associated duplicates are assigned a Duplicate group ID. Each duplicate group can only have 1 Primary account.

Last updated