Website analysis
Kernel identifies and fixes accounts with missing and incorrect websites
Kernel identifies and fixes accounts with missing websites and incorrect websites (gmail.com, bit.ly, etc.). A correct mapping between the company and the website is a prerequisite for cleaning & enrichment.
Kernel uses the following data to make its recommendation:
Related contact domains
Billing/Shipping address
Company name
Alternative websites, billing emails, account notes, and existing LinkedIn profile
The output of the website analysis is the following data:
Website status
Whether the website is functional or not; a non-functional website includes 4xx and 5xx status codes, parking domain pages, “out of business” messages, and absence of content
Resolved domain
The final website domain after following all website domains
Inferred domain
If the original website was incorrect, malformed, or invalid, the inferred domain shows the correct website

Hidden duplicates
Master data corrections help flag “obvious” duplicates

Website analysis
Kernel automatically cleans and corrects website data to ensure accuracy. The process begins by identifying and handling invalid entries that often originate from form submissions.
Step 1: Removing Invalid Domains
First, Kernel removes invalid domains by checking them against a static list of common errors that we maintain. This list targets entries such as:
Public email providers like
gmail.com,mail.ru, andoutlook.com.Placeholder domains such as
test.com.Link shorteners like
bit.lyandlinktr.ee. Kernel will first attempt to follow these short links to see if they resolve to a valid corporate website before discarding them.
This step can intelligently separate between facebook.com, the company, and facebook.com/user-profile, e.g., an influencer on Facebook.
Step 2: Inferring Missing Domains
For any account that lacks a valid website after the cleansing step, Kernel automatically infers the correct one. Kernel uses a variety of techniques to identify the company's website correctly and accurately assigns the proper domain, turning an incomplete record into an actionable one.
Missing domain techniques
Check if the CRM name is actually a website
If the name is Acme.com, Kernel will use this as the website and update the Name to Acme
Existing LinkedIn company profile
If the account has an associated LinkedIn profile, Kernel will check if the profile has a valid domain associated with it.
Related contact domains
If the account has contacts with emails associated with it, Kernel will strip out the domain name and use this as evidence, e.g. [email protected] -> kernel.ai
URL typo
If the website appears to have been fat-fingered, e.g. delotte.com or hubsport.com, Kernel will replace it with the correctly typed version.
Alternative websites
If the CRM has custom account fields used to store domain names, Kernel will use these for evidence as well.
Address lookup
If the account has a valid billing or shipping address, Kernel will look up the company and find any associated domains.
Web search
Kernel searches across the Internet to find the company's website
Web search (LinkedIn)
Kernel will search across all LinkedIn accounts for a suitable match.
Based on these techniques, Kernel produces a list of candidate websites. Kernel crawls the websites of all candidates, feeding their content into a proprietary AI-based algorithm to determine if the pairing is accurate.
Step 3: Website verification
Kernel comprehensively verifies each website in your CRM to determine its true operational status. Our multi-layered process goes far beyond a simple ping to deliver a verdict on a site's true business viability.
Key verification features include:
URL Path Resolution: Traces the website's path to its final destination, automatically following redirects (e.g.,
frontapp.comtofront.com) and handling common URL variations like thewwwprefix.Unrestricted Global Access: Bypasses regional firewalls and restrictions, such as GDPR, to ensure a reliable connection can be established from anywhere in the world.
Intelligent Content Analysis: Recognizes that a simple "200 OK" status code is not enough. The system analyzes page content to identify and flag non-operational sites, including:
Domain Parking: Generic landing pages from services like HugeDomains.
Business Closures: Explicit "out of business" or "service unavailable" messages.
Domain Misuse: Sites that have been acquired and repurposed for unrelated or illicit activities, such as gambling websites.
False Negative Prevention: Cross-references any unresponsive site against a curated database of known corporate domains. This safeguard prevents legitimate sites from being incorrectly flagged as 'Not working' due to a temporary server glitch or network outage.
This comprehensive approach ensures the final verdict reflects a website's true business viability, not just its momentary technical uptime.
The website verification is a crucial factor in determining whether the cleaning action should be "Delete."
Last updated

