HubSpot Gotcha #1: Identity De-duplication

This post is an excerpt from my upcoming book — Mastering HubSpot. Sign-up for the pre-launch list to receive an exclusive discount.

This gotcha is one that most HubSpot users have no awareness of, but are affected by. Even if you take no action after reading this, it’s very important to know how identity de-duplication works.

Envision the following scenario, if you will.

You e-mail a list of prospects to promote an exclusive webinar. One of your prospects, Dwight Schrute, opens the email and thinks, “Wow! This is the most amazing webinar topic I’ve ever seen!” He clicks the link, fills out the web form, and registers for the event.

Then, it occurs to Dwight that his collegues Jim and Pam would also love this webinar, so he reloads the landing page and registers them, too.

Not only did your champion Dwight reconvert, but now we have 2 new leads from Dunder Mifflin to nurture. Awesome, right?

Not so fast.

What actually happens

Behind the scenes, all 3 form submissions are merged into the original Dwight Schrute contact record! Because each of the form submissions happened within the same browser session, with the same cookies, we don’t get new contact records for Jim and Pam.

In fact, the very last form submission steals Dwight’s identity. Now Pam is the only Dunder Mifflin lead we have in HubSpot and all of Dwight’s prior activity is attributed to her.

Perhaps the worst of it is that you have no way of knowing when this happens. Brand new leads can evaporate and you wouldn’t even know it.

WHAT IS THIS I DON'T EVEN

5 minutes ago this was Dwight’s profile. Now it belongs to Pam:

Pam steals Dwight's identity

Common causes

There are other common scenarios that trigger this problem, as you can imagine. The two that I find most commonly are:

  • People sharing computers
  • A salesperson or VAR registering on behalf of a customer/prospect (this is a big one)

The logic

HubSpot’s identity de-duplication logic works like this:

  • If the user token stored in the browser’s cookie matches an existing contact record, form submissions from that browser session will be merged into the existing contact record regardless of the data being submitted
  • If there’s no cookie match, form submissions will de-duplicate based on email address
  • If the email address being submitted doesn’t match an existing contact record, a new contact will be created

The rationale

The rationale behind the way HubSpot de-duplicates is based on a concept called sandbagging.

The idea is this: Dwight browses your website a few times. You’re tracking his movement with a cookie. Eventually, he downloads an ebook with his dwight@gmail.com address because he doesn’t trust you yet.

At that moment, all of Dwight’s previously anonymous activity is associated with his new identity dwight@gmail.com.

The ebook is legit and two weeks later Schrute requests a live demo of your product from the sales teams. This time, he uses dwight@dundermifflin.com.

Now, HubSpot doesn’t want to create a new contact record just because Dwight’s email address changed. Doing so would mean that we wouldn’t associate any of the activity (visits, submissions, etc.) that occurred prior to requesting the demo.

Alternatively, by using the cookie to de-duplicate the identities, HubSpot can group all of Dwight’s historical activity under one record.

In the example above, and in many cases, this behavior is exactly what we’d want.

How KISSmetrics does de-duplication

As an aside, KISSmetrics, an analytics app that focuses on tracking people, runs into the same exact problem, but they handle it differently. As soon as a form submission (or some other identifying event) occurs, KISSmetrics creates a new identity and begins attributing activity in the browser session to that new identity.

Best practices for preventing lead merging

First, tell salespeople and VARs to always use an incognito browser window or clear their cookies when submitting forms on behalf of prospects and customers. If they’ll listen, great! Unfortunately, this won’t work for people outside of your organization (e.g., the Dwight, Jim, Pam scenario), but I’ve found the biggest offenders are inhouse.

Another option–which I would only advocate if this problem is killing you–would be to use your own forms (not HubSpot’s) and send the form submission data to HubSpot via the Forms API without passing the user token. There are major drawbacks to this workaround, mainly with respect to activity tracking, so I won’t detail exactly how to implement it. Visit the HubSpot API Google Group if you have questions about this.

The way forward

HubSpot opted to solve for the sandbagger problem over the lead merging problem. Personally, if I had to choose between the two evils, I’d have done the reverse (a la KISSmetrics).

The reality is, however, that in the vast majority of cases, all form submissions are done by a single person under a single email address and you don’t have to worry about lead evaporation or sandbagging.

My proposed solution, which I’ve discussed with the product manager at HubSpot, is to give portals a checkbox that forces de-duplicate based on email address instead of cookie.

From this HubSpot Ideas thread, it looks like they might implement this suggestion at the form level, which I’m also cool with!

Another terrific option would be to provide a way to quickly produce a report that shows you all contact records that have been aliased/de-duplicated with a side-by-side diff of the former and current data and a big button labeled SPLIT that let you untangle the contact records if it makes sense to do so.

Be the first to hear when Mastering HubSpot is released!

Signup for exclusive content, a discount, and progress reports.


We will never spam you. Unsubscribe at any time.

2 Comments

  1. Thanks for the writeup, Rob! We are exploring a few solutions to this, actually, so I will keep you posted on progress. And regarding the idea you linked to, this has already been completed so it should provide you with another possible approach.

    Thanks,
    Maggie

  2. Jennifer Fournier

    This was very informative. Thanks for sharing it and I look forward to the book!

Leave a Reply

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax