hand-moving-pieces-on-gameboard

Data Talks Episode 22: URL and Email Address Matching

Getting More Out of Your Match Exercise

Host: George L'Heureux, Principal Consultant, Data Strategy
Guest: Ron Stam, Data Strategy Consultant

There are two primary use cases when matching by domain is especially helpful: first, when we can’t get a match to the name and address that's provided; and second, when the company information that is provided has incomplete information or is heavily dependent on domain level data itself. In these cases, when a match might have been impossible, we’re able to identify a match candidate. Similarly, we can use domain matching to reinforce other match results or to get an even better record than would have been possible using imperfect data and name-and-address matching alone.

In this episode of Data Talks our experts discuss some of the challenges, best practices, and considerations when leveraging a domain approach in your matching exercise.

 

 

Read full transcript

George L'Heureux:
Hello everyone, this is Data Talks presented by Dun & Bradstreet. I'm your host, George L'Heureux. I'm a Principal Consultant for Data Strategy and the Advisory Services team here at Dun & Bradstreet. And in Advisory Services, our team is dedicated to helping our clients to maximize the value of the relationship with Dun & Bradstreet through expert advice and consultation. On Data Talks, I chat every episode with one of the expert advisors at Dun & Bradstreet about a topic that can help our consumers of our data and services to get more value. Today's guest expert is Ron Stam, a Data Strategy Consultant at Dun & Bradstreet. And Ron, how long have you been with the company?

Ron Stam:
Hey, George, I've been with the company about 25 years.

George L'Heureux:
And Ron, we wanted to chat a little bit today about domain and email matching. Now, I think most people are familiar with the idea of trying to identify an entity by say its name or its address or phone number. What are cases where you'd want to go beyond that and search by domain or an email address?

Ron Stam:
There's two primary cases that domain matching is especially helpful. Number one is when we don't get a match to the name and address that's provided, and the second one is where the company information that comes into us does not have very complete information or it's heavily dependent on domain level data itself.

George L'Heureux:
So when we do have situations like that, what sort of advantage can we gain by going ahead and doing a match on that URL domain or that email domain?

Ron Stam:
Oftentimes we can get a match where we didn't have a match previously. We can reinforce the match that we had and we can get to a better record at times with the domain match than we could with data that might not be presented perfectly with name and address matching.

George L'Heureux:
Now, is there a difference between maybe what you're able to get out of, say, a URL domain, a website address that someone gives you versus someone handing you an email address?

Ron Stam:
There are differences. And actually, you have to be a little bit careful. While both of the domains themselves are often similar, so dnb.com is used both for the corporate domain as well as for our email convention. If a customer record shows a corporate domain, that data is more likely to be reflective of the company. An email domain has a challenge that sometimes people don't use their corporate or work email address as a source point that is being provided. And there are certain instances where you have to be a little bit more careful and do a little bit more filtering up front with email domains than you might have to with a corporate domain.

George L'Heureux:
Can you go into that a little bit deeper? What are some of those examples? What are situations where that email domain question might come into play and you have to exercise an additional little bit of caution when you're looking at your match results?

Ron Stam:
Sure. The first one that's very visible is all of the ISP addresses, the Gmails, the Yahoos, the AOLs. If somebody registers with those kind of emails, they're not reflective necessarily of a company that they work for. A couple other things that you do have to be careful though of are for example, .edu, which is colleges and universities. And a college or university might extend those email addresses to alumni or other people who don't necessarily work for the organization. And as it relates to matching to Dun & Bradstreet's data, oftentimes what the goal is is to get the information on the business itself, where somebody works. And if I'm an alumni of a university and I'm using that email address but I work at a company not related at all to that university, that domain match on abcuniversity.edu would not be reflective of where I actually work. I'm just using that email address for professional or convenience reasons.

George L'Heureux:
So it strikes me that compared to some other places where you can match, for example, address in particular, a name, where there's going to be a lot of fuzziness around is this name spelled correctly? Am I, spelling this name the way I think it's pronounced? Or is this street in avenue or a Boulevard or a circle? It feels like URLs, domains, would be a lot more binary. That it's either yes or no. Is that the case or is there more to it than that?

Ron Stam:
It is. Almost always the case. There's chances that a domain might be typed in incorrectly, but a domain such as dnb.com exists and it points to Dun & Bradstreet. The challenge is that if somebody did type in DMB instead of DNB, the domain for DMB could very well be valid and could point to a different company. But most of the time, we have to assume that the domains themselves are accurate as provided and we can match to that domain on a one-to-one basis, or at least get to the right organization because the domain itself could be reflective of all of the different locations of a business, but a primary goal of our customers when they match to our domain is to get to the right corporate entity as opposed to necessarily getting to the exact particular address site for a record. Even though that's beneficial, it's not always a requirement.

George L'Heureux:
So obviously domain match, we're talking about it and it is a capability that we offer here at Dun & Bradstreet. When it comes time to actually perform a domain match, is it as simple as just supplying that URL or that email address and hitting go, or is that more complex?

Ron Stam:
It is kind of that simple, but it's beneficial for you if you are a customer who has domain to provide us with additional information, in particular, geographic or location-based information because that might point to a better record in the world. So if ibm.com is valid and relevant for all of the IBM locations worldwide and you are dealing with IBM in South Africa, it would be good to know that you're trying to get to an IBM record in South Africa as opposed to just any IBM record or go to the top IBM record in New York.

George L'Heureux:
And we've talked about domain matching versus traditional name and address type matching. Are there different ways that customers when they go to do this with Dun & Bradstreet can integrate the two in a way that makes sense for them? Is that variable or is it all kind of only one way?

Ron Stam:
There are a couple different ways and you could do it in a kind of integrated way or you could do it in a waterfall way or you can do it in a kind of a validation way. So for example, a customer might try to match on a name and address, and especially if they have questionable or mid-level match results, they might want to also match on a domain that's provided and they might make a better decision if they feel that the domain match and the address match was the same. If they see that there's differences, they might choose one path versus another based on the type of information provided in either the address or domain and the likelihood that that is more correct. So the domain match is at a minimum, a value add second step, but it can be used as a more reinforcement of an existing match or identification of a record.

Ron Stam:
And outside of Dun & Bradstreet, a domain match can be used internally by a customer or a company as well to help link records that are of the same organization. So using the ibm.com example again, if they have records around the world that are ibm.com, they can manage or link those records together as part of their data management, but it should be noted that some companies, and I'll use in this case, Microsoft and LinkedIn. LinkedIn is now a subsidiary of Microsoft, but their linkedin.com URL is in fact fully operational and used all the time, but it's different than Microsoft. So there is a benefit in matching to the URLs, but the information on a corporate hierarchy or family tree is not always one consistent URL. It can vary.

George L'Heureux:
And you kind of hit on another aspect of data management that at Dun & Bradstreet we talk about quite a bit, and that is hierarchies. This idea that LinkedIn is a wholly owned subsidiary of Microsoft and they have different domains and they still all, however, are part of Microsoft. That goes for things like brands, right? If you're a CPG company, if you've got a lot of different consumer brands that each have their own URL, we're going to be able to resolve those to the correct company that actually manufactures those brands, right?

Ron Stam:
Often. Yeah, I won't say 100% of the time. Some of the brands themselves have their own unique URLs and some of them are going to point or redirect to a corporate site. But if that's what a customer has, that's what they should submit. And many times we'll be able to resolve or get a match to that record.

George L'Heureux:
I want to go back to something that you were talking about a few minutes ago, Ron, and that is this idea of these shared or ISP type domains that particularly for email matching, we have to take additional caution with. I imagine that this is a data stewardship, data governance type question, but can you talk about the additional steps that a consumer of that type of matching might want to take after getting results back to ensure that they're really putting the right high quality results into their database?

Ron Stam:
Yeah. And they might actually want to do it before they submit it as well. So what they could do is they could take a list of ISP and other email domains, marketplaces like Facebook or Etsy or Amazon that people also use, and they could actually suppress those records from even being attempted to match if there's an awareness that the match might come up with an inconsistent answer. So, as an example, if ronstamplumbing@facebook.com was either my email address or my marketplace address, if we isolate that facebook.com, that would not get to the right organization related to my plumbing business. It's just a site that I might be using via a third party to have a presence on the web. So if I suppress that out front knowing that I don't really want to match on facebook.com or gmail.com or Etsy or a bunch of other domains, you won't get records that you have to figure out and filter later. So it's usually better to do it up front.

George L'Heureux:
So, Ron, as we wrap up here, do you have something that you are hoping that people who are watching this or listening to this will walk away with? What's the bottom line message that you'd like them to make sure that they learn from this discussion today?

Ron Stam:
Yeah, the primary benefit is that you will get matches that you might not have gotten previously using a standard name and address match that is common at Dun & Bradstreet. In addition, you will often get, I'll say a more certain answer with the domain match because it is a little bit more binary, as you mentioned, where a domain is a domain and that reports and reflects the particular company with those exceptions of some of the marketplaces, with some of the franchises, with some of the other kind of known shared domains. And again, with the email domains, you have to be a little bit more careful still.

George L'Heureux:
Ron, I really appreciate you taking the time to sit down and talk about this aspect of matching with me and sharing your expertise here.

Ron Stam:
Thank you, George.

George L'Heureux:
Our guest expert today has been Ron Stam, a Data Strategy Consultant at Dun & Bradstreet, and this has been Data Talks. I do hope you've enjoyed today's discussion. And if you have, I encourage you to please share it with a friend or a colleague. And for more information about what we discussed on today's episode, please visit www.dnb.com or talk to your company's Dun & Bradstreet specialist today. I'm George L'Heureux. Thanks for joining us. Until next time.