Visit booth 3171 at Google Cloud Next to see how to unlock real-time decisions at scaleMore info
Blog

What is identity resolution?

Discover how identity resolution combines technologies, data sources, and signals to create consistent records for entities like people and products, driving personalized advertising.

February 1, 2025 | 12 min read
Alex Patino
Alexander Patino
Content Marketing Manager

At a glance:

  • Identity resolution: Combining different technologies, data sources, and signals to create a consistent record for an entity, such as a person, household, or product.

  • Entity: A distinct, individual thing you might want to track or identify. This can include people, households, family groups, businesses, places, etc.

  • Golden record: A single, unified record for any entity that stays consistent across any device, platform, or channel. For example, connecting a customer from an app to an e-commerce website to an in-store purchase.

  • Why is identity resolution important? Identity resolution helps brands better understand their customers to give them more personalized experiences and advertise more effectively.

  • Also called: Unified ID, Unified ID solution, cross-device identification, cross-platform identification, UID

What is identity resolution?

In AdTech, identity resolution is the process of taking all of the information available from first, second, and third parties to create a unified customer profile of a specific consumer. As consumers move around the web (and even through the physical world), they leave behind pieces of information about themselves: who they are, what they like, how they make purchasing decisions, etc. 

Unfortunately for advertisers, this information is spread across hundreds of places, platforms, and providers. And all of them have different ways of referring to the person this information describes: some use emails, some credit card numbers, some user IDs, or loyalty club membership numbers. Identity resolution is a bit like playing data detective: using clues to match individual records to a record of an individual to create a unified profile.

Imagine trying to sell someone a shirt, except you don't know their size, their favorite colors, what kind of fit they like, or even their gender. You can show them shirts at random and hope you eventually get it right, or you can go around to the other associates and ask if they've helped that customer before. Maybe one knows Jane Consumer's size, while another one knows her favorite color, a third can show you her purchase history. You write this down in your notepad and label it "Jane Consumer." Congratulations, with these multiple identifiers, you've resolved an identity.

With identity resolution, advertisers strive for a "golden record": a single, unified, consistent record that works across platforms, devices, and channels for any entity.

However, identity resolution is becoming more difficult as moves to safeguard user privacy, especially ongoing work to eliminate third-party cookies, have made it harder to identify users effectively. Almost half of the open internet is already opaque to advertisers and advertising platforms. Modern identity resolution approaches will need advanced tools and techniques to gain visibility into these blind spots.

Why is identity resolution important?

Accurately identifying users and linking them to their demographics, interests, purchase histories, and other information creates highly effective targeting. When users see an ad they feel was created just for them, they are much more likely to click on it and buy the product; conversely, when they see an irrelevant ad, that impression is wasted, leading to higher costs for advertisers.

But identity resolution isn't just for gathering deeper insights into targeted users; it's also critical for identifying:

  • Businesses and organizations: Not all customers are people, and B2B advertisers might want to target specific businesses or types of businesses.

  • Websites: AdTech platforms need to be able to identify websites to better and faster match ads and placement spots, ensure brand safety for customers, and reduce fraud.

  • Places: For local and out-of-home (OOH), advertisers need to be able to identify physical locations. Place identity is also important for understanding where people live, work, shop, and relax.

  • Product: A complete product identity enables dynamic ad creative, precise audience matching, flexible pricing, and competitor analysis to optimize offers.

Webinar: Achieving the perfect golden record with graph data for identity resolution

Unlock the power of real-time identity resolution in the AdTech landscape with insights from AWS, Lineate, and Aerospike experts. Discover how graph data models and reference architectures can help you build the perfect golden record for seamless audience targeting.

Benefits of identity resolution across the advertising ecosystem

  • For advertisers

    • Unified consumer data lets advertisers build ads that precisely target them with personalized offers and messaging, which in turn creates better outcomes.

    • It also lets them track results to specific customers, opening up opportunities for smart, real-time optimizations, and attribute outcomes to advertising, no matter where the outcome took place (e.g., a customer clicking an ad and then buying in-store).

  • For AdTech platforms

    • Formulating a customer identity lets AdTech platforms create better matches for advertisers and demonstrate clear results.

    • It improves margins by allowing them to obtain detailed consumer information from one source rather than many.

  • For publishers

    • Better, more personalized ads reduce ad fatigue, letting publishers sell more ads.

    • Identifying high-value audience members means publishers can sell ad opportunities for more money, improving their revenues the customer relationship.

  • For consumers

    • Consumers might get annoyed by advertising, but they get less annoyed when the ad is something they are actually interested in.

    • Publishers can generate more ad revenue, which allows them to produce more and better content for consumers.

    • A solid identity resolution solution is better for privacy. Data is typically collected by first-party sources trusted by the consumer and not sold or reused en masse.

To put it simply: Identity resolution across all touched entities helps provide deep consumer insights that allow for the personalized digital experience consumers want and expect. It also gives AdTech platforms and advertisers the ability to attribute campaign outcomes — so they know which ads are working and which aren't.

How does identity resolution work?

All identity resolution techniques depend on a unique identifier (UID) that connects an entity to its properties, characteristics, and other data — no matter where the data is generated or stored. However, how those connections are made and where they come from have become more complicated over time.

Historically, IDs were based on simple, technical identifiers like cookies, IP addresses, or user-provided identifiers like loyalty club member numbers. These simple IDs were often siloed and not cross-linked; for example, a first-party website cookie would only identify a user on the website that set it. 

To identify users across different websites and devices, a need grew for a more universal identifier and a central place to keep this identity. This led to third-party cookies, which allowed a single service to track users from one site to another, and data brokers, which maintained these identities. But that proved problematic for consumer privacy.

What does identity resolution mean for privacy and data ownership?

As tracking became more universal and all-encompassing, consumer privacy concerns grew, creating a backlash of negative sentiment and privacy legislation. Laws like the EU's General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) limited third-party cookies and led to many companies, like Apple, phasing them out entirely or blocking them by default.

Modern approaches to identity resolution are built to work around these restrictions while respecting user privacy. In many ways, identity resolution has come full circle: starting with first party data and user-provided data, moving to centralized tracking, and returning to first-party and user-provided data.

In many ways, this makes modern identity resolution better for consumer privacy and data protections. Instead of buying and selling vast quantities of anonymized or pseudo-anonymized data (which can often be easily deanonymized), data is collected and stored by first parties. These are companies consumers directly interact with and trust, like the brands they buy from and the publications they read. 

This limits the ability for identifying data to be intercepted as it's transferred from broker to broker. It also gives consumers more power to control how their data is used since they only need to interact with a few known companies rather than hundreds of unknown data brokers. However, it also minimizes third-party cookies as a viable identity resolution component.

Modern identity resolution techniques and technologies

Identity resolution begins with a process called data onboarding, which collects external or offline identifiers (customer analytics, including memberships, in-store purchases, CRM data, etc.) and tries to match them with online identifiers, like user IDs, cookies, or device IDs. This process can get very involved, requiring the building of complex probabilistic models about what data points might belong to which customer. 

As data is matched and added to an identity graph, it can be revised and corrected when obvious mistakes become apparent (e.g., two people with the same name might be considered the same entity until it becomes clear they have two different birthdays or live in two distant addresses simultaneously).

Some common techniques and technologies used for identity resolution today include:

  • Graph databases: Uses nodes/vertices to store properties and characteristics (like name, address, or UID) and links them together with edges that represent relationships (like "belongs to" or "lives at"). An identity graph uses first party data, cleanrooms, and additional resources to build customer identity/identity resolution, personalization, targeting and retargeting, attribution, and other services without cookies.

  • Data sources: Includes first-party data as well as sources for third-party data like data management platforms (DMPs), customer data platforms (CDPs), and signal aggregators (SAs or SSAs for super signal aggregators).

  • Machine learning: Uses advanced algorithms and AI to link together, or "stitch," different identities and data points into a single, accurate entity.

  • Data processing/cleaning: Collects data from many different sources into a single format, combining it while removing duplicates and errors.

  • Monitoring and analysis: Processes identity information to turn data into actionable insights.

  • Data clean rooms: Allows consumers to be identified without violating privacy laws through specialized data processing tools.

These technologies can combine first- and third-party data into a single identity that can be used for targeting or personalization and can be tracked across any device they use while still respecting the privacy and safety of consumers.

What are the biggest challenges in identity resolution?

Consumer privacy and safety might be the identity resolution challenges getting the most attention, but they aren't the only roadblocks to building a unified identity. These challenges, often lumped together as “signal loss," make it harder for platforms and brands to get the information they need to build robust identities. 

Some of the biggest include:

Data inconsistencies

These can range from simple misspellings to using nicknames to entities with multiple values for the same property. For example: "john doe," "John Doe," and "Jonathan Doe" could all be the same person, or they may be different people. Different sources of data might have different ways of identifying the same person and could store them in different formats.

Shared identifiers

A single location, like a home or office, might have multiple people, and it can be challenging to determine which one performed an action, like buying a product or viewing an ad. Similarly, common first and last names might be shared among millions of people, making the process of linking a name to an entity difficult.

Data volumes

An average internet user generates almost 150 GB of data every day. Collecting, processing, storing, and using all this data is a monumental task that requires cutting-edge technology on both the processor and storage side. 

Just deduplicating this data in real time can be a herculean effort. Ingesting and then accessing this data is even harder — especially since graph databases require significant optimization to be fast at scale.

Deterministic identity resolution vs. probabilistic identity resolution

Identity resolution and outcome distribution are often divided into two broad categories: deterministic and probabilistic. Deterministic describes data you know, like a customer profile or account numbers, while probabilistic describes data you suspect, like anonymous ad views or predicted product preferences.

Even with deterministic data, there's some level of uncertainty—a customer may have given you one of their throwaway email addresses or used an anonymized payment method. Determining how confident you are in the accuracy of a data point is challenging, and small inaccuracies can cascade into much larger ones as more decisions are made based on them.

Regulatory differences

Instead of converging around a single standard, privacy regulations diverge across geographies. The multitude of requirements makes building a single, unified identity more challenging. It requires platforms and advertisers to change how they do identity resolution based on where they're located, where the user resides, and where the processing and data storage takes place.

Walled gardens

As consumers increasingly perform more of their transactions and activities inside apps rather than in browsers, more data becomes locked in proprietary ecosystems owned by publishers and developers. This gatekeeping makes building unified identities more expensive and cumbersome and can shut out some data entirely.

The future of identity resolution

Not that long ago, identity resolution seemed like a solved challenge — third-party cookies were working well and making the internet transparent and highly visible to brands and platforms. In a matter of a few years, everything changed. 

The rapid pace of evolution on the web and across the digital world makes it difficult to predict the future of identity resolution. Still, some things are more likely than others.

Privacy regulation could become stricter, further reducing advertisers' ability to build and use unified identities. Some jurisdictions are already talking about banning any identifying data — even clean rooms and anonymized records.

Signal loss is going to accelerate and make finding reliable data points harder. Paradoxically, this loss will actually increase the number of signals that companies will need to collect and process as they try to replicate one strong identity signal with many weaker ones. This will require advanced processing powered by AI and ML tools, making innovative data storage like graph databases mandatory to increase the technical complexity of resolving identity.

In short, identity resolution will get more complex, more technical, and more expensive. Learn more about how Aerospike's Graph Database for identity resolution can help you cut costs and complexity for your identity resolution needs.

Try Aerospike: Community or Enterprise Edition

Aerospike offers two editions to fit your needs:

Community Edition (CE)

  • A free, open-source version of Aerospike Server with the same high-performance core and developer API as our Enterprise Edition. No sign-up required.

Enterprise & Standard Editions

  • Advanced features, security, and enterprise-grade support for mission-critical applications. Available as a package for various Linux distributions. Registration required.