We are excited to be a part of AWS re:Invent 2024. Visit us at booth #1844 in Las Vegas.More info
Blog

Achieving the perfect golden record with graph data

Learn about the challenges and solutions for entity resolution in the AdTech world, including the shift away from cookies and the importance of creating a comprehensive "golden record."

george-demarest-600x600-1
George Demarest
Director of Product Marketing
August 8, 2024|6 min read

Over the last decade, there has been a great deal of focus, anxiety, and speculation in the AdTech world about the impending deprecation of third-party cookies by the major purveyors of browsers: Apple, Mozilla, and Google. Aerospike’s position as a technology incumbent in the AdTech space has yielded some blogs on the subject, including The year of identity (in the AdTech industry), Graph databases and signal loss in AdTech, and Cookies, IDs, and beyond!

While Apple and Mozilla took early steps (in 2017 and 2019, respectively), Google has long been a holdout. Google first announced its intention to deprecate cookies in 2020 but postponed its de-implementation several times, and it is now backtracking. Google has been under pressure from advertisers who have long relied on cookies to provide a deterministic signal for tracking consumers, their activities, and their behaviors.

However, with Google’s delay in deprecating cookies, some might conclude that doing nothing about cookie deprecation is okay for now. But that is a dangerous road to take. Advertisers already face a world where 47% of the open internet is unaddressable because of cookie and device ID deprecation by Apple and others. But given Google’s several changes in direction on the topic, it is still likely that cookies will eventually go away. Should that happen, AdTech companies will need a strategy and direction to get there. Without this, some say there could be revenue loss of over 50% and a corresponding loss of market share to competitors who have been more proactive.

The most forward-looking of AdTech companies are developing new ways to identify and model their target personas, using a much richer collection of data from a broader set of sources. They are also broadening their scope by looking beyond personal identity to model other important entities such as products, websites, businesses, and more.

Resolving user/customer identity has become increasingly sophisticated as AdTech companies try to rationalize how customers relate to products, websites, businesses, and other entities. There is also increasing pressure to resolve these relationships in real time to maintain a seamless end-user experience. 

In search of a golden record 

Databases that contain information about customers, businesses, or products tend to be increasingly fragmented and siloed across applications, channels, and data stores. This has given rise to increased attention on creating the so-called “golden record,” a term long used in the Master Data Management space. 

In the context of AdTech, a golden record is a unified, accurate, and consistent representation of an entity: a user, customer, product, or what-have-you. It's a comprehensive profile that aggregates data from a growing collection of sources to create a much more nuanced - and powerful - model of an entity.

However,  creating that golden record for your most important entities can seem daunting. There may be inconsistencies in the source data, such as disparate records containing incomplete or conflicting information, creating a difficult matching process. There are practical challenges in resolving different but equivalent data, such as for street addresses (“2nd Avenue” and “2nd Ave”). Consequently, it’s not easy to link related records together to create a unified view and gain better insights.

Effective entity resolution is becoming crucial for effective ad targeting and personalization. By creating a rich golden record for all crucial entities, advertisers can create more precise and relevant audience segments for targeted campaigns. They can also improve ad relevance to better align with user interests and preferences. They can also optimize campaign performance with more accurate metrics and measurements to make data-driven adjustments to their ad delivery.

Entity resolution at an industrial scale

AdTech companies and marketers are increasingly looking to run effective advertising campaigns to reach consumers across multiple applications and channels with personalized messaging. One way AdTech firms can address this problem is to build custom data resolution processes that may take the form of complex SQL queries interacting with multiple databases. Developers may also train machine learning (ML) models for record matching and resolving inconsistencies. 

But these solutions take months to build, consume dev resources, and cost a lot to maintain. AdTech needs a headstart to solve the problem of state-of-the-art entity resolution in a cookie-less world. The required future state for such a system requires a systematic method of:

  • Parallelized rapid ingest of data 

  • Use of a graph data model to store entity records

  • Real-time delivery of entity data storage from a real-time graph database

  • A scale-out solution for rationalizing disparate data sources

  • Sophisticated data record pattern matching of source records

Two major technology pieces are now available to AdTech firms. First, a packaged cloud solution for entity resolution from Amazon. Second, a scalable multi-model database from Aerospike that can support both real-time identity resolution via a key-value data model and a transactional graph database. You also need to have a plan.

AWS Entity Resolution

AWS Entity Resolution is an ML-powered service that helps you match and link related records stored across multiple applications, channels, and data stores. Amazon describes AWS Entity Resolution as a data matching service that gets to the core of the entity resolution challenge:

"AWS Entity Resolution helps you more easily match, link, and enhance related customer, product, business, or healthcare records stored across multiple applications, channels, and data stores. You can use flexible and configurable rules, machine learning, or data service provider matching techniques to optimize your records based on your business needs."

achieving-the-perfect-golden-record-with-graph-data-aws-entity-resolution

AWS Entity Resolution designers had the foresight to support different storage options from S3 buckets to NoSQL databases like Aerospike. 

Aerospike Graph for real-time graph data

Many AdTech companies are looking to approach identity and entity resolution using a graph data model to enable audience targeting across the open internet. Graph databases provide a model to help you accurately link related sets of customer information, product codes, and business data codes. If and when cookies finally become available as a deterministic signal, graph databases and machine learning look most likely to be the technologies that step in to deliver entity data. 

Aerospike Graph is a developer-ready, highly scalable graph database designed to meet and exceed the requirements of the most demanding large-scale property graph database workloads. It can deliver real-time responses on mixed transactional workloads and will step in to deliver more expressive graph data as a means of representing entity data. 

Getting started 

graphs-manage-complex-relationships

The combination of AWS entity resolution and Aerospike Graph makes the typically cumbersome process of creating a modern golden record for various entity and identity resolution use cases more efficient and effective.

Webinar: Achieving the perfect golden record with graph data for identity resolution

Join experts from the AWS Entity Resolution team, Lineate, and Aerospike as we discuss how entity resolution and identity graphs can help achieve the perfect golden record.