AWS’ Entity Resolution service to help enterprises improve data quality

The machine learning-powered service, accessible via a no-code interface in the AWS Management Console, can be used to match data from multiple data lakes or AWS storage, the company said.


Amazon Web Services (AWS) has released a new service, AWS Entity Resolution, to help enterprises improve data quality for analytics and AI tasks.

The new service uses machine learning to help business users in an enterprise match data from multiple data lakes or AWS storage, said Davor Golac, general manager for the service at AWS. It can be accessed via a no-code interface in the AWS Management Console.

The service is aimed at cutting down enterprise expenditure on solving challenges around data quality, Golac said. For the US alone, enterprises spend around $3.1 trillion annually to improve data quality, he said.

In contrast to the practice of developing, integrating and managing complex data pipelines for data reconciliation, the new service’s no-code interface can be used to either adopt pre-configured workflows or create custom rule-based workflows for any kind of record-matching and accuracy needed with the help of machine learning, the company said.

Customers can set a higher threshold to obtain exact matches, or a lower threshold to match data across a broader set of results, it said. Once the threshold is set, the service’s machine learning model takes over and identifies data with similar attributes before clustering them together and generating a normalized data output in AWS S3 storage, Golac said, comparing the underlying principles of the service to vector or semantic search.

The output normalized data can be used for analytics or AI tasks, he said.

AWS said the Entity Resolution tool also negates the need for enterprises to hire developers and spend months developing their own machine learning model for data cleaning or normalization.

The service is generally available and can be accessed across AWS locations including US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (London), with availability for other regions to follow soon, the company said.

AWS Entity Resolution is currently supported on Amazon S3 Storage. The company charges for the number of records processed per workflow at $0.25 per 1,000 records processed.

Support for other data sources are expected to follow in the coming months, Golac said.

Additionally, the company has integrated the service into offerings from data connectivity platform LiveRamp, American consumer credit reporting agency TransUnion, and open-source advertising framework Unified ID 2.0.

During its annual re:Invent conference last year AWS hinted that it would launch a new service dubbed Identity Resolution while announcing its Data Clean Rooms service.

“The planned Identity Resolution service has been rebranded and launched as Entity Resolution,” Golac said.

Copyright © 2023 IDG Communications, Inc.