Blog > Data Integrity > The Importance of Data Integrity in the Age of AI/ML

The Importance of Data Integrity in the Age of AI/ML

Authors Photo Precisely Editor | October 11, 2022

Artificial intelligence and machine learning (AI/ML) holds tremendous promise for improved efficiencies, automation, and valuable insights that can drive business value. Machine learning (ML) is a subset within the larger category of AI which enables machines to learn by consuming large volumes of data to learn its patterns and ultimately make accurate predictions, without requiring explicit programming instructions. Because machine learning technology relies on volumes of trusted data to ‘learn,’ viable machine learning and data integrity are inextricably entwined.

Data Integrity - Digital drawing of a human brain.

ML must be ‘trained’ to understand the specific domain about which it is to make predictions. In other words, ML models must look to the past to understand what is likely to happen in the future. ML algorithms require a sufficiently large set of data from which they can derive statistically valid predictions. For most organizations, coming up with an adequate volume of data isn’t really a problem. The real issue for most companies has to do with data quality and siloed information disconnected from other data sources within the company, which lacks the contextual richness provided by location intelligence (LI) and data enrichment – in other words, the real issue is related to data integrity.

At Precisely, our mission is to solve these problems by addressing four pillars of data integrity: data integration, data quality, data enrichment, and location intelligence. Companies embarking on AI/ML initiatives must get data integrity right from the very beginning. Effective use of machine learning rests upon having accurate, consistent, complete information upfront.

2023 Data Integrity Trends & Insights

Results from a Survey of Data and Analytics Professionals

Lebow Report 2023

The “Garbage In, Garbage Out” Problem at Scale

We have all heard the old saying “garbage in, garbage out”, or “GIGO” for short.  There are two reasons this is especially true with AI/ML. First, if machine learning models are trained on data sets that lack integrity, they will fail to achieve their intended results and may even yield faulty, inaccurate predictions that result in poor business outcomes. “Garbage in, garbage out” therefore becomes “garbage in, garbage out… forever” because the ML model has “learned” from incorrect or incomplete data.

The second problem with the GIGO paradigm in the context of AI/ML is one of scale. If you are concerned with the accuracy and completeness of a single customer record, then poor quality data has a relatively limited scope. If you are analyzing a broad spectrum of customers, poor data quality takes on greater significance. With AI/ML, enterprises have the power to truly leverage data at scale, driving both operational and strategic business decisions. “Garbage in” at scale has the power to yield “garbage out” at scale.  For companies embarking on an AI/ML journey, or for those who have already started the process, this is a critical point. Data integrity matters more than ever.

Data Integrity = Huge Opportunity

You can just as easily turn this argument on its head and take a positive view of this data integrity challenge. As your competitors are struggling with poor data quality, siloed information, and lack of contextual richness, there is an opportunity to take the lead in leveraging AI/ML to achieve long-term competitive advantage. Precisely is helping companies across multiple industries such as telecommunications, banking and finance, insurance, health care,  and retail to achieve that vision every day.

Insurance companies are using machine learning to make better policy pricing decisions, to understand risk at a more granular level than ever before, and to spot potential cases of fraud and abuse. They’re also using AI/ML with Precisely’ location intelligence technology to proactively reach out to customers in advance of major weather events to warn them of the potential dangers and to pre-position claims adjusters to rapidly respond to policyholders who are likely to need urgent assistance following an impending disaster.

Data Integrity - Home destroyed by a natural disaster.

Banks are improving their ability to assess lending risk and determine home valuations using enhanced location information and linking to third-party data sets. Using machine learning with enhanced data and cloud-native location intelligence technology, many Precisely banking customers have successfully reduced the time to produce trusted data from 13 hours to just over 3 hours.

Retailers are using machine learning to better analyze purchasing patterns and understand  their customers’ behavior. AI/ML is helping businesses to improve site selection, joining in-house data with a vast array of location-based variables that can be used to calculate catchment areas, analyze traffic patterns, and understand populations along with their lifestyle preferences, income levels, and purchasing habits. Using the Precisely data integrity suite, retailers are achieving a unified view of their customers, de-duplicating information in their CRM and ERP systems, and enriching consumer information for a better understanding of the customers they serve.

Data Integrity is the Key

A holistic view of data integrity is that it includes accuracy, consistency, and context. When data is accurate and consistent, and when it incorporates geospatial context and third-party data, businesses are better positioned to see “the truth, the whole truth, and nothing but the truth.”

There is more data at your disposal than ever before. According to a recent Precisely webcast, in 2019, it’s estimated that 45 zettabytes of data were generated globally, and that by 2024, that figure will reach 143 zettabytes. As enterprises encounter this avalanche of information, they’re faced with greater challenges than ever before. According to Forbes, 84% of CEOs are concerned about the integrity of the data they’re using to make decisions, and 68% say that they are negatively impacted by the existence of data silos in their organizations. More than 50% are missing out on the benefits of location intelligence, which holds the key to thousands of variables connected to any given location, and which can provide valuable insights about traffic and consumer behavior.

AI and machine learning hold great promise for companies that are intent on winning on the competitive battlefield.  By getting data integrity right, business leaders can embark on these critically important AI/ML initiatives with confidence.

Precisely partnered with Drexel University’s LeBow College of Business to survey more than 450 data and analytics professionals worldwide about the state of their data programs.  Now, we’re sharing the ground-breaking results in the 2023 Data Integrity Trends and Insights Report.