eBook

How "Good Enough" Quality is Eroding Trust in Your Data Insights

Explore key data quality insights from data professionals in the data quality survey

Read this eBook to explore key highlights from the survey and take a deeper look at the full survey results.

Survey background

Precisely’s Enterprise Data Quality Survey explores the challenges and opportunities for organizations looking to bring data quality across the enterprise as data volumes grow and new technologies emerge.

Respondent profile

Precisely polled 175 respondents, 69 percent of whom work for organizations with over 1,000 employees. Participants represented a range of industries, with the largest percentage coming from Financial Services (25%), as well as a range of positions, ranging from CDO to Data Analyst, with the majority in data-focused roles (29%).

 

How
Converging roads

Good data isn’t good enough anymore

There is a disconnect around understanding, confidence, and trust in the data and how it informs business decisions.

72 percent responded that the quality of the data used to run their business was good or better and 69 percent stated their leadership/ c-suite trust data insights enough to inform business decisions on them. Yet, they also reported that only 14 percent of stakeholders had a very good understanding of the data and that less than 60 percent of the data was well understood by stakeholders.

More than 70 percent also reported that sub-optimal data quality negatively impacted business decisions, and almost half found that untrustworthy results or inaccurate insights from analytics were due to a lack of quality in the data fed into systems such as AI and machine learning.

Data quality is a top challenge for machine learning

Poor data quality is enemy number one to the widespread, profitable use of machine learning. The phrase “garbage-in, garbage-out” has a multiplier effect with ML — first in the historical data used to train the predictive model and second in the new data used by that model to make future decisions.

With almost half reporting that untrustworthy results or inaccurate insights from analytics were due to a lack of quality in the data fed into systems such as AI and machine learning, it’s not surprising that “many sources of data” (69%) and “volume of data” (48%) are among the top 3 challenges companies face when ensuring high quality data.

3/4 of respondents also identified as having challenges profiling or applying data quality to large data sets.

A Wall Street Journal article revealed a recent report by Forrester Research Inc. found data quality a top challenge for AI projects and that “companies pursuing such projects generally lack an expert understanding of what data is needed for machine-learning models and struggle with preparing data in a way that’s beneficial to those systems.

Understanding data across the organization

How well do you (or other key stakeholders) understand the data that exists across your organization?

Very Good
Understanding

Good
Understanding

Partial
Understanding

Minimal
Understanding

Very Little or No
Understanding

Defining “good” understanding of data

If you answered, Very Good or Good, what percentage of your data is well understood by you/key stakeholders?

Greater that 70%

70%-50%

50%-30%

30%-10%

10% or less

Data attributes lacking visibility

Of those who responded that had partial, minimal or very little understanding of their data, the top three attributes respondents lacked visibility into were:

  • Relationship between data sets
  • Completeness of data
  •  Validation of data against defined rules

Data attributes lacking visibility

Use of data profiling tools

Less than 50% of respondents take advantage of a data profiling tool or data catalog where insight may be centrally provided for broad access.

Instead, respondents rely on other methods to try to gain understanding of data, with more than 50% of respondents using SQL queries or similar and over 40% using a BI tool.

Only 17% are profiling data manually.

 

Use of data profiling tools

Profiling large data sets

3/4 of respondents identified as having challenges profiling or applying data quality to large data sets.

Profiling large data sets

How would you rate the quality of the data used to run your business?

Only 8% of respondents reported having excellent data quality.

Rate the quality of the data

How would you rate your organization’s ability to get a single view of customer?

More than 30% of respondents lack ability to get a single view of the customer.

A single view of customer

Challenges to ensuring data quality

Many sources of data (70%) and volume of data (48%) are among the top 3 challenges companies face when ensuring high quality data.

Applying governance processes to manage and measure data quality is second with 50%.

 

Challenges to ensuring data quality

Consequences of poor data quality

Those who reported Fair or Poor data quality cited Wasted Time as the number one result (92%), followed by Ineffective Business Decisions (72%) and Customer Dissatisfaction (67%).

 

Consequences of poor data quality

Confidence in data sent to analytics platforms

70% of respondents are “Somewhat confident” in the data their organization sends to analytics and data visualization applications.

 

Confidence in data sent to analytics platforms

 

Poor data quality leads to inaccurate data insights

47% of respondents had untrustworthy or inaccurate insights from analytics due to lack of quality.

Only 16% are confident they aren’t feeding bad data into AI and ML applications.

 

Poor data quality leads to inaccurate data insights

Leadership trust in data insights

Although confidence in data sent to analytics systems is lukewarm and almost half reported they’d had untrustworthy results from analytic platforms, nearly 70 percent of respondents still state that their leadership trusts the insights enough to inform business decisions.

Leadership trust in data insights

Data quality is growing in priority

Although levels of confidence and trust in data appears mixed, 75% of respondents cite data quality as a high or growing priority.

Only 4% feel data quality is not a priority.

Data quality is growing in priority

Data quality in the cloud

Leveraging cloud computing for strategic workloads

Have partial to no understanding of the data that exists in the cloud

Rate the quality of their data in the cloud as Fair of Poor


Data quality in big data

Have a data lake or enterprise data hub leveraging distributed computing platforms like Hadoop or Spark

Do not have a process for applying data quality to the data in the data lake or enterprise data hub

Rate the quality of their data in the data hub as Fair or Poor, while 32% rate their data as “Good”

Responsibility for data quality

51% reported that IT is responsible for data quality, while business users and data stewards play a critical role.

Responsibility for data quality

Read the full eBook

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.