5 Characteristics of Data Quality
Data quality is crucial – it assesses whether information can serve its purpose in a particular context (such as data analysis, for example). So, how do you determine the quality of a given set of information? There are data quality characteristics of which you should be aware.
There are five traits that you’ll find within data quality: accuracy, completeness, reliability, relevance, and timeliness – read on to learn more.
|Characteristic||How it’s measured|
|Accuracy||Is the information correct in every detail?|
|Completeness||How comprehensive is the information?|
|Reliability||Does the information contradict other trusted resources?|
|Relevance||Do you really need this information?|
|Timeliness||How up- to-date is information? Can it be used for real-time reporting?|
As the name implies, this data quality characteristic means that information is correct. To determine whether data is accurate or not, ask yourself if the information reflects a real-world situation. For example, in the realm of financial services, does a customer really have $1 million in his bank account?
Accuracy is a crucial data quality characteristic because inaccurate information can cause significant problems with severe consequences. We’ll use the example above – if there’s an error in a customer’s bank account, it could be because someone accessed it without his knowledge.
“Completeness” refers to how comprehensive the information is. When looking at data completeness, think about whether all of the data you need is available; you might need a customer’s first and last name, but the middle initial may be optional.
Why does completeness matter as a data quality characteristic? If information is incomplete, it might be unusable. Let’s say you’re sending a mailing out. You need a customer’s last name to ensure the mail goes to the right address – without it, the data is incomplete.
In the realm of data quality characteristics, reliability means that a piece of information doesn’t contradict another piece of information in a different source or system. We’ll use an example from the healthcare field; if a patient’s birthday is January 1, 1970 in one system, yet it’s June 13, 1973 in another, the information is unreliable.
Reliability is a vital data quality characteristic. When pieces of information contradict themselves, you can’t trust the data. You could make a mistake that could cost your firm money and reputational damage.
Read our eBook
4 Ways to Measure Data Quality
See what data quality assessment looks like in practice. Review four key metrics organizations can use to measure data quality
When you’re looking at data quality characteristics, relevance comes into play because there has to be a good reason as to why you’re collecting this information in the first place. You must consider whether you really need this information, or whether you’re collecting it just for the sake of it.
Why does relevance matter as a data quality characteristic? If you’re gathering irrelevant information, you’re wasting time as well as money. Your analyses won’t be as valuable.
Timeliness, as the name implies, refers to how up to date information is. If it was gathered in the past hour, then it’s timely – unless new information has come in that renders previous information useless.
The timeliness of information is an important data quality characteristic, because information that isn’t timely can lead to people making the wrong decisions. In turn, that costs organizations time, money, and reputational damage.
“Timeliness is an important data quality characteristic – out-of-date information costs companies time and money”
In today’s business environment, data quality characteristics ensure that you get the most out of your information. When your information doesn’t meet these standards, it isn’t valuable. Precisely provides data quality solutions to improve the accuracy, completeness, reliability, relevance, and timeliness of your data.
Find out more in our eBook: 4 Ways to Measure Data Quality
FAQs for 5 Characteristics of Data Quality
More often than not, data quality issues show up on the front lines. Users who spend much of their time working with customer records and individual transactions will usually have an intimate knowledge of the problems within the datasets they use and update every day. Those who rely on detailed inventory records, likewise, will usually be aware that much of their information is incomplete or inaccurate. Engaging end-users throughout your organization is a good first step toward understanding the scope and nature of your organization’s data quality potential data quality issues. Poor data quality isn’t always readily apparent to users who are focused on the big picture, though. Executives looking at customer analytics, for example, may be unaware of duplicate records or incomplete or inaccurate information because they’re only looking at a high-level summary. Data quality issues can easily get lost in the details. Far-reaching data quality issues may show up as anomalies in the analytics, leading executives to ask questions that prompt further investigation. To truly understand the kinds of data quality issues that can negatively impact your business, it’s important to take a comprehensive and systematic approach to the problem. That means creating a data catalog and a prioritized list of data assets.
If data is not fit for purpose, that can lead to costly mistakes, lost productivity, and poor business decisions. If customer data contains duplicate records or incomplete or inaccurate information, your company may waste valuable money sending redundant mailings or shipping packages to the wrong addresses. These errors often consume time during day-to-day business activities, as data entry personnel must grapple with the confusion these problems cause. Perhaps most importantly, poor data quality drives bad business decisions. If your customer analytics indicate that the business should proceed with a new product idea, but those analytics later turn out to have been driven by bad data, your company could end up with an underperforming product, leading to lost revenue and declining market share. As AI and machine learning take on greater significance in driving operational decisions, the same concerns emerge. Poor data quality leads AI/ML to “train” on inaccurate or incomplete information, which undermines the intended results of those investments.
Improving data quality is an ongoing process, but it begins with understanding what data you have and how it is used. Data cataloging and profiling provide systematic methods and tools for taking a thorough inventory of your information assets so that you can begin to prioritize them. Next, define the data quality metrics that will guide your efforts. Engage stakeholders throughout your organization to help them understand the importance of your data quality initiative and recruit data quality champions in each department to ensure that you fully understand your end-user’s concerns. Third, work with your stakeholders to establish the business rules that determine what good data quality looks like. To achieve data quality at scale, you need the right tools and framework to support this rules-based approach. Finally, monitor your data quality KPIs to verify that your efforts are producing the desired results. Keep in mind that data quality is an ongoing endeavor, not a “one-and-done” project. To maintain good data quality at scale, over the long term, make sure you have the right systems and technology in place.
Many people confuse these two terms. Data quality refers to fitness for purpose, characterized by information that is accurate, complete, reliable, relevant, and timely. Data integrity, on the other hand, encompasses a much bigger picture. Together with data governance, data quality comprises one pillar of data integrity, but integrity also includes integration, enrichment, and location intelligence. Integration means eliminating the silos that prevent users in your organization from having a complete and holistic understanding of the important realities that affect your business. Data enrichment adds valuable context. By enriching your internal customer data with curated demographic information from trusted third-party sources, you can gain rich insights based on a 360-degree understanding of your customers and prospects. Location intelligence adds rich geospatial context, unlocking a vast array of additional data points that shed light on customers, competitors, and the physical world in which your business operates.
A good data quality vendor should have a proven track record, working with companies of all sizes to deliver measurable improvements to data quality. To achieve the right outcomes as your business scales, the best data quality software should include comprehensive data cataloging and data profiling tools and should be based on a rules-based approach to monitoring and improving quality. It should also incorporate workflows to ensure that the right stakeholders are engaged in the quality improvement process at the right time. Look for integrated data governance as well. Vendors with a broader spectrum of data integrity tools can offer a one-stop shop with pre-integrated solutions proven to work well together. Look for software companies with expertise and tools for data integration, location intelligence and data enrichment.