Data Quality Dimensions: How Do You Measure Up? (+ Downloadable Scorecard)
By now, you’ve heard how valuable data can be, how it can drive your company forward, how you can use it to make better decisions. There’s a caveat there, of course. Information is only valuable if it is of high quality.
How can you assess your data quality? Data quality meets six dimensions: accuracy, completeness, consistency, timeliness, validity, and uniqueness. Read on to learn the definitions of these data quality dimensions.
Six data quality dimensions to assess
|Dimension||How it’s measured|
|Accuracy||How well does a piece of information reflect reality?|
|Completeness||Does it fulfill your expectations of what’s comprehensive?|
|Consistency||Does information stored in one place match relevant data stored elsewhere?|
|Timeliness||Is your information available when you need it?|
|Validity||Is information in a specific format, does it follow business rules, or is it in an unusable format?|
|Uniqueness||Is this the only instance in which this information appears in the database?|
The term “accuracy” refers to the degree to which information accurately reflects an event or object described. For example, if a customer’s age is 32, but the system says she’s 34, that information is inaccurate.
What steps can you take to improve your accuracy? Ask yourself whether the information represents the reality of the situation. Is there incorrect data (that needs to be fixed)?
Data is considered “complete” when it fulfills expectations of comprehensiveness. Let’s say that you ask the customer to supply his or her name. You might make a customer’s middle name optional, but as long as you have the first and last name, the data is complete.
There are things you can do to improve this data quality dimension. You’ll want to assess whether all of the requisite information is available, and whether there are any missing elements.
At many companies, the same information may be stored in more than one place. If that information matches, it’s considered “consistent.” For example, if your human resources information systems say an employee doesn’t work there anymore, yet your payroll says he’s still receiving a check, that’s inconsistent.
To resolve issues with inconsistency, review your data sets to see if they’re the same in every instance. Are there any instances in which the information conflicts with itself?
Is your information available right when it’s needed? That data quality dimension is called “timeliness.” Let’s say that you need financial information every quarter; if the data is ready when it’s supposed to be, it’s timely.
The data quality dimension of timeliness is a user expectation. If your information isn’t ready exactly when you need it, it doesn’t fulfill that dimension.
Validity is a data quality dimension that refers to information that doesn’t conform to a specific format or doesn’t follow business rules. A popular example is birthdays – many systems ask you to enter your birthday in a specific format, and if you don’t, it’s invalid.
To meet this data quality dimension, you must check if all of your information follows a specific format or business rules.
“Unique” information means that there’s only one instance of it appearing in a database. As we know, data duplication is a frequent occurrence. “Daniel A. Robertson” and “Dan A. Robertson” may well be the same person.
Meeting this data quality dimension involves reviewing your information to ensure that none of it is duplicated.
How does your data measure up?
Are you fulfilling all possible data quality dimensions? Download a free scorecard to assess your own data quality initiatives. Data quality solutions can help improve your score and ensure your data is accurate, consistent and complete for confident business decisions.
To learn more, read our eBook: 4 Ways to Measure Data Quality