Data Integrity Trends for 2024
In 2023, organizations dealt with more data than ever and witnessed a surge in demand for artificial intelligence use cases – particularly driven by generative AI. They relied on their data as a critical factor to guide their businesses to agility and success.
When we look at data integrity trends for 2024, agility remains a top focus for organizations across industries. Leaders increasingly rely on their data to support fast, data-driven decisions that increase time to value and help them respond quickly to emerging opportunities and threats.
With these goals in mind, data with integrity – maximum accuracy, consistency, and context – is critical.
Here are some trends that will command the attention of business leaders in 2024 that promise to increase reliance on trusted data.
Trusted AI Outcomes Require a Focus on Data Integrity
Everyone’s talking about artificial intelligence (AI) for its unique ability to automate or accelerate user tasks, resulting in greater efficiency and productivity and a reduced dependence on manual labor.
The launch of ChatGPT in November 2022 sparked an interest in generative AI for its power to create new content and designs with remarkable efficiency, solve problems smarter and faster, and create new opportunities like never before. It’s easy to see why there’s been an astronomical rise in the usage of generative AI over the past year for consumers and business users alike.
This transformational technology has the world buzzing, and AI’s potential continues to grow. But few organizations have the data integrity required to power meaningful outcomes.
To make data AI-ready and maximize the potential of AI-based solutions, organizations will need to focus in the following areas in 2024:
- Access to all relevant data: When data is siloed, as data on mainframes or other core business platforms can often be, AI results are at risk of bias and hallucination. Organizations must focus on breaking down silos and integrating all relevant, critical data into on-premises or cloud storage for AI model training and inference. These more complete datasets will both reduce bias and increase accuracy.
- Increase in data trust: Data that doesn’t meet rigorous quality metrics, or that isn’t governed with a robust framework, puts AI systems at risk of generating inaccurate predictions, recommendations, and output. Increasing trust in data used for training and fine-tuning AI models by obtaining transparency into the lineage, improving, and observing its quality, and governing its management will be a priority for organizations in 2024.
- Increased data context: Few organizations have the depth of context within their first-party data to optimize the accuracy and relevance of AI predictions and outputs. In 2024 organizations will increasingly turn to third-party data and spatial insights to augment their training and reference data for the most nuanced, coherent, and contextually relevant AI output.
When it comes to AI outputs, results will only be as strong as the data that’s feeding them.
Trusting your data is the cornerstone of successful AI and ML (machine learning) initiatives, and data integrity is the key that unlocks the fullest potential. Without data integrity, you risk compromising your AI and ML initiatives due to unreliable inferences and biases that don’t fuel business value.
But with data integrity, you gain more trustworthy and dependable AI results for confident data-driven decisions that help you grow the business, move quickly, reduce costs, and manage risk and compliance.
Data Integrity Supports Continued Modernization Momentum
Organizations are adopting cloud services for more cost-effective, agile, and scalable data analytics, artificial intelligence, and the development of new applications.
Mainframe and IBM i systems remain critical parts of the modern data center and are vital to the success of these data initiatives. They’re where the world’s transactional data originates – and because that essential data can’t remain siloed, organizations are undertaking modernization initiatives to provide access to mainframe data in the cloud.
AWS Mainframe Modernization Data Replication with Precisely (for mainframe and IBM i systems) enables organizations to break down data silos and provide real-time access to these complex data sources on the AWS cloud, where it can be used for analytics, AI, DevOps initiatives, and new applications.
2023 Data Integrity Trends & Insights
Results from a Survey of Data and Analytics Professionals
Data Integrity Increases On-Time Last-Mile Deliveries
Online shopping and food delivery have become so ingrained in our day-to-day lives that we barely give them a second thought – until something goes wrong.
No one wants to track down a package that was mistakenly delivered to a neighbor’s home or to have a hot pizza delivered cold. Customer expectations are higher than ever (and constantly growing), and issues like these can spiral out of hand if organizations don’t have a solid grasp on their last-mile delivery approach.
Last-mile delivery is the final step in an item’s delivery journey, typically when it moves from a distribution center to the end consumer. Challenges with these deliveries often stem from incomplete, low-quality address data. A minor omission like an apartment unit can have significant consequences, like failed deliveries or increased delivery times, negative customer experiences, and increased customer and driver attrition.
Accurate, consistent, and contextualized data powers flawless last-mile deliveries, and hyper-accurate geo addressing is critical to that success.
Property-specific context and intelligence help organizations achieve benefits like:
- more accurate and timely deliveries
- increased confidence among consumers and delivery drivers
- enhanced insights into opportunities for market penetration
- better, more informed decision-making that helps maximize market share
Data integrity is essential for better deliveries that save time, reduce costs, and boost customer loyalty.
Data Integrity for Compliance Remains in the Spotlight
Data privacy and security concerns remain top of mind for organizations across industries.
Landmark legislation like Europe’s General Data Privacy Regulation (GDPR) and the California Consumer Privacy Act (CCPA), along with initiatives around ESG (environmental, social, and governance) and more, require continued compliance with new and evolving laws and regulations around the globe.
As consumer standards for protecting their personal identifiable information (PII) grow, so do the consequences for organizations that don’t live up to those expectations. For instance, a data breach or violation of privacy standards can lead to liability, expensive fines, and a slew of negative publicity that’s a hit to brand reputation and trustworthiness.
Organizations with data integrity minimize these risks with various capabilities, including:
- data observability processes that proactively monitor for and flag anomalies in real time
- data cataloging for critical data assets across the enterprise to identify, classify, and manage data for compliance policy processes.
- location-specific attributes and spatial insights that provide context to support stronger overall risk management.
Reporting standards are also becoming increasingly stringent, and data integrity capabilities help ensure that metrics are clear, accurate, and readily accessible. For example:
- data integration captures the necessary data from diverse sources and makes it available in real time
- data governance provides positive control over data storage, access, use, etc., while maintaining privacy, security, and compliance with essential government regulations
- data quality controls help ensure data that’s accurate, complete, and consistent, wherever it’s stored and used throughout the organization.
At the same time, there’s a demand to analyze compliance requirements to drive better business outcomes pragmatically. With the proper data integrity tools and programs, organizations can ensure that their users get maximum value from their data assets while avoiding the pitfalls associated with such adverse events.
Scaling Democratized Data Access Requires Data Integrity
Many organizations are challenged to know where their data is and who owns it, to understand which projects use or need similar data, and ultimately achieve economies of scale and alignment across the multiple data initiatives that spin up across their enterprise.
Going into 2024, we see a continued trend towards enabling democratized delivery of data across business functions with repeatable, scalable processes.
There are three emerging practices for achieving this that are not mutually exclusive, although each requires data integrity for success:
- Data mesh
- Data fabric
Data mesh initiatives focus on building business-focused data products by empowering subject matter experts and business units that own and manage the data to provide it to other departments, or even customers, as a product of its own. This requires that data owners have the ability to find, explore, define, label, merge, and enrich their data – all key elements of data integrity.
Data fabrics are being explored by business and IT teams to connect data through its metadata and provide the knowledge layer required to identify the data products that will provide the most meaning and value. The ultimate goal of a fabric is to bring together structured and unstructured data and make it useful for humans and machines alike. Data integrity capabilities such as data cataloging, data integration, metadata management, and more are employed to create a fabric.
And DataOps, while not a data delivery method like a mesh or fabric, brings together people, processes, and technology across the data lifecycle to improve data quality with speed, agility, and at scale. As with the other disciplines, this requires a focus on data discovery, data cataloging, data governance, data quality and enrichment – all core data integrity capabilities – to automate agile aggregation, enrichment, and delivery of data that delivers value to an enterprise.
While both data mesh and data fabric are very early in the maturity lifecycle, they are anticipated to mature further in 2024 as organizations seek ways to accelerate the delivery of data across their enterprise for analytics, reporting, AI, automation, and more.
Build an Ongoing Discipline for Data Integrity
In the earlier stages of data maturity, many organizations view data integrity through the lens of data quality, and they tend to understand data quality improvement as a one-off exercise. If inconsistencies and inaccuracies in the customer database can be fixed, the organization’s data analytics initiatives can presumably proceed.
That approach assumes that good data quality will be self-sustaining. The one-off approach tends to deliver short-term improvements, followed by medium-term decline and long-term erosion of confidence.
To achieve lasting data quality at scale, organizations must build a framework to catalog data sources, establish a standard data dictionary, and develop clear business rules supported by a technology framework to help continuously manage data quality.
Beyond data quality, organizations further extend the business value of their data by taking additional steps in the data integrity journey – like eliminating data silos through data integration and adding context via data enrichment and location intelligence.
Organizations must achieve data competencies in all aspects of data integrity to compete effectively in today’s highly competitive global marketplace. To establish trust and confidence in data-driven insights, you must internalize data integrity as an ongoing discipline. Read our 2023 Data Integrity Trends and Insights Report to learn more.