Blog > Mainframe > Transform Your Mainframe Data for the Cloud with Precisely and Apache Kafka

Transform Your Mainframe Data for the Cloud with Precisely and Apache Kafka

Authors Photo Precisely Editor | November 4, 2021

Cloud migration projects are happening in virtually every large enterprise throughout the world, and in many small and midsize companies as well. For most, cloud data migration is an ongoing journey, likely to take a few years and often broken down into several major phases. Throughout that process, though, technology leaders must be mindful of the need to work smarter, not harder. According to a 2021 study by PwC, as much as 35% of cloud spend is wasted on inefficiencies.

Mainframes continue to be the transactional workhorses of most large organizations. They provide a stable, reliable, and proven platform for managing most business-critical operations for insurance companies, banks, government entities, healthcare organizations, and more. Most organizations do not want to throw away the value that has been accrued in those technology assets over several decades. The lingering question: How can enterprises take full advantage of cloud computing while continuing to leverage the investments they made in the past?

mainframe data - Looking at code on a computer monitor.

The Journey to the Cloud

First, technology leaders should focus on application transformation, initially with the highest value assets that serve the most important core business functions. By taking this approach, companies can gain a better understanding of which data is most important and what their data integration needs will be as their cloud migration project unfolds. A corollary to this rule is that companies must also develop a clear understanding of their existing on-premise data pipelines. These factors will inform key decisions with respect to integration throughout the project lifecycle.

Watch our Webcast

Transform Your Mainframe Data for the Cloud with Precisely and Apache Kafka

If your organization is seeking stronger, faster, more flexible, and scalable streaming technology to connect your mainframe data to Apache Kafka, Snowflake, Cloudera, or other stream processing platforms, check out our free on-demand webcast.

Technology leaders should also consider how their landscape might evolve over time. This may include containerization, hybrid, or multi-cloud scenarios. The tools and approaches they adopt must be capable of accommodating an evolving mix of on-premise systems (including mainframes) alongside these various cloud operations.

This also leads to the question of which on-premises applications will need to be open to the cloud. For organizations running mainframe data systems, the answer to that question continues to evolve. Nevertheless, there are compelling reasons for companies to consider robust solutions for streaming data pipelines that connect their IBM i or other mainframe data sources to data streaming targets built in Apache Kafka, Snowflake, Cloudera, or other widely used stream processing solutions.

Connecting On-Prem Systems to the Cloud

Given the trajectory of the cost/performance ratio for cloud storage and analytics, it’s no surprise that much of what is driving the need for streaming connectivity between on-premise systems and the cloud centers around analytics, artificial intelligence, and machine learning. At the same time, we are seeing a stronger need to integrate transactional data into the cloud.

Mere connectivity is not enough, though. Speed, scale, accuracy, and reliability matter as well. Although there are certainly many applications that do not necessarily require real-time integration, we see an increasing range of scenarios that demand the immediate availability of information.

For obvious reasons, the ability to access transactional information as it happens is a critical business requirement. But real-time integration matters for analytics as well. Machine learning applications are widely used to detect fraud, for example. The timeliness of a company’s response to potential threats is essential; their ability to identify anomalies and respond rapidly is extraordinarily important.

Key Ingredients for Success

The first key ingredient to successful mainframe-to-cloud data streaming is log-based data capture. By monitoring and responding to changes to mainframe data based on logs, integration can be driven by events rather than being executed in batches. This eliminates the need for any invasive action with respect to the mainframe DBMS. It also supports data integrity by ensuring that no data is lost in the event a connection is dropped while data transmission is taking place.

The second key ingredient to mainframe-to-cloud success is real-time data pipelines. Because it is built around log-based data capture and a streaming queue to manage individual changes, the Precisely Connect CDC solution provides for a pseudo-two-phase commit process, ensuring that data integrity is preserved and that no data is lost.

Third, look for flexible replication options. Most often we see a need for one-way point-to-point connectivity, distribution of data to two or more targets, or consolidation of multiple sources into a single target. It’s not uncommon, however, for companies to require two-way, bidirectional, or cascading integration scenarios. Robust data streaming solutions should provide full flexibility in selecting which data is to be replicated, to where, and when.

Precisely Connect CDC

There is a fourth essential ingredient for successfully streaming mainframe data to the cloud, namely, an enterprise-grade integration solution that addresses those first three key ingredients while also providing the flexibility and scalability enterprises need to remain agile and adaptable. The ability to quickly and easily design and deploy an integration, or to design it once and deploy it across multiple instances, is a key enabling factor for scalability. By providing self-service tools through a browser-based interface, Connect CDC eliminates the need for users to develop an in-depth understanding of mainframe data sources.

Another key factor affecting scalability is the resilience of Connect CDC’s data delivery. The solution’s fault tolerance ensures that zero information is lost, even when a connection is dropped. By integrating with Apache Kafka’s schema directory, Connect CDC ensures that metadata integrity is maintained as well.

Proven Mainframe-to-Cloud Integration

When a major Australian financial services company set out to improve its customer engagement with alerting, notifications, offers, and reminders, the company sought ways to stream transactional data to the cloud. Unlocking that data from VSAM was a critical step toward improving its services to customers.

The company chose to use Precisely Connect CDC to replicate changes to its Confluent-based Kafka event bus. For this use case Connect CDC performs one-way, resilient replication to Confluent Kafka, ensuring data integrity in the process. Using Precisely’s technology, the company is able to vastly simplify the data mapping and transformation process, making the data fully intelligible within Kafka.

As a result of deploying Connect CDC, the financial services company achieved its goals for increased customer engagement and adopted a platform that they can further exploit for future digital transformation initiatives.

If your organization is seeking stronger, faster, more flexible, and scalable streaming technology to connect your mainframe data to Apache Kafka, Snowflake, Cloudera, or other stream processing platforms, check out our free on-demand webinar, Transform Your Mainframe Data for the Cloud with Precisely and Apache Kafka.