Solution Sheet

Databricks and Precisely Connect

Breaking down data silos by integrating legacy, mainframe and IBM i data into Databricks Unified Data Analytics platform for cloud-based AI and ML projects

Pipelines for Legacy Data to the Cloud

Liberate data from legacy sources for use within Databricks Unified Data Analytics Platform and Delta Lake by building data pipelines with Connect. Connect offers a design once, deploy anywhere approach to ETL workflows, and the interface allows you to define both batch and streaming ETL workflows from the same view. This visual approach to data integration means there is no reformatting or code generation needed to construct high-performance data pipelines. With point-and- click transformations, you can free time and attention to focus on your rapidly expanding and evolving technology stack.

Build a Data Lakehouse with Databricks and Precisely Connect

Connect helps you to build a data lakehouse by efficiently offloading data from legacy data stores to Databricks Unified Analytics Platform. Onboard data from almost any source, including:

Mainframe data: VSAM, COBOL Copybooks, mainframe fixed and sequential files
RDBMS: Oracle, SQL, Db2, MySQL, Sybase, PostgreSQL
Semi-structured data: JSON, XML
Enterprise data warehouses: Teradata, IBM Netezza, Vertica, Greenplum
Cloud: Amazon AWS, Microsoft Azure, Google Cloud Platform
Big Data: Hadoop, Hive
Streaming platforms: Apache Kafka
Flat files: Fixed length, variable length, delimited

Connect takes an end-to-end managed approach to offloading data. Regardless of which source you choose, you can replicate hundreds or thousands of tables – including whole database schemas – into Databricks quickly and easily.

Scale for Machine Learning Projects

Your machine learning projects require collection and native integration of complex legacy data stores at scale. Connect collects the data you need from all your legacy data stores and sends it to Databricks, which provides a scalable framework for machine learning. Connect has native integration with the Databricks Runtime. Directly in your Databricks cluster, use Connect’s intelligent engine to cleanse, transform, and prepare petabyte-scale datasets for analytics within your tight SLAs.

Download this solution sheet to learn more.

Download Resource

Download