Big Data

The Why and How of DataOps, A New Approach to Data Management

November 26, 2019

Christoper Tozzi

You’ve probably heard of DevOps, but you may not be familiar with DataOps, a related concept that has received much less attention so far. Here’s a primer on what DataOps means and how it can revolutionize your approach to Big Data. DevOps is an approach to software production that emphasizes constant collaboration between developers, IT Ops teams, software testers and everyone else who plays a role in designing, writing, installing and managing software.

The core idea behind DevOps is that software can be produced and managed more quickly, efficiently and reliably when everyone collaborates. DevOps aims to break down the “silos” that have traditionally separated the various IT teams within an organization.

The DevOps concept originated about a decade ago. Since then, DevOps has been established as a mainstream practice in companies ranging from small startups to large enterprises.

DataOps: The DevOps cousin

DevOps focuses on software production. It may not seem to have much to offer people who specialize in working with data. They use software tools to work with data, but they are rarely responsible for creating or managing those tools.

Yet as observers like Andy Palmer have pointed out, data specialists have much to learn from the DevOps movement. (Related: Big Data and DevOps)

That’s because the data management process is similar to the software production process in that data management usually involves multiple teams. One group sets up data storage infrastructure. Another runs the database software. A third works on data analytics using tools like Hadoop or Spark. And then there is the security team, which is responsible for keeping data safe and enforcing compliance policies.

eBook

How to Build a Modern Data Architecture with Legacy Data

Learn how you can create a modern data architecture that includes any data source regardless of the data’s type, format, origin, or location in a manner that’s fast, easy, cost-effective, secure, and future-proof. Download our eBook to learn more.

Read

Traditionally, these different teams have not always collaborated closely. The folks who set up MySQL databases usually don’t know much about using Hadoop, for example. By embracing the core ideas of DevOps, however, organizations can achieve DataOps to make these different teams collaborate more effectively.

DataOps offers the following benefits:

Faster time to value when working with data, thanks to streamlined communication across the team.
Faster identification of problems that could lead to errors or delays. By detecting issues earlier, before they snowball into larger problems, you can fix them more easily.
Stronger security. When the security experts are in close contact with everyone else on the data team, keeping data secure becomes easier.
Better use of your staff’s time. That’s because, when communication is streamlined, your experts can spend more time putting their expertise to work and less time trying to communicate information to each other through clunky or redundant channels.

Agility: The key to DataOps

If DataOps sounds like a good idea – and if you’ve read this far, it hopefully does – you might be wondering how you can achieve it.

Part of the answer involves organizational design. Just as you need to eliminate silos within your organization to implement DevOps, you need to remove silos to do DataOps. But technology is an important part of the journey to DataOps, too. You need to adopt technologies that enable you to be agile in the way you manage and work with data.

In the data context, being agile means being able to move data quickly between different environments, and to remain efficient no matter the scale of the data you must work with.

Agility also requires the ability to be flexible with your tools and frameworks. Maybe you use Hadoop today, but tomorrow you want to switch to Spark – or use both together. You don’t want to be locked into a particular framework.

Precisely’s solutions for moving data from legacy systems into modern ones can help you achieve DataOps agility by putting the tools of your choice to work when analyzing data – even if that data exists in obscure mainframe environments.

To learn more about working with legacy data, check out our eBook, How to Build a Modern Data Architecture with Legacy Data.