Mainframe

What’s In a Name? Mainframe GDGs Get the Job Done

December 23, 2025

Steve Pryor

Mainframes are different. They’re the largest, fastest, most capable systems on the planet, and they still run many, if not most, of the most important applications in the commercial world.

One reason mainframes remain dominant? Their long history of effectively dealing with business data.

In large-scale business data processing environments, a common requirement is the need to deal with many different successive sets of the same type of data. Daily files lead to weekly files, which then roll up into monthly and yearly files, all of which must be easily accessible either individually or as a group, in the desired order.

Since the early days of mainframes, the Generation Data Group (GDG) has been the means of managing these successive occurrences (or ‘generations’) of the same data. Simply by using the dataset name, applications can select a current or prior generation, create new generations, or use the entire collection of datasets (a ‘GDG-all’) request.

This powerful yet simple method of managing data is unique to z/OS.

How to Create a Mainframe GDG

Before the individual datasets that comprise a GDG can be created, a GDG ‘base’ entry is created in the catalog by IDCAMS. Once the GDG base is created, individual generation datasets (GDSs), which are normally ordinary sequential datasets, can be created.

A DEFINE statement, like the one below, accomplishes a few different things:

Creates the GDG base entry
Sets the number of generations to keep track of
Specifies what to do when the maximum number of generation datasets is reached

DEFINE GDG(NAME(MY.BUSINESS.DATA)) LIMIT(255) NOEMPTY SCRATCH

In the example above, each individual generation data set (GDS) is catalogued as it’s created, and a maximum of 255 generations (the LIMIT value) are retained in the GDG base catalog entry.

Once 255 datasets have been created, the GDG base entry is ‘full’ and the oldest generation must ‘roll off’ of the GDG and is deleted (SCRATCH). Optionally, if EMPTY is specified rather than NOEMPTY, all generations (not just the most recent) will roll off when the GDG is full.

The generation datasets belonging to the GDG above have dataset names of the form MY.BUSINESS.DATA.G0001V00, where the last qualifier (referred to as the ‘goovoo’ level) specifies the absolute generation number – which may range from 0001 to 9999 as generations are created, rolled off, and deleted. The version number (‘Vxx’) is rarely used.

While possible to refer to a particular generation dataset by its absolute name and version, it’s more common to use relative generation numbers in the dataset name.

The relative generation is specified by placing it in parentheses following the GDG base name.

The most recent, or current generation, is generation zero – in our example this is MY.BUSINESS.DATA(0).

Older generation numbers are prefixed by a minus sign – so, the generation preceding the current generation would be BUSINESS.DATA(-1)
New generations are created by specifying a plus sign: BUSINESS.DATA(+1) and MY.BUSINESS.DATA(+2).

When creating new generations, make sure to specify each new generation number in ascending order as the dataset names appear in the JCL, particularly in a multi-step job, so that the generations are catalogued correctly.

Generation datasets are commonly used not only for ordinary business applications, but for system data as well – particularly SMF data. They are especially flexible, since any individual generation dataset within a GDG can reside on disk or tape, can be SMS-managed or non-SMS, and can have differing block sizes or other characteristics.

Yet, GDSs are easily and automatically managed by virtue of their naming convention. Generations are:

Kept in chronological order
Automatically deleted as necessary
Referred to individually or as a group

ProductSyncsort™ Storage Management

Take control of your IBM Z storage—and your budget.

Learn more

Processing All Generations

While most day-to-day processing will probably deal with generation datasets one at a time, applications that process weekly or monthly may want to deal with all of the generations belonging to a GDG at once.

This is accomplished by simply specifying the GDG base name in the JCL, without any relative or absolute generation number. For example:

//INPUT DD DSN=MY.BUSINESS.DATA,DISP=SHR

This ‘GDG-all’ processing treats the DD statement as if it were a combined input of all of the generations belonging to the GDG.

By default, the generations are processed from the most recent to the oldest (LIFO order). They can, however, be processed from oldest to newest by specifying FIFO either when the GDG base is defined or on the GDGORDER parameter in the JCL.

In more recent releases of z/OS, IBM has continued to add features to GDG processing, including LIFO order and the use of Extended GDGs which can keep track of up to 999 generations rather than the previous limit of 255.

Additional GDG-related parameters have been added to the JCL language. Defaults for GDG DEFINE can now be set in the IGGCATxx member of SYS1.PARMLIB. In addition, users of Precisely Syncsort™ Allocation Control Center (ACC) can take advantage of systemwide allocation standards enforced by the ACC Policy Rules Engine to set policies for the creation and characteristics of not only generation datasets, but all types of other SMS or non-SMS data.

GDG: A Core Strength of z/OS

GDG processing is a unique strength of the z/OS system.

With very little effort, multiple iterations of related data can be grouped together, tracked, and managed using ordinary batch job and catalog processing.

GDGs are simple to understand and useful for a wide range of both business and system data. They’re often the backbone of some of the most important applications that run on today’s z/OS.

To learn more and take the next step beyond GDGs, see how Syncsort™ Storage Management helps you optimize IBM Z storage and avoid costly space-related failures.