Migrating mainframe JCL jobs to serverless using AWS Step Functions

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/migrating-mainframe-jcl-jobs-to-serverless-using-aws-step-functions/

This post is written by Raghuveer Reddy Talakola, Sr. Modernization Architect, Sanjay Rao, Sr. Mainframe Consultant, and Aneel Murari, Solution Architect.

JCL (Job Control Language) is a scripting language used to program batch jobs on mainframe systems. A JCL can contain one to many job control statements. It can be challenging to understand the condition code parameter checking syntax, which determines the order and conditions under which these statements are run.

If a JCL fails midway through execution, mainframe programmers have no visual aids to help them understand the flow of the JCL. They must examine text-based execution logs to manually correlate condition codes in the logs with condition check rules attached to JCL statements to understand the root cause of failure.

This post explains how AWS Step Functions can make it easier to maintain batch jobs migrated from mainframes to AWS.

Overview

The sample application shows how to use AWS Step Functions to address typical challenges when maintaining a batch workflow built using JCL. The sample business case validates a feed of new employee information against an existing employee database. It identifies discrepancies between the feed and the database and sends out notifications if it finds any.

The mainframe JCL supplied with this blog has seven steps. Each step applies condition code rules to check codes emitted by previous steps to decide if it must run. The Step Functions example achieves the same result. Using its graphical user interface, you can develop each step as an independent task, and link them visually. This makes it easier to understand how to decouple, reorder, or scale tasks if needed.

Visual tools for workflow analysis

A JCL controls its flow by using condition code checking or/and using IF-ELSE statements. A JCL condition code check defines the rules under which its associated JCL step will not run. Developers may code compound rules, double negatives, or triple or more negative conditions into the flow.

Example of condition code check in JCL What it means
//STEPTS2 EXEC PGM=XYZ,COND=(4,GT,STEPTST) Do not execute PGM XYZ if previous step STEPTST ended execution with a code greater than 4
//STEPTS3 EXEC PGM=XYZ,COND=EVEN Execute PGM XYZ even if all the previous steps failed

//STEPTS5 EXEC PGM=XYZ,

COND=((6,EQ),(8,GT))

Do not execute PGM XYZ if any of the preceding steps exited with return code 6 or a code greater than 8

The sample JCL illustrates the complexity of setting up a batch workflow using JCL condition code:

  1. The first step of this JCL deletes files from a previous run. If it ends with code 0, the second JCL step extracts employee data from Db2 using a COBOL program and ends with a return code 0 if it is successful or 4 if no records were found.
  2. The next step coded with condition check (4,LT), runs if all preceding steps ended with codes less than 5. It checks the external extract and emits a condition code of 8 if the external extract is empty.
  3. The next step compares the 2 files if the extract validation step produced a return code of zero.
  4. If this comparison step detects some records that are missing in the employee Db2 database, it creates a file with missing records. If that file is empty, it sets a return code of 8, which ends the program. If the mismatch file has data, it copies the mismatch file over to another system for processing.

With Step Functions, you define the same workflow more easily by using the Amazon States Language (ASL). The Step Functions console provided a graphical representation of that state machine to visualize the application logic using a drag and drop interface.

Step Functions Workflow Studio

  1. The first task fetches the employee file from Amazon S3. It does not need a cleanup task as S3 supports versioning.
  2. If the fetched file is not empty, control passes to the step that runs business logic code inside an AWS Lambda function to validate the employee feed.
  3. The workflow retrieves an environment variable from an external parameter store. This step shows how environment parameters can be externalized in a Step Functions workflow.
  4. It publishes an event to Amazon EventBridge to trigger the external processing needed if discrepancies are found and conditions are met.
  5. The final step is a Succeeded state that marks flow completion.

The following image compares the sample JCL that is converted to a Step Functions workflow:

Sample JCL and Step Functions

Using a graphical interface instead of job control statements

In JCL, you define a batch process with a series of job control statements, which run a program, utility, or a nested procedure in a text editor. There is no visual aid. If a batch process becomes complex, it’s harder to understand the dependencies between the steps.

Step Functions makes it simpler for you to set up tasks, which are the equivalents of steps in JCL. It provides you with a graphical user interface (GUI) that enables you to configure and drag-and-drop steps into a state machine.

Decoupling tasks instead of deleting and commenting of code

To disable or change a step in a JCL, you examine the condition code logic associated with all preceding and succeeding steps of the job. Any mistake in editing these codes can lead to unintended consequences.

With Step Functions, removing or changing a step can be done using the visual editor or by updating the ASL code. This can help improve your agility and make it easier to implement change.

Using Parameter Store instead of editing parameters in code

To make JCL behave differently based on parameters, you must edit dynamic variables known as JCL Symbols inside the JCL or in control cards to affect the behavior change. The following JCL code sample shows a parameter called REGN coded to value DEV. At runtime, this REGN parameter is substituted by DEV in every statement that references this parameter. To reuse this JCL in production, you can change the value assigned to REGN to say PROD.

//   SET REGN=DEV
//    -------
//******************************************************************
//*  RUN  Db2 COBOL Batch Program 
//******************************************************************
//EXTRDB2 EXEC PGM=IKJEFT01,COND=(0,NE)                                
//    -------
//FILEOUT  DD DSN=&REGN..AWS.APG.STEPDB2,                             
//******************************************************************
//*  RUN  VSAM COBOL Batch Program 
//******************************************************************
//    -------
//FILE2    DD DSN=&REGN..AWS.APG.STEPVSM,                             

In Step Functions, configuration parameters can be decoupled from state machine code by managing them in an external data source such as the Amazon DynamoDB, AWS Systems Manager Parameter Store. In the Step Functions workflow, the following step demonstrates retrieving a configuration from Parameter Store and using it to perform branching logic:

Workflow example

Independent scaling of steps versus splitting and cloning JCLs

When a JCL takes a long time to run, mainframe programmers split the job into multiple jobs or steps. Each job is a replica addressing different ranges of data.

With Step Functions, you can run a step or a group of steps concurrently by using a parallel state or map state, without creating multiple jobs that do the same thing. This can help make maintenance easier.

Improved observability and automated retry

If a JCL fails, there are no visual aids to help debug the errors. On the mainframe, you must log into the mainframe and run through several screens of text on SDSF (System Display and Search Facility) to find the cause of the failure.

Step Functions provide visual information on failures, automated retry capabilities, and native integration with AWS services. This can make it easier to understand and recover from failed jobs compared with reading through lengthy logs.

JCL example

Workflow visualization

Benefits for developers

Step Functions provides the following improvements over jobs written in JCL or migrated from JCL.

  • Visual analysis: Step Functions provide a graphical console that shows the status of each task in a visual presentation that developers and support staff can understand and debug more easily than a failed JCL.
  • Decoupling: You can update each component in the workflow independently, unlike in a JCL, where changing a step requires redeployment of the entire batch job to production.
  • Low code: Step Functions are defined with minimal code. The workflow editor can be used to drag and drop different steps and visually edit the workflows.
  • Independent scaling of steps: Step Functions is a serverless solution, and each step can scale independently. This opens up the possibility of scaling up resources for steps that are resource-intensive.
  • Automated retry capabilities: You can configure Step Functions to retry steps and recover from failures. This is much simpler than coding restart conditions in the JCL.
  • Improved logging and visibility: Step Functions can integrate with observability tools like Amazon CloudWatch and AWS X-Ray.

Conclusion

This conversion example shows how Step Functions can help you rewrite complex batch processes written in JCL to serverless workflows. It also shows how such a conversion provides maintenance and monitoring features that make it easier to simplify and scale these batch processes.

To learn more, download the sample JCL and Step Functions workflow from the GitHub repository. To learn more about our AWS Mainframe migration and modernization services, go here.

For more serverless learning resources, visit Serverless Land.