A Search and Deploy Mission, Part 2

In the conclusion of this two-part article, tips and techniques of software defect investigation and resolution are discussed.

ABEND Completion Codes and Some Typical Causes

While there is a wide variety of reasons for ABEND conditions ("WHYthings are the way they are") in production systems, it is possible and useful tocategorize and organize HOW certain conditions often lead to certain types of ABENDcompletion codes ­ in order to expedite or streamline your analysis and research (an80/20 approach to analysis). The following information on a few common MVS ABENDcompletion codes, and the conditions which generated them is included for you to makeeffective use of ABEND-AID listings and the above debugging, research and analysisprocess.

S0C1

  • Attempt to execute an invalid machine instruction.
  • S0C1s occur due to COBOL: Table-handling overlay (MOVEs to table subscripts/indexes which are out-of-range and which overwrite PROCEDURE DIVISION instructions); Statements referencing LINKAGE-SECTION fields incorrectly; and CALLs to an invalid subroutine name.
  • The COBOL compiler always generates valid machine instructions. S0C1s usually occur when populating tables beyond the valid OCCURS range.

Typical reasons for S0C1s (listed in italics) followed by their explanations, arelisted below.

Moving elements to a table using a subscript or index which contains a value beyondthe maximum OCCURS in the table declaration. This usually happens because of a loopthat is not terminated correctly ­ such as a routine which populates a table from aninput file containing more records than the table OCCURS declaration provides for. It canalso happen through a MOVE or invalid math statement which computes an invalidsubscript/index value.

Referencing incorrectly defined/passed LINKAGE SECTION fields. If thedefinitions of your LINKAGE SECTION fields do not match, or the definitions in the calledprogram are larger than the calling program, you could be attempting to reference dataoutside of valid storage when statements which reference those fields executed.

CALL to an invalid or unavailable module-name. If your program makes a dynamicCALL and the module-name being called is not found, you can get S806, S0C4 or S0C1 systemerrors. The reasons for invalid module-names include; misspelling the name, incorrectlyspecifying the STEPLIB/ JOBLIB DSN= in the JCL (or incorrectly concatenating theSTEPLIB/JOBLIB datasets) and leaving out apostrophes (or quotes) on a CALL literal. Thiswould cause the COBOL compiler to treat the statement as if it were a CALL identifier ­and if an identifier with that name exists in the Data Division, COBOL will attempt adynamic CALL to the value of the identifier.

S0C4

  • Attempt to reference an invalid storage address.
  • S0C4s occur due to COBOL: Table-handling overlay errors (MOVEs to table subscripts/indexes which are out-of-range ­ and which overwrite PROCEDURE DIVISION instructions); Statements referencing LINKAGE SECTION fields incorrectly; CALLs to an invalid subroutine name; STOP RUN or GOBACK in the INPUT or OUTPUT PROCEDURE when using the COBOL SORT verb; and attempt to access an unopened dataset.
  • Unless your program is executing with "bounds-checking" (supported by CA-Capex Optimizing, COBOL II and COBOL/370 ­ and generally not used in production), your table routines could overlay the contents of storage beyond the boundary of the OCCURS clause. This can cause S0C7s (see above) S0C1s and S0C4s by overwriting field values in the Data Division (S0C7s) or actually overwriting the instructions in your PROCEDURE DIVISION, producing invalid addresses (operands) for the executable (machine) code (which in turn can cause S0C1s and S0C4s).

Typical reasons for S0C4s (listed in italics), followed by their explanations, arelisted below.

Table subscript or index contains a zero value. Verify that all table-handlingsubscript/index references are within the allowable range of the table's OCCURS clause(>= 1, <= OCCURS max).

Moving elements to a table using a subscript or index which contains a value beyondthe maximum OCCURS in the table declaration. This usually happens because of a loopthat is not terminated correctly ­ such as a routine which populates a table from aninput file containing more records than the table OCCURS declaration provides for. It canalso happen through a MOVE or invalid math statement which computes an invalidsubscript/index value.

Referencing incorrectly defined/ passed LINKAGE SECTION fields. If thedefinitions of your LINKAGE SECTION fields do not match, or the definitions in the calledprogram are larger than the calling program, you could be attempting to reference dataoutside of valid storage when statements which reference those fields execute.

CALL to an invalid or unavailable module-name. If your program makes a dynamicCALL and the module-name being called is not found, you can get S806, S0C4 or S0C1 systemerrors. The reasons for invalid module-names include; misspelling the name, incorrectlyspecifying the STEPLIB/JOBLIB DSN= in the JCL (or incorrectly concatenating theSTEPLIB/JOBLIB datasets) and leaving out apostrophes (or quotes) on a CALL literal. Thiswould cause the COBOL compiler to treat the statement as if it were a CALL identifier ­and if an identifier with that name exists in the Data Division, COBOL will attempt adynamic CALL to the value of the identifier.

S0C7

  • Data exception: Invalid numeric data in numeric field caught by a Convert-to-Binary machine instruction during a mathematical operation or numeric compare.
  • S0C7s can occur on COBOL: Arithmetic instructions (ADD, SUBTRACT, MULTIPLY, DIVIDE, COMPUTE); Comparisons involving tests of numeric fields (which can occur with the following statements: IF, EVALUATE, PERFORM UNTIL, PERFORM VARYING, GO TO DEPENDING...); and MOVE statements when the receiving field is packed (COMP-3) or binary (COMP) and the sending field contains invalid numeric data.
  • S0C7s occur when MVS finds invalid numeric data in a field defined as PIC 9 (all PIC 9 fields ­ DISPLAY, COMP, COMP-3 and floating point) during arithmetic or compare operations.

Note: S0C7s do not occur on an IF statement comparing PIC X fields

Typical reasons for S0C7s (listed in italics), followed by their explanations, arelisted below.

Failure to initialize a WORKING-STORAGE field. Be sure all numeric work areascontain a VALUE clause at the elementary level or are correctly INITIALIZEd before theyare used (as other than receiving fields in MOVE statements) within your program. Beparticularly careful with "counters and accumulators." Also, always initializeelementary (rather than group) COMP-3 fields.

Non-numeric "input data" in a numeric field. May need "IF NUMERIC..." test or may need to browse output files produced in previous job step("input" to this program).

Fall-through logic or invalid branching sequence. Sometimes program logic errorsforce program execution into a paragraph out of sequence such as executing an edit routinebefore a record is READ, or after the file has been closed (and spaces or HIGH-VALUES havebeen moved to the record).

MOVE statements when the receiving field's definition is COMP or COMP-3.The type of MOVE statement generated by COBOL compiler is based on the datatype definitionof the receiving field. If the receiving field is COMP or COMP-3, COBOL generates an"algebraic MOVE." This will result in a S0C7 if the sending field containsinvalid numeric data. "IF NUMERIC ..." tests on the sending field may benecessary prior to the MOVE statement.

Table-loading overlay errors. This can happen if a table-loading processoverlays data beyond the table OCCURS range (i.e., non-numeric data can be moved tonumeric-defined fields that are adjacent to the storage area set aside for the tablethrough its OCCURS clause).

Referencing incorrectly defined/ passed LINKAGE SECTION fields. If thedefinitions of your LINKAGE SECTION fields do not match, your program may referencenon-numeric data through numeric field definitions.

S0CB

  • Attempt to divide by zero/decimal-divide overflow.
  • S0CBs occur due to COBOL: DIVIDE statements if the quotient in a division using a decimal operand is greater than the size of the receiving field; and division by zero.
  • Note that S0CB ABENDs may be intercepted by COBOL library subroutines (which automatically check for zero before dividing). If this is the case, zero-divide will result in "user" return-codes: U0203 ­ OSVS COBOL; and U1061 ­ VS COBOL II.

Typical reasons for S0CBs (listed in italics), followed by their explanations, arelisted below.

DIVIDE by zero. Program logic should always check to see if the divisor has beenproperly initialized or updated or, in the case of input edits and data validation, thatthe divisor is > zero before doing the division. Also check to see whether a fractionalvalue was MOVEd to an integer field, truncating the fractional value and resulting in zerodivide.

Decimal DIVIDE exception. Check the specification of the COMP-3 receiving field,the placement of the V in the receiving field definition and the overall definition of thereceiving field. Also, check to see if the ON SIZE ERROR condition should have been coded.

S001

  • Input/Output problem
  • S001s occur due to COBOL logic errors: File READ/WRITE error; and file OPEN/CLOSE error.
  • S001 errors occur primarily due to incorrect COBOL logic (fall-through errors, logic executed out of sequence, etc.)

Typical reasons for S001s (listed in italics), followed by their explanations arelisted below.

S001on a READ operation. Occurs if your program READs before opening a file orREADs after closing a file. (Place file OPEN/CLOSE statements in dedicated Initializationand Termination paragraphs.) Can also occur if your program READs past the end-of-filecondition (Create a unique end-of-file switch for each file your program reads, watch"switch" on READ statement and PERFORM UNTIL). Can also occur if your programattempts to READ from a file OPEN for OUTPUT.

S001on a WRITE operation. Occurs if you WRITE before opening a file or afterclosing a file (see above on Initialization/Termination routines). Can also occur if yourprogram attempts to WRITE to a file OPEN for INPUT.

S013

  • Conflict in DCB (Data Control Block) parameters.
  • S013s occur due to inconsistencies between COBOL file description statements in your program and: The DCB (data control block) parameter specified on the file DD statement in your JCL (for output files) or the DCB entry taken from the physical file DCB parameters, stored on the file's device header.

Typical reasons for S013s (listed in italics), followed by their explanations, arelisted below.

S013 on an OPEN statement for an input file. Occurs if your program's RECORDCONTAINS clause conflicts with the physical file's record length. Or if your program'sBLOCK CONTAINS clause conflicts with the physical file's blocking factor. Suggestion ­ oninput files, do not specify RECORD CONTAINS. Code BLOCK CONTAINS 0 RECORDS.

S013 on an OPEN statement for an output file. Occurs if your program's RECORDCONTAINS clause conflicts with the file's JCL (LRECL= size). Or if your program's BLOCKCONTAINS clause conflicts with the file's JCL BLKSIZE= parameter. Suggestion ­ on outputfiles, code BLOCK CONTAINS 0 RECORDS.

S213

  • File open error
  • S213s occur when an input file is not found. This can happen if: The file does not exist or the filename is misspelled on the JCL DSN= parameter.

Typical reasons for S213s (listed in italics), followed by their explanations, arelisted below.

S213 on an OPEN statement for an input file. Occurs on file OPEN when the systemcannot find the input filename as specified in your JCL. This can happen because of asimple typo in the JCL, or because a previous job failed to complete successfully.

S122/222/322

  • Operator cancel
  • S122/S222s occur when an operator cancels a job: 122 means the job was canceled and a storage dump was requested; and S222 means the job was canceled, but a dump was not requested (although, depending on which MVS routine was active when the job was canceled a dump may have been produced).
  • 322s occur when MVS cancels a job because the default or specified CPU time limit for a job step or procedure was exceeded.

Note on S122/222: It is important to note that S122/222 job cancellations are"judgment calls" by the system operator and that, in fact, there may be nothingwrong at all. Always begin your research by calling the operator and requesting anexplanation of why they canceled the job.

Note on S322: If a job that normally processes 100,000 records jumps to10,000,000, or if it is run on a slower CPU with slower external devices, S322 may simplysignify that you have to increase the CPU time in the JCL.

*However, it could be that S122/222/322s occur because of program logic or jobexecution errors.

Typical reasons for S122/222/322 (listed in italics), followed by their explanations,are listed below.

Job is deadlocked (program is in a Wait state). Occurs when a file your programrequests cannot be allocated to your process because some other program is using it. Thisgenerally occurs when jobs are initiated out of sequence.

Program is in an infinite loop. Occurs when your logic repeatedly executes thesame routines over and over. Generally due to incorrectly setting or checking switches andreturn-codes, or some type of fall-through error.

S806

  • Requested Load Module not found.
  • S806s occur when a called program (or system subroutine) is not found. This can happen if: The module name is misspelled on the CALL statement; The module was not successfully LINKed into the application; The program name is misspelled on the JCL EXEC PGM= parameter; and the STEPLIB/JOBLIB DD statements point to incorrect load libraries, or the libraries are incorrectly concatenated.

Typical reasons for S806s (listed in italics), followed by their explanations, arelisted below.

Module name is misspelled. If your program makes a dynamic CALL and themodule-name being called is not found, you can get S806, S0C4 or S0C1 system errors. Thereasons for invalid module-names include; misspelling the name, incorrectly specifying theSTEPLIB/JOBLIB DSN= in the JCL (or incorrectly concatenating the STEPLIB/JOBLIB datasets),leaving out apostrophes (or quotes) on a CALL literal - which would cause the COBOLcompiler to treat the statement as if it were a CALL identifier - and if an identifierwith that name exists in the Data Division, COBOL will attempt a dynamic CALL to the valueof the identifier.

B37/E37

  • Out of space condition
  • B37/E37s occur when there is insufficient space on an output device. This can occur because of: Insufficient SPACE allocated through the JCL for an output file - in which case you should re-estimate the SPACE requirements for your output file, and increase SPACE allocation; Insufficient SPACE on a particular DASD device - in which case you should either choose a different device, or remove some files from the pack; and A program logic error such as an infinite loop which includes WRITE statements.

Typical reasons for B37/E37s (listed in italics), followed by their explanations arelisted below.

Program is in an infinite loop in a WRITE routine. Occurs when a file your logicrepeatedly executes a WRITE statement over and over. Generally due to incorrectly settingor checking switches and return-codes, or some type of fall-through error.

About the Author:

Jonathan Sayles is Senior Technical Consultant, Micro Focus, PLC (Palo Alto,Calif.), and has published books and articles on topics such as relational database,client/server development, application development workbenches, Visual Basic,PowerBuilder, DB2, Oracle and SQL.