In-Depth
Data Encapsulation Within the Procedural Paradigm
The Object-Oriented paradigm is very popular today. Supposedly, the use of object technology can reduce the development and maintenance costs of application systems due to the OO properties of data encapsulation, polymorphism and inheritance. These properties lend themselves to the development of reusable object components.
The procedural development paradigm has been in existence for many years. Procedural systems are generally written in third generation languages, such as COBOL or PL/I. The maintenance costs associated with procedural legacy systems can be prohibitive. Business needs constantly change and legacy systems must be modified in order to support new requirements.
The procedural world might significantly reduce maintenance costs by borrowing an idea from the world of object technology. That idea is the idea of data encapsulation. This article discusses a technique for encapsulating data in a traditional procedural application system.
Discussion
Legacy systems cost a fortune to maintain. A group supporting a legacy procedural system receives new requirements on a regular basis from the user community and applies changes to the system to produce the new functionality. A fair portion of the maintenance effort involves the addition of new fields to a database or the introduction of new processing associated with existing fields within a database.
Legacy systems can be composed of millions of lines of code and hundreds of program modules. Much of the effort in performing maintenance to a legacy system is the determination of where to change the system.
The analyst often uses a utility, such as the Search capability of ISPF, to find every occurrence of a variable within the set of programs that constitute the system. When performing such a Search, the analyst can find references to the variable in many, many programs. Each of the programs must be examined to figure out it’s relationship to the variable in question. Changes to satisfy the new requirements must be formulated.
The program modules are then modified and each program which contains a reference to the variable being modified must be tested. This process can be very complex. A change to the processing of a variable within a given program can potentially impact the use of that same variable at some other location in the system. The point is, in most legacy procedural application systems, references to a variable being modified are spread all over the set of programs which compose the system. Hence, all such programs must be examined, changed, and tested in order to effect a change.
The developers of the Object-Oriented paradigm certainly recognized this "data all over the place" phenomena with regard to procedural systems. They came up with the idea of an object that bundles up data within a class. The only way to get to the data in a class is to call a class function or method. The system data is partitioned into logical objects. An object describes the characteristics and behavior of some entity within a system. In the object-oriented scheme, the maintenance programmer simply finds the object which contains the data to modify and makes changes within that single class. The data related to the enhancement is localized to the class. Therefore, the change is not "spread all over the system." Costs associated with such changes may indeed be less.
It is possible to create pseudo-objects in procedural programming languages such as COBOL or PL/I. In fact, any language which contains an ENTRY statement can facilitate the creation of "objects." An ENTRY statement is presently a part of both the COBOL and PL/I languages.
An ENTRY statement defines an entry point into a COBOL subprogram of a PL/I procedure. A subprogram or procedure can contain multiple entry points. Hence, many execution paths can exist in a single subprogram or procedure.
Each ENTRY statement has a matching GOBACK or RETURN statement. When called from a Main routine, control is transferred to the specified entry point and execution begins. Execution continues from that point until the GOBACK or RETURN statement is encountered. When the GOBACK or RETURN is executed, control returns to the instruction immediately after the call in the calling routine. The calling routine can then perform other processing, call the same entry point once again with different input arguments, or call other entry points within the subprogram or procedure containing the multiple entries.
This action is similar to calling any of the various methods (functions) on an object in the object-oriented world. The use of ENTRY statements in COBOL or PL/I effectively creates a procedural "object." The attributes (variables) contained in a procedural "object" are the variables within the subprogram or procedure. All variables within the subprogram or procedure are accessible to the entry points (methods) contained within the procedural "object." Note that such variables are global to all entry points. Also, the procedural "object" can be said to contain state.
The state of a procedural "object," like it’s object-oriented counterpart, depends upon the sequence of entry points which have been called prior to a given entry point invocation. State is always retained across entry point calls in COBOL due to the static nature of data within a COBOL subprogram. State may be retained across entry point calls in PL/I if the variables in the procedure are declared to be STATIC. If the PL/I variables are AUTOMATIC, the memory associated with the attributes (variables) are freed when a return is executed by the PL/I procedural object. Hence, if it is desired to retain state in a procedural PL/I object, declare the global class variables as STATIC. Doing this will ensure that a PL/I "object" acts just like an object in C++, Java or Smalltalk.
The sample code below illustrates the architecture of a procedural "object."
PROC: TABLE1;
DCL 1 TABLE1,
5 FIELD1 CHAR(5),
5 FIELD2 CHAR(10),
.
.
etc.
ENTRY: ADD_ROW(TABLE1);
INSERT INTO TABLE1 (TABLE1 VARS)
VALUE
(:TABLE1);
RETURN;
ENTRY: UPDATE_ROW(TABLE1);
UPDATE TABLE1
SET TABLE1 VARS = :TABLE1
WHERE KEY = :TABLE1.KEY;
RETURN;
ENTRY: DELETE_ROW(TABLE1);
DELETE FROM TABLE1 WHERE KEY = :TABLE1.KEY;
RETURN;
ENTRY: SELECT_ROWS(TABLE1_LIST, NROWS);
SELECT COUNT(*)
INTO :NROWS
WHERE KEY = :TABLE1.KEY;
ALLOCATE TABLE1_LIST(NROWS);
DO I = 1 TO NROWS;
FETCH CURSOR1 INTO :TABLE1_LIST(I);
END;
RETURN;
ENTRY: ANY_FUNCTION_ON_TABLE1_DATA(WHATEVER_ARGS);
PERFORM DESIRED FUNCTION ON TABLE1 DATA;
RETURN;
END TABLE1;
The pseudo-code in this example presents an object which stores all fields from a relational database table. The database table (and the object) contain character fields FIELD1, FIELD2, etc. The functions or methods on this procedural object are ADD_ROW, UPDATE_ROW, DELETE_ROW, SELECT_ROWS, and ANY_FUNCTION_ON_TABLE1_DATA. Each function begins with an ENTRY declaration and ends with a RETURN statement. The attributes or variables within the "object" are the variables declared within the TABLE1 structure appearing at the top of the procedure.
Within a large scale application system, many procedural "objects" can be created for each relational database table employed by the application. In this way, procedural maintenance programmers can gain the big advantage of object-oriented data encapsulation. The data, stored in procedural "objects," is not spread "all over the system, across millions of lines of code."
Changes related to a data item on a procedural "object" are local to that object. If a maintenance programmer wants to know where to modify the activity of definition of a data item, he finds the procedural "object" containing the data item and makes the change there. The only other places within the system which must be potentially changed are those application functions which retrieve a data item via a method up into the application. If a data item’s field width changes for instance, then routines using that data item will have to be modified also to redefine the field width. But the change starts in one place, within the procedural object. After changes are made, the system is searched for function calls on that procedural "object."
Maintenance of large scale application systems consisting of procedural "objects" related to relational database tables should certainly be less costly due to data encapsulation. Object-oriented developers have been making that claim for years. And rightly so. Within the object-oriented paradigm, data encapsulation makes it easier to find and repair data items and related functionality. The use of multiple entry points within a procedural routine produces the same basic advantage.
Conclusion
The use of multiple entry points in a procedural routine can encapsulate application data. The benefits of data encapsulation, as found in object-oriented applications, can be realized in the procedural world also. Having maintained legacy procedural systems for years, it would be a pleasure to simply pull up the "object" storing a variable which needs redefinition or extended functionality, make changes in one location, and change the application components which refer to the "object" to produce the desired new functionality.
Before developing your next procedural COBOL or PL/I application system, consider the use of multiple entry points in subprograms or external procedures to realize the benefits of data encapsulation and reuse within the procedural programming paradigm. The utilization of such an idea will certainly increase future application system maintainability.
About the Author: Richard Brodine has been a teacher, writer and software developer on mainframes and other platforms for 22 years. He holds a Masters of Science in Computer Science, a Masters of Science in Operations Research, and a Masters in Energy Resources. He can be reached at [email protected].