A Year 2000 Solution on a Legacy Database

The author gives his first-hand account of how he used a practical solution to solve his company's Y2K problem by using the Unisys COBOL PIC 99 COMP field.

I am a COBOL programmer, on a UNISYS 2200 mainframe, running DMS. I am responsible forthe County Welfare Database, known as the Welfare Management Base (WMB). Our heritagedatabase schema is a hierarchical tree-structure arrangement (i.e., there are"owner" records, with their groups of "member" records stacked upunderneath them, comprising a "set."). All members of a given set have identicalrecord format, or "record type." This set structure is a bi-directionallinked-list arrangement: a given record points to both the next and the prior member ofthe set. On our database, most of the 2000-sensitive data is found on sets of records thatare in order by date (i.e., a date-field within the record establishes the order of theset). In these date-sensitive sets, the schema definition determines the sort order andthe key, much as an active index file determines the order on a relational or xbase file,or in an isam file. In the example pictured below, successive changes in the name of awelfare case are recorded in a series of "name" records arranged in a set, inascending order by "effective date."

fig1.jpg (11221bytes)
FIG 1.  Record Changes

Note that our legacy COBOL DMS does not provide any of the newer "complex"data types, such as "date." It only provides the traditional simple data types(numeric, character, etc.) In the schema, the date fields, YY, MM and DD are simplydefined as binary numbers, termed "computational numerics." Accordingly,date-manipulation and date-validation are not features of the system and must be carriedout by application programs and subroutines: The DMS does not know that they are dates.Incidentally, all of the year fields in these sets are two-digit.

In late 1996, we were looking at ways to achieve millennium compliance on themainframe. Unfortunately, 2000 conversion was on a collision course with a number of otherhigh-priority items.

From my vantage point, as caretaker of a million-client database competing for limitedresources with other bases, it was clear that we were not going to have time to revise thedatabase schema, expanding the year fields. A WMB schema change for the purposes ofMillennium conversion would be a major project management dilemma. Because of theextensive date changes needed, all programs accessing the WMB database would have to cutover to the new system at the same moment. That includes all major WMB subsystems(dailies, monthlies, etc.), as well as all ancillary programs on other bases that accessthe WMB base.

As a result, no WMB-related system could be put into production until all of the otherswere tested and ready. The following steps would be required:

  1. Develop a new Millennium WMB test schema, and copy the entire production database onto it.
  2. Develop and test all of the WMB systems and ancillary systems.
  3. At the appointed time (before the millennium horizon), cut all systems over at once (known as the "Big Bang" method).

Such a method had been employed in the past, but for a beleaguered unit now grapplingwith the dual pressures of welfare reform and technological revolution, it was not anoption.

Eventually an idea emerged which gave some hope. I had known from previous experiencethat two-digit computational numerics, while defined as "PIC 9(02) COMP," couldactually hold numbers greater than 99. I had exploited this once to define a specialcharacter for report-formatting. Such a field can actually hold up to a maximum integervalue of 511 (or 111111111 binary). This is an established feature of COMP fields,documented in the UNISYS COBOL specifications.

From there, it was only a short step to our central algorithm: Because the date fieldsin the aforementioned sets are PIC 99 COMP, the years in the 2000 to 2099 range can bestored as 100 through 199. This is just a continuation of the pre-millennium convention inwhich we store years in the 1900s as 0 through 99. In other words, the storage value willcontinue to be 1900 less than the actual year. In fact, we could store up to the year 2411in this fashion:

YEAR STORED VALUE
1998 98
1999 99
2000 100
2001 101
2101 201
2201 301
2411 511

Implementation of this method was strikingly easy, and carries very little overhead.Only a few simple rules needed to be clarified, based on the behavior of the COMP fields:

Behavior of the Unisys PIC 99 COMP Field

A. A PIC 99 COMP field can store any value between 0 and 511 inclusive. The value willbe available in any calculation or numeric comparison legal under COBOL.

Transferring a Value to Other Fields

B. It is important to note that if the value of a PIC 99 COMP field is 100 or more, adirect move of the COMP field into a numeric non-COMP field will NOT transfer the fullvalue. Whatever the size of the destination field, it will receive only the last twodigits (f7). This is the only quirk to be aware of when retrieving from these fields. Inall other respects the PIC 99 COMP field handles the values 100 through 511 just as a3-digit COMP field would.

Example: FIELD1 is PIC 99 COMP with a value of 101.

FIELD2 is PIC 9999.

"MOVE FIELD1 to FIELD2" is executed.

FIELD2 then contains the value 1.

Fortunately, there are other ways to transfer the actual value of a COMP 99-year fieldinto a non-COMP field. Here are some alternatives:

1) First, move the COMP 99 field to a COMP 999 field. Then move the COMP 999 field tothe numeric field.

Example: FIELD1 is PIC 99 COMP with a value of 101.

FIELD2 is pic 999 COMP

FIELD3 is pic 999

These lines are executed:

"MOVE FIELD1 to FIELD2.

MOVE FIELD2 to FIELD3."

FIELD3 then contains the value 101.

2) Use any arithmetical operation to transfer the data.

Example: FIELD1 is PIC 99 COMP with a value of 102.

FIELD2 is PIC 999.

After executing the line

"SUBTRACT 0 from FIELD1 GIVING FIELD2"

FIELD2 will contain 102.

Applying this method is simply a matter of providing the logic for storage andretrieval on the base under the new rules. The operating tactic is to make existingprograms aware that the year fields may be holding values greater than 99.

To get a better idea of the environment we are working in, it is important to mentionthat there is a great variety of programming styles in the WMB inventory. This is a resultof the long life span of the legacy system, the variety of functions to which it has beenput to use, and the great number of people who have worked on it. Seen by the noviceperhaps as wild spaghetti-code, the storage process often involves some very intricatelogic, expressing sophisticated interaction of date-fields, which served both the"business" process and the strictures of the arcane hierarchical data-system. Assuch, the uninitiated programmer coming in to "clean up" and redesign wholeparagraphs or programs would easily wind up like a bull in a china shop. Other than its2000 problems, WMB is a relatively stable system. In making changes, a light footprint isto be preferred, and a major rewrite should be undertaken only if it serves the Millenniumgoal. The most creative solution, moreover, is often the one that involves the leastdrastic change to the program code. The more fundamentally a program is changed, the moreit would have to be tested.

With the help of the new algorithm, the net code changes were modest, compared to thenightmares of field-expansion. For one thing, the year comparisons found in the programcode could often be left alone if they involved direct comparisons between COMP fields, orbetween like groups of COMP fields. "IF AUTH-DATE < TRANSACTION-DATE" willcontinue to work if both fields are group levels composed of exactly three PIC 99 COMPsub-fields.

Note, incidentally, that years in the WMB database can be directly compared with oneanother. For instance, we know that 105 is six years later than 99. This feature provides"downward compatibility" with the old data on the base.

In instances where extensive adjustments had to be made, they were easy to isolate andcentralize, even when coding style was less-than-ideal.

Data Retrieval

Often, retrieval of dates from the WMB database requires no more than a direct move toa display field on an end-use screen or report. In such cases, when there is nooff-database date-sorting or date-calculation involved, the program did not need to bechanged at all. For example, 100, moved to a display field, becomes '00,' perfectlyacceptable, because it represents 2000. 101 shows up as '01,' etc. There were manyprograms like this on WMB, because their screen displays relied directly on thedate-sorted order of the sets.

Conclusion

It has been my intention in this article to describe a practical application of a 2000solution. There are plenty of theoretical solutions, and that is good. But there is adifference between theory and practice. As a part-time violinist, I know that one couldmaster the theory of violin-playing without ever being able to play a note. Similarly, themost elegant 2000 theory can collapse under real-world conditions.

As we move closer to the judgement threshold, it has become clear that the methodoutlined above has served its purpose very well. As of this date, the main update processhas already been converted and is in production. Other analysts having similar databaseshave chosen to adopt the same strategy.

To summarize, the following advantages can be seen in the implementation of thismethod:

  1. There is no schema change on WMB for 2000 conversion purposes. Test and production database schemas are the same, simplifying testing. Conversion proceeds in phases, rather than in one massive cutover.
  2. Pre- and post-conversion data are compatible. No unloads, reloads, or data-conversion of WMB database data are necessary. Converted programs are able to read old data.
  3. Code-conversion involving storage of date-keyed records is simpler.
  4. The strategy is easier to comprehend than any other competing strategies.

It is perhaps ironic (and a testimony to frugality) that the COMP field which wasintroduced on the base to save space actually holds more data than the plain non-COMPnumeric field.

The sorted date sets on the WMB base which are so convenient for date retrieval are thesame sets which would be part of a major nightmare if we had had to contemplate changingthe WMB schema for millennium conversion.

To be sure, we had a lot of other 2000 issues to unravel, including problems withsequential file formats, and unresolved questions about outside data-sources. That is whyit is indeed fortunate that we had an alternative in the COMP storage capabilities of theexisting fields.


About the Author:

John Bartley is a programmer and analyst at Erie County, New York, Department of SocialServices. He can be reached via e-mail at bz910@freenet.buffalo.edu.