TV Guide Dials In Efficiency With Autiomated Error Detection
Anyone who manually monitors standard lists ($STDLIST) on HP3000s to check for errorsand abort messages knows how time-consuming it is. Every time a program is run, a $STDLISTis generated that provides a complete history of that job: how long it ran, the user whoran it, what accounts it ran on, various statistics involved in the process, and mostimportantly, error or job-abort messages. In a manual operation, each $STDLIST is printedout and the operators go through them line-by-line to locate the various warning, errorand abort notations that signify a problem. When as many as 1,000 jobs per day are run,manual reviews and handling can become a full-time activity for several staff members.
Further, because batch processing is normally conducted after close of business,considerable night-shift resources must be invested in manually monitoring $STDLISTs.Typically, the machines have a multitude of vital off-hours transactions occurring,including daily, weekly and monthly reports. It's up to the night-shift operators to makesure these jobs run as planned, reacting to problems and working with on-call resources toresolve them before business resumes in the morning.
During the past ten years, TV Guide, Inc. (Tulsa, Okla.), formerly United VideoSatellite Group, has grown from more than 200 to over 1,500 people, placing tremendousdemands on the enterprise data center. In the early days, when we only required a fewsystems and had a low user count, the data center was run by six people, who werededicated to reviewing job streams and handling error and abort messages manually. Toassist with the time-consuming task, we made use of a contributed library utility known asSLEEPER (contributed library programs are written by the HP user community and suppliedvia tapes to members of the Interex user group). SLEEPER could handle fairly simplescheduling routines, but it had no ability to store and recall job listings. Consequently,the operators devoted many long hours to keeping the job stream flowing.
With the rapid expansion of the company, the data center has undergone significantchange. Today, 11 HP 3000s operate 24x7. We handle a wide range of jobs, such as callcenter, back-office, information services, satellite programming, sales reports, customerstatistics, corporate sales figures, month-end reporting, cash receipts and weeklyreporting. The environment is primarily online and most batch jobs are scheduled forevening processing. Two 969/320s and one 959/300 support our call center back-officesystem; a 987 and a 978 support our communications link to satellite up-link facilitiesand six other 9x7s and 9x8s support a variety of other functions, including developmentefforts. These systems are continuously monitored from a centralized data center staff byshifts of two operators.
Yet, despite the wide array of tasks being performed across the enterprise and the 750%expansion in the number of TV Guide users, the entire data center staff consists of onlyeight people split into three shifts (which is a 30% increase over the last decade). Withso few personnel available, how is it possible to keep up with day-to-day demands?
Several years ago we introduced a software program known as JobRescue from Nobix, Inc.(Pleasanton, Calif.) to automate the error and abort message reporting process. JobRescueis a HP 3000 job management utility that automatically detects errors, exceptions andabort messages. It eliminates the need for manual review of $STDLISTs, significantlystreamlining batch processing operations.
JobRescue is pre-configured to trap job messages and, once running, acts as anunattended batch job, tracking job performance statistics for easy analysis to help datacenter operations flow more efficiently. As jobs log off, the program automaticallyexamines each $STDLIST and multiple $STDLISTs can also be processed simultaneously.Additionally, it automatically compresses $STDLISTs, saves them to disk and makes themavailable online, eliminating fear of lost or misplaced information.
There are many benefits to such a system beyond the reduction of personnel levels,including lower printing and storage costs, return of investment in two to three monthsand improving data center efficiency. With hundreds of jobs running daily, printing andarchiving costs can mount up. Printing is virtually eliminated as operators no longer haveto search hard copy for error messages. And with the records available online, archivingis automatically handled.
Additionally, while data center staff must always react to notification of errormessages, through automation they can respond much more quickly than before. Thistranslates into more time being available for upgrading the overall capabilities of thedata center. Operators are now able to invest time into productive tasks such as tapemanagement and reporting as well as event logging and notification. For TV Guide, though,the primary benefit of JobRescue is quality control. System management and IS staff arefree to enter the system at a high-level to quickly locate the reasons why a programfailed. This way, we are able to isolate the exact problem in a job stream in a timelymanner and are more proactive in accomplishing our IT goals.
The globalization of the business world has made competition fiercer than ever, makingevery increase in efficiency count. Those who are still printing out $STDLISTs andchecking errors/aborts manually, are absorbing time and money that could be better spentin assuring the future survival of the organization.
-- Phil Anthony is director, System Resources at TV Guide, Inc., a globaldiversified media and communications company that markets and distributes to over 100million cable and satellite homes in the United States every week.