AS/400 Cluster: Not Just for Failover Anymore

The first cluster-aware applications for the AS/400--capable of restarting operations within minutes on a hot standby server--are rolling off the assembly line. However, with the record-low incidence of AS/400 downtime--a few hours a year--will clustered applications find a need in the market?

Clustering provides benefits beyond system crashes, including system maintenance and performance advantages, says Wes Lucas, CEO of ALE Systems (Richmond, Va.), the first vertical application vendor to come on the market with cluster-aware AS/400 software. "From a backup and recovery standpoint, clustering is a bit of a stretch for only about five hours of potential downtime a year," he admits. However, clustering capabilities enable IT managers to switch systems "for lots of routine types of situations, such as the application of PTFs and normal system maintenance."

ALE Systems' AS/400-based MLPS Loan Origination System and Delinquent Loan Collection Systems have recently been certified by IBM Rochester as "ClusterProven." IBM's ClusterProven program recognizes applications that are configured to be available on a backup server in the event of a failure. The applications take advantage of clustering capabilities in V4R4 operating system and disk mirroring middleware. Leading AS/400 mirroring middleware solutions providers, including Vision Solutions Inc. (Irvine, Calif.), DataMirror Corp. (Toronto) and Lakeview Technology (Oak Brook, Ill.), are underpinning IBM's ClusterProven initiative.

The latest version of Vision Suite (6.3) from Vision Solutions, released in July, includes remote journaling support, as well as a GUI-based console for configuration and management of clustered machines. DataMirror's High Availability Suite--with its integrated cluster resources services APIs--is designed to enable continuous availability in an AS/400 environment. Clustered nodes can be monitored from a single workstation. Lakeview Technology announced that its MIMIX Availability Management software will also support V4R4's clustering technology.

Even though mirroring capabilities provided by these vendors have been available for some time, a failure can still often require as much as 24 hours in downtime in order to get a system back up and operational. ClusterProven Technology shortens the downtime, in most cases, to less than a minute.

Lucas reports that ALE Systems worked with IBM Rochester for about six months to bring the technology to market. The first release is tweaked to run with Vision Suite. He adds, however, that ALE is also developing APIs to link into DataMirror and Lakeview MIMIX.

One client, a large financial services organization, is running ClusterProven ALE software to link AS/400 servers in Jacksonville, Fla. and Colorado Springs, Colo. The advantage is daily maintenance, Lucas says. "They can apply PTFs without waiting until the weekend. Plus, when the Jacksonville operation powers down at the end of the day, work continues in another time zone. Changed data is moved between the servers through remote journaling."

Workload balancing for high-performance computing is another benefit of clustering technology. Another ALE Systems client has 1,000 users processing 600 loans an hour on a three-way cluster of AS/400 650 machines. "We're recommending they put some part of those users on a West Coast machine, and others on an East Coast machine," Lucas says. "That divides the workload on an alternative processor, instead of just having a backup machine sitting there."

With these distances, geography is not a barrier to clustering technologies, says Lucas. Rather, "Bandwidth and line speed is an issue. The heartbeat between the systems has to be very fast to detect a failure." The IBM High Availability APIs include a heartbeat function which continually assesses the state of the primary machine. If it doesn't respond for a specified number of milliseconds, the IBM APIs automatically failover all the users to a second machine.

So far, bandwidth has not been a constraint, he adds. "A pair of T1 lines in a high transaction processing environment seems to work pretty well. We don't think we need ATM, even though ATM would sometimes be indicated."