Riding the RAIL: Public Utility Depends on RAIL Technology

An enterprisewide implementation of RAIL technology may seem like an enormous undertaking. However, for Ameren, the risks are too great to leave data protection to a system with lesser capabilities.

When your customers depend on you 24x7, you leave nothing to chance. Nowhere is this more apparent than the public energy industry, where an extreme amount of effort is put into system and data availability. There’s nothing like the threat of millions of business and residential users thrown into sudden, total darkness to cause you to reevaluate your data protection plans.

Though the above is an overly dramatic scenario, Ameren (formerly Union Electric), one of the country’s largest public energy utilities, understands its responsibility to its 1 million-plus customers in Missouri, Illinois and Iowa. As with any utility company, Ameren is critically concerned with keeping operations running on a 24x7 basis.

Ameren wisely operates with a corporate policy that requires a duplicate copy of each backup tape, and keeps one copy offsite in case of a catastrophic data loss. But, too often in high-intensity business environments, a company that cannot afford data loss also cannot afford the time to adequately protect itself without compromising its 24x7 operation.

Ameren’s St. Louis headquarters includes a computer installation of 1,500 nodes with 120 heterogeneous file servers running NetWare, Windows NT, Sun Solaris and HP-UX. Customized applications developed for a variety of tasks like time reporting, customer service systems and outage management, generate a data load of 100 GB that must be backed up nightly, with a weekly full system backup from 400 GB to 500 GB. All departmental data is backed up to two centralized tape servers, with each server handling backups for 60 file servers.

Each backup server connects to two Storage Technology 9714 tape libraries containing 100 tapes, and equipped with four or six DLT7000 tape drives using Legato NetWorker software for backup automation.

Ameren’s IT staff satisfied the corporate policy of duplicate backup tapes by using the tape cloning feature within NetWorker to get offline tape-to-tape copy functions after the backup process was complete. While NetWorker’s tape cloning feature created sets of backup tapes for safekeeping, according to Vance Bufalo, Senior Engineer in Networking Engineering at Ameren, it was not the best approach for the size and scope of their backup policy.

"My servers are doing so much work every night and the off-site tape had to be ready for pickup by 11 a.m. I did not have time to do the backup at night and do the tape-to-tape copy in the morning before the vendor showed up to pick up the tapes," Bufalo says.

"The other problem I ran into is that the two tape systems support about 1,500 users in our main office complex, and we do recovers on a daily basis. If the tape drives are busy doing tape-to-tape copies in the morning, then those drives are not available for tape recovers. I needed a solution where I was getting true tape mirroring – real-time copy, so that while the first tape is being written, the second tape is being created at the same time."

The need to do real-time tape copying led Ameren to investigate the potential of a technology known as RAIL (Redundant Array of Independent Libraries) or RAIT (Redundant Array of Independent Tape), or sometimes simply – though less accurately – described as tape RAID. RAIL architectures consist of tape arrays optimized for throughput and fault-tolerance by advanced controllers, which enable tape devices to perform comparably to the way individual disks perform in RAID environments, with features like making multiple reads and writes in parallel, mirroring, and the use of parity. RAIT capabilities, like RAID, are often described in levels (0, 1, 2, 3, 4 and 5).

Orange County, Calif.-based Ultera Systems was selected to provide the backbone for the installation, and Ameren built a solution around Ultera’s ShadowMaster tape array controller technology. Ultera’s ShadowMaster simultaneously writes data to a mirrored set of tape drives, automatically creating a duplicate set of tapes in the same time it takes to do the original backup – in essence a RAID-1 environment, and what would have been an immediate fix for Ameren’s backup window issue.

However, beyond simple tape drive mirroring, Ameren and Ultera designed a RAIL environment that actually mirrors pairs of STK 9714 libraries, with one library dedicated to storing onsite tapes and the other used exclusively to make tapes for offsite storage. Each pair of mirrored drives is driven by a dedicated ShadowMaster controller, and another ShadowMaster manages the robotic media changers in each library.

Ameren’s new RAIL setup literally makes it impossible to perform any backup operations without creating the required offsite set of tapes. Any read operations will read data from tapes in the primary library drive, but if a write command is issued, ShadowMaster will automatically write the data to the two mirrored drives. If the second drive is not ready, an error message will be generated and the backup aborted. "We never want to have a broken pair on a mirrored tape," Bufalo says.

This innovative solution delivered the real-time tape mirroring that Ameren required, but also made media management a painless process, says Bufalo. "We have a 30-day tape rotation because that’s our data retention policy, and tapes never leave the onsite library. They’re in there all the time for tape recovers and also for the writes at night, and the offsite tapes are rotated on a daily basis.

"What that gives us is real-time mirroring, which met our backup window requirements; offsite copies which are ready by 11 a.m., and we can still do all of our data recovers from the onsite tape library for the last 30 days without any tape handling by the operators."

The Ameren/Ultera configuration is completely software transparent, allowing the company to use Legato NetWorker for backup with no changes – and no learning curve – required.

Ameren also implemented an Ultera-based library solution on a grander scale with an identical mirrored-library system of Ultera controllers and an HP9000 K-Class Enterprise Server running HP-UX. Two STK 9710 libraries, each accommodating 512 tape slots and 10 DLT7000 drives, are configured as a RAIL environment. "We’re doing the same thing on a much larger scale, only we’re doing it with UNIX and the bigger libraries," Bufalo notes.

The larger solution, for the ever-expanding data requirements, is not yet slated for backups of the corporation’s 5,000 total nodes. That project, planned for a future date, will incorporate mirroring controllers from Ultera with large-scale Storage Technology libraries.

Though the enterprisewide implementation of RAIL technology may seem like an enormous undertaking, for a company like Ameren, the risks are too great to leave data protection to a system with lesser capabilities.

Must Read Articles