Continental Soars with NAS

In moving from direct-attached to network-attached storage, Continental Airlines has found high availability, a flexible storage infrastructure and backup times that have been reduced from days to hours.

For Continental Airlines' Houston-based Planning, Pricing and Scheduling Division, high availability is vital. Some 12 Oracle databases receive data on schedules, numbers of passengers per flight, total numbers flying per day, as well as pricing information on every flight from every competitor. This information is stored and analyzed to model and generate globally distributed schedules and pricing plans.

"Using a direct-attached storage architecture with a SCSI high-speed interface, we were only able to do a cold backup every Saturday," says Vinod Kaila, Manager of Unix Network Operations at Continental. "By switching to a network-attached storage architecture, we perform a complete backup each night, and it takes us about five minutes per database compared to six hours before."

Simply put, network-attached storage (NAS) involves shared storage on a network. For Unix environments like Continental's (running Sun Solaris 8), it communicates using Network File System (NFS). NAS devices typically are dedicated, high-performance machines that serve specific storage needs. Continental uses a type of NAS device called a filer, which focuses all of its processing power on file service and storage.

Continental uses two F840 filers from Network Appliance Inc. (NetApp). They provide built-in RAID/4 support; fast online replication, backup, recovery and point-in-time copies; and rapid installation and reboot. The filers scale up to 12TB. LTL tape drives continue to be used for failover purposes.

Data Snapshots
Before implementing NetApp filers, Continental used to let software changes accumulate, only testing them every few months. Running tests more frequently would have involved too much restore time if modifications proved unacceptable. "For an application on a 150GB domain," Kaila says, "the backup could take six hours, and another six hours for the restore."

By switching to the F840s, the airline can use Network Appliance's proprietary Snapshot technology to take a point-in-time copy of the data of that same domain in less than five minutes. If any changes cause difficulties, it only takes a few minutes to restore the previous image.

Continental currently runs two Sun Enterprise 10000 servers linked to two filers through a gigabit Ethernet switch. The eight applications supported in separate domains by this infrastructure include a flight scheduling application, three revenue management applications, an airline profitability modeling application, a cargo revenue management application and a pricing application. Collectively, these applications encompass more than 2TB of data.

"With this architecture, we can shut down any domain and bring up the associated database on another machine," says Kaila. "So we never have to bring an application down if there is a problem with one domain, and we can move domains around to optimize performance."

NAS in a Nutshell

Continental moved from direct-attached storage to NAS for the following reasons:

  • NAS filers can be located anywhere on a network.
  • They relieve more expensive general-purpose servers of many file management operations so they can focus on more CPU-intensive activities.
  • Expanding storage capacity is simple and non-intrusive.
  • Filer backup can be completed without affecting the performance of general-purpose or application servers.

The airline takes seven daily snapshots of each database volume. It copies the last of these to tape during the night. This provides the equivalent of an incremental database backup without ever having to shut down the database.

On Saturdays, the company performs a complete backup of the most recent snapshot. One database snapshot takes five minutes to complete, compared to a cold backup's six hours. If data loss occurred on a Friday, Continental had to use the previous week's backup along with archive logs to restore the data. According to Kaila, the company dumps as much as 50MB into archive logs per database every 15 minutes. He estimates that it requires 50 hours or more for such a restore, compared to a couple of hours using a snapshot.

"We now store the PC files on the F840 too and take a snapshot of home directories every hour," says Kaila. "If a document gets lost in the morning, the user logs in to the snapshot area, finds the file and drags it over."

Gradual Installation
While maintaining the earlier direct-attached storage system, Continental brought in the filers, plugged them in and created the volumes for each database in about one hour. The next stage—copying the databases from direct-attached storage to NAS—took quite some time. Kaila cites data transfer rates of only 10GB per hour as the reason he didn't switch from direct-attached to NAS in one session. Within the F840s themselves, however, volume-to-volume copying took place at 50GB per hour. "We moved a database or two each night, and a few on the weekend," says Kaila. "One week later, we turned off the old system."

Once up and running, however, Continental's network struggled with high traffic volumes and greater storage demands. The original 100MB line between the Sun Solaris system and the NetApp filers wasn't enough. The airline set up a private 1GB Ethernet subnet for storage. Each database has a dedicated 1GB path to the filer.

Kaila reports a total of two disk failures since the system was brought online a few years ago. Because of the built-in RAID/4 and hot-pluggable capabilities, neither failure caused any downtime or loss of data. Continental recently added a NetApp F880 filer, raising its capacity to 10TB. According to Kaila's numbers, it cost the airline $300,000 for 10TB of storage, and each additional TB is $25,000 to $28,000. In comparison, he received quotes as high as $2 million from direct-attached storage vendors for 10TB.