Medical Backup: Technology Initiatives at the NIH and Johns Hopkins University: Demand Cross-platform, Distributed Backup Solutions

The National Institutes of Health and Johns Hopkins University search for fast, distributed and flexible backup solutions - as well as automated tape libraries - that can function across many operating systems.

To perform their mission effectively, great research institutions must provide their faculties with the best computer technology. It's not surprising then that two of the most prestigious research institutions in America today, the National Institutes of Health (NIH) and Johns Hopkins University, have large-scale initiatives underway to bring the finest and most cost-effective technology to their respective research communities. The NIH has begun a major project to store medical images and make them easily accessible to researchers and physicians treating patients through the NIH's Clinical Center. Meanwhile, a team of network specialists at Johns Hopkins' Milton S. Eisenhower Library is re-organizing and updating the university's research technology to make it state-of-the-art for the new millennium. In both cases, backup and automated tape libraries are important components in the overall strategy.

NIH Uncovers New Knowledge

The goal of the NIH, which is part of the U.S. Department of Health and Human Services, is "to uncover new knowledge that will lead to better health for everyone." Located on a large campus of more than 300 acres in Bethesda, Md., the NIH has recently initiated an archive and retrieval system for medical images as part of the Multi-Modality Radiological Image Processing System (MRIPS) sponsored by the Laboratory of Diagnostic Radiological Research (LDRR). The project is called MARS (the MRIPS Archival and Retrieval System), and several aspects of the project, including the network backup, are managed by Shalom Nevet.

MARS is designed to be both easy to use and as reliable as possible. "We are constantly looking for ways to make MARS fail-safe," explains Nevet, "because the information we are storing is so critical." The MARS project receives copies of all of the CT, MRI, PET, and other scans of patients at the NIH, and losing any of them would not only delay diagnosis and treatment but also cause seriously-ill patients to have to repeat expensive and sometimes uncomfortable procedures. Archiving scans correctly is also important because some patients are treated at the NIH for many years. MARS must be ready to retrieve a series of scans from a particular patient so that physicians and researchers can use the scans to study a whole course of treatment.

Many complex security measures are built into the MARS system to make it reliable. Whenever a medical scan is performed at the NIH, a copy of the scan is sent directly to a buffer on a MARS server running a version of UNIX. The MARS project has several powerful servers running Sun Solaris, SunOS, HP-UX, SGI IRIX, Digital UNIX, and Linux. After it is logged in, the image is sent from the buffer to a RAID array where it may be stored for up to a full day, depending on how close it arrives to the time of the daily backup. Each day's images are held in a separate directory so that they can be easily identified, and currently, about eight gigabytes of new data is accumulated daily.

Daily Backup Is Lynch Pin

The daily backup is the lynch pin of the system because it allows the daily set of images to move from the temporary storage of the RAID array to the permanent archival system, which consists of an HSM package and an automated tape library. Once the backup job has run successfully, the image is written to the read-only tape library, which has a current capacity of a terabyte of data. When the archival copy is complete, the image is finally deleted from the RAID array. Researchers can then retrieve images from the automated library for study and manipulation on their own workstations. And if anything goes wrong, MARS has a backup tape in storage.

"As you can see," observes Nevet, "our key strategy is redundancy." At some point every day, MARS has three copies of each image safely stored on its system: one in the temporary storage on the RAID array, one on a backup tape, and one in the archival tape library. Since such a large amount of data moves through the system daily, the backup product has to be reliable. If a backup job should fail, or worse, the backup system become disabled, the RAID array would fill up with images very quickly. In addition, the new scans would not be available from the archive. While the system is under development, the scanners themselves are temporarily storing copies of the images. However, once MARS is thoroughly tested, the archival tape library will be the only source of images other than backup tapes, which, of course, would only be used in emergencies.

Must Process Seamlessly Across Platforms

How did Nevet choose a backup for such a critical operation? His first requirement was that the backup product must work seamlessly across platforms. It had to work equally well with the many different variations of UNIX on Sun, HP, and SGI, and with Windows NT. Although Nevet's main focus is the backups for the archive and retrieval system, he also has to be able to back up researcher's workstations, his own group's developmental scripting and "paper work," and a variety of related logs and reports. Increasingly, this data is held on a wide variety of machines, some running a variation of UNIX and others running Windows NT. "Our project keeps expanding," explains Nevet, "and we cannot predict what we will have to back up next month or next year. For example, we recently began to provide Web access to our images, and we have to back up those servers too. We cannot limit ourselves with a backup that doesn't work well on both UNIX and NT. We had to have a cross-platform backup. There was no question about it."

Distributed Devices Required

Another feature that security-conscious Nevet was looking for in his backup was the ability to distribute devices. "Ideally, I wanted the master server to be in one building and the backups themselves to be in a different building across campus. If I have a failure of any kind, I want to know that another copy of an image is available on a different media in another building. This is critical because I am responsible for actual patient data."

Nevet eventually chose a backup product that satisfied all his requirements, and had two other very attractive features. One was the very easy-to-use graphical user interface. "I looked at a lot of UNIX-based products that could also back up Windows NT, and it was obvious to me that there was only one that had a GUI that was developed with the product instead of one that was added later. You can see this in how easy it is to use and how intuitive it is, especially when managing automated libraries. Other products I tried had GUIs that were so frustrating to use that I gave up and went back to using their command lines because that was obviously how they were designed to be used."

The second strength that Nevet cites is the responsiveness of the backup products technical support staff. "In our environment," explains Nevet, "we can't tolerate poor technical support. Our services are critical to NIH patient care, and we cannot deal with any vendor who would sweep any problem we were having under the rug, telling us they would get back to us in a few days. The response from our backup vendor has always been timely, and second to none in my experience."

Johns Hopkins Focuses on Backup

Johns Hopkins University, another prominent research institution, has also been focusing on backup solutions recently as part of its technology initiatives. At the center of these initiatives is the Milton S. Eisenhower Library, the University's invaluable main research resource. One of the newest innovations used at the library is the Internet-accessible HORIZON online computer system. The HORIZON online catalog has powerful search capabilities and a very fast response time that mirrors the excellent use that the Hopkins staff is making of the newest digital and network technologies.

The keyword used by Don Pliska, a Network Operations Support Specialist in the Systems Office at the Eisenhower Library, is "evolution." Pliska searched for a backup product that was rich enough in features to grow along with the changing situation at Hopkins. Not surprisingly, the first criteria Pliska looked for is the same one sought by Mr. Nevet at the NIH: cross-platform compatibility. However, the reason for this requirement at JHU was quite different. Mr. Nevet at the NIH is in charge of a new project while Pliska is working on updating a huge system that is already in place. He wanted to run his backup from a Windows NT server, but some of the most important data he needed to back up was on the library's NetWare system. He also knew that his group would very likely be asked to back up UNIX machines in the near future too because many important campus-wide systems, such as email and HORIZON, are administered from powerful UNIX servers.

"We assumed that we would have to be ready to back up everything," explains Pliska, "NT, NetWare, and UNIX." And Pliska was right. The Systems Office is already running backups of NT and NetWare machines that average about 14 gigabytes a week, and the staff has recently added a configuration for a UNIX server that will be backed up by year end too.

Nevet's second criteria for backup at the NIH, the ability to distribute backup devices throughout a network but keep centralized control, is also the second criteria that Pliska used at Hopkins, but again for a totally different reason. In a distributed setup, only backup-related messages move across the network while the data is backed up directly to a backup device, or within a subnet, and Nevet was interested in this strategy mainly for security reasons. He wanted to locate backup devices in widely separated areas in case of power outages or other problems. But Pliska wanted distributed backup because of the Eisenhower Library's very tight backup window. His first concern was keeping overall backup elapsed time to a minimum.

Elapsed Time Is Critical

Although the public might perceive of a university as peaceful place for contemplation and study, the atmosphere at a research university like Johns Hopkins is very dynamic. The university library is not open for a normal "business day" during a normal "business week." Rather the Eisenhower library is "open for business" during the academic year from 8 a.m. until 2:30 a.m. four days a week, from 8 a.m. until midnight on Friday and Saturday, and from 10 a.m. until 2 a.m. on Sunday. During reading period and final exams, the library is open for study 24 hours a day. This leaves the library with a smaller backup window than many large corporations.

"Since the staff comes in at 7 in the morning," says Pliska, "we have very little time to back up the library's own systems. Ideally, we'd like to get everything done in the four-hour window between 2 and 6 in the morning. That's the main reason we wanted a distributed backup. Distributing backup devices gives us a lot of scheduling flexibility through enhanced concurrency and multi-tasking. It lets us run our backups faster."

Both LAN and WAN Support Needed

Pliska also notes that distributing devices gives a backup both LAN and WAN capabilities, something he needs in the "evolving" situation at Johns Hopkins. The Eisenhower Library is located on the main Homewood campus, a large local area network, but Johns Hopkins also has two satellite campuses that Pliska had to consider as part of a wide-area network. "We had to think in terms of backing up data on those two campuses too," explains Pliska, "and perhaps distributing some of the administrative duties while we kept centralized control. And distributing devices also lets us reduce the amount of data going over the WAN link."

A final issue that is just as critical to Pliska as it was to Nevet is technical support. "The technical support staff for one of the first backup products we tried took two to four days to get back to us about a problem, and we wanted faster response. In a sense, we need a vendor who will form an alliance with us, a kind of partnership, so the accessibility of the support staff was very important to us."

A Note on Cross-Platform Backup Configuration

Both the MARS project at the National Institutes of Health (NIH) and the Systems Office at the Eisenhower Library at Johns Hopkins University (JHU) are using Backup Express from Syncsort Inc. of Woodcliff Lake, N.J., as their network backup software.

The MARS project has a Sun SPARC 20 as its master backup server with an ADIC FastStor tape library and a single DLT tape drive attached. The FastStor holds backup images while the single DLT tape drive is used for catalog backup exclusively. This is necessary because of the large amount of new data (current 8 gigabytes of new data per day and growing) backed up and cataloged daily. A StorageTek 9730 is also used for backup and is located across campus where it is connected to a Sun ULTRA 2. Client machines are running Sun Solaris, SunOs, HP-UX, SGI IRIX, Digital UNIX, Linux, and Windows NT.

JHU's master backup server is a Dell Pentium 266 running Windows NT Enterprise Server software. The master server is currently connected to two automated libraries: a Qualstar 4420 with two AIT tape drives and an Exabyte 10E with an 8mm tape drive. JHU first purchased the Exabyte and then the Qualstar as their backup project evolved and they needed more capacity. The Qualstar has a maximum capacity of four drives, and JHU intends to add the other two drives as needed. Client machines are currently running NetWare 4.1 and Windows NT. Machines running various versions of UNIX may be added to the backup by year end.

Analysis Reveals Common Criteria

Analyzing the technological initiatives at both the NIH and Johns Hopkins University reveals several common criteria for choosing a backup product for a modern network environment:

  • The backup must be cross-platform, allowing data to be backed up from UNIX, Windows NT and NetWare machines with equal ease.
  • The backup must allow devices to be distributed for security, speed, and scheduling flexibility.
  • The backup should be easy-to-use and have responsive technical support.

Although their situations are quite different, the MARS project at the NIH and the technical initiatives at Johns Hopkins University are both important research institutions that need a scaleable, fast, distributed, and flexible backup - and that backup has to work across the many operating system platforms in use today.

About the Author:

Ira Goodman is a Software Services Manager at Syncsort Inc. in Woodcliff Lake, N.J. With 24 years experience in information processing, he currently manages support for Backup Express, a backup product for networks running UNIX, NT and Netware, and advises customers on the setup and maintenance of distributed backup systems and automated libraries.