Wanted: A Host That Serves Reliability
Despite high-profile blowouts, managed service providers succeed at selling their high-availability message to large enterprises.
South Florida: Luxury condos, warm ocean water and pleasure-boat-lined docks. A place to escape your problems. Or, if you're John Wolf, a place where problems—also known as hurricanes—can escape to find you.
Wolf's the executive director of Océ Production Printing Systems, headquartered in Boca Raton, Fla. He's seen the darker side of working in paradise: Each year, from late summer through fall, Boca finds itself in prime, business-destroying hurricane territory.
Sometimes storms score a direct hit on the state, as in 1992 when Hurricane Andrew caused more than $25 billion in damage, knocked out power, and brought area businesses to a standstill. But even when Boca escapes a direct hit, its businesses can have trouble keeping the doors open. The industrial park that houses Océ requires all occupants evacuate within 24 hours of a hurricane's expected landfall.
Keeping critical communications and financial systems up and running in this environment used to keep executives like Wolf up at night. For the past three years, though, he's slept better, thanks to the wide variety of outsourced redundant and backup systems that will keep Océ online if a natural or technological disaster turns Boca Raton into a bull's eye. So far it's working. The company was forced to evacuate last October in advance of yet another hurricane, but his business didn't miss a beat.
Outsourcing is nothing new for large-scale enterprises, of course. But in the past, mainframe-driven corporations typically contracted for shared time on someone's Big Iron machine in case of an emergency. Today, managed services providers (MSPs) are taking this concept to a new and more sophisticated level by offering diverse technology infrastructures that provide shared and dedicated mainframes, as well as similar services for such midrange systems as IBM S/390s and AS/400s.
In addition, MSPs can provide Web hosting, storage systems management, and a mirror of a customer's business applications. Rather than just kicking in when disaster strikes, MSPs can monitor storage, networking, CPU and Internet resources to uncover impending problems before they turn into a system-strangling meltdown. Some services even operate as co-production facilities, so MSP customers can take advantage of expensive and sophisticated outsourced resources in their day-to-day operations, not just when emergencies strike.
MSPs, of course, must take their own safeguards to maintain high availability in their datacenters. Infocrossing Inc. in Leonia, N.J., establishes multiple physical paths for networking communications traveling from its buildings out to the central offices of duplicate telecommunications providers. The company maintains redundant networking hardware, including routers and switches that access these multiple communications networks.
Well-honed pitches notwithstanding, Big Iron MSPs haven't had smooth sailing in recent years. Like most technology sectors, MSPs took an economic hit from the dotcom blowout. Once-mighty Exodus Communications, for example, filed for bankruptcy late last year. Nevertheless, by touting benefits like as-needed management services, low up-front costs, and access to IT staff, a consolidated MSP industry still successfully courts large enterprises.
|MSP Market Growth |
The hosting market in the U.S. grew 25 percent last year, to $4.8 billion, according to market research firm IDC. Growth rate this year will be 40 percent, IDC says, and within four years, the Web-hosting market could reach $20.8 billion.
For Océ, being able to get a laundry list of services from Infocrossing was a big factor in helping it plan for many different contingencies. Boca is headquarters for Océ's production printing business, which serves a large number of telecommunications, banking and financial industry corporations. "They demand high availability of our printers," says Wolf. "We position ourselves against our other two major players in the U.S. market by our up-time." To do that, Océ's service dispatch and spare-parts-logistics systems must be available "24x7x365," he adds. Closely associated with those systems are the division's data and voice communications networks, which rely on the full range of messaging devices to keep everyone in sync, including cell phones, pagers, and e-mail systems.
Initially, Océ just contracted for mainframe resources, but today it takes advantage of the MSP's services with two tiers of high-availability safeguards. The first consists of mirrored Microsoft Exchange e-mail and Lotus Notes servers, which are replicated in real time and run in Boca Raton and at an Infocrossing datacenter in Atlanta. "Even in the worse case, our company will have the ability to communicate with e-mail," Wolf says.
Similar Boca/Atlanta redundancy exists for Océ's networking infrastructure, with a duplicate hub in Georgia that can keep Océ's 50 U.S. offices communicating via frame relay or dial-up services among themselves and with the parent company's headquarters in Germany if Boca's services go down. "The network is designed to have two hubs that are interwoven. If we lose the Boca hub, traffic is re-routed to Atlanta. The converse is true, too," Wolf explains. Similarly, Océ maintains a duplicate Internet presence in each of the two cities. Infocrossing's Atlanta facility also manages a frame-relay gateway to Europe "because we're more likely to vacate the building in Boca than in Atlanta," he says.
The second tier of availability includes Florida and Georgia mirrors of Océ's SAP and service records system. Rather than real-time replication, these servers are scheduled for less expensive weekly updates of dispatch and material movement records, and once-a-month replication of the accounting books. "Our disaster recovery plan is a living, breathing document," says Wolf. "As we discover more vulnerabilities and as our business expands, our plan becomes more critical."
|Managed Service Providers |
- ACS (Dallas, Texas)
- Computer Sciences Corp.
(El Segundo, Calif.)
- EDS (Plano, Texas)
- Infocrossing Inc.
- Sungard (Wayne, Pa.)
Not all companies judge high availability by their vulnerability to catastrophes like hurricanes. Sometimes, updating the infrastructure technology that runs a business can be just as disruptive. Close-out retailer Big Lots Inc. of Columbus, Ohio, swapped out its dial-up telecommunication lines last year for a new frame-relay system, in part to get its stores moving at a faster pace. "When we used dial-up for credit-card checks, it took 15 seconds. Now we can do it in four or five seconds," says Big Lots CIO Steve Bromet.
Big Lots isn't a stranger to IT muscle. To support its $3.5 billion in sales, headquarters facility, 1,344 retail stores, and four distribution centers, the company relies on scores of AS/400s and RISC 6000 servers. With 110 people in his IT shop, Bromet can tap a lot of technical talent. But as the new communications link connected facilities across 46 states, Big Lots realized it had entered new technology territory. "Frame relay changed our network thought process," Bromet recalls. "All of a sudden, we were in a more sophisticated environment, and we needed someone looking at it 24x7."
Instead of spending time and money acquiring in-house expertise to run the new network, Big Lots decided to investigate MSPs. Bromet evaluated a handful of providers and finally chose IBM Global Services, for its technology and wide-ranging partnerships with out-service providers. For Big Lots' frame-relay service, IBM subcontracted with AT&T. Cisco is the prime supplier of network hardware for the project. A Canadian partnerNuvo Network Management Inc.became the company that monitors the network for high availability when it offered something Bromet couldn't resist: an economical price. "Close-out retailers can be pretty cheap," he says, although he declines to give dollar specifics about the monthly charges for the service.
Nuvo, based in Ottawa, monitors Big Lots with remote network management via the Web to spot problems or identify ones just starting to develop, such as congestion on a particular router. "They're our first line of defense," Bromet says. "They'll contact Cisco if it's a hardware issue, or AT&T if it's in the frame-relay network. What I'm really looking for from them is a proactive orientation."
In a similar attempt to achieve high availability by off-loading the demands on its internal IT department, Blue Cross Blue Shield of Massachusetts signed on with MSP services provided by Plano, Texas systems integrator EDS, to quickly launch a new, Web-based transaction system. To accomplish this, EDS first installed a development and testing platform using 10 Sun Microsystems servers that EDS hosted at Blue Cross' Boston datacenter. A complementary production and staging resource at EDS' Plano operations used 27 servers and a total of 144GB of hosted storage space to get the new application ready for prime time.
Today, the insurer runs the transaction system using these resources. As part of its seven-year EDS contract, Blue Cross also receives management support for its mainframe and midrange computers, WAN, and desktop computers. "Mainframe support is still a big part of what we do," says Steve Lapekas, EDS's global service executive for Web and applications hosting. "What's needed today [is] a full range of services from Web hosting and application hosting all the way to mainframe management."
Service-Level Agreements: Iron Fists in an IT Manager's Glove
Choosing an MSP is relatively easy. The real work begins after that. Because of the complexity of the technology and systems involved, partnering with a service host tends to be a long-term relationship, with contracts often spanning five to seven years or more. Thus, to get off on the right foot and assure the marriage stays solid over the years, enterprises must spend up-front time on the service level agreement (SLA).
The contract covers a wide range of contingencies, but the two most important aspects are up-timewhat percentage of the time services are guaranteed to be availableand response timehow quickly an MSP service kicks in. Both areas include hardware and software the hosting organization has direct control over, such as the servers in its own data center, as well as third-party services like telecommunications networks and electrical power provided by other companies.
Up-time guarantees for high-availability applications often come in two levels, 99.5 percent and 99.9 percent. Companies often reserve the highest and most costly level for lifeline-class business systems, such as internal communications, that require immediate switchover to active duty. Important but less critical applications, such as inventory management systems, may exist in standby mode, which allows for switchovers to happen within minutes or hours. SLAs come into play by defining the consequencesoften a discount in the monthly service feeimposed on the MSP if it doesn't meet these agreed-upon availability levels.
To track service performance, some MSPs are beginning to offer customers real-time statistics. For example, EDS recently completed a pilot with 500 clients that tested a dashboard application that continuously shows up-time statistics within a browser interface. The company plans to roll out the application to its customers nationwide this year.
Because business needs change over time, especially over multiyear MSP contracts, an SLA shouldn't be only about numbers. The document should reflect the flexibility inherent in the partnership. "An SLA shouldn't be an all-or-nothing-at-all proposition," says Michael Bendit, vice president of product management at Infocrossing, an MSP in Leonia, N.J. "A good service provider will give you the ability to grow your contract over time as needed and as the relationship and trust grows."
Because the high-availability needs of large enterprises are so diverseranging from protection against catastrophic events to avoiding downtime from the assaults of network congestioncompanies need to set clear goals before they start evaluating MSP offerings. Infocrossing's Michael Bendit, vice president of product management, suggests that CIOs draft a one-page list of needs and goals that becomes a starting point for high-availability discussions with other business and technical people within the company. CIOs should then take the evolving goal sheet to IT people at other large enterprises for their suggestions. "[They] can give you some good ideas on how to address those needs based on real-world experience," Bendit says. "Then refine your requirements based on the information you gather and go back to the vendors who seem to be the best fit."
An early decision about MSPs involves choosing between a large provider that can supply a hosted environment entirely from in-house, versus an MSP that may subcontract services with other providers. Depending on a corporate customer's requirements, one-stop shopping may be less expensive, but the trade-off could be less flexibility in service offerings.
For Wolf, deciding between several vendors to find the right one takes more than a cold, hard evaluation of their datacenters. "You want an MSP that can accommodate your twists and turns." He says that the efforts the Infocrossing staff took to learn and understand Océ's applications and infrastructure convinced him the two companies could establish a workable relationship. "We talk to them in terms of processes and applications, and they're good at translating that into technical capabilities in their centers."
No matter what decision a corporation makes about whether or how to use an MSP to maintain high availability, it should never let its guard down. "A lot of companies develop a three-ring binder and say, ‘We have a business recovery plan in place,'" says Wolf. "But to me it's a constant process that requires a partnership between you and your provider." That process can help defend a company against the darker side of IT life, even in sunny Florida.