Clustering: What Some Vendors Don’t Tell You

In the chicken-and-egg world of vendor marketing and user need, clustering has achieved "hot topic" status. Depending on individual needs, a variety of vendors manufacture appliances to target specific points of failure. To be effective, however, the solutions you choose must provide simultaneous coverage for multiple points of failure.

When managing data I/O becomes problematic, many network managers look to server clustering technology as their solution. In the chicken-and-egg world of vendor marketing and user need, clustering has achieved "hot topic" status.

However, in the hype that clouds clustering solutions, we can dimly make out the actual problems that it addresses. The hue and cry for server clustering is to handle exploding storage requirements, enable faster data access and increase data availability. In order to accommodate this, it uses the following mechanisms: To expand along with increasing data and speed requirements, the unused server system capacities and bandwidth on selected network servers is harnessed. Because a clustered server system involves multiple servers, a single server crash will no longer affect data accessibility.

As vendors rush to custom-design clustering solutions within these guidelines, various architectures are available with varying degrees of efficacy. A hungry man has many options for what to eat; not all options are nutritious or satisfying.

A network manager’s dream is a networked server solution with no single points of failure, where end users have no downtime (continuous access to data and applications). In order for clustering to make this dream come true, it must resolve several issues.

Making Every Link in the Chain Strong

Depending on individual needs, a variety of vendors manufacture appliances to target specific points of failure. With multiple points of failure, solutions must provide simultaneous coverage and still be cost-effective.

There are several high-availability configurations with no single points of failure within the data subsystem. How well the solution resolves multiple points of failure is a larger concern to the network manager. The ideal solution may need to take into consideration any number of these common failure points including the server, operating system, applications, CPU, RAM, disk drives (failure is minimized by mirroring or RAID 5 configurations) and disk controller (RAID 5 controllers do not commonly support dual adapters). On external RAID subsystems the SCSI fan out – multiple channel SCSI cabling connection – is too common, even when using redundant RAID controllers.

Also, redundant RAID controllers often share a common battery that is, itself, a single point of failure:

  • * Power supplies (failure is minimized by redundant UPSs on redundant circuits)
  • * Fans (if a fan fails, CPUs and drives can fail due to improper cooling)
  • * Network interface cards (a path must exist from a client to the server, or it is effectively down)
  • * Multiple locations (failure due to power supply outage, fire, flood, earthquake, etc., is minimized by having backups in separate geographic locations)
  • When using clustering to "fix" all evident problems in a storage network, the true solution is to provide continuous access to all data and applications. In a multi-server environment when one server is down, other servers must fill in for the failed unit to fulfill all data/application requests.

    Clustering Architecture

    Clustering goes beyond hardware installation. In a complete clustering solution, the proper hardware and software must be combined to resolve network issues; otherwise, the system will be down at least part of the time. For some network managers part of the time is much too often.

    With the correct configuration of hardware and software, each of the failure points can be compensated for and data/application availability can be maintained. Clustering also allows the ability to change components of the system without affecting users. A network manager can sleep more soundly knowing that when one node fails, access to files, printing and other applications is automatically supported on a different clustered node.

    Unlike RAID, where a number of vendors claim to manufacture a single, superior product that meets all needs, effective clustering solutions depend on a collection of components that must suit individual uses.

    Different clustering solutions are distinguished by considering each single component and how the complete solution operates together. A common multiple-node cluster to provide continuous data/application availability includes the following configuration:

  • * Two or more servers
  • * A server operating system
  • * A multi-initiated disk subsystem that supports two or more servers
  • * Clustering/automatic failover software
  • * Multiple network adapters for "heartbeat/keep alive" networks and client networks
  • However, a system that provides connectivity to one or more servers and increased disk capacity is not, in itself, a complete cluster. Critical to proper operation are cluster software, cluster-capable servers, clustering interconnect and a disk storage solution. Without the most knowledgeable I/O engineer on their side, network managers may be surprised to find that certain elements of the clustering solution itself can create a potential point of failure.

    Currently, NT and NetWare seem to be the primary markets for failover technology. File access and print services must commonly be supported for failover and are critical to proper operation. The software vendor and clustering architecture in place each make a difference when implementing additional applications.

    Manufacturers must understand that their varying clustering architectures inherently affect hardware connectivity for the user. How much hardware can be shared and how much hardware must be duplicated? How many nodes per cluster are supported?

    Software has to provide for the creation and management of clustering with manual failover to let the user upgrade, add software, change hardware or perform other maintenance with data/applications still available on another node. Under an automated failover system, additional cluster software must be purchased and implemented to allow the same flexibility as manual failover.

    Cluster-Capable Servers

    Clustering is impossible for some systems due to I/O bottlenecks and a finite number of PCI slots on a finite number of buses. These are vital considerations for network managers who know their clustered server must, for example, have a minimum of six PCI slots: one for a hard drive controller for booting, one for a tape backup subsystem, one for a disk that is shared with another server, one for connection to the other server’s shared disk, one for a network card to connect to users and one for a network card to connect to a dedicated heartbeat/link to the other clustered servers.

    Clustering solution vendors know too much activity on a particular slot creates an I/O bottleneck. By current design, PCI (v2.1) only allows four-full speed PCI slots (33 MHz/132 MB/s) for each bus without a bridge. For most PC servers with "dual-peer PCI," a bus has a secondary bus with either ISA or EISA slots, along with an embedded real-time clock. The clock identifies interrupts, like the keyboard, and all 8 MHz devices slow a bus dramatically.

    Clustered NT solutions may be used by servers like NetFrame 9000 (4 CPUs, 3 PCI buses, 8/16 PCI slots); Amdahl/Fujitsu M700I (6 CPUs, 2 PCI buses, 6 PCI slots) or Tandem/Compaq 7000 (4-8 CPUs, 2 PCI buses, 9 PCI slots, plus 2 EISA slots).

    Although clustering keeps data available, in failover mode data/application access is still slowed. Redundant system components then become an important aspect of the clustering solution. To remain operational, a system may also need to include hot-swappable components such as fans, power supplies or drives, for example. In the event a component fails, hot-swappable elements allow for access while the component is fixed and reconnected to the system.

    Clustering Interconnect

    There is much to be said for the current preference for fibre channel interface, usually 200 MB/sec (100 MB/sec full duplex) data transfer protocol with as many as 126 nodes per loop; a node can be a disk or a server. Additionally, a built-in fibre channel interface used in clustering solutions can cost significantly less in the long run.

    At present, only a few manufacturers offer clustering solutions with redundant host adapters, leaving the operation vulnerable to an additional point of failure. When considering an investment of this magnitude, network managers must be informed that the most trouble-free solutions have as many as four redundant host bus adapters.

    However, when a fibre channel "fabric" connection is in place, the fabric itself becomes a point of failure. A point-to-point fibre channel connection, on the other hand, enables instant re-routing of data paths without needing to fix the fabric immediately, thereby enhancing maximum data availability. The problem with typical SCSI disk connections is that the network is limited to 13 drives per SCSI channel, with 15 SCSI IDs using SCSI Wide, one ID used per controller. The use of SCSI interconnect also requires bus termination. If a node fails, the SCSI buses’ termination point changes or even disappears, which usually causes disk array failure.

    A point-to-point fibre channel connection’s success is based on a design that supports up to 64 drives per fibre channel adapter (with 18 GB drives, this allows for more than one TB of data per PCI slot used). The system also has eight slots for host connections, allowing the connection of either eight concurrent servers or four servers with dual path connections. It is engineered to connect to any server that has a free PCI (v2.1) slot, which allows an otherwise unclusterable server to be interconnected. With many other external disk subsystems, a company’s existing servers cannot be interconnected and new cluster-capable equipment must be purchased, significantly raising the cost of the solution.

    Users need to make sure the clustering interconnect is load balanced to reduce the occurrence of I/O bottlenecks. With load balancing, I/O is constantly distributed to the least-used host bus adapter. Another important aspect to consider is the ease of adding security and sharing features; a large distance between hosts allows these options to be implemented reasonably and inexpensively. The solution should be tested with multi-mode fibre to one km between hosts and with single-mode fibre to 10 km between hosts. At a greater distance, shared storage can also be flexibly designed.

    Despite the preference of some manufacturers to promote an Active/Passive cluster, this configuration only allows one-way failover. When failure occurs, the passive server – the designated target – assumes the identity of the active server. In this Active/Passive configuration users that were connected to the passive server lose their connection and are not able to access data until the active server is restored.

    In an Active/Passive cluster model the second server has a mirror of data, but does not share access to all equipment. Again, the user is obligated to purchase twice as much hardware as will actually be used. When all nodes are active, all equipment is used on a dynamic basis and the total cost of implementation is lower.

    Disk Storage Solutions

    All storage appliances are not created equal. Speed, capacity, flexibility, scalability, mirroring capability and data stripping are all relevant to the clustering configuration, and not all products enable clustering with the same success.

    For example, a shortcoming of some RAID storage devices is the inability of two RAID systems to mirror data without draining the resources of the attached server OS. As clustering technology advances, users can expect RAID processors that mirror data directly from one RAID to another for maximum data availability.

    Another quirk of some RAIDs is the requirement that all stripping drives should be the same size. Still others allow drives of different sizes, with the caveat that available data use is limited to the smallest-sized drive.

    The best choice for clustering is an array that easily strips to different-sized drives with the complete data capacity of each drive.

    Backup of data is vital to an organization. Backup options typically run between 5 and 15 GB per hour, and slow the server down enormously because of the increased network traffic. Backup copying – and creating offsite backups – without interrupting the primary backup server’s operation is a priority for most users.

    The need for cost-effective approaches to data access and management seems to indicate that server clustering will have an ongoing place in the industry. Since jumping on a high-priced technology bandwagon rarely pays off, network managers must be certain that their individual points of failure are addressed and the disk subsystem, servers, software and connections are most appropriate for their networks.

    Clustering can raise as many questions as it answers. One thing is for certain: Vendor’s products and services must meet the needs of the user both today and tomorrow.

    About the Author: David G. Swift is a Senior Systems Engineer for XIOtech Corporation (Eden Prairie, Minn; www.XIOtech.com). David is certified as MCSE, MCPS, MCNE, ECNE, CNI, CNE, CAN, OS/2E, LSE, LSA and AIX-CSA.