In-Depth
The Fibre Channel Infrastructure: The Next Highway System for Heavy Data Traffic
Leading the way in next-level networking topologies is Fibre Channel. Fibre channel is a high-performance multiple-protocol data transfer technology. Fibre channel is a serial interconnect standard that allows servers, storage devices and workstation users to share or move large amounts of data instantaneously. Fibre channel delivers speeds that are 2.5 to 250 times faster than other existing communication and I/O interfaces.
There are three fibre channel connection schemes:
• Point to Point, which provides a single connection between two servers or between a server and its RAID storage system.
• Switched Fabric, which uses a Fibre Channel Switch, allowing each server or RAID storage system to be connected point-to-point through a central switch. This method allows for the construction of massive data storage and server networks.
• Arbitrated Loop, which connects up to 126 severs, RAID systems and/or other storage devices in a loop topology.
Arbitrated Loop is the natural choice for connecting servers and storage systems to support business networks. Fibre Channel Arbitrated Loop (FC-AL) refers to a shared (or arbitrated) bandwidth version of fibre channel used in high-speed data storage applications. FC-AL supports up to 127 nodes per loop and 10 kilometer cabling ranges between nodes.
Storage Area Networks
Fibre channel extends beyond being the best infrastructure for multiple high-speed peripheral connections; it allows for the development of Storage Area Networks (SANs). SANs differ from LANs in that they are created specifically to move large amounts of data. SAN refers to a network that connects servers and data storage devices at 1-Gbps speeds. Most SANs are fibre channel-based storage infrastructures that create storage groupings and server groupings to reach across multiple departments of an organization. Even a remote office can be rolled into the mix. Platform issues, file formats and other incompatibilities can be overcome with proper structuring, and the ability to share various types of data is greatly expanded.
Network nodes on a SAN may include storage devices, servers, routers and switches and in some cases, workstations, such as graphics terminals that need high-bandwidth access to data. The servers and workstations that have fiber channel connectors and ethernet adapter cards usually supply points of connection to an ethernet network.
Clusters – Virtual Mega-Servers
Fibre channel offers an ideal protocol for clusters – groups of independent servers that function as, appear to users as, and are managed as a single system. Microsoft announced NT clusters, SCO announced UnixWare clusters, Sun announced Solaris/Intel clusters and Novell announced Wolf Mountain clusters, indicating a growing trend.
Last year, approximately 2 million Intel servers shipped, with about 100,000 of them used in clusters. According to the IDC forecast, in 2001, 3 million Intel servers will ship with 1 million of them to be used in clusters.
The IT market is moving to clustering to accommodate dramatic growth in online applications, such as e-commerce, OLTP, Web servers and real-time manufacturing. Clustering is important to applications where the server must be online 24 hours a day, seven days a week, 365 days a year, requiring new levels of fault tolerance and performance scalability. Virtually any business dependent upon high-volume data transfer can benefit from this technology.
There are two types of clusters: high-availability clusters, which might be created using Microsoft’s Wolfpack 1 or Compaq’s Recovery Server; and load balancing clusters, that are also known as parallel application clusters, and might use Microsoft’s Wolfpack 2 or Compaq’s VAXClusters. Load balancing clusters are actually a superset of high-availability clusters.
In a high-availability cluster, each node of a two-node cluster is seen as a single server. During normal operation, both servers do useful work. When a node fails, applications failover to the surviving node, and it assumes the workload of both nodes.
In a load-balancing cluster, two-node clusters that are also seen as single servers. The cluster rebalances the workload when the node dies. If different applications are running on each server, they failover to the least busy server, or as directed by present failover policies.
RAID is Essential
Directing data traffic is the RAID controller. The intelligence supplied by RAID insures data availability 24x7. RAID controllers designed to support fibre to SCSI, fibre to fibre, or any other fiber channel configuration will make sure that data traffic is properly managed. Moving, storing, protecting and managing critical data has an ever-increasing value as systems expand into clusters and SANs.
Clustered servers always have a client network interconnect, typically Ethernet, to talk to users and at least one cluster interconnect to talk to other nodes and to disks, or they can have two cluster interconnects. One is for the nodes to talk to each other, called the "Heartbeat Interconnect," and is typically Ethernet. The other is for the nodes to talk to the disks called the "Shared Disk Interconnect," and this is typically SCSI or fibre channel.
Each node has Ethernet NIC (which is the Heartbeat Interconnect), a private system disk (generally on an HBA) and a PCI-based RAID controller – SCSI or Fibre. The nodes share access to data disks, but do not share data.
Each node has Ethernet NIC, which is the, multi-channel HBA’s connecting the boot disk and external array, a shared external RAID controller on the SCSI Bus.
About the Author: Suzanne Eaton is the Senior Marketing Communications Manager at Mylex Corporation (Fremont, Calif.).
***
SIDEBAR:
SAN Maps for Simplified Data Security
Shared access to a consolidated pool of storage promises to dramatically reduce total cost of ownership by decreasing the storage resources that need to be purchased and lowering the cost to manage the data stored on those resources. But shared access to disks is not without risks. A security barrier is generally required for sensitive data, such as the HR and corporate financial databases. Some corporations require limiting access to other types of data, such as product design or customer databases. On the other hand, enabling related departments, such as engineering and manufacturing, to share access to data can dramatically enhance productivity.
RAID controllers present all of their logical disks or LUNs (Logical Unit Numbers) to all attached nodes. Hence, an access control or "mapping" mechanism that provides a security barrier by defining node-to-logical disk affinities is required in most corporate SAN environments.
Mapping Tables. A mapping table is a simple data structure that uniquely identifies nodes by their worldwide names (WWNs) and specifies their access privileges to LUNs. There are three general requirements for mapping tables.
First, mirrored copies of the table must be maintained in multiple devices to avoid losing the entire SAN if a piece of hardware (the one with the table) fails. Second, a locking protocol is required to insure that simultaneous updates from different initiators to the table or to different copies of the table are not allowed. Third, the mapping table needs to be accessible through a management interface that provides an consistent view of all node-to-logical disk relationships, and the interface should be integrated within a broader storage management system to avoid increasing management complexity.
SAN Design Strategies. SAN mapping tables can be implemented at the server, switch or storage array levels. Storage arrays for SANs generally include server-independent RAID controllers packaged in disk storage enclosures.
Host-Based Mapping
Designing mapping tables at the system software level, while feasible, is problematic. SANs are typically heterogeneous collections of servers. Departmental SANs typically combine some flavor of UNIX with NT servers. In some environments, AS/400 or NetWare servers are thrown into the mix. A host-based mapping strategy uses filter drivers that restrict file access to the logical disks or LUNs that each server is allowed to "see." There are several disadvantages to this strategy. First, incremental cost is added to each SAN node. Second, a different filter driver is required for each type of OS participating in the SAN and the filter drivers may need to be upgraded with each release of the operating system. Third, complexity increases unless the filter drivers are integrated into a common management framework. Fourth, the filter drivers add a layer to the I/O stack that translates into additional processing overhead.
Switch-Based Mapping
The second approach uses "zoned" fibre switches that map server ports to array ports. Zoning may work well for partitioning the storage management task, but a finer grain mapping scheme than "port-to-port" is required to share the logical disks behind an array controller. Server port-to-logical disk mapping is needed to partition the logical disks among the SAN nodes; for example, to isolate the HR logical disks to the server hosting the HR applications. This example points out the security benefit of mapping at the logical disk level: Keeping a database with sensitive information isolated from unauthorized users. Since departmental fibre switches do not recognize the notion of logical disks, zoning does not provide the required level of granularity. To deliver server-to-logical disk mapping, SAN switches would have to open every packet passing through the switch, parse its contents and validate its logical disk access rights if the packet is addressed to an array controller port. This process would add considerably to switch latency.
Array-Based Mapping
The third implementation strategy is to have the array controller maintain the mapping tables. Mapping physical disks to logical disks and presenting them to all attached servers is intrinsic to RAID controllers. Mapping logical disks to SAN nodes is a natural extension of RAID mapping technology. An array-based mapping strategy does not impose overhead on the SAN nodes since the mapping is done by the controllers. I/O latency remains unchanged since the array controllers have to open every packet anyway and the mapping function can be easily integrated into the array’s management utility. Most SAN designs use external RAID controllers; hence, there is little or no additional cost with this approach. This strategy provides a security barrier at the logical disk level. With server port-to-logical disk mapping, sensitive data can be isolated to authorized user groups.
SAN RAID Controllers
External RAID controllers for mission-critical environments are designed to operate in dual, active pairs that function analogously to high availability cluster nodes. When both controllers are operational, each does useful work servicing I/O requests. If a controller fails, its partner emulates its loop address and services I/Os directed to its native address and its failed partner’s address. Aside from a short failover transition period while the surviving controller cleans up any partially completed writes, controller failover is transparent to the SAN nodes.
In SAN topologies with nodes that support failover HBA drivers, dual, active RAID controllers with mirrored, coherent data caches and redundant switches, a failover anywhere in the I/O path will not cause a loss in data accessibility.
About the Author: Kevin Smith is Senior Director of Business Management and Marketing for External Products at Mylex Corporation (Fremont, Calif.).