72-Node Cluster Shows Sort, Expansion Potential

What does the future of large-system Windows NT configurations look like? Ask the folks at Tandem. The Tandem Division of Compaq Computer Corp. (Cupertino, Calif., www.tandem.com) helped construct a 72-node, 144-processor Windows NT cluster for Sandia National Laboratories (Albuquerque, N.M., www.sandia.gov). The configuration was targeted at performing simulations and other non-business commercial and secret governmental tasks, but it provided a glimpse of what large-system, data-centric configurations built on Windows NT could look like in the future.

Before turning the keys of the system -- named Kudzu for the perennial vine that tends to grow in every direction -- over to Sandia researchers, Tandem and Sandia programmers configured the system to run a benchmark test, called the 1 Terabyte (TB) Sort. The benchmark was used to gauge how fast the configuration was when compared with the one other machine -- a 32-processor, Silicon Graphics Origin2000 system -- that was configured to run such a test.

Although Kudzu includes no database structure, Sandia researchers contend the sorting demonstration is relevant to commercial interests, particularly for data warehousing and decision support purposes.

Carl Diegert, a scientist at Sandia, says the cluster will be used to sort through the data produced by supercomputer-based simulations being conducted on other machines at the research center. "What we’re going to do is data management and visualization of the results of these simulations. Because the simulation results are so large, we can’t warehouse the results for 20 simulations [on the machine that runs the simulations]. That’s what Kudzu does for us."

The Kudzu machine is built using ServerNet I technology supplied by Tandem. ServerNet I is an interconnect system that is normally used in configurations of six nodes or fewer and more typically supports Tandem’s Windows NT-based clusters used for commercial transaction processing. ServerNet I includes software extensions that allow it to support the Intel Virtual Interface (VI) architecture, and provides the high-speed node-to-node communication technology that enabled the configuration of a 72-node cluster.

During the sorting test, only 68 of the 72 nodes were installed. The system used Windows NT Workstation 4.0 with Service Pack 3 installed on most of the nodes, and Windows NT Server on the controlling node.

The NT cluster, built with 400 MHz Pentium II Compaq ProLiant 1850R dual-processor systems, completed the 1 TB sorting operation in less than 50 minutes. The Silicon Graphics Origin2000 configuration performed essentially the same 1 TB sort in 2 hours and 32 minutes.

"We have bettered that [previous benchmark result] by 300 percent and two-thirds the price, for an overall price performance measurement [improvement of] five times," says Pauline Nist, vice president of products and technology at Tandem.

Nist says that even with 68 nodes to share the work, the Kudzu machine was I/O bound during the test, leading to average processor utilization of about 46 percent. "If you go though the technical paper, you will find the bottleneck with this kind of system is not compute -- the bottleneck is I/O." She says the price/performance ratio could have been better had the system been configured with uniprocessor nodes, given the moderate processor utilization levels the two-way SMP boxes experienced.

The Kudzu system’s results were audited by Chris Nyberg, president of sort software vendor Ordinal Technology Corp. (Orinda, Calif., www.ordinal.com). "The Terabyte Sort really isn’t a defined benchmark," Nyberg says. "The only other one that has been published is the one that my company and Silicon Graphics ran last December. Sandia asked me to audit it -- I think they wanted the results audited for their own credibility." Nyberg adds, "They did indeed sort 1 terabyte of data."

Several benefits already have come out of the project, particularly for Tandem. David Cossock, software performance specialist with Compaq’s Tandem Labs, says the company has applied for a patent on the parallel merge algorithm used to perform the test. "Given the projections for future enhancements, I don’t believe any other platform can sort as fast as this one can -- and that includes Tandem’s NSK. Sorting is clearly the dominant application for a COBOL programmer. It is hard for me to believe that superior capacity is not meaningful for commercial applications," Cossock says.

Microsoft Corp. senior researcher Jim Gray, a noted database expert and one of the creators of Microsoft’s TerraServer image database (www.teraserver.com), points out that the demonstration typifies one requirement placed on any large database. "Sorting is a metaphor for many kinds of commercial operations. In the database world, you [have] the join operator. The join operator includes traffic very similar to sorting."

Still, some outside observers question what commercial value the system offers. "There is actually no database -- from any vendor -- inside this benchmark. It is a high-volume sorting test, more applicable to supercomputing purposes," contends Jeff Jones, program manager, data management marketing for IBM Software Solutions. "Given that I believe databases are a mandatory part of almost all commercial applications, I’m not sure this 72-node demo is indeed very representative of typical commercial use."

Proponents note that the configuration represents another example of scalability demonstrated by Windows NT-based systems. Microsoft’s Gray observes, "This is fundamentally a clustering application. If you have twice as much money, you could build twice as much computer."

On that point IBM’s Jones agrees: "What’s good [here] is that Compaq/Tandem did this large sorting test using the Intel VI Architecture to enable 72 nodes to be interconnected. This is cool, and further endorses this emerging standard [for] connecting multiple servers together into a cluster for high-end computing purposes."

Microsoft’s Gray acknowledges that critics will emerge who will dispute the NT scalability aspect of the Kudzu machine. He adds, "The only way to argue with such people is to show them that they’re wrong."

How the 1 TB Sort was conducted:

  • 1 TB of data was broken into 67 pieces
  • Local sorting took place on each block of data by each of 67 nodes
  • Blocks of data were partitioned in preparation for parallel merge
  • Parallel merge conducted to finalize sort