Lower Costs Pushing Adoption of High-Performance Computing

Lower HPC costs make it possible for more companies to be cutting edge in their market.

by John Lee

Many challenges face the high-performance computing (HPC)/supercomputing industry today. Professionals are consistently asking themselves: How do we drive down prices while increasing performance? How do we curb high power costs? How can we manage bigger machines to support the production environment? What is the most efficient way to leverage the multi-core and many-core processors of today and tomorrow? How do we use and manage the heterogeneous computing environment necessary for multi peta- and exa-scale computing of the future?

One thing is certain: high-performance computing costs are dropping -- a move that is greatly expanding the relevance of HPC and offering new applications in other industries.

The forces driving down the cost of HPC systems are making HPC more accessible to organizations in the same way the widespread adoption of cell phones worked in the consumer market. Suddenly, industries that seemed to have fewer applications for HPC systems are finding ways to leverage them to improve their products, services, and solutions. Industry experts expected that the dropping cost would level off -- but the industry has instead responded by expanding the use of HPC systems into new applications.

Because HPC is no longer exclusively about the traditional scientific community, supercomputing has become a way for businesses to stay cutting-edge in the market. New product development benefits (such as modeling and simulations) and more affordable HPC over the past 10 years makes being out in front an achievable goal for organizations of all sizes.

The Emergence of Multi-Core Processor Technology

Multi-core technology is following Moore’s Law of processor performance. These same technologies have allowed for the advent of virtualization, as well as the ability for each system to increase functionality. With each system’s enhanced functionality, even at the same scale (i.e., the same physical number of managed servers), HPC system performance continues increasing exponentially.

It wasn’t long ago that we went from single core to dual core. Today, we have 4-, 6-, 8- and 12-core processors. With the same number of nodes, we can have at least twelve times the performance of yesterday’s systems. Since the actual number of managed nodes can stay the same, the complexity of managing the HPC system remains somewhat unchanged. Having multiple cores adds complexity at the application level, but from a system management perspective, it is less of an issue as you scale out the system.

In terms of pure LINPACK performance, HPC systems have scaled well, evidenced by the world’s first petascale system and how quickly it superseded the world’s first terascale systems. However, sustained system performance over a large number of multi-core processor systems is still difficult.

When you examine the performance of an HPC system, the word that comes to mind is “balance” -- balance between processor, memory, and IO, as well as applications and software tools that are smart enough to leverage these pieces properly. We have witnessed enormous computing capacity increases in the last 10 years. It will be up to the software programmers and the ISVs to leverage these advances to substantially impact application performance.

Advancements in Parallel Computing

All new supercomputers are clusters of some kind and therefore are distributed parallel systems. The shift towards parallel computing happened because it is more cost-effective to build large systems based on commodity building blocks than to build out the monolithic systems of the past. At the same time, it is impossible to build machines big enough to tackle the world’s biggest problems. Ultimately, it is easier to expand the system to tackle these problems.

Parallel computing constitutes greater than 95 percent of the top 500 machines today. The questions the industry is trying to answer now have to do with the choice of processors for parallel computing. Is it better to use traditional serial processors such as x86 CPUs or use massively parallel processors such as GPUs and accelerators?

There are other difficulties: Distributed parallel computing has put the burden solely on the computer programmers to redefine the software to take advantage of distributed parallel systems. Even with advancements in software tools, this is difficult. A paradigm shift is coming again with the advent of GPGPUs with many cores, representing even greater challenges to programmers for rewriting existing code to take advantage of their hardware characteristics.

This scale-out model also presents management and networking problems. Networks in essence becomes your computer and the performance of your system becomes dependent on the performance of your network. Additional problems involving power, cooling, space, and density occur when you start to scale out to very large systems.

Performance Improvements

From a hardware standpoint, AMD and Intel are both vying for x86-64 supremacy, and HPC end customers have benefitted much like the PC consumers did in the late 1990s. As expected, with the standardization and commoditization of the server space, we are seeing truly amazing price/performance HPC machines. AMD deserves much of the credit for ushering in the x86-64 architecture, integrating the memory controller to the CPU package, and riding on the DDR memory path.

With Intel and AMD trying to outdo one another on which company can fit more CPU cores in a CPU package, we have a third viable option with the emergence of nVIDIA and their Fermi GPUs. Efforts that nVIDIA have made in the HPC space cannot be understated.

Massively parallel processing is here to stay. With AMD’s acquisition of ATi and their upcoming Fusion technology -- as well as Intel’s revival of the Larrabee project -- hardware innovations will continue to drive the HPC space and bring a lot of excitement to the industry.

Memory is probably the single most important factor that affects the performance of an HPC system. Think of it like the law of diminishing returns; the further you go down the IO hierarchy, the less impact you have to the overall system performance (generally). With the increasing number of CPU cores in the same CPU package, having access to very fast, low latency and high bandwidth memory is critical to ensuring that HPC applications run optimally. This isone of several reasons why GPU computing is showing so much promise in HPC.

For the Future

The future of HPC is never written in stone. For now, expect that systems will continue to scale out to larger numbers of processors and nodes as countries vie for the world’s first exascale system. In addition to employing accelerators such as GPUs and FPGAs, these trends will continue to apply more pressure to software developers and hardware integrators to figure out a way to make all these elements work well together.

Expect systems to continue on their current path of improving performance by leaps and bounds while maintaining a very attractive price/performance. The key will be extracting real application performance out of these hybrid machines. The vendor who can best put together a solution that addresses all of these challenges while making the system manageable and user-friendly will position itself as the HPC leader.

John Lee is the vice president advanced technology solutions group at Appro where he is responsible for leading the company's hardware product development engineering team. He also leads the project management team responsible for deploying Appro’s complex cluster solutions. For questions and comments, you can contact Appro at

Must Read Articles