Enterprise Grid Computing: The Value of Resource Pooling—A Transportation Analogy (Part 2 of 7)

A great way to understand the power and benefits of grids is to look at the philosophy and two key properties they share with large transportation systems.

Transportation systems and grids share a philosophy: they make large-scale resources available to a large user community on a shared basis. Jet aircraft may cost anywhere from a few million to more than 200 million dollars. A private aircraft can provide excellent service to its owner on a coast-to-coast flight. The obvious shortcoming of this solution, however, is that the cost of the plane and the fuel it takes to fly it across the continent makes this solution economically impractical for most people, and probably does not represent the best use of capital for general-purpose transportation. In spite of these realities, millions of passengers travel by air every year because aircraft resources are shared, and any single user pays only for the seat used, not for a complete jet and the infrastructure behind it.

There are tradeoffs in the sharing scenario. Shared-resource models come with overhead: users need to make reservations and manage their time to predetermined schedules, and they must wait in line to get a seat. The actual route may not be optimized for point-to-point performance: passengers may have to transfer through a hub, and the departure and destination airports may not be convenient relative to the passenger’s travel plans.

Aircraft used for shared transportation are architected for this purpose. Aircraft designed for personal transportation are significantly smaller, and would not be very efficient as a shared resource. Likewise, workstations repurposed into grids may impose certain limitations.

Transportation systems are heterogeneous, where sharing exists on a continuum. In an air-transportation system, users choose among a variety of dedicated resources, including general aviation and executive aircraft, time-shared aircraft, commuter aircraft, and the very large aircraft used in long-haul flights. Computing grids are similar, with a variety of nodes. This heterogeneity needs to be managed to minimize effects on performance.

While the air-transportation system is an instructive instantiation of a grid, it is so embedded in the fabric of society that we scarcely consider it as such. Grid computing will likely evolve similar to the way aviation did sixty years ago, gradually gravitating toward an environment of networked, shared resources as technology and processes improve.

Two Key Grid Properties

Ideally, the resources in a computing grid should be fungible and virtualized. Two resources in a system are fungible if one can be used instead of the other with no loss of functionality. Two single dollar bills are fungible, in the sense that they will each purchase the same amount of goods. If one of the bills is lost or destroyed, another one can be used instead for the same purpose. In contrast, in most computer systems today, if one of two physically identical servers breaks, the second is not likely to be able to take over smoothly. The second server may not be in the right place, or the broken server may contain critical data on one of its hard drives, thus halting processing.

A system can be architected to attain fungibility, for instance, by keeping data separate from the servers that process it. A long-running computation can checkpoint its data frequently, so if a host breaks, the new host picks up the computation at the last checkpoint. If the server was running an enterprise application, it could unwind any uncommitted transactions and proceed. An online user may notice a hiccup, but the computations are correct.

Fungibility helps improve operational behaviors. A node operating in a fungible fashion can be taken out of operation and replaced by another one on the fly. In a lights-out environment, malfunctioning nodes can be left in the rack until the next scheduled maintenance.

Virtualization allows ignoring physical differences among systems. For instance, from an application perspective, systems with different amounts of physical memory behave identically, allowing the application to be run without modification. This capability is essential in a grid environment. The only difference noted is in the performance behaviors.

Resources that are fungible and virtualized are easier to allocate from a common pool according to an organization’s policy. Resource pooling in a grid environment blurs traditional distinctions between clients, servers, and even mainframes: resources are drawn from clients through cycle scavenging. It is also possible to have servers running nothing but grid jobs, which can be likened to extensions to programs running on a client. In this landscape mainframe-hosted applications become additional resources on a grid. Virtualization would make accessing these resources transparent relative to allocation and billing settlements.

In a highly virtualized, fungible, and modularized environment, it is possible to deploy computing resources in small increments to respond to correspondingly small variations in demand. Procurement cycles are compressed from months or even years to almost real time (i.e., on a per-job basis). This significantly impacts business agility. Running a very large job does not require purchasing extra servers. The grid infrastructure allows searching for computing resources to run the job somewhere in the Internet. The user gets billed for resources used, not for the cost of capital for the new computers. This is the utility model.

A pure utility model is not yet practical, because the infrastructure is still primitive and evolving, a pervasive network of service providers is not yet available, and no one has yet figured out how to run jobs safely and charge fairly for the work.

About the Author

Enrique Castro-Leon an enterprise architect and technology strategist at Intel Solution Services, where he is responsible for incorporating emerging technologies into deployable enterprise IT business solutions.