Enterprise Grid Computing: Parallel Distributed Computing (Part 5 of 7)

A core capability of the grid is parallel distributed computation—scaling application performance beyond what is possible with one computer or grid “node.” Unfortunately, most applications today don’t take advantage of parallel computing.

A core technological capability of the grid is parallel distributed computation—the ability to scale the performance of an application beyond what is possible with one computer or grid “node.”

In an Enterprise context, grid computing represents one more technique for increasing equipment utilization. It extends the continuum of server virtualization from multiple logical servers running in one physical server to one application running in multiple physical servers. If a job can be decomposed in multiple tasks, and each task carried out in a different node, the time it takes to run a job is shortened correspondingly, minus the overhead of parceling out the work. Equipment utilization improves in grid environment with dynamic allocation (where CPU time is allocated when needed by a job). Otherwise, the CPU is released and available for use by another job.

Because moving data around has an inherent cost, computational resources are easier to utilize if they are co-located in a single data center. As we saw in the discussion about the hierarchical grid framework, technical considerations (i.e., Visible Grid level usage models) may be overridden by considerations at the Business Grid level: it may be less expensive to host grid services in countries like China or India. The deciding factor is balancing the inherent cost of managing distributed data against the savings from running applications across countries.

An interesting topic for speculation is whether the grid will mirror the patterns for outsourcing seen elsewhere in the information industry. Because a grid architecture tends to blur the effect of geographical distance, and labor is a significant component of the operation of a data center, it would not be surprising to see grid data centers migrate to countries with lower labor costs. This effect may be tempered by security and privacy concerns. Security technology will continue improving, although privacy is a non-technical issue likely to remain. These concerns may limit initial grid deployments to multinationals where presence in multiple countries keeps a grid within organizational boundaries.

In spite of business considerations, physical laws must be taken into account in the way grids are planned and operated. The enormous build-out of fiber optic has lowered the cost of moving data across continents. However latency in parallel computations is an issue: it takes up to 0.3 seconds for a signal to travel half-way around the world over a satellite link, or a little less over a fiber optic link. This timing does not consider additional equipment delays. This delay, or latency, determines the minimum unit of work, or working set, that can be processed efficiently by the system.

For instance, assume a hypothetical case where a computer in the United States requests a transaction be processed at a Chinese data center, and it takes one millisecond to execute with a one-second round-trip latency. Furthermore, assume that in order to send a second transaction, the results of the first one are needed. In this setup, the grid utilization is 1millisecond per second, or a mere 0.1 percent, which is probably unacceptable. Circumventing this problem often requires clever programming. In this case, if it were possible to have 1000 transactions in transit simultaneously, the problem might be solved with some tinkering and re-engineering.

The same inference can be made about data: If a grid can consume 6.4GB of data per second, data sets need to be at least 6.4GB in size. If the data sets are smaller than that, the system starves and utilization goes down. The product of processing speed in terms of bytes per second times the latency time in seconds yields the characteristic working set or characteristic message size if a communications link is involved, measured in bytes for a given problem. This is the smallest problem set that will fully utilize the grid. The inefficiency associated with a grid working on undersized data sets is analogous to that of a 747 jetliner flying with too many empty seats. Thus, small problems are still better processed locally with a single computer.

Parallel computation needs to be harnessed at each level in the hierarchical model. The parallelism we have been discussing so far is geographically distributed parallelism, where parts of an application get doled out for execution across continents.

Applications also need to consider taking advantage of cluster-level parallelism across machines co-located in a data center, node-level parallelism with parts of a job running across multiple CPUs in a server or grid node, and parallelism across multiple cores and functional units within a CPU.

Regarding multi-core parallelism, while Moore’s Law continues unabated in terms of gates per chip, another turning point has been reached in this decade: until very recently, extra performance came from an ever-faster-running processor and from the use of functional units to uncover parallelism within the instruction stream. Continuing on this path has led to increasing heat-dissipation problems.

At this point, it becomes more power-efficient to run two or more processor cores on the same CPU chip. One core of a dual-core CPU may be slightly less powerful than the prior-generation’s single-core version. However, when the two cores are used together, they are significantly faster than the single-core version.

This situation will create a powerful motivation for both hardware suppliers and applications vendors to incorporate parallelism into their solutions. Vendors may experience significant user resistance to migrating to a multi-core environment if the application can use only one of the cores, where the performance running with one core is lower than in a prior-generation single-core CPU.

Over the long term, application vendors and consumers will become increasingly comfortable with building parallelism into their applications. This familiarity with parallelism will also make it easier, eventually, to port and run these applications in a grid environment.

It is fair to say that a large proportion of applications today do not take advantage of parallel computation. However, it is also fair to say that most applications need not be re-architected for parallelism—only those that are mission critical to an organization.

Applications such as word processors do not have performance requirements that justify running them in multiple CPUs. These applications are inherently “parallel” in an office as multiple users run multiple instances at any given time.

The same applies to the front end in a three-tier application. Additional capacity can be added to the front end by bringing in additional Web servers in a scale-out setting. The Web servers are front-ended with a load balancer. This is essentially an embarrassingly parallel program because transactions coming from outside are mostly independent from each other. If the business logic in the mid-tier is not designed for parallelism, all incoming transactions must be queued behind one server. If this serialization is an implementation artifact, this situation provides an opportunity for parallelism in a grid setting. These modifications require investment from the software vendor.

Likewise, if most database updates do not overlap, in theory these could be done in parallel. There are usually legacy restrictions in the implementation that prevent this parallelism from happening. Database vendors have made significant strides in implementing back-end parallelism with a database being serviced from multiple nodes. This is not a problem that has been fully addressed.

Parallel operation is possible today, but the expectation for grids goes beyond enabling parallel operation to dynamic provisioning. Unfortunately, most parallel databases require careful manual configuration. The dream of on-the-fly, dynamic provisioning is still a goal, not a reality.

About the Author

Enrique Castro-Leon an enterprise architect and technology strategist at Intel Solution Services, where he is responsible for incorporating emerging technologies into deployable enterprise IT business solutions.