Enterprise Grids: A Framework (Part 3 of 7)
We explore what grids are, their usefulness, and how they should be implemented and deployed.
Reaching an industry consensus on the concept of computing grids has been a slow and painful process. One factor has been the breadth and the expansive scope of grids. The first challenge is defining grids—discovering what grids are, understand their usefulness, and explore how they should be implemented and deployed.
It is difficult to come up with a meaningful definition for grids that will not become obsolete because of the rapidly evolving technology. We use a different approach through a flexible framework that can be adjusted as technology evolves. The framework will also serve as the basis for a discussion on usage models in the next article in this series. The framework reflects an inherent hierarchical structure in grids, i.e., grids can be understood in terms of multiple levels.
Much of the confusion stems from the fact that each level exhibits distinct traits. Quite often two people may come up with different conclusions about grids, which are both correct because they are unwittingly dealing at two different levels of discourse.
In a grid ecosystem we can clearly distinguish three domains: the Physical Grid, the Visible Grid, and the Business Grid. These are not different grids; they are different logical overlays analogous to the ISO OSI model used to explain the workings of the Internet through multiple layers.
The Visible Grid encapsulates the commonly understood notion of grids, namely a collection of nodes forming a co-located cluster, and geographically distributed clusters (and individual nodes as well) forming a computing grid.
Below the Visible Grid lays the Physical Grid, comprising the elements and technologies used to build grid nodes.
The upper layer of the Visible Grid also defines the boundary for hardware and the Business Grid region. These layers are defined by the size of the organizations that deploy or use a grid. Grids can be owned by a single organization; these are departmental grids. If a grid is large enough, it may cross organizational boundaries within a company. These are intra-grids, which lie behind a company’s firewall. Even bigger grids may span more than one company. These are extra-grids (see Note 1, below), which can be a group of intra-grids running an application collaboratively in an outsourcing (or other business arrangement). The highest abstraction considered in this model is that of Virtual Organizations (VOs). These are essentially a business abstraction on top of extra-grids. The concept of Virtual Organizations is described in the paper “The Anatomy of the Grid” (see Note 2).
The notion of grid VOs crosses enterprise boundaries. The closest examples today are represented by companies such as Boeing or Airbus Industrie. Building a new aircraft “platform” today (such as the 787) is an exercise in collaboration between the primary organization (namely, the Boeing Company) and a large number of technology partners and suppliers. VOs provide a collaborative environment to build a product well beyond the reach of the resources of a single company. The nominal platform “owner” reaches out to other companies (such as Northrop Grumman for building parts of the airframe or Raytheon and Allied Signal for avionics).
In this process the companies involved may share the grid-computing and data resources needed to get the job done in an arrangement similar to that carried out by business partners in a supply chain management (SCM) arrangement. Planning future platforms around this collaborative model eases the transition between platform development and production. At this point, technical and enterprise computing converge, becoming parts of a continuum in a development process. Grid computing facilitates these relationships through a computing environment that makes it easy to exchange data and sharing computing resources.
Because computing grids are by definition inherently distributed entities, an essential ingredient to understanding grids is the technology used to interconnect or bind together the grid entities at each level. These technologies are indeed specific to a level. The interconnect technologies and software for each layer have been captured in Figure 1.
We won’t cover each layer in detail due to space limitations. Sampling a few layers, at the lowest level the CPUs and chipsets in a baseboard are “glued” together with conductive traces running on the baseboard with protocols such as HubLink. Add-on cards may be connected through PCI technology, such as PCI-X or PCI Express.
The nodes in a cluster may be connected through Ethernet or InfiniBand technology, whereas the nodes in a grid are tied together through the WAN or some type of MAN or ATM/Sonet link.
For each layer in the hierarchical model is a characteristic set of software that makes the layer run. Microcode is the “software” associated with CPUs or co-processors, whereas firmware (the BIOS or EFI) is used to control and configure baseboards. Higher up, the operating system runs on nodes. Grid middleware is used to extend the reach of the operating system to integrate nodes into actual grids. Grid end-user applications run on top of the operating system.
It is possible to think of IT processes as the “software” for the upper grid abstraction layers. These IT processes allow the configuration, procurement, deployment and management of all types of grids.
The highest form of “software” is made up of the business processes used to run a grid. The business processes provide guidance on how composite applications are built and interfaced under SOA guidelines.
Finally, looking at this model from a business strategy perspective, an important consideration is the “granularity” for each entity in a layer (that is, economic impact per unit for each of the layers). The approximate order of magnitude has been captured in the second column of Figure 1. For instance, a CPU can be had for a price that ranges from a few dollars to a few thousand dollars, whereas a baseboard can cost anywhere from a hundred to a few thousands of dollars. Likewise, the cost of a node may range from thousands of dollars to hundreds of thousands, and grids and clusters may cost several million. Economic granularity determines the business models to be used at each level: purchasing a single server can be approached almost casually, whereas purchasing one thousand may require a carefully crafted technology deployment strategy. Likewise, building an extra-grid infrastructure may require several months of procurement negotiations.
Understanding grids all at once constitutes an intractable task. The framework described in this article allows analyzing grids one layer at a time in a divide-and-conquer approach. It also allows adjusting the level of detail to the task at hand, from strategic business planning all the way down to the minutest technical implementation details. We will use this model in the next article when we discuss grid usage models. Usage models tell us what grids are useful for, a precursor to ROI analysis.
¹ The notion of intra-grids and extra-grids is attributed to IBM (Grids 2004: From Rocket Science to Business Service, The 451 Group.)
² The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications, 15(3), 2001. Paper in http://www.globus.org/alliance/publications/papers.php.
Enrique Castro-Leon an enterprise architect and technology strategist at Intel Solution Services, where he is responsible for incorporating emerging technologies into deployable enterprise IT business solutions.