Distributed Capacity Planning, Part 1

Editor's note: In part one of this two-part article, we review a brief history of Capacity Planning, explain how to prioritize your CP need and control CP in scaleable, distributed environments.


Capacity Planning has been a discipline practiced by IT since before the invention of the mainframe. It encompasses different measurement, analysis, modeling and reporting techniques and strategies. But with the advent of distributed computing, with hardware resources competing in what may be termed a commodities market, should Capacity Planning be treated with the same reverence as in the past?

We will discuss Capacity Planning's history, so that we understand and can sing the hymn "But That's How We've Always Done It." But, more importantly, we want to point out:

  • The role of IT as a mission-critical Service Provider to Lines of Business in a company, and why it is so important to focus on Service Delivery for the survival of the company;
  • Whether it makes sense in this decade to keep doing Capacity Planning the same old way because "That's How We've Always Done It";
  • What's really behind the mainframe costing model (long lead times, millions of capital equipment dollars being budgeted, necessary facilities, etc.) and why it may no longer make sense today;
  • When it does make sense to do Capacity Planning and how it aligns with the more important concept of Service Delivery;
  • That there are alternatives to complex tools that model down to the disk revolution.

Capacity Planning's Beginnings

In 1965, Gordon Moore, who later co-founded Intel, predicted that the capacity of a computer chip would double every year. Moore had looked at the price/performance ratio of computer chips -- the amount of performance available per dollar -- over the previous three years and simply projected it forward. Moore himself didn't believe that this rate of improvement would last long. But ten years later, his forecast proved true, and Moore then predicted that chip capacity would double every two years. To this day, Moore's predictions have held up, and engineers now call the average rate of capacity increase -- a doubling about every 18 months -- Moore's Law.

Moore's Law is likely to hold up for at least another 20 years. Storing all those bits shouldn't be a big problem either. Consider 1983, the year that IBM released the PC/XT. This was the first PC with an internal 10 Megabyte internal hard disk. Customers who wanted to add the 10 Megabyte drive to existing systems needed to by a $3000 kit -- which made the cost per Megabyte $300. Thanks to the exponential growth described by Moore's Law, the end of 1997 had 6.1 Gbyte hard drives selling for close to $400. That comes to less than $0.07 per Mbyte. And holographic memory is a new technology that can hold terabytes of data in less than a cubic inch of volume -- implying that all of the volumes in the Library of Congress could fit in a holographic memory about the size of a fist.

Service Delivery

Today's customers are not concerned with the underlying hardware, operating system, or application software that brings the information to them. Their only concern is getting timely, useful information. For capacity planners, the same philosophy should hold true -- the key is to identify useful information, and then display it quickly and inexpensively. Useful information is the information that has the global perspective, rather than detail. Cost-effective delivery may imply auditing existing IT resources to determine the true cost of delivering business services and information. This must include not only the cost of hardware and software, but the cost of people as well. Buying a new piece of hardware may be cheaper than spending the money on having several people study it, and subsequently buying something that only is a tiny bit less expensive.

The cost of maintaining a Capacity Planning staff should be examined for today IT environments, if for no other reason than to understand the return on investment. In the past, if equipment were purchased that was incorrectly sized for an application, that mistake could cost tens of millions of dollars. Today, a sizing error for a single server, for example, would only cost several thousand dollars. Yes, large systems could require purchasing lots of servers. But if there isn't enough capacity, very often the solution is to buy more servers or to increase the memory and disk capacities of the ones already purchased. The difference for making similar mistakes is orders of magnitude different in price.

What's the Real Reason to Do Capacity Planning?

If mission-critical business applications become overloaded, the poor performance that results could have a very serious consequence: revenue can be lost if dissatisfied customers move to the competition. This consideration, and this consideration alone, is enough for organizations to want to provide sufficient capacity for the mission-critical applications to perform well. It is interesting to note here that when IT is not perceived as providing adequate service levels to different lines of business, outsourcing becomes a very serious alternative. If you can't do it right yourself, pay someone else to do it for you.

Note, too, that the cost of providing too much capacity is just as critical a concern. What company can afford to tie up millions in hardware simply because this equipment might be needed at some time in the future? And does senior management really know how much it costs to obtain the benefits from their IT investments in hardware, software and personnel? IT managers today want to be assured that their sizeable investments are not only under control, but are being responsibly managed.

When is Capacity Planning done? Simply, it's when a business manager realizes that the new application (or new release of an older application) may have a negative impact on customers. If customers see that the performance of the new system is significantly worse than the old, they'll want to go back to the old system! Why is Capacity Planning done? A successful business succeeds if they are responsive to customer needs. Thus, new functions/features of applications need to be added to keep the customer satisfied. The Capacity Planner, in close harmony with the Application Designer, are responsible for assessing the new functionality being built, and finding the appropriate software and hardware architecture that will work efficiently and as inexpensively as possible. And that also means assessing future application functionality and its impact on application design and system architecture so that the useful life of the application is as long as possible.

What about business vs. technical requirements? The business requirements demand functionality to keep existing customers satisfied and to attract new customers. The business needs this functionality to be available as soon as possible, and as cheaply as possible. But the technical requirements to build that functionality might be very demanding. That implies identifying a software and hardware architecture that might be very costly. It is precisely here where the capacity planner can offer some assistance. On a broad level, each proposed architecture should be assessed for its potential performance limitations (response time) and corresponding service level (amount of actual work done).

With a model of the entire system, bottlenecks can be identified, and substitute components inserted and evaluated. At this level, it is not necessary to get very precise predictions of throughput, utilization, and response time. We only need to identify the broad capacity limits and compare them to the anticipated business volumes of work. That way, the business will know, with some degree of confidence, that the architecture being built will have sufficient capacity on day 1, and should be useable until the work volume increases beyond what was anticipated.

Scalability and Compatibility

What would seem to be a perfect business plan or the latest technology today, may soon be as out-of-date as an air cooled car engine, an 8-track tape player, or core-type memory. Companies contemplating making large investments into IT will try to avoid repeating the mistakes made previously in the computer industry. History, it would seem, is a good teacher, and observing many companies over a long period of time can teach us principles that will help us with strategies for the years ahead. The key to understanding mistakes is the need to initiate rather than to follow trends. Let's look at some actual history.

In the 1950s and first half of the 1960s, many companies were trying to establish themselves as leaders in the computer industry. What each company did, even within their respective product lines, was that each model had a unique design and required its own operating system and application software. Computers at different price levels had different designs -- some were dedicated to scientific applications, others to business applications. It took a great deal of energy and time to get software that ran on one computer to run on another. But this was the trend ... just keep building different machines and operating systems.

But the initiative that revolutionized the industry came out of seeing a real business need. Organizations did not want to keep re-inventing the wheel as their capacity needs grew bigger. And certainly they did not want to keep converting software so that they could say that they were at the "leading edge" or that they were at "the state-of-the-art." The key, found by Tom Watson of IBM, was to develop a scalable architecture. All of the computers in IBM's System/360 family, no matter what size, would respond to the same set of instructions. All of the models would run the same operating system.

During a similar period, the minicomputer industry was created by Ken Olsen when he founded Digital Equipment Corporation (Digital). He offered the world the first small computer -- a PDP-1. Purchasers now had a choice: they could pay millions for IBM "Big Iron" System/360, or pay about $120,000 for a PDP-1. Not as powerful as the mainframe, it could still be used for a wide variety of applications that didn't need all that computing power. In 1977, Digital introduced its own scalable-architecture platform, the VAX, which ranged from desktop systems to mainframe clusters, and again, scalability did for Digital in minicomputers what it had done for IBM in mainframes.

What's the lesson here? Companies like IBM and Digital were successful then because they saw a need that business had ... to fill incremental computing needs in different ways, without having to waste prior investments in IT. This same need is still with us today. If a company needs more computing power, they ought to be able to get more power so long as its mission-critical application software can still run.

Computers were once intentionally designed to be incompatible with those from other companies -- the manufacturer's objective was to make it difficult and expensive for existing customers to switch brands. Amdahl, Hitachi and other mainframe clone companies ended the mainframe monopoly IBM held. In addition, a cottage industry emerged in the storage arena where companies like StorageTek or EMC could supply completely compatible disk drives for the generic mainframe. Market-driven compatibility proved to be an important lesson for the computer industry.

This notion of market-driven compatibility extended into software and operating systems. While UNIX was once only the darling operating system of the academic community, it become embraced by many hardware manufacturers, including Digital and HP. With its proliferation on many machines, even IBM could not ignore its presence. We see today that MVS, the proprietary operating system of the mainframe, now includes many functions and features to make communication with other UNIX-based systems seamless.

Perhaps the most critical business IT problem has been solved too -- software portability across platforms - by Sun's JAVA. Efforts like IBM Systems Application Architecture (SAA) and consortia efforts like those from the Open Software Foundation (OSF) tried to define infrastructure common to all. But all of these efforts failed miserably. With JAVA, an object may be defined on a Sun SparcStation, clipped to a Web page on an HP 9000, cached on NT, and fired on a Mac or Network Computer. JAVA makes dynamic distributed systems possible, where we can readily move objects around for optimal placement during development time, deployment time, and even run time. Perhaps compatibility across platforms is really here.

Scalable architectures and market-driven compatibility are concepts that drive capacity planning for distributed systems. The key here is the network -- the glue that connects the seemingly different components of the architecture. Thus, capacity planning becomes less of a function of say, counting available MIPS, and becomes more of a function driven by anticipated new business that has to be processed:

  • We want to scale our applications up to process more work;
  • We want them to run on the new hardware we acquire;
  • We need to connect applications (i.e., data) that currently exist on different platforms; and
  • We don't want to re-invent or convert anything, if we can help it, to keep our costs down and our productivity up.

Capacity Planning is going through a re-engineering process. And Capacity Planning's "cousin," Performance Management, is also going through a similar transition. Sam Greenblatt, Senior Vice President of Advanced Technology at Computer Associates, recently said, "Integrating application, system and network management is helpful only if it yields useful business information. Nobody cares whether or not a system is down if it doesn't impact their business." Again, the key -- Service Delivery -- is that Capacity Planning is a methodology used to ensure that maximum service is delivered to customers.

Is Capacity Planning a Checkoff Item?

While many organizations would consider their mainframe capacity planning efforts as mature, these same organizations are at the earlier stages as far as planning for distributed environments are concerned. Perhaps because budgets were departmentalized, and because of the dropping cost of hardware, individual department managers had gone to a vendor, and made the "buy" decision for client/server gear. Here, the traditional capacity planner was not involved in the process. The next few years may see the planning function return to central IS management in many organizations.

In recent times, we must ask why the planning function was not housed in central IS? Primarily cost; the hardware has, for the most part, become a commodity. A 16 Megabyte upgrade to the mainframe required a two-month justification study; today, we can buy 16 Megabytes of memory for a PC at our local computer store for about $100. So rather than burden the planner with commodity shopping, users took on that responsibility.

This notion of hardware being a commodity has thrust capacity planning under the microscope -- do we even need CP any more? With prices dropping and technology advancing, it might be easier to just buy new gear when you need it, period, and not do any Capacity Planning at all. Consider, too, the cost of doing a Capacity Planning study -- people with specialized analytic skills are needed; the study will take some serious time before it's completed; and dedicated machine time may be needed to collect measurements from specific performance experiments. In today's commodity market, Capacity Planning may not always be a viable option.

Again, we need a broader focus with less precision. It no longer is critical to obtain the exact utilization of each server in the network. What becomes more important are questions like "is the network providing adequate performance? What should we get/do if it isn't?" We need to understand distributed applications and, more importantly, how to grow distributed applications; that is, how to build distributed applications that are scalable. How many more users can be added, while preserving acceptable performance (i.e., response time). These questions force the capacity planner to seek a new perspective -- one that is more closely tied with application-specific measurement.

Capacity Planning is not done for the same old reasons anymore. Capacity Planning is done for business reasons. The key to success for a business is ample delivery of necessary IT services, of which scalability and compatibility are critical components. So the traditional capacity planner must now become more application savvy ... and must understand the network. You must focus attention on identifying the parts of an application that won't scale up well ... and then offer solutions. You must always keep compatibility across platforms at the forefront of their thinking. And you must be able to anticipate bottlenecks -- either in a server, the network, or a client -- and propose alternative components that will avoid broad performance problems. Deploying new applications on a specific architecture may be wonderful today, but may become disastrous tomorrow if that architecture becomes a dinosaur and new/faster/cheaper gear is available.


About the Author:

Dr. Bernie Domanski is a Professor of Computer Science at the Staten Island campus of the City University of New York (CUNY) with nearly 25 years of experience in the data processing. He is the author of over 50 papers, has lectured internationally and is CIO of the Computer Measurement Group (CMG).