12 Essential Steps to Successful VMWare-based Virtualization
Virtualization planning can be complex, but our one dozen guidelines can help you get the most out of the technology.
by Andrew Hillier
VMware’s Virtualization solutions promise greater hardware utilization and flexibility. Yet, like all Virtualization technologies, they carry a certain amount of risk. Stack too many servers and you achieve significant financial gain but incur the risk of operational incidents and performance problems in your production environment. Ignore the business-related aspects of your production environment that aren’t present in the lab and you may put critical apps at great risk. Rely too heavily on some of the advanced automation features of VMWare without the proper planning and you could wind up with bigger problems than you were trying to solve in the first place.
Proper planning is at the heart of any technology optimization initiative, and this applies to VMWare as well. Large-scale virtualization necessitates a data-driven approach, carefully evaluating elements such as business considerations, technical constraints, and workload patterns. The VMWare world is very fluid, so it’s important to achieve an optimal initial placement of virtual machines and understand how to keep the environment optimized.
Virtualization planning can be very complex if you don’t use the proper planning tools. Regardless of the approach, organizations should ensure that they are following basic guidelines—we offer a dozen below—during the process.
1. Watch for Technical Factors that May Introduce Risk.
Be careful when combining servers that have differing configurations, diverse underlying platforms, or varying network/storage connectivity. Combining onto a single physical host servers that touch too many networks can drive up costs by increasing NICs and PCI extenders (blade racks are particularly sensitive to this). Uncover any hardware or configurations of interest, such as SAN controllers, token ring cards, IVRs, proprietary daughterboards, direct-connect printers, or other items that are not part of the standard build.
This process, called variance analysis, reveals hardware configuration affinities and "outliers" that ultimately helps you avoid any interruption of critical business services during the virtualization process.
2. Consider the Key Business Constraints that Govern Your Environment
Consider real-world business constraints, such as availability targets, maintenance windows, application owners, compliance restrictions, disaster recovery strategies, and other business sensitivities. Most small-scale virtualization planning doesn’t go beyond simple workload analysis, yet any foray into larger production environments will show that it is very important to dig much deeper. For example, it’s not unheard of to combine virtualization candidates based solely on utilization data and end up with a dysfunctional environment where there is not a single time in the calendar when the physical server can actually be shut down for maintenance.
Considering the maintenance windows of the applications in the planning phase will avoid such problems, and it is not always wise to rely on Vmotion to get out of a jam. Likewise, mixing different availability levels can either create risk or waste expensive hardware.
3. Tackle Political and Financial Ramifications of Infrastructure Sharing
Prepare to address the politics of your consolidation decisions. Application owners may have real or perceived reasons why they cannot share infrastructure and these often require additional constraints and/or what-if analysis to resolve the issues.
In addition, most chargeback models aren’t sophisticated enough to deal with a virtualized infrastructure and will break down if resource sharing crosses certain boundaries. Using affinity regions based on departments and application owners may be a wise decision in cases where political or financial considerations pose a challenge.
4. Exhaustively Examine Workload Patterns and Personalities
Everyone wants to maximize savings, but there is a trade-off between risk and return when virtualizing existing environments. What is acceptable in a lab is usually not the same as what is required in production, and the risk of performance degradation is often a key consideration when determining the target utilization in a virtual environment. Organizations must understand this and properly evaluate workload patterns to determine their own comfort with savings, stacking ratios and operational risk levels.
Some of the most important aspects of workload analysis, such as complementary pattern detection and time-shift what-if analysis, are often overlooked when determining if workloads can be combined. Looking at these areas in depth and across all the major CPU, I/O and resource capacity food groups, helps ensure that you’ve maximized utilization while leaving enough headroom to cushion peak demands on the infrastructure.
5. Understand Virtualization Overhead
When analyzing VMWare consolidation, look at the overhead created by the virtual machine. Unlike physical servers, VMWare virtual servers create CPU overhead when data is sent to disk or over the network. Typically organizations build in a fixed-percentage overhead when planning virtual environments, but this approach can sell systems short. The best approach is to properly analyze I/O rates and project a more accurate utilization curve that factors in application workload as well as the true overhead introduced by virtualization.
6. Analyze Constraints Together, Not in Isolation
Don’t plan virtualization based on any one constraint viewed in isolation. Consider all critical constraints together when choosing targets. Taking one-dimensional analysis of workload, for example, will not only limit your success, but can cause critical performance, security, and compliance issues.
Organizations should take a multi-dimensional look at the net effect of all of the key constraints applied to the pool of potential resources to determine the optimal path to a virtualized infrastructure.
7. Always Go Forward in Security and Compliance
Ensure that as machines are virtualized they do not breach compliance rules. For example, regulations about information sharing between divisions within financial services or health-care operations necessitate that certain applications and databases be kept separate. Keeping systems apart from their disaster-recovery counterparts or cluster/replication peers is also critical. In addition, security zones should be maintained unless the organization has a clear mandate to redefine what can cohabitate in an environment and/or on a physical system.
8. Understand the New Roles Introduced by Virtualization
ESX administrators are a new breed of IT professional, and their frequent access to disk images of multiple virtual servers tends to give them broad visibility into applications and their data. This sometimes creates a "super super" user role that is unprecedented in many environments, and has the potential to violate regulatory and internal compliance rules. Proper virtualization analysis and planning looks for these vulnerabilities and providing a risk matrix that helps the organization ensure continued compliance.
9. Don’t Abuse VMotion
VMotion is an extremely powerful technology that will likely revolutionize the way many environments are managed. Even so, do not use it as a crutch and rely on it to compensate for poor planning or inadequate management of an environment.
Purposefully creating sub-optimal VM placements with the expectation that you can VMotion your way out of trouble is rarely a good strategy, particularly in production environments. This creates a "try it and see" culture that encourages people to try out different combinations and assume they can just reverse these out if they don’t work.
10. Lay Explicit Ground Rules for DRS
VMware’s Dynamic Resource Scheduler (DRS) automatically motions servers according to workload balancing criteria, and because it is not inherently aware of the technical and business constraints on an environment, this can tend to scramble systems from a technical and business perspective. To combat this effect, DRS supports affinity and anti-affinity rules that are used to identify which systems should be kept together and which should be kept apart.
While good in principle, this system is difficult to program without a proper understanding of the relevant constraints. A convenient byproduct of the constraint-based analysis described above is a complete map of all relevant affinities and anti-affinities in a server cluster, providing rules that eliminate potential conflicts and ensure that security zones, business constraints, compliance issues, disaster recovery, and chargeback systems are all respected and that the virtualized infrastructure remains optimized over time.
11. Model Plenty of "What-If" Scenarios
Organizations should test scenarios leveraging analysis of business, technology, and workload constraints to better manage their pool of resources. Virtualization allows capacity to be managed in aggregate—providing the potential to revolutionize capacity planning. This makes it possible for businesses to explore a variety of options for optimizing their environment. What would happen, for example, if I virtualized multiple data centers together? Which servers are good candidates for consolidation and will work best together? What is the difference between putting these servers on blades versus rack-mount systems?
Altering pre-conceived notions of which servers should be included in an initiative or adjusting risk levels can reveal new opportunities for savings.
12. Don’t Get "Tunnel Vision" About Virtualization
Understand virtualization alternatives. Any virtualization initiative should be part of an overall optimization program. Organizations must recognize that virtualization is just one of several strategies that can be used. Java applications and J2EE content are already abstracted from their physical environment, and database instances reside in a database server that isolates them from the surrounding infrastructure.
It may not be necessary to virtualize these applications at the operating system level. Utilizing their inherent scaling/clustering strategy may be more effective, both technically and financially.
Virtualization planning is not just a sizing exercise. From a planning and management perspective, virtualization is a multi-faceted challenge that can quickly become political. A methodical and data-driven approach to assessing and planning virtualization opportunities is the best way to drive out risk, positively engage application owners, and ensure that success is achieved beyond the "low hanging fruit".
To that end, leveraging multi-dimensional analysis of all critical constraints and carefully planning for the specific technologies and platforms in use is key to assuring the success of virtualization initiatives.
- - -
Andrew Hillier is co-founder and CTO of CiRBA. You can reach the author through their Web site at http://www.cirba.com.