Q&A: Avoiding VM Stall

What gets in the way of managing and controlling your virtual environment effectively, and what best practices can you use to avoid VM stall?

Data centers managers are under pressure to change process, management, and procedures to scale, manage, and control their virtual environment -- but problems all too often arrive to derail or stall such change. What's getting in the way of a more efficient data center, and what can you do about it? For answers, we turned to David M. Lynch, the vice president of marketing at Embotics (www.embotics.com), a company focused on automation and control for the virtual data center.

Enterprise Strategies: What is VM stall?

David M. Lynch: Virtualization is a new architecture in data centers, and one that crosses most of the traditional silos. It entered the data center in a different way than most technologies; driven by the potential economic savings associated with consolidation and the value of the flexibility it brings to IT organizations. It was introduced as a top-down initiative aimed at decreasing the ongoing footprint of the data center and preparing for an internal cloud architecture. It came in fast, grew fast, but hadn’t gone through the normal impact assessments that most external data center technologies do before deployment.

Its impact surfaces during the growth phase, leading to problems and eventually applies the brakes to the whole initiative -- also known as virtual stall.

Virtual stall has four main causes:

  • Scalability Issues: A single IT admin team often finds it difficult to scale beyond the 25-30 percent penetration range. This is due to the combination of a lack of automation and reporting in the virtualization management tools (creating time-consuming manual processes) and the lack of available experienced, trained staff.

  • Management Issues: The data center is not a place that can be managed manually; there are too many elements to be checked and too many independencies to manage. Although the levels of automation built into the virtualization platform can be difficult to define and implement, the lack of automated monitoring, alerting, and control become a greater problem as the overall level of virtualization penetration increases.

  • Process Issues: Enterprise virtualization impacts a wide range of existing data center processes, all of which need to be modified, replaced, or added to. As long as the virtual environments are small and self contained, IT can work around these processes or ignore them. As the environment grows, it reaches a point when the processes must be dealt with before efficiencies are realized. The more “process mature” an organization is, the quicker this point is reached.

  • Coordination Issues: Virtualization crosses multiple silos and ultimately requires a level of cooperation and integration that is impossible to achieve with the traditional silo management structure. In addition, the first workloads to be virtualized tend to be less critical, but as environments grow (especially where “virtual first” policies are implemented), higher-risk, higher-impact services are virtualized. These tend to have more stakeholders, more politics, larger and more distributed infrastructures, as well as greater cost of failure and downtime. Consequently, they require more coordination and information sharing.

How are companies dealing with it?

Smaller and more flexible data centers haven’t experienced VM stall yet (although they will eventually), and the very large and process-mature organizations generally implemented virtualization initiatives in a very controlled and integrated manner, so they are seeing it less. However, it’s the firms in between these two extremes that need to be concerned. In our internal surveys, more than 30 percent of prospects are slowing down their virtualization initiatives due to management or control concerns.

These companies, first and foremost, are looking for tools and systems that can help increase visibility and simplify their environments. For some, this is all they will need to get them back on track. For others the cultural issues will need to be addressed; silos need to be broken down and new processes and procedures need to be implemented.

Is there something they should consider to better combat this issue in the data center?

To better combat virtual stall you need to improve the management and reporting systems in the environment as well as the implementation of new approaches, procedures, and processes.

Reporting is critical, and not just the reporting that administrators need to get their jobs done. There are other stakeholders in this architecture that also need specific information in order to do their jobs. For example, the security team may need to review changes in the environment over a certain period of time. Business owners may need to understand how their services and associated levels of service are changing. IT managers may need to regularly review costs, capacity, and utilization metrics. Asset managers may need to validate software licenses.

Automation, alerts and management are all elements that will help your virtualization administrators manage more efficiency and create a more consistent environment. This will also help free time for them to consider and manage the necessary interdepartmental coordination. Investing in virtualization-aware management tools that scale to handle enterprise-class virtualization deployments and allow you to scale your virtualization environment is critical.

In addition, enterprise virtualization needs new approaches to issues that have already been solved in the “physical world,” such as performance assurance, process modification, security and audit, software compliance, OEM support, configuration compliance, and more. The earlier you consider these impacts and changes, the better you will be able to avoid virtual stall.

How does it differentiate from VM sprawl?

Virtual stall is a completely different issue than VM sprawl. VM sprawl describes the prevalence of underutilized or wasted resources. It is an efficiency and capacity issue. Virtual stall on the other hand, is really more of a governance issue caused by the need to change process, management, and procedures in order to scale, manage, and control the virtual environment effectively.

Which is a bigger consideration for the industry right now?

Although both are important, virtual stall has the bigger impact by far. The cost of virtual sprawl is underutilized resources that if left unattended will result in incremental spending in order to accomplish your growth objectives.

Virtual stall, on the other hand, is the emergence of issues that, if not dealt with, will prevent you from ever achieving your virtualization growth objectives. Even smaller organizations that don’t have the VM population, process maturity, or silo complexity that larger organizations have will run into virtual stall -- they’ll just do so later in the project.

What is a frequent mistake that enterprises are making during server virtualization deployment?

The most frequent mistake we see is in not treating the implementation of a virtual infrastructure as a strategic initiative but rather as a tactical project limited to a small IT operations team or multiple IT operations teams. We see this even in organizations that have implemented “virtual first” rules. The focus is on virtualizing and hitting penetration numbers instead of implementing a new data center architecture and dealing with the process and management issues as needed.

It is a reflection of the economic environment, but many virtualization teams know what information and management systems they need without the funds to put them in place, thus resulting in virtual stall.

What best practices can you recommend to avoid these mistakes?

Companies must think beyond the mechanics of the deployment of a virtual infrastructure, and shift to the transfer of physical assets to virtual ones. Treat virtualization as the data center architecture that it actually is. Involve data center stakeholders, and incorporate the needed reporting, management, and asset tracking systems necessary to scale a virtual environment safely and effectively and do so as early as possible. It’s the only way to ensure that your virtualization initiatives stay on track.

What are your recommendations to data center administrators looking to develop and advance the state of their virtualization deployments to achieve the associated ROI?

Virtual initiatives are not just about volume or the numbers virtualized. They are about implementing a new data center architecture. In addition to the virtualization platform, you need to improve your visibility into the virtual environment and start engaging key stakeholders (other IT teams, business managers, asset managers, etc.) and provide them with the reports and insight they need as early as possible in the process. Then you need to add enough management, alerts, and automation to free up enough time to consider the process and procedural impacts of the structure you are implementing.

Building a cross-silo team early -- to ensure that they understand the impacts and that you understand the stakeholders need -- is an essential step in moving from a tactical deployment to a strategic initiative.

What role does Embotics play in this market?

Embotics V-Commander is an enterprise virtualization management platform for midsize and large organizations that maximizes the value of virtual assets within the dynamic data center. It employs a lifecycle approach to continuously monitor, optimize, control, and integrate the highly variable and interconnected virtual asset portfolios that will ultimately form the foundation for cloud computing, and provides the much needed insight, automation, and management that virtual environments need to scale effectively and quickly.