Service Delivery: You Can't Manage What You Can't See
Best practices for using network and application baselines to help predict service delivery degradations
by Eileen Haggerty
Enterprise IT environments are becoming increasingly complex, which in turn is making it increasingly difficult for IT professionals to see what is happening as essential business services and applications traverse the global network. Technology approaches such as optimization, SOA, and virtualization promise reduced cost of operation and greater efficiency, but may also introduce unintended issues with applications across the network or mask identifying how the target business service is functioning.
Just as important, the enterprise network must support bandwidth-intensive applications that may easily experience performance degradation, impacting the quality of experience for employees using them. Intermittent issues with jitter or dropped calls on the VoIP system or lack of synchronized audio and video on the expensive HD video conferencing solution can quickly spark user complaints or even cause abandonment of the service.
If enterprises are to take full advantage of the promise of the "modern IP network," IT operations must have a firm understanding of how these new services and technologies are impacting the network. The adage "you can't manage what you can't see" is important to keep in mind when choosing a solution to manage and optimize service delivery assurance. To gain that understanding, you need to identify and establish a baseline of what you're looking for, where to find it, and how to see it.
What You Should Watch
To effectively know what you should be looking for, begin by auditing the activity of business services across your global network. Depending on the size and complexity of your network and application mix, your audit plan will likely combine monitoring of critical applications and key network aggregation points as well as ensuring that you can perform packet-level problem identification and resolution.
The crucial and most often overlooked point is to create a baseline of the current performance of essential business services and network aggregation points during the planning stage of major new projects. Before rolling out any new IT project, ask: "Do we have the ability to analyze the application activity if it's needed?" Knowing the networked applications and services they deliver will make your decision and planning process more effective and efficient.
Managing the Business Services Audit
During a business services audit, you must gather essential information about your existing network, including bandwidth, applications, and response times and establish of a network baseline. To create the audit:
- Create an inventory of the business services and applications running over the network. This includes key details for optimizing the use of the network resources, such as identifying all applications traversing the network.
- Evaluate bandwidth to ensure capacity availability for new business services. Rank most- and least-utilized network segments, in the campus LAN as well as over remote office WAN connections; examine activity trends for patterns in traffic behavior.
- Create response-time baselines of your business' essential services and applications. Measure typical application response times for key applications so you know what your end users expect for predictable, repeatable performance.
- Identify ancillary performance issues. No network is perfect. Use this opportunity to clean house. Look for packet loss, high application retransmits, previously undetected worms or viruses, and router misconfigurations.
Baseline best practice: Identify virtually all of your applications and services: Web-based, complex applications, and even custom applications. You can leverage the actual packets in your network by using a management solution to inspect the full packet header to accurately reveal the actual application in use. Knowing the networked applications and services they deliver will make predicting and identifying end-user impacting service degradations easier and more complete.
Your Audit Baseline in Action
Despite your careful planning and having rock-solid application identification, it's inevitable that service delivery issues will occur with some applications. Predictive, early-warning analysis can isolate the application fault and get things back up and running in the most efficient manner possible.
It seems that whenever application performance is compromised, finger pointing between network staff, application teams, and third parties (e.g., WAN providers) starts almost immediately. Eliminate the finger pointing and associated time loss by using your baseline as a starting point to resolve problems by ruling in (or out) network problems from the start. Running a simple health report will quickly determine if underlying network issues are contributing to the application issues users are experiencing.
If the report indicates that the flow between servers is behaving normally, the next step is to validate the end-user issue. Your baseline will reflect the applications normally present in the network, but perhaps there are now new or different applications running on the network impacting your end users. Your baseline will also have the typical response time for this service, for overall flight time, network time, and server "think" time, all of which point to whether there is real degradation and where it is manifested. Unfortunately, we've all been victim of random bad behavior of an application, which annoys the end user and can be difficult to track down or replicate. With proper tools, identifying the problem source will lead to follow on steps for resolving the issue.
Baseline best practice: Tools that enhance and encourage collaboration will speed resolution. Ad hoc reports including information for the current problem can be compared and shared with the teams responsible for resolving the issue.
This is exactly the type of situation where a well-thought-out and well-executed baseline audit pays the most dividends. Monitoring and/or capturing traffic at strategic points throughout the network can provide the information necessary to measure overall application response times using a variety of metrics including server delay and transaction transfer times. This level of visibility arms you with the evidence you need to track down critical issues with service delivery in order to apply the right fix.
The "modern IP network" is helping today's businesses achieve more and accomplish things that were once thought impossible. However, for all the benefits, the complexity introduced has made monitoring, managing, and assuring service delivery of essential business applications distinctly more challenging. Enterprises can meet that challenge and achieve the visibility required to keep their networks and applications running smoothly by planning and implementing a comprehensive baseline audit strategy. The right mix of strategically deployed technology and real-time predictive analysis can give IT operations everything it needs to respond when the application performance is degraded.
Eileen Haggerty is a product director at NetScout Systems. You can reach the author at Eileen.firstname.lastname@example.org.