Q&A: Application Performance Management in Action (Case Study)
When Web apps get busy, IT must ensure performance doesn't suffer. We examine how the UK's Nisa Retail used APM to keep its members (and their customers) satisfied.
When Web pages are slow to load, customers tend to abandon their orders or move to a competitor's sites. Conversely, the faster IT systems are, the more opportunities are created. Nisa Retail, a UK organization that helps its members (independent food and drink retailers) stay competitive, put an application performance management solution to the test to keep its customers (and their customers) satisfied. To learn about the project, we contacted David Morris, head of IT at Nisa Retail.
Enterprise Strategies: What are APM solutions and what benefits do they offer?
David Morris: APM stands for "application performance management," and these solutions have developed to ensure applications (driving everything from payment transactions on Web sites to ad delivery and search performance on tablets to in-house order-tracking and fulfillment technology) are operating at peak performance. All these components of an application combine to create the customer experience, and, therefore, significantly impact the bottom line.
A great APM solution is strong in the "management" area, not limited to "monitoring" as some first-generation solutions were. Today's APM ensures that if there is a breakdown (or even a potential breakdown is about to occur) in applications that drive business, anywhere from the servers to the last mile, it will be immediately visible and correctable in real time -- before it can have a negative impact on business.
Tell us a little about Nisa Retail and the problems you were trying to solve with an APM solution.
Nisa Retail is a unique member-owned organization, helping independent retailers remain competitive in the food and drink markets. We are unique in that we are owned by our members -- over 3,000 independent retailers throughout the UK.
Nisa Retail represents over 1080 registered shareholders, and supports its members with over 250 staff at its Member Support Centre, who work in all aspects of the business to ensure the best products and support are always provided. We operate almost a million square feet of warehouses and move 108 million cases of goods annually.
The company is growing, and along with expansion comes an explosion of complexity that has impacted all stakeholders in our business. From throughout the Nisa network to visitors from outside the organization itself, visitors interact with us online with different browsers and devices every day. Also, the way applications are architected, built, and deployed within our technology stack is increasingly complex.
In addition, user expectations continue to rise. The "Google Effect" is a term that describes how fast people expect even very complex applications to perform. In response to this, businesses expect IT to deliver more features, and better performance, and they want you to deliver it sooner.
We wanted to be sure that even though complexity and demand are rising, all our members would have an excellent experience every time they visited the Nisa Web site or transacted with the company, even during the busiest days of the year. We need to see what the status of our application performance is in real time, not just in a samples and averages, so we can proactively address any potential issues before they impact our members.
What alternatives did you consider (software package, in-house development, etc.)?
High-performance applications produce best-of-breed user experience. If that's what you want to deliver, you need best-of-breed application performance management. We did evaluate a few alternatives, but it rapidly became apparent that Compuware APM was (and remains) the foremost solution of its type.
What criteria did you use in your evaluation? What ROI did you require? What solution did you choose and why?
We needed an APM solution that was comprehensive and able to track all aspects of our applications' performance end-to-end, and we needed it to be on 24x7 to ensure we were seeing a total picture. Other vendors offered a sampling approach, but we were very concerned with only seeing an occasional snapshot of our performance, when we could have a complete understanding of what is happening at all times. If someone reports an issue and it wasn't captured in a sampling approach, we'd have a potentially serious problem on our hands that we may not be able to replicate and really understand. We were very impressed with the ease with which Compuware dynaTrace captures every transaction on our applications and the one click to code-level issue identification -- it gives us a full context, so we can make informed business decisions.
Additionally, we were initially concerned about overhead in running a 24x7 APM solution in production. Compuware dynaTrace APM has extremely has low overhead (2%) so we can confidently leave it on all the time for traces through Java, .Net, and mixed applications from the Web browser back to the database. This is the breadth and depth we required to manage our applications
Lastly we loved that Compuware dynaTrace provides a single system that can be used across the application life-cycle by all stakeholders: architects, developers, testers, and operations.
Tell us about the project itself -- how long did it take, how much did you customize the package, how many resources, how were IT and business users involved, etc.
We began using Oracle Retail Suite software in late 2008, and though it initially proved functional, it didn't have a friendly interface -- it was difficult for our members to use. We brought in specialist performance consultants who were also able to customize the front-end of the application.
Our implementation of Oracle Retail Suite was hardly simple to begin with -- the application wasn't simply retrieving basic product information but also availability, estimated dispatch date, and data on bulk purchasing for preferential pricing. With a friendlier front end, what had in reality been longstanding performance problems became evident. When we began using Compuware dynaTrace around 15 months ago, we connected business and IT concerns. Because we had full visibility into all transactions executed on the OCS, we could actually see carts being abandoned due to lengthy response times; or conversely, members shopping around for suggested products based on what they'd bought in the past. We connected application performance with business performance.
Once implemented, did Nisa Retail realize the benefits you expected? Did you enjoy any benefits you hadn't expected?
One example illustrating the everyday value from our APM deployment is when we received a phone call to say that a member was complaining of difficulties logging into OCS at 2:15pm -- which is approaching the afternoon cut-off for new orders, so this was a critical time for the business to run smoothly. Inability to place orders causes delays to the daily order and also has follow-on effects throughout the business, warehousing, stock control, just-in-time ordering, etc. We have very efficient warehouse operation to save on costs, but this means everything needs to operate like clockwork -- it impacts everything down to tight schedules for deliveries and over 1000 vehicle movements per shift.
This screenshot was taken "after the event", as 5 minutes into the problem we clicked on the "red bar" you see here. We immediately drilled down to errors page and saw we had experienced 16 errors in this 5- minute period. This wasn't a crisis at this stage but was definitely a cause for concern.
Drilling down into the errors to the "PurePaths" view showed all the errors were related to the home page. We drilled further and found that the problem was related to a problem with Umbraco (which handles the customer-specific content). We restarted Umbraco (single application pool) and the problem was resolved. That's 15 minutes from problem identification to resolution.
Here is a more strategic example of what we experienced, following the APM deployment.
This view shows the PurePath dashboard. In it, we can see the slowest transaction here is "View Order" taking approximately 3 seconds. This is a critical interaction on an ecommerce site: if the page takes too long to load, users will lose confidence in your ability to correctly process their order. Member confidence is critical, as it helps retain members and ensure they keep coming back. Here we were able to drill-down right away to the cause, and resolve it within minutes.
When we'd made a number of performance improvements, the effects accumulated to produce a 46 percent reduction in response times. Our members certainly noticed: the volume of complaints fell drastically, and a few people were so delighted they even rang us up to tell us so!
What's next? For example, will you roll out the solution to more users, use it in more departments, turn on features you didn't use initially?
One of the big shifts I've seen in business and IT over the years is that people now expect to interact with Web services at any time, with any device. This is why we're building a mobile app for our members to order with, but with this new platform comes additional complexity. Most of our users currently use Web browsers to buy through OCS -- and our .Net back-end systems translate relatively simply to .aspx Web pages. On the other hand, iOS apps are written in Objective C. A service that has transactions executed on it through several different programming languages is a pretty typical locus for performance problems, which we'll need to be mindful of.
Fortunately, Compuware have a solution for monitoring performance inside mobile apps, and that's something we're looking into.