Q&A: End-to-End Transaction Tracking with Business Transaction Management
In today's environments, business transactions flow through multiple systems. Business transaction management eases your detective work when something goes awry.
Transaction failures mean money -- lost money. So does the time spent trying to find and remedy the cause of these failures. IT would love nothing more that to find and fix these business-critical issues as quickly as possible, but solving production problems in today's environments can be complicated when business transactions flow through multiple systems. Business transaction management (BTM) can make short work of costly failures by providing visibility each transaction across distributed environments.
To explain BTM, the benefits it provides, what enterprises should use it, and how it works, we turned to Ed Horst, vice president of product strategy at AmberPoint and an authority on the subject.
Enterprise Systems: Let's start with the definition. What is business transaction management (BTM) and why is it important for an enterprise?
Ed Horst: In today's IT environments, business transactions flow through multiple systems and services, making them much more difficult to ride herd over. Although an IT team may have visibility into some steps of the overall transaction, they usually can't track the transaction end-to-end. Instead, transactions tend to "vanish" somewhere along the way.
You can't fix what you can't see, so when something goes wrong -- as invariably happens -- fixing the issue quickly becomes a much more onerous and expensive task. The problem could be anywhere, so it's a major undertaking to pinpoint the root cause.
As for the significance of the issue, it's important to remember that there's always one person who sees everything he or she needs to -- the business user. From the customer's vantage point, their transactions have failed and they don't care how, where, or why -- only that you fix it before they decide to take their business elsewhere.
Enterprises need a way to keep their business transactions on track. Business transaction management, or BTM, addresses this need by providing real-time monitoring of each transaction flowing end-to-end across distributed applications. It covers two different areas: performance management, which is tracking performance and throughput of the components and the transactions, and failure management, which calls for mechanisms to detect transaction failures and rapidly locate the root cause of the issue.
What, exactly, does BTM do?
Business transaction management provides instrumentation for tracking the transactions flowing across an application, as well as detection, alerting, and remediation of various types of unexpected business or technical conditions. It enables application support personnel to search for transactions based on message content and context -- such as time of arrival, message type, or client credentials.
These search capabilities facilitate root cause analysis for a wide range of issues common in business transactions, such as stalled transactions, missing steps, faults, and application exceptions, as well as low-level issues such as incorrect data values, boundary conditions, and so on. It provides application support teams with detailed information useful for debugging and repairing the problem, thus reducing the mean time to repair.
Ideally, BTM should track -- in real time -- each and every transaction flowing across the system. It's important to look at historical data, too, but without real-time detection, you lose the ability to fix the problem quickly, before the customer becomes disgruntled.
BTM can also provide insight into the business data of each message, enabling you to modify system behavior based on transaction-specific values. For example, if the services supporting platinum customers approach thresholds at which SLAs will be breached, your BTM system can redirect traffic via routing or load balancing.
Do distributed applications -- such SOA or cloud-based systems -- make transaction management more difficult?
Very much so. In a distributed application environment, transactions are executed by arranging, or "orchestrating," applications and infrastructure to implement business processes. Because organizations are rarely in the position to rewrite everything to a single, common standard, the assembled application must meet each component on its terms. These systems inevitably incorporate a wide variety of technologies and protocols deployed across many platforms and organizational boundaries. The newer parts will often introduce new technology, such as firewalls and appliances, and new protocols, such as REST.
To perform transaction management effectively in these settings, you need visibility across the complete range of application interactions, including those facilitated by SOAP and XML services, messaging systems such as JMS and MQ, database calls, and RMI and EJB applications, as well as across different runtime infrastructures such as enterprise service bus (ESBs) and application servers.
This variety makes transaction tracking much more difficult. The cause of a failed or stalled transaction could be anywhere in a distributed application. It could be caused by components in the implementation layer of the service, or it might actually be located in one of many replicated services or in the infrastructure supporting the orchestrated services. Some business transactions may be long-running or asynchronous, involving human interaction and spanning multiple systems and several days. With distributed applications, you get this needle-in-a-haystack issue.
What price do organizations pay for the lack of visibility into their transaction flows?
The pain points are pretty clear -- lost customers, lost revenue, much slower mean time to repair, and a corresponding increase in employee costs. In fact, without adequate automation to address the environment, as the complexity of the distributed application environment grows, your operations staff needs to be larger to handle the issues that arise.
What types of organizations are typical users of business transaction management systems? What benefits do they typically realize from BTM systems?
Any organization relying on distributed applications -- including SOA and cloud-based systems -- should consider business transaction management. The need isn't specific to any industry. That said, a few examples of some of the common business applications supported by business transactions include operations and business support services (OSS/BSS), account provisioning and activation, insurance claims processing, and procurement.
Done the right way, business transaction management will bring immediate and dramatic benefits. Its benefits include a significant reduction in failed transactions across services-based systems, much lower mean time to repair, and reduced IT maintenance overheads. From a big-picture point of view, the most important benefit is customer retention, as fewer customers are lost due to failed transactions.
How can an IT department sell business transaction management to their CTO? How does BTM link to overall business goals? Is an ROI easy to calculate?
This raises an interesting point. Whereas IT solutions are usually tied to costs, business transaction management is tied closely to revenue. Done properly, BTM tracks each transaction flowing across the system and provides insight into the business data of each transaction, creating linkage between IT and the business units so you can do things like ensure the highest-value orders or most important customers receive prioritized service.
ROI is very simple to show, since you know the actual dollar value of the transactions you're tracking. This gets the attention of CTOs and presents them with a business-relevant way for the tech arm of the company to talk with the business. It enables a business discussion. If you consider that it's the business units that rate the effectiveness of any IT organization, BTM effectively improves that rating.
What should organizations keep in mind when choosing a BTM solution?
The key is to track and record each transaction flowing across the system. There are a few different ways to approach this. You could have the application tag every message with a unique ID, which is a great solution if it's available, but every component of the system has to "cooperate" for this to work. The problem is that some components -- for example, packaged applications like an SAP or Oracle application -- might not work with modified messages. You don't want to have to rewrite code for these applications, as that would be very expensive.
Alternatively, you could use a management system to tag each message, but many environments and applications won't tolerate message modifications. You'd have the same issue as the previous approach in that some components won't cooperate and pass on the tag. The challenge is to track end-to-end transactions without modifying any of the messages involved.
What best practices can you recommend to avoid these problems?
Since the big obstacle is to monitor transactions across distributed applications without breaking them in the process, we recommend a non-invasive approach to business transaction management. Something we call "message fingerprinting" avoids the issue of breaking the transaction flows by not modifying the messages at all.
Instead, with this approach the management system calculates a unique fingerprint, similar to a universally unique identifier or UUID, for each message and then tracks each fingerprint. The message fingerprinting approach works in all situations since it does nothing to modify the messages.
What products or services does AmberPoint offer for business transaction management?
AmberPoint Management System provides comprehensive business transaction management capabilities without requiring modifications to the messages. It tracks each transaction flowing across the system and detects any failures or unexpected conditions in real time. AmberPoint Management System includes the Transaction Safety Net, a set of tools that help application teams quickly troubleshoot issues. Users can quickly search for transactions matching the information reported by a user, narrow the suspects, and identify problem transactions. Furthermore, they have the entire context available at their disposal to identify the problem-causing message and related service and to reproduce the problem.