Q&A: Real-time Data Integration No Longer a Luxury
Real-time data can turn into an addiction: once you get a taste of it, you'll want more.
- By Linda Briggs
"Real-time data is like a drug to our users. Once they have a taste for it, they just want more." That's Sami Akbay, vice president of marketing and product management for GoldenGate Software, quoting a data warehousing director at a customer site. Akbay offers the quote to explain just how important real-time data integration has become. That's why today's operational BI system needs to be continuously available, Akbay says, and must deliver good data nearly as soon as it is captured and processed.
In this interview, Akbay talks about the benefits and quick returns that can be realized with near-real-time data integration. Akbay's company, GoldenGate Software, provides real-time data integration and high-availability solutions.
BI This Week: What are the key drivers behind the interest in real-time data integration?
Sami Akbay: We live in an on-demand world. Day- or even hour-old data has become obsolete -- only low-latency solutions can deliver the kind of value and return on investment that today's enterprise demands. Businesses must be able to access, analyze, and act on business information faster than ever and without system interruption or downtime.
Additionally, in the current economic climate, companies are giving budget priority to IT projects that either increase revenue or contribute to overall cost savings. Implementing a real-time data integration solution gives users immediate access to the freshest information about the business. For example, customer service representatives have the ability to know what has just happened with an individual's account and can troubleshoot and offer incentives in the timeliest manner, helping to retain customers and increase satisfaction levels.
With low-latency data available in the business intelligence system, the business user can be confident that decisions are based on accurate and timely data. The call center representative no longer needs to wait for a nightly batch load to see what is happening with a particular customer (and also needn't ask the customer to call back the following day). Business processes become more efficient, directly impacting revenue and growth.
How does real-time data integration offer a quick return on investment?
Recent industry research surveys point to an increased demand for lower-latency data access for reporting and BI systems. In fact, late 2008 research statistics show that BI is becoming more pervasive because there is fresher data in the system, and because there is a greater number of domains feeding data into the BI system. Companies recognize the value of getting the right data to the right business groups in the right format, so that multiple functions across the enterprise can benefit from having a broad, accurate view of what is occurring in the business.
A good example of this is a large U.S. telecommunications and cable provider that initially deployed continuous data feeds from its central customer relationship management (CRM) system to an enterprise data warehouse. The project was initially deployed to improve reporting and efficiencies for field service, which proved successful. The deployment's second purpose was to reduce overall turnaround time for sending technicians into the field. This telecom provider found that it could use the same data to help reduce customer churn, since call center representatives now had immediate access to each customer's problem files. Proactively offering new pricing or extended service to the customer immediately helped reduce customer churn well below industry averages. The organization clearly benefitted by having access to data across various channels and distributors.
What about performance concerns with real-time data integration?
Actually, moving data continuously -- or with just a few seconds of latency -- is generally a much more efficient technology approach than moving large blocks of data incrementally over a 24-hour period. Traditional ETL or batch methods require a full table scan to determine the delta processing since the last time an extract was performed; in many cases, that impedes performance on the source system. Given the data explosion in most businesses today, a full table scan and extract can take many hours. In some cases, the window of time expires before the extract and load can be completed, causing problems for the business.
With change data capture (CDC) technology, only the changed data is moved across to the target BI or reporting system. That greatly reduces the burden on the source and target system, and requires far less network bandwidth. In addition, with real-time CDC technology, the capture component on the source system reads from the native database logs in a very non-intrusive manner, preventing it from interfering with the core processing performed on the OLTP system.
How does an operational BI framework tie into real-time integration?
Companies that have implemented an operational BI strategy discover that the business quickly comes to rely heavily on the most up-to-date information for reporting and analysis. Once users have this data at their fingertips, living without isn't an option. In these instances, companies look to keep their data warehouses highly available, so that in the event of an outage, they can fail over to a secondary system, allowing report processing and queries to continue as usual. In many situations, the service-level agreements (SLAs) for an operational BI system are stringent.
That's especially true when the BI system is relied on by a wide group of users spanning many parts of a business -- from executive management to marketing and sales, from inventory management and customer call center representations to front-line staff that serve thousands of customers every day. If the BI system goes down for any length of time without a contingency plan, an organization can face heavy penalties, lost revenue, and decreased customer satisfaction. Therefore, today's operational BI system needs to be continuously available. In the words of one director of data warehousing for a large online retailer, "Real-time data is like a drug to our end-users. Once they have a taste for it, they just want more."
From a cost perspective, what best practices can you suggest for real-time data integration?
The IT group is focused on enabling the business to access rich transactional data while simultaneously avoiding any impact or outage to that source system. Some IT groups quickly and cost-effectively deploy a dedicated server for reporting. However, in some cases, rather than keeping a backup system idle and waiting, reports are pulled off the live standby or disaster recovery system. That can save costs. Also, it's often a best practice to roll out a departmental ODS or reporting server before embarking on a higher-cost enterprise data warehouse.
When cost is especially important, organizations should deploy a real-time data integration solution that supports the widest range of database and hardware platforms. They're then free to choose the most appropriate associated environment while keeping cost top of mind. Many large enterprises have dedicated operational data stores and warehouses that are not as expensive to maintain as their core OLTP systems.
Standardizing on a single data integration technology solution also further reduces overall costs. For example, the IT team needs only to be trained on one software solution, reducing maintenance cycles and licensing costs.
What does GoldenGate offer in terms of what we've talked about today?
GoldenGate solutions leverage real-time change data capture technology to enable operational and analytical data integration for the real-time enterprise. Our solutions offer continuous data feeds across heterogeneous environments with transaction integrity to facilitate decision-making with accurate, real-time information. GoldenGate can also augment existing ETL tools to provide real-time ETL solutions. Our data integration offerings include:
- Real-Time data warehousing provides real-time capture and delivery of changed data from OLTP systems to the data warehouse or operational data store (ODS) for enhanced strategic and operational BI
- Real-Time change data capture (CDC) for ETL eliminates batch windows and provides a continuous feed of changed data from OLTP systems to existing ETL systems with very low infrastructure overhead
- Live Reporting allows highly scalable reporting with real-time data while off-loading the report processing from the primary production database to any number of lower-cost secondary databases