In-Depth
Winning the Quest for Reliable and Timely Data
In a 24/7 marketplace, you need reliable and timely information. We explain the data integration challenges and how to overcome them.
by Irem Radzik
Consumers are demanding better, faster goods and services at the lowest possible price more than ever before. In response, it is becoming essential for businesses to continuously enhance and differentiate their offerings while improving cost efficiency. To meet these evolving market pressures, a new set of data governance requirements are needed to improve visibility into critical business operations, minimize costs, and enhance customer insights.
Why Deliver Continuous Access to Reliable and Timely Data?
Data governance teams must build a flexible and efficient data infrastructure that can deliver 24/7 access to reliable and timely data that supports the business through improved operations and customer service. There are three key components in this new data infrastructure model:
- Low data latency
- High system availability
- Reliability
Low Data Latency
Timely data, or data freshness, plays an important role in both operational and analytical environments. Sharing low-latency data between operational systems enables businesses to meet customers’ demand to have access to data in a timely manner (e.g., via customer portals). In analytical environments, it creates relevant and accurate data that delivers real business insights, thus allowing solutions to be developed that more effectively meet shifting customer demands.
A great example of how timely data helps businesses can be seen in customer data analysis for call center operations. Many leading organizations provide customer representatives with near real-time customer background information and recommend personalized promotions that address the customer’s immediate needs. These personalized offers achieve a higher success rate and even allow companies to keep customers from moving their business to the competition. Having seen the benefits of low latency data, many businesses now set specific data-latency limitations (e.g., a maximum of 15 minutes) for sharing data across the enterprise.
High System Availability
In building a robust data infrastructure, high system availability is required for multiple reasons. Obviously, it is important for running uninterrupted transactions around the clock, as customers want to do their business online from anywhere in the world. It is also critical for data integration purposes. As employees and enterprise systems access and act on data from other systems more frequently and increasingly with very low latency, a source system’s availability has impact beyond its direct users. This paradigm is even more prominent in service-oriented architectures (SOAs) where application logic and data are shared by an increasing number of end users. Because of these interdependencies, many mission-critical systems have strict service level agreements (SLAs) for availability and performance levels.
Reliability
A reliable data infrastructure is linked to the demand for 24/7 system availability. However, reliability also refers to data being trustworthy and accurate. Poor data quality, including the lack of transactional integrity, prevents accurate business context and leads to poor analysis and decision making. For example, if a business has multiple accounts for an individual customer, the marketing organization will not have a complete view into the customer’s portfolio. In this scenario, they may try offering the customer products they already own, or forgo up-sell opportunities that would have been discovered with a complete view.
Turning Challenges into Opportunities
While IT organizations are tasked with supporting 24/7 access to timely and reliable data -- in some cases with strict SLAs. They face additional challenges ranging from data architecture complexity to finding ways to reduce total cost of ownership. Specifically, for data warehousing or business intelligence solutions, the 24/7 access requirement means that shrinking batch windows often make it impossible to move data from source systems to the data warehouse within the allocated time using traditional ETL methods.
What do these challenges mean to data governance teams, especially in their efforts to achieve 24/7 access to timely and reliable data? Moreover, what are some of the architectural approaches to data integration that can enable access to timely and reliable data?
Reducing Architecture Complexity and Cost of Ownership
Many organizations work in environments with siloed redundant data sources on legacy systems and a long list of various data integration tools and products to bring data together. This picture hurts the bottom line by creating unnecessary redundancy that increases operational costs. Moreover, productivity and data quality are hurt, as employees are forced to work with “multiple versions of truth.” Therefore, consolidating data to create complete, timely, and clean information (thereby eliminating unnecessary data silos) is a top goal for IT organizations in their efforts to enable agile and cost-effective data infrastructures.
Standardizing on heterogeneous data integration solutions that support existing systems, meet different data latency requirements, and offer integrated data quality functions create management-related efficiencies and allow an open and streamlined architecture to be established. Furthermore, IT teams should consider reducing multiple, point-to-point integration instances in favor of efficient solutions that capture changed data only once -- with minimal overhead on the source -- and then distribute it to the one or multiple targets that need the data. This “capture once, deliver-to-many” strategy reduces the impact on source systems, facilitates better performance and extends the lifetime of source production systems.
Another innovative approach to lower TCO and improve performance within data integration architectures is to leverage a database engine’s power for transformations (i.e., extract, load, and perform transformation within the target and/or the source database). This ELT architecture allows working with a very thin middle-tier server for transformations that lower TCO, reduce complexity, and achieve very fast data movement and transformation performance for timely data access.
Shrinking Batch Windows
As many critical systems are expected to support global 24/7 operations, the time available for batch data extracts and loads shrinks significantly. With the growth in data volumes, IT teams are increasingly investing in faster ETL solutions to complete data extracts and loads within the short amount time available to them. However, that approach is only a temporary solution, as data volumes continue to grow and batch windows continue to decrease. Instead, the right strategy is to eliminate batch window dependency altogether and move to a non-invasive change data capture and delivery solution.
There are other benefits of using low-impact change data movement, such as removing source overhead and enabling high performance and availability. This is especially useful for mission-critical systems with strict SLAs. Continuous change data feeds is also one of the key methods to provide low-latency data that consumers demand and businesses strive to use for improved decision-making.
Recipe for Success: Reliable and Flexible Data Infrastructure
Information continues to be a source of power and a competitive advantage in today’s fast-paced world. A flexible and reliable data infrastructure that provides continuous access to reliable and timely data is one key element in the quest for business success. Well-thought-out data integration architectures can allow organizations to turn today’s IT challenges into opportunities that move the business forward.
Irem Radzik is the director of product marketing at Oracle Data Integration. You can contact the author at irem.radzik@oracle.com.