Q&A: Managing the Risks of Offshore Data Warehousing

Moving a data warehouse offshore can bring significant cost savings, but such a move introduces elements of risk that must be managed.

Moving a data warehouse offshore can bring significant cost savings, but it also introduces elements of risk. In this interview, we discuss the advantages and challenges of offshore data warehousing with John Ward, who leads delivery of HP's business intelligence best-shore centers, including offshore centers in China and India and near-shore centers in the U.S., Spain, and Bulgaria.

As manager of best-shore delivery centers for HP's Enterprise Information Solutions group, Ward is responsible for building offsite delivery capacity and capabilities to help clients cost-effectively meet growing demands for BI and data warehouse development. Ward, who has over 20 years of experience in consulting services, leads the service offering development for related services including HP's Data Integration Factory, BI Testing, and BI Operations and Support offerings.

Ward joined analyst Krish Krishnan in discussing the topic of offshore data warehousing at the TDWI Webinar on September 27, Managing the Risks of Offshore Data Warehousing.

BI This Week: Let's start with a definition of terms. What is offshoring versus near-shoring?

John Ward: In the data warehousing space, we typically use the term "offshore" to mean transferring work to any country outside the client's country. This includes what we call "far-shore" and "near-shore." Far-shore centers provide scale and the lowest possible costs. In HP terms, this typically means countries such as India, China, the Philippines, and Malaysia. Near-shore centers have other characteristics or business requirements that make it a more appropriate location selection to meet client requirements at a lower cost compared to the client's location. For example, we have near-shore centers within the European Union, which help us take on work that cannot be outsourced outside the EU. Our data warehousing center in Sofia has a great combination of data warehousing technical skill, multi-lingual capabilities, and, for European clients, a convenient time zone and easy travel for meetings. Time zones are also an important consideration for many of our U.S.-based clients, which is why we have significant "near-shore" delivery centers in Central and South America.

HP uses the term "best-shore" to reflect our global delivery strategy, directing work to locations that are best suited to support our clients' requirements at the lowest cost. We also recognize that some activities are best provided onsite at client locations or in onshore centers within the same country.

These terms are starting to evolve as multi-nationals become truly global companies with business leadership distributed around the world. Key business functions can be distributed worldwide, blurring what is onshore, near-shore, and far-shore. As an example, HP has a global oil and gas client with finance leadership in South America, while sales and marketing is led from Asia Pacific. In this case, classic "onshore" activities, such as specifications and design, were performed in traditionally offshore locations and vice-versa.

Why might offshoring (or near-shoring) be an attractive proposition for data warehousing? What are some of the biggest potential benefits? Is it mainly a cost consideration?

By far the main benefit is lower costs, but secondary benefits of using large offshore centers include flexible resource supply to handle spikes in demand, access to alternate technology pools in the same centers (such as Java for report distribution portals), rigorous process compliance, and simply getting work done at night when traditional data warehouse batch processing occurs.

Even though cost is typically the main driver for offshoring data warehouse development, I caution people to look at the overall project cost, not just the hourly rate. You have to consider risk mitigation factors in the cost -- for example, allocation of senior resources with experience in data warehousing and in your industry, or flying people between onshore and offshore locations. Both of these may cost more, but without them, a project is more prone to mistakes and delays -- and thus, higher cost.

Speaking of risks and risk mitigation, what are some of the risks of offshore data warehouse development?

The most common risks are lower data quality and missed deadlines, both of which can cause users to reject the data warehouse and continue using operational or departmental data sources.

I like to break the risks into two general areas: project management and communication. Management becomes more difficult when the team supporting your project or application is half a world away. It is hard to gauge progress, quality, or confusion from a distance.

Communication can be challenging because of the distance but also due to language and cultural differences. By cultural differences, I mean two things. First, how do people interact? For example, do they have a propensity to say "no problem" all the time? Conversely, do they always perceive problems? Second, I mean a basic cultural understanding of how business operates in different parts of the world. For example, I have had to support value added taxation (VAT) systems and have always found this confusing because we rarely have this type of taxation in the U.S. I use this example to demonstrate that cultural issues cross all shores. (As an aside, learning about other cultures is the best part of my job.)

A third risk is that in the quest for lower rates, it is easy to find firms that are really good at other types of development but that just don't "get" data warehousing. I have seen situations in which an offshore bid comes in significantly lower than other bids. Yes, the cost may be low, but it generally isn't because of the lower rates; it's because they didn't understand the project and thus grossly shortchanged requirements, architecture, and analysis phases.

What are some ways to lessen some of those risks?

I'll start with communication. It is difficult to quantify, but most will agree that getting people together in person has significant benefits to communication and the success of any project or activity. It facilitates openness, gives the offshore team leaders the ability to "hear it themselves" about business needs, and builds a foundation for better communication going forward.

Also key to improving communication is using design and specification templates that are specific to data warehousing, containing areas that are unique to our domain, such as traceability, balancing methods, data lineage, and job control. Coordinators serve as an onsite completeness check for specifications, so it is critical that they have cultural understanding of the offshore country to catch documentation weaknesses because of assumed understanding. Coordinators also need to have relevant data warehousing skills so they can knowledgably resolve issues for the offshore team during the day.

To handle difficulties managing a remote team, divide work into very small units and measure progress to the specific ETL [extract, transform, and load] mapping, graph, or report and so forth. The sooner delays are identified, the faster they can be resolved, so don't send whole sections of the data model to be "filled" offshore. Also, measure changes with an eye on late-breaking data issues. These typically signal weakness in data profiling, which should be addressed with the highest urgency. Finally, track defects to the specific developer. Although defects are most often caused by poor upstream specifications, there are some instances in which programmers just may not be qualified for the work assigned.

What are some critical success factors to offshoring, especially as it pertains to data warehousing and BI?

Critical success factors start with a partner who has long experience in data warehousing. Most of the team needs experience. Project leads should have more than 15 years of experience with multiple implementations in your industry. Team leads should have more than five years specifically focused on data warehousing skills. Like any data warehousing team, only a fraction of the resources can be newly trained on ETL or reporting tools. For trained resources, make sure the training program covers more than just technology and includes data warehouse architecture, modeling, and basic data analysis.

Additionally, onshore and offshore project teams need to be integrated. Although there needs to be a tiered leadership structure offshore, the overall team structure should be singular -- not duplicating responsibilities by dividing major work items between centers.

Finally, the success of a data warehouse is often measured by the adaptability to meet future business analytics requirements. This happens when the underlying data structure is built not just for current needs but also for future flexibility. Although future requirements are difficult to predict, data warehousing experts who focus on particular industries and have experience with multiple data warehouse environments often understand the most flexible and useful structure. Probably the most critical success factor is to make sure your offshore provider has industry expertise as part of the team. This can be hard to find in an offshore provider.

Is it best to start small, perhaps offshoring one division's data mart or warehouse, or is it better to pick a large data warehouse, perhaps one with mature processes in place, and begin there?

We have found that the best place to start with offshore data warehousing is with code and unit test portions of ETL and report development. As capabilities and experience mature, you can move to earlier phases in the life cycle. This applies to both project work as well as enhancements performed as part of the support team.

Make sure clear deliverable templates, architectural standards, code review, and promotion processes are well defined and explained to offshore teams. Make sure architectural standards address what we refer to as audit, balance, and control -- metadata describing the processing environment, including error checking, error reporting, and dependencies and restart logic. Finally, monitor progress to identify issues around meeting delivery expectations for timeliness and quality.

Is security really a top risk or is that more a perception?

Outside of defense and public sector, we have seen few situations in which security is such an issue that data cannot be processed offshore. This includes countries with strict regulations around customer, financial, and healthcare information. One approach to securing data is data masking -- for example, changing Social Security numbers across the data warehouse -- but because of the nature of data warehousing work, this can be prohibitively expensive. As a result, it is usually necessary to give the offshore team access to your sensitive data. This requires "barrier" approaches designed to prevent data from being moved to unsecured platforms, along with physical security, intrusion prevention, and testing and monitoring data movement to ensure data is secured. In any case, a comprehensive approach to technical and physical security ensures multiple layers of controls to facilitate offshore services.

How long before the benefits of offshoring a data warehousing project are typically realized?

Realistically, maturation of the offshore unit can take up to two years. A number of strategies can greatly improve and accelerate this process and should be part of all offshore BI programs. These include:

  • Staffing the offshore team with resources experienced in industries similar to yours.

  • Staffing the offshore team with more senior resources.

  • Assigning offshore resources to rotational assignments onshore so they can work with business subject matter experts (this includes involving offshore staff in upfront requirements gathering).

  • Make sure onshore and offshore resources are tightly integrated; this facilitates direct training and mentorship of offshore resources so they are more knowledgeable of your data and business processes.

  • Comprehensive onboarding processes or "client universities;" make sure the resources assigned to your project will get training on your business processes, data models, and technology. We make sure our resources get comprehensive training -- not just on the technology but also on why and how they would use the technology and on client business processes.

Are there particular countries that are easier to work with, or is the most important factor simply finding the right partners?

I have found people to be more similar than different around the globe. When selecting a location, first and foremost consider the availability of skilled resources. India and China lead in skills availability, primarily due to their size and education systems. Other considerations include language, legal and IP protections, time-zone similarity, visa regulations to allow for onsite travel, and proximity to client business operations.

How does HP help reduce the risk of offshore data warehousing projects?

First, HP takes communication very seriously. We've built robust steps in our methodology to ensure requirements, specifications, architecture, and design are complete and clear for offshore teams. This up-front rigor leads to a higher quality result. Also, our approach includes detailed tracking as well as regular and frequent communications so our clients always know exactly how their projects are progressing.

We also take training very seriously. Our offshore resources receive weeks of training on integration-specific methods, tools, architecture, and data. Our project onboarding prepares our staff on client-specific business processes, data requirements, and implementation standards.

HP uses our own business intelligence-specific methodology, which includes data-specific templates and deliverables that typically are absent in generic methodologies, and we have global scale, with data warehouse practitioners across client locations and our nine global BI delivery centers.

Must Read Articles