Q&A: Managing the Risks of Offshore Data Warehousing
Moving a data warehouse offshore can bring significant cost savings, but such a move introduces elements of risk that must be managed.
- By Linda L. Briggs
Moving a data warehouse offshore can bring significant cost savings, but it also introduces elements of risk. In this interview, we discuss the advantages and challenges of offshore data warehousing with John Ward, who leads delivery of HP's business intelligence best-shore centers, including offshore centers in China and India and near-shore centers in the U.S., Spain, and Bulgaria.
As manager of best-shore delivery centers for HP's Enterprise Information Solutions group, Ward is responsible for building offsite delivery capacity and capabilities to help clients cost-effectively meet growing demands for BI and data warehouse development. Ward, who has over 20 years of experience in consulting services, leads the service offering development for related services including HP's Data Integration Factory, BI Testing, and BI Operations and Support offerings.
Ward joined analyst Krish Krishnan in discussing the topic of offshore data warehousing at the TDWI Webinar on September 27, Managing the Risks of Offshore Data Warehousing.
BI This Week: Let's start with a definition of terms: what is offshoring versus near-shoring?
John Ward: In the data warehousing space, we typically use the term "offshore" to mean transferring work to any country outside the client's country. This includes what we call "far-shore" and "near-shore." Far-shore centers provide scale and the lowest possible costs. Near-shore centers have other characteristics or business requirements that make it a more appropriate location selection to meet client requirements at a lower cost compared to the client's location.
For example, we have near-shore centers within the European Union, which help us take on work that cannot be outsourced outside the EU. Our data warehousing center in Sofia, Bulgaria has a great combination of data warehousing technical skill, multi-lingual capabilities, and, for European clients, a convenient time zone and easy travel for meetings. HP uses the term "best-shore" to reflect our global delivery strategy, directing work to locations that are best suited to support our clients' requirements at the lowest cost. We also recognize that some activities are best provided onsite at client locations or in onshore centers within the same country.
These terms are starting to evolve as multi-nationals become truly global companies with business leadership distributed around the world. Key business functions can be distributed worldwide, blurring what is onshore, near-shore, and far-shore. As an example, HP has a global oil and gas client with finance leadership in South America, while sales and marketing is led from Asia Pacific. In this case, classic "onshore" activities, such as specifications and design, were performed in traditionally offshore locations and vice-versa.
Why might offshoring (or near-shoring) be an attractive proposition for data warehousing? What are some of the biggest potential benefits? Is it mainly a cost consideration?
By far the main benefit is lower costs, but secondary benefits of using large offshore centers include flexible resource supply to handle spikes in demand, access to alternate technology pools in the same centers (such as Java for report distribution portals), rigorous process compliance, and simply getting work done at night when traditional data warehouse batch processing occurs.
Even though cost is typically the main driver for offshoring data warehouse development, I caution people to look at the overall project cost, not just the hourly rate. You have to consider risk mitigation factors in the cost -- for example, allocation of senior resources with experience in data warehousing and in your industry, or flying people between onshore and offshore locations. Both of these may cost more, but without them, a project is more prone to mistakes and delays -- and thus, higher cost.
Speaking of risks and risk mitigation, what are some of the risks of offshore data warehouse development?
The most common risks are lower data quality and missed deadlines, both of which can cause users to reject the data warehouse and continue using operational or departmental data sources.
I like to break the risks into two general areas: project management and communication. Management becomes more difficult when the team supporting your project or application is half a world away. It is hard to gauge progress, quality or confusion from a distance.
Communication can be challenging because of the distance, but also due to language and cultural differences. By cultural differences, I mean two things. First, how do people interact? For example, do they have a propensity to say "no problem" all the time? Conversely, do they always perceive problems? Second, I mean a basic cultural understanding of how business operates in different parts of the world. For example, I have had to support value added taxation (VAT) systems and have always found this confusing because we rarely have this type of taxation in the U.S. I use this example to demonstrate that cultural issues cross all shores. (As an aside, learning about other cultures is the best part of my job.)
A third risk is that, in the quest for lower rates, it is easy to find firms that are really good at other types of development but that just don't "get" data warehousing. I have seen situations in which an offshore bid comes in significantly lower than other bids. Yes, the cost may be low, but it generally isn't because of the lower rates; it's because they didn't understand the project and thus grossly shortchanged requirements, architecture, and analysis phases.
What are some ways to lessen some of those risks?
I'll start with communication. It is difficult to quantify, but most will agree that getting people together in person has significant benefits to communication and the success of any project or activity. It facilitates openness, gives the offshore team leaders the ability to "hear it themselves" about business needs, and builds a foundation for better communication going forward.
Also key to improving communication is using design and specification templates that are specific to data warehousing, containing areas that are unique to our domain, such as traceability, balancing methods, data lineage, and job control. Coordinators serve as an onsite completeness check for specifications, so it is critical that they have cultural understanding of the offshore country to catch documentation weaknesses around assumed understanding. Coordinators also need to have relevant data warehousing skills so they can knowledgably resolve issues for the offshore team during the day.
To handle difficulties managing a remote team, divide work into very small units and measure progress to the specific ETL mapping, graph, or report and so forth. The sooner delays are identified, the faster they can be resolved, so don't send whole sections of the data model to be "filled" offshore. Also, measure changes with an eye on late-breaking data issues. These typically signal weakness in data profiling, which should be addressed with the highest urgency. Finally, track defects to the specific developer. Although defects are most often caused by poor upstream specifications, there are some instances in which programmers just may not be qualified for the work assigned.
What are some critical success factors to offshoring, especially as it pertains to data warehousing and BI?
Critical success factors start with a partner who has long experience in data warehousing. In fact, many of the critical success factors for offshore data warehouse development are the same as for onshore projects. For example, you need people with years of experience in data warehousing and your specific domain, along with junior developers who are really well trained. The training needs to be effective. Other things to look at include methodology -- you need a process so everyone understands what to do and how to do it.
Additionally, onshore and offshore project teams need to be integrated. Although you need a tiered leadership structure offshore, the overall team structure should be singular -- not duplicating responsibilities by dividing major work items between centers.
Finally, there has to be a good process for management oversight. You need to have visibility into all aspects of the project so you are aware of any problems and can help fix them right away to avoid project delays.
Is it best to start small, perhaps offshoring one division's data mart or warehouse? Or is it better to pick a large data warehouse, perhaps one with mature processes in place, and begin there?
We have found that the best place to start with offshore data warehousing is with code and unit test portions of ETL and report development. As capabilities and experience mature, you can move to earlier phases in the life cycle. This applies to both project work as well as enhancements performed as part of the support team. Make sure clear deliverable templates, architectural standards, code review, and promotion processes are well defined and explained to offshore teams. Make sure architectural standards address what we refer to as audit, balance, and control -- metadata describing the processing environment, including error checking, error reporting, and dependencies and restart logic. Finally, monitor progress to identify issues around meeting delivery expectations for timeliness and quality.
Is security really a top risk or is that more a perception?
Outside of defense and public sector, we have seen few situations in which security is such an issue that data cannot be processed offshore. This includes countries with strict regulations around customer, financial, and healthcare information. One approach to securing data is data masking -- for example, changing Social Security numbers across the data warehouse -- but because of the nature of data warehousing work, this can be prohibitively expensive. As a result, it is usually necessary to give the offshore team access to your sensitive data. This requires "barrier" approaches designed to prevent data from being moved to unsecured platforms, along with physical security, intrusion prevention, and testing and monitoring data movement to ensure data is secured. In any case, a comprehensive approach to technical and physical security ensures multiple layers of controls to facilitate offshore services.
How long before the benefits of offshoring a data warehousing project are typically realized?
Realistically, maturation of the offshore unit can take up to two years. A number of strategies can greatly improve and accelerate this process and should be part of all offshore BI programs. These include:
- Staffing the offshore team with resources experienced in industries similar to yours.
- Staffing the offshore team with more senior resources.
- Assigning offshore resources to rotational assignments onshore so they can work with business subject matter experts (this includes involving offshore staff in upfront requirements gathering).
- Make sure onshore and offshore resources are tightly integrated; this facilitates direct training and mentorship of offshore resources so they are more knowledgeable of your data and business processes.
- Comprehensive onboarding processes or "client universities;" make sure the resources assigned to your project will get training on your business processes, data models, and technology. We make sure our resources get comprehensive training -- not just on the technology but also on why and how they would use the technology and on client business processes.
Are there particular countries that are easier to work with, or is the most important factor simply finding the right partners?
I have found people to be more similar than different around the globe. When selecting a location, first and foremost consider the availability of skilled resources. India and China lead in skills availability, primarily due to their size and education systems. Other considerations include language, legal and IP protections, time-zone similarity, visa regulations to allow for onsite travel, and proximity to client business operations.
How does HP help reduce the risk of offshore data warehousing projects?
HP has a long history solving some of the most complex data challenges our clients present to us. We are really good at business intelligence, and our experience lets us manage the risks of offshoring. For example, we have what I think is the best training offshore. Not only that, but we also have rigorous processes and are constantly working on ways to improve the quality of our delivery. We have a complete team of senior people onshore -- people who have walked in your shoes and trained in your industry -- and junior developers offshore.
With HP, clients get the benefits of the world's largest technology company with a team specialized in offshore BI delivery. Our consultants have delivered hundreds of offshore and offsite BI development projects to customers ranking among the largest firms in banking, retail, cable, oil and gas, and healthcare. We have global scale, with data warehouse practitioners across client locations and our nine global BI delivery centers.