In-Depth
Data Warehousing on a Shoestring Budget (Part 2 of 3)
You can implement data warehouse solutions on a small budget by focusing on system, database, ETL, and reporting technologies that work in concert with requirements gathering, development, testing, and training. In the second part of our series, we look at how to keep human/personnel costs low.
by Nathan Rawling
As we explained in the first part of this series last week (http://www.tdwi.org/News/display.aspx?ID=8950), organizations can maximize a limited budget when purchasing hardware and software simply by staying focused on immediate organizational needs and wisely selecting tools that meet these exact requirements. However, there is another important aspect of a data warehousing project where organizations must be accountable and, frankly, careful, in their budgeting -- the human costs.
What are human costs? Simply, they are costs inherent in needing and having individuals select data warehousing tools, implement these tools, and train staff in the proper use of this technology. To support your data warehouse project you will need to find individuals to fit these roles in your organization, hire additional staff, or outsource these needs.
Individual Roles
Traditionally, a data warehouse implementation requires many roles: a sponsor, business analysts, database administrators (DBAs), data modelers and/or data architects, ETL developers, reporting/business intelligence developers, and testers. In a complex and sizeable project, many individuals may be required in each role. For a smaller project, however, a large staff clearly would not match the project's needs. It is simply enough to ensure that every role is covered with sufficient personnel to complete the effort in a timely manner. In planning a data warehousing project, these roles and subsequent responsibilities must be considered when balancing your human costs.
Sponsor: A DW sponsor serves as the business representative from the office or department of an organization that seeks to benefit from the data warehouse. The sponsor is active throughout the effort but is focused on facilitating the relationship between the warehouse builders and the end users of the project. The time required for this important task cannot be underestimated. From organizing interviews for requirements gathering to reviewing proposed designs to approving testing paradigms and organizing training, the sponsor is responsible for some of the most critical activities within a data warehousing implementation.
Business Analyst: The tasks of gathering requirements from the business, assimilating them, and producing coherent documentation for the team members is traditionally performed by a business analyst (BA). In some environments, the BAs can take on even more responsibility, such as data profiling, data mapping, and testing. In many ways, however, the role of the business analyst is most critical in successful data warehousing projects because they translate the needs of the business to the technical team.
Database Administrator (DBA): The database administrator manages the database and, in some cases, the server which houses the database. In a DW project, the DBA is likely to perform capacity planning, create databases and schemas, and recommend/create indices. DBAs are also generally responsible for the security of the database, so the DBAs need to be involved in developing and implementing access requirements and policies.
Data Modeler/Data Architect: In a DW project, data from many source systems is taken and delivered into the warehouse or data mart. Understanding the source systems and developing the target schema requires a strong background in data structures and knowledge of data warehousing strategies for data storage (star and snowflake schemas). Depending upon the complexity of the source systems, it's likely that an organization will need a small effort from this role relative to other tasks. Most DW projects will not require a full-time data architect, but instead will require data architecture activities be completed at certain points throughout the project.
ETL Developer: The ETL developer creates the data transformations using the requirements from the business analyst. Again, depending upon the complexity and number of the necessary transformations, this is frequently one of the longest-running efforts in a DW project. Use of ETL tools such as Informatica or Data Stage can certainly speed development but can drive up software costs as well. Conversely, manually coding ETL using procedural languages can be very slow. Carefully balance these two expenses to minimize the overall total cost for the organization.
BI Developer: The BI or reporting developer, the counterpart for the ETL developer, is responsible for the front-end reporting solution that delivers data to the end users. Using the requirements provided by the business analyst, the BI developer designs and develops reports, analytical tools and dashboards so that end users can easily access the data warehouse. Frequently, this activity is second only to the ETL development in effort.
Tester: The ETL and BI developers generally perform unit testing -- testing the individual components in development. However, they rarely perform end-to-end validation, which is testing of the whole system from source to final output reports. Such stringent testing requires test plans and careful execution to ensure that the DW solution effectively meets the business requirements, which is where the position of tester comes into play on the DW team. In many cases, testing does not require substantial effort, but unfortunately the amount of time allocated for testing is nearly always underestimated. Sufficient testing to ensure that the requirements have been met exactly presents a significant undertaking.
End User: While the role of the end user seems obvious, it is not uncommon for them to participate in requirements gathering sessions, late acceptance testing and systems training. Although important, time spent to aid in DW implementation can certainly add up and reflects time taken away from other job responsibilities.
Combining Roles
Small teams can produce stunning results when faced with appropriately sized projects. The key to success, though, is that the implementation team must have the right composition, with the right mix of skills, and time to complete all necessary tasks. When attempting to consolidate roles into individual team personnel, the most important considerations are the following:
- Total effort required by role in current project
- Concurrency of efforts
- Available skill combinations
- Traditional responsibilities/skill overlaps
Three activities within the typical data warehousing project constitute the bulk of both effort and time: requirements-gathering and analysis, ETL development, and BI development. ETL and BI developments cannot truly begin until the requirements are gathered completely. Therefore, it is possible to have the same resources perform the requirements-gathering and other roles in the implementation without necessarily impacting the project timeline. However, it is not uncommon for the ETL development and BI development to execute concurrently to speed project completion. You eliminate this type of project plan if you combine the ETL developer and BI developer roles into a single team member.
When building a small team to implement a data warehouse, it is possible to effectively complete this type of project with as few as two people in addition to the sponsor and the end users. Yet, while it is possible to run a single-person project, it is rarely advantageous to do so, especially since timelines tend to be very long.
The DBA/Architect/ETL Developer: Aside from the previously mentioned option of ETL developer and/or BI developer, a common scenario is to combine the various intensive database roles. Many ETL developers have some prior experience with database administration and data warehousing methodologies. However, not every ETL developer has all of these skills. Yet, finding one that does possess these skills can potentially save adding two full-time employees to your payroll. The typical data warehouse schema is not a complex data model to design and the task of creating tables within the database is typically within the capabilities of most developers.
There are risks to this approach, however. If a complex database issue arises, it will take valuable time away from ETL development. It may also simply be beyond the database administration ability of an ETL developer. Furthermore, even as the target schema in a DW implementation tends toward star-schema simplicity, the source databases can have varying degrees of complexity, and therefore an experienced data architect might be needed. Still, in a typical data warehousing project, merging these roles makes sense if you do not require a full-time data architect and DBA on your project.
The Business Analyst/BI Developer/Tester: Although ETL developers tend to have a somewhat stronger technical background, the BI developers tend to come from a business background. For this specific reason, it is sometimes simpler for BI developers to take over the responsibilities of interacting with the business. Moreover, as end-to-end testing cannot take place until the reporting solution is completed, it makes the most sense to allocate that specific role to this person as well. When troubleshooting, the combination of these roles will be beneficial to the organization as these resources, typically, must work together to correct any deficiencies.
When a DW project involves multiple, complex source systems or a sophisticated business transformation, it is not always appropriate to ask a business-focused individual to complete all of the requirements' analysis outside their area of expertise. In those circumstances, assigning requirements-gathering duties to the database person or sharing those responsibilities will likely yield superior results.
Outsourcing
Going outside of the organization is a different yet excellent way to access DW professionals. Although per-hour rates can seem steep at times, the savings from hiring a dedicated team can be substantial. Outsourcing can be a terrific way for an organization to begin a data warehousing initiative as it later takes a strong first step toward building an in-house team to manage the solution.
Another scenario where outsourcing is beneficial is when an organizational project requires extra help to fill a particular role or requires overflow capacity. If you have a business analyst, DBA, ETL developer, and BI developer available, but your organization still needs a data modeler who is capable of mapping complex source systems, outsourcing makes sense. Similarly, if the ETL development will be more substantial than your existing resources can accomplish, bringing in additional experts to supplement your team members can allow your organization to meet demands within the project plan without being concerned about having excess/idle team members at other times in the project.
Checkpoint consulting (consulting provided by an expert on a part-time basis over the life of the project) has proven to be an excellent way of benefiting from the vast experience of an enterprise DW team while paying for only a fraction of a consultant. An experienced consultant with broad and deep knowledge and experience can evaluate your existing data warehouse and business intelligence solutions, make recommendations, and, ultimately, help you leverage your existing assets.
Each of these roles is critical to the proper implementation of a data warehouse, but each is also a key aspect of the human costs associated with each DW project. Clearly, these roles are important and as much as an organization attempts to maximize its budget through cost-effective decision-making, one area where an organization must be willing to make an investment is in its DW implementation team. This is an investment, although seemingly too significant at first, which makes all of the difference when considering a project's success. When operating on a limited budget, it is the organization's responsibility to find a way to maximize its people for the benefit of a project as opposed to the detriment of it.
- - -
Nathan Rawling is a data integration/business intelligence consultant with ISA Consulting, a systems integration firm. You can contact him at [email protected].