Why Agile BI Needs an Agile Architecture
To react quickly to a changing regulatory environment, BI projects need a flexible and robust -- and agile -- underlying architecture.
By John O'Brien, CBIP, President, Zukeran Technologies Corp.
[Editor’s note: John O’Brien is leading several seminars at TDWI’s 2011 Government Summit being held this April in Crystal City, Virginia, including Architecture and Technologies for Agile OLAP on April 8. Visit http://tdwi.org/dc2011 for details.]
Agile development has been sweeping through application development teams (and more recently BI teams) across the public and private sectors alike. We’ve seen development methodologies come and go over the decades; today we’re embracing terms such as “scrum,” “sprints,” and “refactoring” into our everyday vocabulary. With each one of these methodologies, the driving force has always been to improve aspects of our value delivery. Information consumers are interested in quicker delivery of information, and government agencies need to act quickly to meet regulatory requirements.
Organizations of all types and sizes want lower the risk of projects not meeting intended business needs. Projects delivered in smaller increments will reduce risk, deliver valuable increments faster and effectively manage ever-changing project requirements in a fast-paced world.
Such needs drive development methodologies to incorporate short delivery cycles, partnerships, and fast failures (to reduce losses) while not worrying about formal requirements processes that can be rigid. However, when the organization wants the benefits of being agile, the whole system of information delivery -- not just the developers’ methodology -- has to be agile.
Agile architecture doesn’t immediately come to mind when we think about agile BI. Some might even jokingly say that it’s an oxymoron. However, information delivery does create and run on an information platform and architecture. Architectures represent solid foundations and standards that emphasize consistency that will stand the test of time. This may sound as though it’s the opposite of being agile and tactical -- or of delivering quickly. We need to understand that being agile is a philosophy and the same benefits enjoyed by agile developers are gained in building information architecture.
I want to outline just a few of the principles that I have found successful in bringing architecture and platform evolution into the agile model.
The data warehouse architect needs to recognize that just as information delivery projects add value to customers, their DW architectures make it possible to deliver those projects efficiently and reliably. Similarly, customer demands change rapidly, as much the underlying architecture. It must continually change and evolve.
We grow into our vision of data warehouse architecture with each project delivered and by mercilessly refactoring and revising the architecture. Classic architectures and principles should be the guidelines along the way, of course. TDWI’s BI Maturity Survey reinforces that data warehouses naturally evolve over time -- from infant stages to being a sage. Conforming dimensions, populating an EDW subject area, and consolidating staging areas are examples of refactoring, and having a mindset open to such changes allows agile BI projects to be completed quickly without concern that developers will ignore architectures, create information stovepipes, or run into dead ends.
Managing the physical architecture is similar in some ways to raising children. Sometimes you know ahead of time they need new winter clothes; sometimes you watch them outgrow their existing clothes when you put them on the school bus in the morning. A physical data warehouse architecture typically starts with a single server and a few hundred gigabytes of data. By the time it’s a teenager, your warehouse architecture must have matured to handle dozens of special-purposed server and hundreds of terabytes that serve an ever-growing variety of enterprise users tackling an ever-growing variety of BI tasks.
Specific architecture-based projects need to be treated the same as agile BI projects but use separate sprints and defined as distinct projects following BI project implementations. Typically refactoring happens about every six months for younger data warehouse architectures that are focused on initial deliveries and 12 to 18 months for more mature data warehouses focused on efficiency and consistency. Always look for opportunities that simplify the architecture, maintain or improve consistency and consolidate patterns in the architecture while not removing any functionality. Keep in mind that refactoring architecture can be either opportunistic or sometimes reactionary when there’s enough pain.
Thin Slicing Your Architecture
Agile BI methodologies usually incorporate the concept of thin slicing a data warehouse’s architecture as part of the sprints. Within your vision for the data warehouse, you’ll typically have data architecture layers and logical data stores with data flows. This is more common in hub-and-spoke architectures, but we also see that bus architectures will rely on staging data architectures for persistence of conformed dimensions and autonomous event data.
Thin slicing is how your sprints will slice across all those layers as needed to support the final information deliverables, data marts. When you start thin slicing across two, three, four (or more) data layers and stores, the associated effort for data profiling and modeling, ETL development, and job schedulers add up quickly, and you find yourself struggling to deal with organizing your agile sprints around these tasks (or getting rid of them all together). This is where the value of architecture delivery gets questioned in the middle of information delivery to users. This is nothing new; this was a primary criticism of the strategic enterprise DW projects in the 1990s.
When you thin slice the architecture horizontally rather than vertically across all the layers, you will reduce the amount of work being done in information delivery projects. Let’s build only the layers of the data architecture as needed over time to fill in the vision architecture like a puzzle, recognizing that architecture-based projects deliver business value by allowing new development to be more efficient and leveraging as much of the information architecture as possible.
We can say that refactoring the architecture is all about minimizing rework and throwaway efforts, reusing and leveraging architecture as an information platform. Thin slicing may seem like you must deal with issues and constantly fix your architecture, but benefits include quicker deliveries and an architecture that evolves as needed.
The Importance of Strategic Road Maps
As in any agile project, communication is key. This is especially true when you’re building and maintaining a data warehouse architecture.
One key component of an agile BI approach will be active management and communication of strategic architecture road maps. You start by performing an assessment and inventory of your data warehouse assets and capabilities. If the inventory is point A, then the vision of your data warehouse architecture will be point B. A good architecture road map shows snapshots of your expected architecture evolution for the next three to five years. The current year will have quarterly snapshots that (at a minimum) reflect your current planned project deliveries, years 2 through 5 will only offer an annual snapshot that shows which part to the architecture you plan to focus on.
At least once a year, preferably quarterly, formally update this road map and communicate it throughout your organization. Executive sponsors prefer this level of detail and progress tracking while gaining confidence that you have a well thought out plan. The annual review will also include the past year’s accomplishments and typically coincide with the budget planning process for the next year. While a physical architecture roadmap assists with the capital budget process, the logical data architecture deals more the BI deliverables and capabilities.
A Final Word
There are more principles for agile architectures that I get share in my presentation at TDWI Washington D.C. world conference. Some of these include the use semantic layers, data persistence principles, sandboxes, and metadata self- service architecture. At the root of all of them is the desire to be agile and always keep in mind the value the agile philosophy delivers.
All aspects of your agile BI program will need to agile: the people, process, architecture, data, and technology. Agile architecture is keeping our natural data warehouse evolving while increasing value of the information platform and minimizing rework.
John O'Brien, CBIP is president of Zukeran Technologies Corp. and a recognized architecture and analytics visionary. In 2004, John pioneered the first massively scalable transparent data warehouse appliance in response to a market need for more scalable data environments. You can contact the author at firstname.lastname@example.org.