Q&A: Agile BI Architectures

What is an agile BI architecture, and how does it fit into an organization's overall BI approach?

What does it mean to develop an agile BI architecture, and how does such an architecture fit in with an organization's overall BI approach? To learn more about this emerging topic, we turned to John O'Brien, a long-time industry information architect and the CTO of Dataupia, a data warehouse appliance company delivering massively parallel processing data warehouse architectures.

BI This Week: What is your perspective on today's agile BI movement?

John O'Brien: Each major trend or era in data warehousing was in response to something that wasn't working well or failed to deliver business value with less risk. It's sometimes hard to remember what the challenges were at a given point in BI history that created a shift.

In the case of agile BI, I believe we're seeing a fundamental shift in application development in IT organizations, and you have to ask yourself, if it's so good in other environments, then why aren't we leveraging it in the BI world, too?

We share several challenges with other development teams; business changes are getting faster to remain competitive, requirements are less known or understood and realizing that co-developing with the business yields better solutions. Development should be about lowering the investment and risk to try new processes and business models.

An agile BI approach is attempting to overcome these challenges with a combination of agile development processes, agile BI/IT organizations, and agile technologies. Agile BI architectures must ensure consistent information access through governance with the flexibility to support different database technologies, proper data models representations, near real-time integrations, and faster-than-ever changing environments.

Where should a BI team start when beginning to take an agile approach?

I've seen that most teams are already doing some form of agile BI and they recognize this. I remind BI teams to make sure that they're not only looking at agile development methodologies, like scrum for example. To achieve agility in BI for the business, they should evaluate and assess how all the other parts of BI delivery is embracing agile. Do you have processes that allow for communication and make changes quickly? Are people empowered to make decisions? Is the business ready to work with you? Are your technologies easy to adapt and flexible? Do you have an architecture roadmap, standards, governance in place?

To deliver agile BI, you need all aspects of the delivery embracing agile principles. Also remember that successful agile BI teams took years to get there and learned what worked for them along the way. With agile BI maturity, we can learn from what others are doing today, but there's still a lot to learn and realize ahead of us.

How do agile architectures relate to the maturity of the DW/BI architecture?

I've always felt that mature DW/BI environments, typically 7 or more years old, tend to find agility much easier because they've made that shift from "DW" to "BI" or from building DWs to leveraging DWs. Agile BI in the front-end delivery or user-interface development is much easier than agile data integration or agile data architecture, but this comparison is missing the point of agile development.

Agile BI brings more frequent deliveries of business value that are more closely aligned to business needs and priorities. Agile BI means dealing with requirements that are vague and reducing risks inherent in changing environments. You should always balance tactical and strategic architecture decisions with the goal of minimizing rework and throwaways through an approach of refactoring the architecture over the years.

What is an agile BI architecture?

We could talk all day about this question, and I know many of us will in San Diego this fall. I'm looking forward to case studies and presentations on everything from thin slicing BI delivery in sprints, leveraging a metadata-driven semantic layer on top the databases for flexibility, to people who have leveraged software-as-a-service and cloud computing for agile architectures.

For me, DW/BI architectures are really information management platforms or environments that are part of the business like operational or mission-critical systems. For a BI architecture to be considered agile, it must be able to move quickly with business changes, it must always leverage as much as possible of what you've already delivered, and it must ensure information consistency for decision making. We are building the decision making environment for companies so better, faster, and more confident decisions can be made.

We work closely with the business for quick delivery but always do so in a direction of ever-increasing value in the architecture to be leveraged more so next time. The more architecture you can leverage next time, the less that needs to be built or changed. Reuse is a fundamental concept in service-oriented architectures as well. I like to say that we used to architect for "built to last" and now we architect for "built to change."

What are some of the risks in agile architectures?

My concern is that when BI teams embrace agile architectures, they become so focused on the here and now delivery that they loose sight of the objective of DW architectures. That objective is to be a managed environment of integrated enterprise information with an active data governance program.

As our pendulum swings once again towards agility, speed, and tactics, we can do so successfully while also embracing the flexibility of strategic enterprise BI programs. I remind BI teams that the business asks for information and BI capabilities to be delivered, not an architecture or methodology. Recognize that the business won't ask for an agile methodology or a 3NF data warehouse or a hub-and-spoke architecture because they are not concerned with "how" it is delivered but rather "what" is being delivered and "when" it is being delivered. They ask for capabilities and access such as information, analytics, metrics or data mining. The BI teams must recognize that in order to deliver these in a agile way, we create our own requirements for agile architecture and delivery process. The risk is when you only look to the business "what" requirements as input to building a data warehouse.

How do data models play a role in agile architectures?

I think this is a growing discussion and debate within the BI industry. In San Diego this fall, the Executive Summit is hosting a panel discussion titled State of the Data Model to hear differing points of view from industry veterans. As a long-time data architect, I have always practiced that there are many data modeling techniques available and we use the best technique for the given situation. I believe there are two fundamental logical data modeling paradigms; the entity-relationship approach and the dimensional approach.

Beyond that there are many derivatives of these two from dynamic attribution, null-valued pairs, and aggressive fifth- and sixth-normal forms to snow-flaking and multi-type dimensions. There are also many physical and structural data modeling techniques that we use, such as synchronizing materialized views, use of detail and summary tables, denormalization, and then RDBMS, cubes, columnar, and MPP databases.

At the heart of it all, all the variations are techniques we've used to meet performance or scalability requirements. We used these techniques because the technologies and infrastructure we had available at the time were not able to meet the requirements alone. Are today's technologies able to overcome our need for techniques? Some people think so but I believe we are always looking to rebalance this equation of technique versus technology.

What technologies do you see helping agile architectures be successful?

The BI industry continues to be an exciting place thanks to the many technologies that we do have available now. We always have an appetite for more data and information to consume then we have time and infrastructure to deliver it.

Agile BI will be leveraging emerging technologies that help us to configure, deploy, and manage faster than every before. These technologies include virtualization, cloud computing, SaaS, and DW appliances. Scalability is no longer reserved for the Wal-Marts of the worlds because every company can affordably management DW architectures capable of hundreds of terabytes of near-real-time data never having to manage purging of detail data again.

Today's scalability mantra is "what new business models can you create now that you have all the data you want?" Massively parallel processing (MPP) databases and appliances are widely available and accepted now. Our business customers are also more savvy and technically competent than ever before and they know how to use search engines, perform analytics, and build their own interfaces.

Fortunately, our current BI technologies enable users like never before with dashboards, desktop analytic apps, and collaborative technologies. After all, most desktop computers now have more horsepower than our early data warehouse servers many years ago. We may finally be out of the report-building service as our users co-develop with us now. That's a good thing, too, because we need to focus our efforts on building and maintaining an adaptable, clear, consistent, and always-available information platform for them to access.

Can you tell me what agile architecture is not?

An agile BI architecture should not mean point solutions and going fast without a roadmap and strategy. There should be a great deal of thought put into a data architecture that can be leveraged the most and most often. Agile BI is simply a better way of building, learning, refactoring, and leveraging that architecture as you go.

If the word "chaos" comes to mind when you review your architecture in an agile BI approach, then you're missing the point. The shortest distance between two points is where you can leverage more of what you've already built. That's why mature data warehouses have it a little easier. There should be an architecture blueprint for your data warehouse and you should be referring to that map frequently as you drive your agile trip to make sure that you're not getting too far off course. Build fast, build with the business, and keep glancing at that roadmap and asking yourself, "Am I leveraging an enterprise data warehouse for increasing value?"