Q&A: Combining Search with BI Yields Powerful Results

Endeca chief strategist Paul Sonderreger talks about combining search and BI to help people find what they need and understand what they’ve found.

"How do you help people ask for information they don't know exists?" asks Paul Sonderegger. As business intelligence faces the challenge of moving beyond "super users" and deeper into the enterprise, Sonderegger, chief strategist with search and BI company Endeca, focuses on that basic challenge, and how answering it with the right tools can improve users' daily decision-making and discovery. The goal: to help people find what they need and to understand what they've found.

In this interview, Sonderegger discusses some current trends in BI, what they mean for the overall landscape, and what combining "the simplicity of search and the power of BI" can mean for users. Prior to joining Endeca, Sonderegger was a principal analyst at Forrester Research, where he focused on information retrieval and the user experience. He blogs about the bigger context around decision-making, information, and the competitive advantage at Endeca's Web site.

Endeca is a search and BI company that takes a somewhat different approach to BI. According to analyst and TDWI faculty member Cindi Howson, Endeca's product "combines both search and a columnar database to allow users to search, navigate, and analyze information that is in the data warehouse or in transaction systems and document files."

BI This Week: What's the biggest challenge that BI vendors -- and BI users -- face today?

Paul Sonderegger: [In a nutshell], the big change we see happening in the world of BI is the grand challenge of dealing with what we call "diversity at scale." What makes that so challenging is that there are two kinds of diversity. There's a below-the-water-line diversity of data and content that must be unified, along with an above-the-waterline diversity of people and the daily decisions they are trying to make. That's the grand challenge in front of BI and what we [at Endeca] are working on into 2011.

Let's start with that below-the-waterline integration. How good a job does today's BI software do of integrating different data sources?

It's important to remember that the BI industry has fought long and hard over these problems, and they have succeeded in coming up with a way to integrate diverse data sources. However, the approach is one of standardizing the incoming data to conform to the model. That works really well when you know ahead of time what questions are going to be asked of the data, and what data you will have at the time the questions will be asked. That approach from the traditional BI world is extremely effective for achieving efficiency at scale.

Now there's a new problem, however, and it's this: How do you deal with diversity at scale? This is where the traditional technology, which requires a lot of upfront modelling, runs into trouble. In the world of diverse data, a quality team might have to pull together, say, warranty claims from the field -- which combines structured and unstructured content -- plus data out of the procurement system about suppliers, plus data from parts catalogs about parts, plus data from customer blogs talking about those products. In that environment, the traditional approaches to BI really struggle to unify that diversity of data.

Do you see that changing as we move forward?

Absolutely. We're seeing a number of different attempts to reduce or eliminate the barriers posed by a traditional approach to BI.

One example is the use of in-memory databases, which take advantage of the dramatic increase in hardware performance to get around traditional demands to model the data up front. That's because in-memory databases give you such abundant processing power that you no longer have to model the data perfectly up front. That's a growing trend.

We also see some companies taking new approaches such as incorporating XML into the data warehouse, or working with more XML-like data structures -- in that way, we're getting away from the relational world. That also helps in reducing the demand to model the data very carefully up front.

We see innovations in delivering a more compelling user experience, thus encouraging people to explore. That's very effective when it's put in front of people who have to make many diverse decisions using varied data and content, which they somehow have to make sense of.

In short, we see a number of different trends, all trying to address this issue of diversity at scale.

When you talk about the challenge of diversity at scale, can that also be applied to mobile devices and their ability to take advantage of BI applications?

Absolutely. Below the water line, you have more and more data content; above the water line, you have more people making more decisions and relying on more information to do it. In fact, a lot of that decision-making doesn't happen at the desk; it happens out in the world. The massive proliferation of mobile devices now gives us a new sort of in-the-moment, on-the-go capability for decision making.

We do need to make an important distinction between the different types of decision support that mobile devices can offer, though. One type is mobile reporting and mobile dashboards, and that's very good for looking up facts and figures on the run. However, that's very different from the exploration and discovery that may be necessary to truly discover something new.

Let's say a relationship manager from an auto company is visiting dealerships to find out how things are going. On her way to Chicago, somebody at corporate says, "I've been hearing rumors about repairs on brake callipers and vehicles with low mileage, so ask around while you're out there." The relationship manager says, "Ask what and ask whom?" At corporate, however, they don't know any specifics.

When the manager is out in the world, she's going to need the capability, right there on the spot, to start asking more penetrating questions at individual dealerships, perhaps based on their repair records for particular customers or particular vehicles. That's going to require some exploration and discovery to support that on-the-ground activity of investigating what's actually going on. That's the power of being able to explore and discover [on the fly].

How important is it to not only give people better ways to manipulate data, but to help people understand what they have after they've found an answer?

Helping people understand what they've found is a fascinating challenge from a design perspective. It really takes a different set of technological capabilities to enable it. I can tell you how Endeca addresses the issue, but the basic problem is this: How do you help people ask for information they don't know exists? Of course, in this world of growing diversity, that's a bigger and bigger problem.

One of the ways you do this is by encouraging better questions, questions that reveal to users the attributes that already exist in the data. When you bring together highly diverse databases and documents, and put them all together, it's very powerful to reveal to users: Here are the attributes in this set of data and documents. That inspires new questions; it inspires new thoughts about perspectives to take on the data; and as a result, it guides people to a new understanding of the relationships in the data. That, in turn, leads to more effective decision-making.

For example, if you think about design engineers at a manufacturing facility, their job is to innovate in designing the new product, and yet whenever possible, the company would like them to re-use parts that are in inventory and that come from preferred suppliers. Combining those two challenges is a tough job. You're supposed to create something new but using only what you already have.

Engineers have an understanding of the form and function attributes for the part they're designing … for that water valve they want to put into a dishwasher, for example. But the company would like them to ask other questions. The company would like them to ask questions about which of these water valves come from preferred suppliers. Which of these water valves are currently used in our products?

Here's the problem: If the design engineers don't know that data is in the system, they certainly won't ask for it. The system, then, needs to prompt them that those attributes are there, and furthermore, it has to be exceptionally easy to take a quick cut into the data and get what you need out.

So the system itself knows to show the user data that he or she normally wouldn't know to even ask about?

To give a more accessible example, think about redesigning your kitchen. You want a new faucet. It has to be stainless steel to match all the new stainless steel appliances you're going to put in, and it has to fit with the beautiful silt stone sink you want to put in. You go to Home Depot and search for faucets, but there are nearly 5,000 of them. What do you do now? The system would actually show you, "Here are the attributes of those faucets -- brands, prices, number of holes in the installation." That's when you realize, "Oh, right, number of holes in the installation, I didn't even think about that!" You look at your sink and sure enough, it has three holes.

Now you've discovered another question you need to ask; it's not just, how many of these are under $400? It's a completely new question that leads to a different decision, a different choice than what you would have reached last year. It's better for you because it's what you actually need, and it's also better for Home Depot because you won't return it.

Looking ahead, do you see us at a turning point where we're really about to take a big bite out of the complexity of BI for the average user?

We absolutely are, and it comes in part because of this newly abundant processing power that allows innovative companies to take a completely new approach to unifying data and making it available to people with no training and the technology.

We're also on the cusp of a new approach here because the market is getting new ideas about the challenge it faces. Coming out of the recession, what we are seeing among our customers is that they are looking for new ways of doing business. They recognize that their supply chains, which were born out of a time of inexpensive money and cheap oil, really, are not necessarily the supply chains that they need for the world that they are headed into.

They are looking for new ways of working that allow functional departments to collaborate more closely. They are looking for new ways of working that make the information assets they have already invested in work hard. They are looking for new ways of working that take advantage of the growing information assets out on the Internet for customers. The market is actually developing a new set of requirements, and that's what you have to have to open up the opportunity for innovation.

We see the market recognizing this emerging set of requirements. We see manufacturers recognizing that they have a huge diversity of daily decision-making taking place among their engineers. They have a huge amount of data and content about parts, products, and suppliers, but it's isolated in separate systems. They're starting to see a real need to put some perspective on this diverse data in front of all these millions of daily decisions.

These engineers are asking themselves, how are we going to do that? They're beginning to realize that if they have to model the data ahead of time, they'll never be able to answer that demand, so they're starting to look around, saying, "Who else? What technology companies have recognized this emerging set of requirements and are taking an innovative approach to solving this problem?"

So that's how Endeca ties into what we've talked about?

Right. We are a search and business intelligence software company. We make a platform that combines the simplicity of search and the power of BI -- the whole point is to help people find what they need and understand what they've found.

There are three principles that guide everything that we do. First, no data left behind. We're combining structured data and unstructured content, inside and outside the company. Second, consumer ease of use -- reaching people who are traditionally underserved by traditional BI. We provide zero-training interfaces that people can start using immediately. Third, agile delivery. We're helping business and IT collaborate in a way they've never been able to before, but by using an agile deployment methodology, so that roughly every two weeks in our deployment cycles, the business can see a new working prototype of the application they'll get when it's all done. This is IT and the business in an entirely new collaboration.