Q&A: 2011 Promises More Struggles with Big Data

BI pros will face more big data challenges this year.

One of the biggest trends -- and challenges -- in BI today is the exploding amount of data that customers are trying to track and manage, according to Dan Lahl, senior director of product marketing at Sybase. Lahl has extensive experience in data management, data warehousing, and analytics, and has worked in technology for over 30 years. He has been with Sybase since 1995, working in areas including data federation, data integration, and cloud computing. In this interview, he discusses the challenge of managing big data and other key issues in data warehousing and BI into 2011 and beyond.

BI This Week: What do you see as some of the biggest trends in business intelligence this year?

Dan Lahl: One of the biggest trends across BI and all of data warehousing is this: The amount of data that customers are tracking is absolutely exploding. … Customers are telling us this, and it’s putting a lot of pressure on data systems. One reason is that the sort of long-tail analysis people want to do isn’t just three months or a year any more -- they’re demanding four to seven years, or in the case of the financial industry, ten to 15 years. We’re seeing the amount of data that has to be analyzed going from a billion rows of data in a table to 20 or 30 billion rows.

In addition to longer-tail strategies, we’re seeing a big addition of new data sources such as RFID and bar code readers, different places tracked through the supply chain, as well as social media, and blogs.

All of that is adding to the data explosion. It’s massive, and we hear this from virtually all of our customers. This isn’t a new trend in 2010 and 2011, but it’s certainly an accelerating one, and we see it moving now past the innovators and further into mid-market companies.

The three vectors -- a lengthening of time that data is needed to do better trending analysis, along with the vast amount of data coming in from new and different sources, and, finally, the addition of unstructured data -- those are the three big trends we see contributing to data explosion.

Another trend we see in requests for proposals in 2010, certainly accelerating in the second half of the year, is a demand for systems that can handle a greater number of users. Companies are pushing the boundaries; they want more line-of-business users doing analytics. Rather than a few hundred users, we’re seeing mention of 2,000, 3,000, even 4,000 users.

Finally, there’s one more mega-trend we’re seeing, and it has to do with data size. Companies are trying to understand both their transactional information and their unstructured information. We’re seeing a number of companies, for example, working to merge their transactional system with their customer technical support systems in order to discover whether customers are happy with customer support by analyzing key words and unstructured data, so-called sentiment analysis.

Despite the interest around and spending on BI, actual BI adoption isn’t growing much within companies. What needs to change for user adoption rates to increase?

People are still spending on BI, but because the big vendors are struggling, adoption isn’t growing as much as we think it should. We’re doing fine, and some of the small companies are doing fine, but the big vendors are just figuring it out now. It goes to those trends we’ve discussed -- being able to do “big data,” to get it to more people, to deal with structured and unstructured data, the long-tail factor, and so forth. All of those things are putting a lot of stress on IT people, who are trying to keep the plates spinning.

The problem is getting up and running much more quickly. In most of these analytic applications from the big vendors, the problem is that it’s a nine- to twelve-month journey to get your first report out. … It’s too complex.

What needs to change? There are lots of interesting ideas out there. MapReduce and Hadoop are basically being used as pre-filtering devices for big data, so that’s one direction. Another is columnar databases, along with MPP being used by some of the smaller vendors.

We’re seeing that the traditional approaches that people have taken just aren’t going to work anymore. People are trying column-based, they’re trying MapReduce, they’re trying the integration of structured and unstructured queries and capabilities into the same database engine … all kinds of things. As vendors, I think we’re all pressing on the box a little bit, trying to get faster and better implementation, trying to make the right things happen for customers.

There have been many advances in mobile devices, but BI vendors seem a bit slow in taking advantage of mobile BI. How important will BI on mobile devices be? What can we expect to see in the next year or so?

What we see in mobile, and where the world is headed with mobile analytics, is this: There are some really sexy visualization tools on mobile devices, like Roambi, that can draw some pretty graphs on a mobile device. The problem is that most of those tools are just visualizing a small spreadsheet on the device. To me, that’s not real mobile analytics. [A better approach would be] taking a visualization tool that’s device-specific, having a pipe that connects to the back end, and having an industrial-strength analytics server on the back end, properly secured. That would give more of an interactive piping between the data and the mobile user who wants an exact answer in an interactive environment.

A user could actually drill down, but it means an interactive pipe and a very robust server on the back end -- you’d be sending results sets only down through the pipe. I don’t think the vendor world has figured that out yet. We’re on a path to do that [here at Sybase].

In 2011, mobile analytics will be a trend that vendors will start to get a handle on. We won’t have the solutions yet, but we’ll start to get a handle on it.

Predictive analytics is another hot topic. How will analytics be used forwarding the future, and will hardware challenges that seem to be holding analytics back be addressed?

If you have the right infrastructure, we definitely think that predictive nature will drive the market. Whether you define that as building statistical models with SAS, SPSS, or some other product, or rather just as the ability to do really good data mining and good ad hoc analytics -- to me, it’s all predictive analytics.

It’s a place where only the innovators and early adopters have been doing really good predictive analytics. In 2011, with what’s happening with vendors and the market, and with the move toward using columnar databases and MapReduce among many vendors … predictive analytics will become much more mainstream.

[At Sybase], when we look at customers doing classic reporting and classic KPIs versus real predictive analytics and modeling and data mining, we saw our revenue mix changing in 2010 toward more predictive analytics, and we think that trend will continue. It used to be around 35 percent of our business; it’s now up around 50 percent and growing, so we think that will be really big.

We also think the technology piece that enables it will be in database analytics. Instead of pulling it out into different tools such as SPSS or SAS, the BI tools will have to get that much smarter. They’ll have to work around their semantic layers, their extract processes, and actually allow the database to do the heavy lifting on the big data. That’s where we believe the market is going to go. …

What will Sybase, now that it is part of SAP, bring to the BI/DW table in 2011?

It goes back to your question on adoption rates. Basically, users are frustrated that it takes so long to get the answers. If you look at what we have now with SAP Business Objects, we have a great way to extract the data, an optimal place to put it, and a set of visualization tools that are best of breed in the market.

We have the EIM (Enterprise Integration Management) or data integration tools from Business Objects that will optimize for Sybase IQ, we get the data into IQ, which is a best-of-breed analytics server for serving up the visualization, and then we have the visualization tools in Business Objects that we’re going to optimize for IQ. We’re actually putting the whole stack together and optimizing -- our answer to users is: we’ll be a lot faster.

The bottom line is that users will be able to go from questions to answers more quickly with the integrated stack.