Enterprise Insights

Blog archive

Survey Sheds Light on Big Data Challenges, Practices

With data growing every day, how are IT managers coping? A survey conducted by RainStor from mid-July through mid-August acknowledged the importance of big data -- three-quarters (75.7 percent) of the mid- to senior-level managers responsible for big data infrastructure and analytics environments agree that “managing their big data and making it available across the enterprise was important to improve overall business value.”

Of course, knowing it’s important to manage big data successfully doesn’t mean you’re doing so. Of particular interest, 10.8 percent of respondents didn’t know if big data helped their organization make better business decisions, and 8.1 percent it didn’t.

When RainStor asked 110 executives in specific industry sectors including banking, financial services, telecommunications and manufacturing about the biggest challenges of managing big data, 37 percent said it was in analyzing the data; 25 percent claimed it was the speed of data creation (the “velocity” aspect to the familiar “3 Vs of big data” -- volume, velocity, and variety).

What happens when big data gets too big? According to RainStor, “Surprisingly, almost 30 percent of respondents look to less expensive data warehouses when they reach capacity in their existing enterprise data warehouse.” Actually, that’s not so surprising, which RainStor admits in its report, pointing to limited IT budgets.

For more than a quarter of respondents (25.7 percent), when the data gets too much to handle, data is moved to tape. Despite a tendency for enterprises to store data indefinitely (especially after adopting a big data mindset), archiving is used out of necessity. Of course, this poses additional problems, such as not having the data available for immediate query to meet regulatory compliance inquiries. RainStor also points out that in the banking and financial services industries, regulations created from Dodd Frank legislation specifies that tape isn’t “an acceptable medium for data that must be stored for 10+ years.” The company says that 12.5 percent of respondents point out that it can take one to two weeks (sometimes more) to “reinstate the data for online query.” (The most popular response -- at 37.5 percent -- was “multiple days.”)

Hadoop is often mentioned in the same breath as big data, just as it is in this survey. Of respondents “seriously looking” at Hadoop, most are split right down the middle between “being considered as an augmentation to existing data warehouse/database environments versus a replacement strategy” (that is, as a standalone solution). If it’s so popular, what’s holding enterprises back? Over half of those surveyed claim they lack the required skilled resources or they can’t take on new technologies or projects.

I asked Deirdre Mahon, vice president of marketing at RainStor, if the survey results offered any surprises. “Although Hadoop and big data are almost synonymous today, we see that half are looking at it as an augmentation to their existing database and data warehouse environments. That makes a lot of sense, especially with the level of investment that has been poured in over the last few decades. What surprised us was the 25 plus percent of respondents that still put data on offline tape. It just seems so antiquated now. We do believe Hadoop as a platform will pretty much replace offline tape in the not so distant future. It just makes sense.”

Did Mahon see any trends from previous surveys? “It’s interesting that there have been a number of big data surveys conducted of late by other sponsors, and many of the findings point to the same trends. For example, using Hadoop as an augmentation strategy, such as for ETL or prototyping, and the fact that standard SQL as a query language is not going away any time soon. Our survey said 85 percent still rely upon it to run daily queries.”

-- James E. Powell
Editorial Director, ESJ

Posted on 08/24/2012 at 11:53 AM

comments powered by Disqus


Wed, Sep 5, 2012 Keith Lynn Atlanta

To expand on your comment on tape storage, when companies start to see their data explode it just isn't effective both in cost and performance to keep it in the primary data store(s). Commercial databases get very expensive when you try to host TBs worth of data and keep performance levels high. The alternative is slower cheaper storage. Typical common methods are low cost disk and tape. Since these disks can, and will, fail it still needs to be backed up to tape. We need an affordable database technology that can address large volumes of data in a near-line state that doesn't not take a week to restore/backup and doesn't break the bank.

Add your Comment

Your Name:(optional)
Your Email:(optional)
Your Location:(optional)
Please type the letters/numbers you see above