In-Depth
Data Management: The Year Past, The Year Ahead
In a year when in-memory analytics was memorable, what data management trends were the most important in 2011, and what can we expect in 2012?
By Conor O’Mahony, Program Director for Database Software, IBM.
Big Data, NoSQL, and in-memory analytics captured many of the data management headlines in the last 12 months. However, enterprises are still in the early stages of evaluating these technologies, and it is too early to refer to them as trends, so let’s have a look at the actual trends from this year.
2011 Trend #1: Making life easier
With IT departments managing an ever-increasing footprint, and IT head count remaining rather flat, something had to give. How could IT staff be continually asked to manage more systems? Thankfully, technology came to the rescue.
More organizations are taking advantage of autonomic features in the major database products. These features have matured to the point where they can often out-perform even the leading database experts when it comes to tuning database performance. IT staff are also taking advantage of time-saving features that automate tasks such as statistics collection, re-orgs, and storage allocation.
Many enterprises are also taking advantage of the easy deployment and management associated with data warehouse appliances. These appliances allow IT staff to quickly and easily deploy data marts. In some cases, organizations have reported being able to run reports against data within 24 hours of a data warehouse arriving at their premises. When you consider that these appliances save the time and effort involved in assembling the different parts of a system, integrating them, and then optimizing them, it soon becomes apparent why they are becoming so popular.
The leading trend is taking advantage of technological advances to do more with the same staff.
2011 Trend #2: It’s all about the money
The recent economic downturn forced many enterprises to closely examine IT budgets. With between 60 and 80 percent of a typical budget going directly to existing systems, freeing budget for the new initiatives necessary to fuel business growth became a challenge. This situation led many enterprises to look at ways to reduce the budget for existing systems, especially with database systems.
There has been strong adoption of features such as database compression that offer hard return on investment (ROI) by lowering storage-related costs, and soft ROI by reducing database administrator time requirements for backup and replication. However, the most significant contribution to lowering costs in 2011 has been the effort by many enterprises to reduce the database maintenance fees paid to vendors. Many enterprises are achieving this with consolidation, virtualization, and migration projects.
By consolidating non-mission-critical database instances onto fewer servers and using virtualization, organizations are reducing the number of database licenses needed. Even when working with database vendors that do not allow a reduction in license count, organizations are still re-allocating those licenses for new projects, thereby having a positive impact on their IT budget.
Many organizations are also taking advantage of what Forrester is calling the Database Compatibility Layer (DCL) to easily migrate from one database product to another, and in some cases significantly lowering their database maintenance fees. Several organizations claim that such projects have cut their database maintenance costs in half or more.
2011 Trend #3: A New data warehouse paradigm emerged
For almost a decade, enterprises have strived to create an enterprise data warehouse (EDW). The benefits of having a single, centralized data repository for the entire enterprise include having both trusted information and a “single version of the truth” for reporting and business intelligence. However, recently, enterprises are discovering a new and improved approach -- the concept of a logical data warehouse.
More enterprises are choosing, either consciously or unconsciously, to move away from the monolithic architecture of an EDW. They are creating a logical data warehouse environment where they offload analytics from the EDW to a purpose-built system (typically an analytic appliance), accelerating queries and freeing compute cycles on the EDW for reporting and operational querying. As a result, these organizations are improving the performance of their EDW while rapidly and efficiently meeting the analytics needs of their business.
2012 Predictions
What’s ahead for 2012? Here are my predictions for all of those exciting new technologies that we’ve been hearing about.
2012 Prediction #1: Hurrah for Apache Hadoop
Probably the hottest topic in enterprise data management is Apache Hadoop. Hadoop was originally created to meet the Web-scale data processing needs of Yahoo. After Yahoo turned it over to the open source community, a number of Web-based businesses started using it as for their Web-scale data processing systems. It didn’t take long for smart people to realize that Hadoop could be used to address the challenges associated with analyzing “big data” in the enterprise.
Although Web-based businesses are quick to adopt Hadoop, general enterprise penetration has been modest to date, with a relatively limited number of early adopters leading the charge. However, there is no shortage of interest in Hadoop from organizations that see it as a way to cost-effectively tackle problems that had been difficult or impossible to solve. For instance, Hadoop offers the promise of analyzing the wealth of information in semi-structured or unstructured formats. I expect 2012 to be the year when Hadoop penetrates the enterprise in a meaningful way, and for the industry to converge on an early set of concrete use cases for Hadoop in the enterprise.
2012 Prediction #2: Are some NoSQL technologies going NoWHERE?
Initially, NoSQL stood for “No SQL.” It gave the illusion of representing a rebellion against the SQL language. Now, as a number of technologies under the NoSQL banner adopt SQL interfaces, NoSQL stands for “Not Only SQL.” To be fair, NoSQL should probably have been named NoRelational, as that would have more accurately reflected its nature, but NoRelational is not as catchy as NoSQL, is it?
As you have gathered, the NoSQL movement is really about using the most appropriate data model for each challenge. The movement is a reaction to the typical shoe-horning of each challenge into the relational model. I’ve already discussed Hadoop, which is on a good trajectory, but what about the other NoSQL technologies such as key-value stores and graph stores.
This apparent challenge from NoSQL is not the first time that the relational database has been challenged. A few years ago, many predicted that object databases would conquer the relational database. However, the relational database added stored procedures, user-defined functions, and a number of other object-like features, and it has gone from strength-to-strength, and object databases are now just a bit player in the overall database market.
I predict that the major relational database vendors will, where it makes sense, add certain NoSQL capabilities to their products. For instance, this makes sense for both name-value pair and graph-store capabilities. Of course, this has already happened for XML data, which the major relational products support.
2012 Prediction #3: How memorable will in-memory analytics be?
In the past year, all major vendors have announced either exciting new in-memory products or product visions. Subsequent press and analyst coverage has resulted in hype for in-memory analytics. However, we should not expect 2012 to be a big year for the actual deployment of in-memory analytics. The reality is that many of the in-memory products being hyped have yet to prove themselves. Once they are proven, the hype may become reality. This is not going to happen in 2012, and until they are proven, most organizations will wait on the sidelines to see how these products evolve.
The Final Word
With the wealth of exciting new technology in the data management world today, it is a fantastic time for everyone involved. Now it’s time to sit back and see what actually happens.
Conor O’Mahony is the program director for database software at IBM. He can be found blogging at http://www.database-diary.com. You can contact the author at conor@us.ibm.com