As CIOs plan for 2012, they may be hearing the buzz of Big Data more than Cloud (see this good write-up from Charlie Bess). While cloud computing has a significant impact on infrastructure – enterprises can either transform internal IT to a services model or outsource – big data’s impact on existing environments has not been well defined yet. Wikibon and SiliconAngle provided coverage of the recent Hadoop World conference in NYC (full collection of videos and articles here). Real customers of the technology provided proof points of how data scientists and the data analytics tools provide new insights into information that is transformational to business. While Internet bellwethers like Facebook, LinkedIn and Twitter are prominent early adopters, if Hortonworks CEO Eric Baldeschwieler’s prediction that Apache Hadoop will process half of the world’s data in five years is even close to the mark, this is a trend that can not be ignored.
The Big Data is partially an evolution of traditional business intelligence technologies meeting the intersection of the mobile and cloud waves. While the volume of “big” data get plenty of attention, we know that as an industry that today’s large amount of information will be considered small in a couple of years; the important thing to consider is that the speed (faster towards real-time), location (distributed), and type of data (trending towards unstructured) of the data requires new tools and methods to extract information. While much of this change is in software, there are optimizations needed in the underlying infrastructure. See my discussion of the impact to networking architectures. Similarly, as Colin Mahoney of HP’s Vertica group stated (see video here), HDFS (the default storage layer for a Hadoop environment) is a threat to traditional storage architectures. While the ripple effect of Hadoop and other Big Data tools are important, the biggest gap that companies looking to leverage these tools have is finding qualified data scientists. Training was a major focus at Hadoop World and since there is a shortage of trained people in this nascent field, CIOs should be sure to allocate ample budget to help educate the workforce so that they can grow into the new technologies.
Additional references on Big Data:
- Wikibon definition of Enterprise Big Data: http://wikibon.org/wiki/v/Enterprise_Big-data
- Big Data Manifesto from Jeff Kelly and the Wikibon Community: http://wikibon.org/wiki/v/Big_Data:_Hadoop,_Business_Analytics_and_Beyond