Actionable Data that generates revenue, reduces risk and cost. It could be 10k or 10PB.
Welcome to the ECF, Clive. If it ranges from 10k to 10 PB, it is truly big data. I think the challenge is understanding which data is actionable.
Step one - Identify all sources of data: Step two - Decided what combination is actionable: Step Three - Decide on the frequency the business needs it. It's about the data you don't know you have, for example cash registers and security video combined.
An amazing number of devices are generating data. From wikipedia: "....ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks."
I read last night that credit card companies produce 40 somethings every day. I can't find where I read it and terabytes seem low, but pedabytes an exabytes seem high. But I'd bet it's pedabytes.
A set of technologies that allow organizations to rapidly process enormous sets of data, using inductive statistics, so as to predict the future behaviour of general populations, as opposed to describing the facts about a limited sample.
Great definition...especially in the field of medical research. How evolved would you say those tools are today?
It depends on who is asking. Generally speaking, given how many competitors and alternatives there still are, we are in the early stages of determining who the winners and losers will be. If you are a programmer or data scientist, who knows how to write code, there are lots of options. However, we seem to be moving from batch parallel processing to real-time so the tool sets are in flux for developers too. In other words, we may have already moved beyond Hadoop before most organizations have even heard of it. If you are a business user, who doesn't want to deal with the details of technology, the options are smaller in size. We also don't know what the ultimate resolution will be on the database front. Some would argue that at a semantic level structure is required to derive any meaning. It isn't surprising to them that one of the biggest uses of Big Data today is turning unstructured data into structured data. SQL databases run fast too, if you stick the entire file into memory and turn off all the atomicity features. Success is also a function of domain. It is easier to predict the future behaviour of natural objects, other than humans, than humans. However, we seem to be experiencing success at predicting the ripple effects of crime (i.e. LA police department), even if we can't predict the first event and humans tend to be very predictable when it comes to shopping habits.
Bill makes a good definition, but I would also add to this, as Big Data alone is just data - for which you could say so what?
So maybe we should start talking about Big Insight leading to Business Actions! Unless we derive insight from the data and then use this to drive action why have we been playing with all this data in the first place?
And by the way, the whole Big Data band wagon is a very useful platform to re-energise use of Analytics generally.
Part of the challenge is to understand the tools well enough to know what's useless and what's valuable. Tools that detect patterns and trends from sensors in a jetliner are going to be vastly different than looking for consumer sentiments on Twitter. Companies need to understand what's important to them and dig into the data than can provides them with relevant insights.
The phrase has not single definition. But generally I think of it as all the data you are going to need to access, store, process and analyze that your current systems are very likely not able to handle. These data will be comprised largely of unstructured and semi-structured data, the processing and analysis of which clearly exceed the boundaries of traditional systems and platforms today. Also many new tools and operating environments are evolving or which there will be a need for new skills and staff, not easily found. Oy, it's going to be a mess!
There's so much confusion around what constitutes Big Data, which I analyzed in this blog post. What do you think it is?
Is you say it's everything (all data), it's really nothing, right?