First published December 13, 2013 in Mediapost’s Search Insider
Data is ubiquitous, and that is true pretty much everywhere. It was certainly true at the Search Insider Summit, where every panel and presentation talked about data. And not just any data — this was “Big Data.” But what exactly is Big Data — just more data? Or is there a fundamental shift happening here?
I believe there is. When I think about Big Data, I think about an emerging data ecosystem, where the explosion of available data will exponentially increase the complexity of the ecosystem. This is not just more data, but a different environment that will require different strategies.
Typically, the data we currently use is either first-party data — the data that emerges as part of our business process — or structured third-party data, available from a rapidly growing number of data vendors. This is probably what most people think of when they think of Big Data. But I don’t consider data in this form a departure from the data we’re used to using. There’s more of it, true, but the process is already identified. It just needs to be scaled to deal with increased volumes.
Let me use one example from the recent Search Insider Summit. The Weather Company has recently launched a new division called Weather FX, aimed at taking the vast amount of weather data it has to create predictive models to help companies add weather-based variables to their own data sets. For example, ad targeting can now be weather-sensitive, ramping up campaigns and changing messaging based on predicted changes in weather patterns. While pretty impressive, this is a relatively straightforward use of data. The data feeds are well structured and have been “predigested” by Weather FX to make them easy to implement.
Big Data, at least in my interpretation, is a different beast altogether. Here, data is messy, often unstructured, hard to find and in raw form. To further complicate matters, it lives in disparate siloes that often have no market-facing interface. T It’s an organic ecosystem that bears more than a passing similarity to how we think of natural resources. This data needs to be identified, nurtured and harvested (or mined, if you’d prefer).
It’s this data that will lead to a true view of Big Data, a world of vast data nodes that require significant development before they can be used. Think of how the world was a century and a half ago, when a lot of raw stuff — wood, minerals, water, crops, livestock — lay scattered about our planet. At the time, there was little in the way of established manufacturing and distribution chains that transformed that raw stuff into consumable products. Over time, the chain emerged, but a lot of logistical challenges had to be addressed along the way. The same is true, I believe, for data.
But there’s another challenge with Big Data: It’s not always clear how to use it. It needs a framework. You can’t dump a ton of various metals and a couple barrels of oil into a big black box, shake it and expect a Ford Focus to drop out. You need to have a pretty clear idea of what your expected outcome is. And you need to have a long chain that moves your raw material towards your end product. In the early days of creating physical goods, these chains were often verticalized within a single organization, but as the ecosystem evolved, the markets became more horizontal. I would expect the same pattern to emerge in the data ecosystem.
If you create a conceptual framework within which to use data, you can determine which data is required and how that data will be used. You can pick your data sources, and identify the gaps and resource as required to address those gaps. Often, because we’re in the earliest stages of this process, we will need to explore, guess and iteratively test before the data will provide value.
This definition of Big Data requires new rules and strategies. It requires a commitment to mining raw data and integrating it in useful ways. It will mean dynamically adapting to the continuing data explosion. It will require blood, sweat and tears. This is not a “plug and play” exercise. When I think of Big Data, that’s what I think about.