Explore new ways to use your data

Let all the data you collect bear fruit. Explorers in the physical world do not know what they will find beyond these forests and swamps. Explore data lakes, follow intuitions, to unearth the new riches of digitalization!

swamp in amazon rainforest
Ivan Mlinaric (CC BY 2.0)

In my recent post summarizing the 5 steps for transforming your business using data, I looked at how important it is to find new sources of data to feed your analytics engines, which in turn power turbocharged business processes. But doing the same old things with new data isn’t always going to cut it. You also need to look at new ideas, new processes, to get more value out of your data assets.

A proper mindset

To achieve this, you need to put in place not only the right technology environment (more on this below), but more importantly, the proper mindset to foster innovation. It is very hard to predict the return-on-investment of innovation, so it requires a leap of faith. If you don’t believe you will get rewarded for exploring new avenues, chances are you won’t go very far and will stick to the beaten path. Like the ancient explorers, to uncover new sources of wealth, you need to venture into uncharted territories and do things neither you, nor your competition, have been doing before.

There is a variety of way this innovation mindset can be fostered. Large digital companies such as Google, Apple, Facebook, LinkedIn have more-or-less official programs ranging from "20 percent time," "InCubator," "Prototype Forum," corporate hackathons, that encourage their employees to experiment, tinker, think outside the box and come up with new ideas -- products or processes. At Google, Gmail and AdSense are great examples of how letting employees be creative, brought tremendous rewards.

Now, of course, even the most creative employee in the most open-minded organization, won’t be able to express their creativity if they don’t have the proper tools. When dealing with data, this is where technology comes into play.

An easy place to find all data

The first element of this technology stack is an easy place to find all the data that’s available. There isn’t one single solution to this equation. In some organizations, a data lake will be the perfect answer: its very nature, containing raw and unfiltered data from many sources, makes it the right place to explore ideas or intuitions. In other cases, you may have to virtualize the data lake, by logically augmenting it with access to transactional databases that can’t practically be replicated in the data lake.

Two elements are essential though. The first one is that the data should be as raw as possible. Any enrichment, cleansing, filtering rule, is biased by the judgment of whoever specifies it. And while this judgment is probably a good one for their existing use cases, it may impair the ability to detect new patterns.

The second element is cataloging. Metadata management is one of the weaknesses of newer big data environments. Which can be both a curse and a blessing. Some degree of cataloging is required to help navigate the intricacies of sources and understand what data represents. But too much cataloging can have the same perverse effect as data cleansing/filtering, by adding bias.

The right tooling

Call it data mining, data exploration, data wrangling, call it business intelligence, advanced analytics, predictive analytics -- a myriad of techniques and tools exist to examine, slice, dice and squeeze the data. Let each person pick and use the tools that are the best for what they want to do. Some will be comfortable writing statistical algorithms in R or Python. Some will prefer a spreadsheet-style view from one of the modern data wrangling tools that let them re-arrange their data dynamically. Others will be using visual ETL tools to build data mapping and cleansing routines. This is research, and your research lab should have a variety of equipment at its users disposal.

Proper processes and governance

In an ideal world, you shouldn’t have to worry about getting the wrong data into the wrong hands. But the world is not ideal, and some industries are highly regulated.

Staying within the rules, you should be in a mode where you trust people to do the right thing (or rather, to not do the wrong thing -- such as leaking personal information, or stalking their ex based on data they find in your data lake). Too much control, too many restrictions, too much red tape to cut through to follow a hunch, are innovation killers.

This article is published as part of the IDG Contributor Network. Want to Join?