Your data-driven business is fueled by insights obtained from the processing and analysis of data from various sources. Have you considered what would happen if one -- or several -- of the data sources that feed this digital engine, were to dry up? If you were suddenly unable to access the critical data that make your business run?
Let's look at where your data comes from, and consider which concrete actions you can take to secure its supply.
Internal transactional data
Internal data, and especially data which primary purpose is distinct than the usage your digital business makes, is both the easiest and the trickiest to secure. It's easy because you don't have to negotiate a formal contract with a third party, and if there is executive buy-in for what you do, then getting the data owner to provide access should not be a problem. But it's also tricky precisely because of this lack of formal contract, because people change, because priorities shift. Whether accidental or not, you may find your access cut off overnight, and the restoration of this access not being a top priority for the data owner. Or data schemas may change and require that you rebuild you entire collection processes.
Action: make sure the proper processes and SLAs are in place, and follow very closely organization and staff movements to inform new stakeholders of why your access to data must remain safe.
Connected objects data
If you process data from the Internet of Things, and especially consumer connected devices, your challenge to securing access is primary legal. There are two questions you need to consider:
- Who owns the data? Does it belong to the owner of the device, the account holder, or to your organization?
- What can you do with the data? Surely, you can use it to render a service to your subscriber, but can you aggregate it with data from other subscribers? Can you resell this data (anonymized or not)? Can you derive insights, and resell this insight?
Syndicated data is usually the easiest to control. Because you are paying a service provider to deliver data to you, you have a contract with this provider. This contract will cover service level agreements, licensing and usage limitations, and should ensure continued access.
However, you still need to consider what will happen if the service provider goes out of business, or changes its business model (like Twitter's recent announcement that they are shutting down their firehose to better control their supply chain).
Action: review if alternate sources are available, and keep these options at hand in case you need them.
Trading partners data
The case of trading partners data is very similar to the one of syndicated data, except that the data is usually not provided as a standalone service but as part of a broader relationship -- for example between a retailer and a manufacturer. Enforcing service level agreements can become tricky, if it puts at risk an otherwise profitable relationship.
Action: like you do for syndicated data, always have in mind alternate sources, if applicable.
Action: find multiple sources, and do not build your business on the assumption that open data feeds will remain available in the long run.
Harvesting data from web sites (screen scraping) or public APIs is common practice, but it is also the least secure source of data you can consider.
From the legal standpoint, this practice is often borderline since there is no licensing agreement that permits you to use the data harvested in such ways.
From the data availability standpoint, web sites change all the time, and your scraping routines will become obsolete in no time.
Action: stay away from data harvesting! And if data harvesting is your only option, be prepared to suffer outages, and to have to redevelop your routines all the time. And maybe get a lawyer....
This article is published as part of the IDG Contributor Network. Want to Join?