In 2015, big data will slowly permeate the borders of the enterprise

Despite obstacles, integration with the outside world will enter the scope of big data projects in 2015 and visible use cases will start to be publicized.

open door

Big data has proven its usefulness. This last year has seen more big data projects go live than ever, more proofs of concept turning productive, more value being derived from new insights on new data sets. Technology has helped, with the advent of Hadoop 2.0 and the YARN framework, which played a critically important role, turning Hadoop into a multi-purpose, multi-workload and multi-latency data platform.

There are still gaps in Hadoop that prevent it from becoming a prevalent enterprise computing platform, probably the most glaring one being its security shortcomings. But we know it's being addressed. Securing a computing platform has been done before, and there is nothing in Hadoop that makes security a more complex proposition than in other environments. It's just that Hadoop has evolved so quickly and security hasn't had time to catch-up. Vendors will see to this in 2015, no worry!

Big data projects essentially fall into two major categories:

  • The first category includes improving and accelerating existing business processes. Accessing new data sources, using more records, applying new algorithms, running them faster for right-time insight, these projects provide value through faster and better insight, resulting in faster decision making and better-oiled processes. A lot of the early (and most publicized) successes in big data fall in this category -- which make sense since it's not about reinventing the wheel, just making it more round!
  • The second category of big data projects enable organizations to invent a new business model based on data. These range from entirely new companies, born in the digital age (connected objects vendors, or new breed of service providers such as Uber or Airbnb typically belong here) to "traditional" players enriching their business model with value added services (aircraft or turbine manufacturers for example) and to organizations spinning off a "data" business unit to sell the data they collect in the normal course of their business (a telco operator selling geolocalization data on consumers for example).

Many of these projects cannot be contained within the borders of the enterprise's firewall. By definition, business models based on data require that big data becomes accessible and visible by trading partners, customers, consumers of the service provided. These new data suppliers no longer live in a "one to one" or "one to few" environment, where ad-hoc interfaces could be established with a few key suppliers or customers. They have hundreds, thousands, even millions of users who should be able to gain access to data in a few clicks, and "onboarding" them one at a time is not an option. Most of the time, a short-lived "one to many" world soon becomes a "many to many" world where several (digital) businesses fight for the same business and hence ease of access and speed of configuration becomes critical.

Even in big data projects that focus on improving or accelerating existing business processes, integration with the outside world are slowly entering the project scope. A huge source of customer insight comes from social media, for example. More and more sources of third-party reference and enrichment data are available. Big data technologies make it possible to incorporate such data into projects.

However big data platforms are not well equipped to communicate with the outside world. The current standard, APIs based on the REST architecture style, answers all the requirements listed above. Such APIs can of course be implemented easily on top of Hadoop or NoSQL data sets, but integration at the metadata layer would be an interesting artifact for platform vendors to offer. And since security is one of the strongest requirements of any external integration, big data platforms will have to rely (for now) on the fairly advanced security features of APIs -- not as good as native security.

Still, in 2015, big data will slowly but surely start to expand beyond the IT systems owned and operated by the enterprise and permeate into the outside world.

Copyright © 2015 IDG Communications, Inc.