IoT and the data-driven enterprise: How to dive into the data flood

Virtually every kind of company will eventually have to become a data-driven enterprise. Experts share strategies for making the most of IoT data

IoT and the data-driven enterprise: How to dive into the data flood

It's an Internet of things (IoT) world, with everything from heating systems to manufacturing control systems to RFID tags collecting data -- and if you're in an enterprise that data is coming your way, if it's not already there.

That's both good and bad. Good because locked up in that data is information that can make your company improve efficiency, work smarter, find new sources of revenue and more. Bad because few companies are prepared for the incoming data flood.

This article provides an in-depth look at the issue and possible solutions to it.

A look at the issue

Before we get advice from the pros, let's take a brief look at the scope of the issue. Datameer, which sells a big-data analytics platform, says that by 2019 there will be 35 billion devices connected to the Internet. Some 40,000 exabytes of data will be generated by sensors that will be built into Internet-connected devices.

What type of data are we talking about? It could be anything. For example, manufacturing companies use sensors to check on their factories and equipment, to make sure everything runs smoothly, and improve the manufacturing process. Retailers can use it to better track sales and tie sale information to the supply chain. Everyday appliances will collect data as well.

So it's no surprise that a joint survey by Accenture and GE found that big data is top of mind for enterprises. Eighty-eight percent of executives surveyed said that it was one of their top three priorities, and 82 percent said they would build or add to their existing big data platform or their analytics capabilities in the next three years.

The long view from GE

By all accounts, GE is one of the IoT-and-big-data pioneers, both using the technology in its own businesses, and providing services for companies that want to take advantage IoT data. GE is in many businesses, from aviation to energy management, healthcare, oil and gas, transportation and more, with factories spread around the world, and so it's had to face the IoT data flood before most other companies. Based on its experiences, it sells products and services for IoT and data, notably in its GE Intelligent Platform division.

Rich Carpenter, Executive Chief Software Architect for GE Intelligent Platforms Software says that the first challenge for most companies looking to make use of the IoT data flood is to gather the data -- and it's a tougher task than you might imagine.

"We face this problem a lot in our own business," he says. "We've got 400 global factories and a surprisingly large amount of that equipment is not connected, because a lot of equipment was installed before the Internet became popular."

He says that GE breaks down its equipment into three categories: completely unconnected equipment, equipment that is capable of being connected but needs work to complete the connection, and equipment that is either already connected or can easily be connected. GE then devises data-collection strategies for each type.

But merely gathering data from IoT devices isn't enough. IoT data can come in many different formats, which might not be compatible with one another or with data analytics software.

In industrial settings, GE installs data-collection appliances that it calls field agents, which have secure, authenticated connections to a public or private cloud for data storage. Not only do the devices send the data securely, but they also determine what kinds of data to collect, what protocols to use to collect them, and how the data should be stored.

Once the data is collected, companies need to make sense of it and mine it for useful information. That's difficult enough. Even more challenging is to take that information and then use it to make changes in the way a company works, such as by making a manufacturing plant more efficient.

Carpenter warns that many companies get stuck in this phase. He has advice how they can solve that.

"Some companies start by picking one manufacturing plant, and try to make data collection and analysis perfect before moving on. But it can take forever to solve all problems, even in one plant. We've learned that a more prescribed solution works. Get yourself 70 percent of the way there in a plant, and then scale that across your entire enterprise. That brings you significantly more value much more quickly."

Carpenter also says virtually every kind of company will eventually have to take IoT data into account and become a data-driven enterprise.

"This isn't just for manufacturers or companies that already know they need to get into IoT, "he says. "All companies need it, whether it's for asset management maintenance, ERP, supply chain, or helping a mobile workforce work more efficiently."

Intel advises: Look at your business goals first

Vin Sharma, Intel's Director of Strategy for Big Data Analytics, Data Center Group, agrees with Carpenter that just about any enterprise will one day need to make use of IoT data.

"Agriculture, manufacturing, healthcare, there are obvious reasons why all of them want and need IoT data," he says. "But our expectation is that every organization will want to make use of all the data available to them, which means IoT data. I'm struggling to imagine an industry that won't need this kind of information. Retail, for example, can gain a lot of value by monitoring its inventory of goods with RFID tags and beacons. Ultimately, the goal for many companies will be to get a 360-degree view of their customers, whether it's a patient in the healthcare industry, a farmer in the agricultural industry, or a consumer in the retail industry."

Sharma says that perhaps the biggest mistake that companies make with IoT data has nothing to do with technology, and everything to do with understanding their own business goals.

"A common problem is that companies don't have a very clear definition of their business objectives before starting, and of the analytics problem they want to solve." he says. "There's a nebulousness, and that translates into long delays for deployment. But with companies that have a very crisp and clear idea of what they want to accomplish, things tend to move very quickly."

Sharma uses the apparel industry as an example of the importance of clearly defining the business problem before embarking on any IoT project.

"Let's say the accuracy of the inventory in my stores isn't where I want it to be," he says. "That forces me to overstock clothing, which generates waste and reduce my margin. And this goes all the way down the supply chain. So I know that that improving the accuracy of my in-store inventory will improve my profitability. That gives me a very clear definition of the problem I want to solve."

With that goal in mind, the company can design a system to get more granular and accurate data about its inventory of in-store goods, for example by using RFID sensor arrays.

The second major issue, he says, is with the scope of the IoT projects that companies undertake -- often, they're too large and become unwieldy and very difficult to deploy and manage.

"We see many companies succeed when they carve out a very specific measure scope first for a proof of concept, and then for a small pilot. After that, they can scale it both horizontally and vertically across their business."

He points back to the apparel example of needing to get a more accurate view of inventory. He suggests first doing a pilot at a single store in a single location, and working all the issues out. After that, he says, they can scale to all of their 300 stores, and then add additional types of data collection to the deployment.

Using cloud-based Hadoop platforms

Even enterprises that have clear definitions of the business problems they want to solve won't be able to make use of IoT data unless they have the analytics platform to handle it. Increasingly, open source Apache Hadoop is being recognized as a premier platform for that. The reason: it offers distributed storage and processing for very large data sets by using computer clusters that can be built from low-cost, commodity hardware.

But Hadoop is not easy to deploy, and is beyond the technical expertise of many enterprises. In addition, many companies don't want to build the massive platforms that the flood of IoT data can require. So a number of companies have sprung up that offer cloud-based, end-to-end Hadoop platforms, built for handling big data, including IoT data. That way, enterprises can focus on data analysis, rather than wrangling with building, deploying, and managing an entire platform.

Datameer offers one of those platforms. Datameer first built its platform in 2009, and Andrew Brust, the company's Senior Director of Technical Product Marketing and Evangelism, warns companies not to get caught up in the current IoT hype.

"Right now, IoT is in the prime of its hype cycle, so it sounds as if the data problems that enterprises face is entirely new. But at its core, it's not really a clean slate. What we're talking about in general is streaming data and analytics. The main difference is that there are a lot more things from which we can gather data now, and there's a greater frequency of gathering that data."

One of the biggest issues with IoT data, he says, is that it comes from many different devices using many different protocols and data standards that aren't necessarily compatible with one another. In some cases the data is highly structured, and in other cases, it's not.

"The biggest piece of advice I can give people is to look for technology and tools that let you create an abstraction layer on top of all of the IoT data. That way, when you get many different data types, you'll still be able to handle it, because the platform will be able to handle new standards as they come down the pipe. And look for a product that will be able to integrate data from as many different sources as possible."

Brust also says that it's important to hire the right people with the right analytics skills. Data scientists are in short supply, he acknowledges, but he believes that it's not necessary to hire people with that job title.

"The whole notion of the data scientist has a lot of mystique around it, but you shouldn't get boxed into a corner and think that you need to hire someone with that exact skill set," he says. "If you have good technical people who are skilled in data warehousing and IT work, you can provide them with the training and expertise they need to get the job done. Not only will you then get the right resources, but by offering that kind of opportunity for people in your enterprise, you'll have much better retention as well."

Altiscale also offers a cloud-based Hadoop platform. Mike Maciag, Altiscale Chief Operating Officer, believes that working with IoT data is different in significant ways than working with big data in the past.

"In many cases, IoT data is an aggregation of many pieces of small data into humongous data," he says. "There's a constant stream that grows into hundreds of terabytes and then petabytes. Also, it's very often unstructured data, and so it may need a lot of manipulation before it can become useful. What's also unique is that much of the data is born in the cloud and coming from the cloud to you."

He says that this, in part, changes the way companies need to think about data. In the past, he said, companies used to extract data, transform data, and then load it into a database. With IoT, he continues, "That has changed to extract it, load it, and then transform it."

Because of that, he recommends, "Make sure to store all the data that comes in, and don't throw it away, even if you don't know yet what to do with it. It may become valuable one day, when your company comes up with new strategies and ways of doing business."

And that – coming up with new strategies and new ways of doing business – is at the core of why enterprises need to begin developing an IoT big data strategy now, or else improve their existing one. As GE's Carpenter says, "It's a matter of competition. You need to run your business based on real data rather than what you imagine. Your competitors are going to be doing it. If you don't, you're going to be left behind."

This story, "IoT and the data-driven enterprise: How to dive into the data flood" was originally published by ITworld.

Copyright © 2015 IDG Communications, Inc.

InfoWorld Technology of the Year Awards 2023. Now open for entries!