How machine learning ate Microsoft

Yesterday's announcement of Azure Machine Learning offers the latest sign of Microsoft's deep machine learning expertise -- now available to developers everywhere

1 2 Page 2
Page 2 of 2

The new Azure ML service started out as a MSR Excel demo, sending data to experimental machine learning-driven data analytics running on Azure. A couple of years before he became CEO, Satya Nadella came across the demo and immediately saw the potential of turning it into a product for business customers. He persuaded the researcher, Roger Barga, to join a team inside his cloud division. “Satya got excited, and he got me excited,” Barga remembers.

The idea was to combine the machine learning tools from the research team with the expertise that product teams across Microsoft had gained by implementing machine learning algorithms. Making machine learning work well isn’t only about having a good algorithm or even making it perform at scale. You also need to make it consistent. The same algorithm in different machine learning packages often delivers different answers; using heuristics to find the model that best matches your data takes a lot of experience.

That experience offers a unique advantage, says Barga: ”These algorithms have been hardened and proven over the years. We're able to draw on that expertise to implement it again in Azure ML. We know what the best practices are, what the heuristics are, what should we do to ensure this will be robust, scalable, and performant implementation we can deliver to our customers.”

But Azure ML didn’t merely take the machine learning algorithms MSR had already handed over to product teams and stick them into a drag-and-drop visual designer. Microsoft has made the functionality available to developers who know the R statistical programming language and Python, which together are widely used in academic machine learning. Microsoft plans to integrate Azure ML closely with Revolution Analytics, the R startup it recently acquired.

Developers can also design a machine learning system using the Azure ML Studio tool. That’s popular even with experienced machine learning developers like the team at Mendeley, Elsevier’s academic research network, which built a new recommendation system in a third of the time it took them with other tools. JJ Food Service in the United Kingdom used it to make a predictive shopping cart that puts products in for you; customers like the convenience and revenue is up 5 percent.

A machine that trains itself

In order to allow easier use of multiple machine learning algorithms together, Microsoft needed to build a suitable platform. That meant creating a system for moving new algorithms from research into production; as new techniques are developed, they can be plugged into Azure ML, keeping it up to date as machine learning continues to develop.

A common problem with older machine learning systems (and one of the issues that deep learning will address) is "ML rot." In other words, you spend a long time training your system, and when you roll it out, it works for a while -- but it falls out of date and you have to train it again. One way to avoid that is by retraining your model as you use it.

During the preview, customers were so keen on that idea that Microsoft added programmatic training and retraining. “They want to upload data to an API and have machine learning models do the learning, so we added that,” explains Sirosh. “Once you have an API in place, you can keep uploading data and the model will update itself and stay fresh and be constantly learning."

That’s what eBay used to train its translation system on the terms used in women’s fashion. If you’re selling handbags, dresses, shoes, or other fashion items on eBay, you might see much better sales overseas because automatic translations of listings are more accurate -- and available in all 45 languages Azure ML supports.

This week, Microsoft added a new machine algorithm used by Bing Ads that can handle very large amounts of data. “We can learn at a terabyte-sized data set,” boasts Sirosh. “I don’t know if any cloud service allows you to learn in terabyte sizes today other than Azure ML.” That’s useful for big data, where you might have to look at a huge data set to find the signals that tell you something.

Microsoft has a range of services that work together for big data scenarios. You can load data into HD Insight, Microsoft’s Hadoop cloud service, or pull in data from websites and sensors with Event Hubs, then process that stream of data with Azure Stream Analytics or with Apache Storm, which Azure now supports.

"From that you can call the machine learning APIs to detect anomalies or fraud," explains Sirosh. "You can take enormous amounts of data using, say, HD Insight and use that distilled with Azure ML to learn models that can be deployed in an application. But big learning is a lot more than that. Say fraud is high in certain postal codes and not in others. There are millions of postal codes in the world. These techniques allow you to take these patterns into account; you’re able to use very fine-grained information and be very precise about it."

Sirosh clearly believes his platform will accelerate machine learning adoption. “Today businesses hire data scientists and they painfully custom build their own machine learning apps. With a platform like Azure ML it becomes so easy to create custom apps ... Only when you have a special set of needs will you need to set up a team of data scientists to build and API for you.”

Walk into a Chili’s restaurants and you might find a tablet on each table for ordering food, watching videos, paying the bill, and giving feedback. The system, built by Ziosk, uses HD Insight to track how customers use the tablets in 1,400 restaurants -- and Azure ML to customize what offers and content they see. It can even change the interface on the tablet, based on how they use it.

Sirosh thinks everyone should be building that kind of system. "This is the birth of the intelligent cloud in many ways. Any application you build, you should now consider using the data generated from the app, or any other data you have, to create a better customer experience, to create efficiencies you wouldn't have otherwise tapped into."

Microsoft’s big machine learning future

CEO Satya Nadella called out machine learning -- and the big data that powers it -- as a key development in his memo to Microsoft last July. “Billions of sensors, screens, and devices -- in conference rooms, living rooms, cities, cars, phones, PCs -- are forming a vast network and streams of data that simply disappear into the background of our lives. This computing power will digitize nearly everything around us and will derive insights from all of the data being generated by interactions among people and between people and machines. We are moving from a world where computing power was scarce to a place where it now is almost limitless, and where the true scarce commodity is increasingly human attention.”

That sounds rather more achievable when you talk to Peter Lee about the advances he believes Microsoft can make in the next decade.

Last year he showed off early work on a machine learning system that could use your phone camera to not only recognize a dog, but identify the breed, or tell you whether a plant was poisonous. That’s Project Adam, which is trying to apply cloud principles of scale to machine learning. Normally, machine learning happens on a single system that you can’t scale beyond a cluster because it has to be synchronous; with Project Adam, the learning can be asynchronous, so you could spread it out over a whole data center.

Project Adam is only one of what Lee calls several machine learning "moonshots" -- “efforts that are truly aspirational but have really concrete, easy-to-assess goals so you know whether you’ve done it or not.” He’s very protective of them (“the pressure can be very distracting”), so he won’t name the other projects or say what the goals are -- but they’re big.

“Project Adam truly pertains to going beyond speech and vision to really a deep understanding of human discourse. Ultimately, it’s the next stage of a true AI where we really understand at scale how to get a machine to understand what human beings are talking about. The goals there are so interesting. From a scientific perspective there are tremendous implications for our understanding; from an engineering perspective the scale is really dazzling and from a commercial perspective the prospects for applications are incredibly enticing. We have very significant efforts in the foundations of speech and translation along the same lines.”

Lee is both excited and pragmatic about the potential of these big projects -- and the side benefits “that have started to dribble out already” -- from OneDrive (which now uses machine learning to tag your photographs) to the demonstrations of Skype Translator (where performance improvements from new techniques have left even researchers “stunned”). Plus, there’s a ready-made platform in Azure Machine Learning for bringing those new techniques to product groups inside Microsoft and developers elsewhere.

“With these large aspirational efforts, there's always a part of me that harbors some skepticism about whether we'll ever get there,” Lee admits. “Some of these things are so fantastical, but you never know! You get surprised. And as a research manager, I'm comfortable that whether we get there or not, there are going to be a tremendous number of spinoffs and new knowledge.”

Whether or not Microsoft makes more fundamental breakthroughs in AI, what it learns about using machine learning will carry on showing up in all the products you use -- including ones you build yourself.

1 2 Page 2
Page 2 of 2