Add machine learning to your app with Microsoft Project Oxford

Want to play with advanced features like face recognition and language understanding? Microsoft makes these goodies available in the cloud for free

Add machine learning to your app with Microsoft Project Oxford
Shutterstock

Machine learning has caught fire lately, in part because the cloud has made it so much easier to implement. With cloud machine learning services from Amazon, Microsoft, and Google, you don't need to build and manage a farm of rules engines. All you need is a set of training data and a RESTful API.

That's fine for a certain set of problems -- say, where you're trying to guide consumers to making purchases, then checking whether there might be a fraudster in the bunch. But what if you're trying to identify objects, translate languages, or recognize faces?

Microsoft has begun opening up its alternative machine learning tooling and APIs on the Project Oxford service. You may have seen some of its tools in action in the viral How-Old.net service, which attempts to guess a person's age based on an uploaded photograph.

Fun with artificial smarts

Project Oxford's APIs are easy enough to use once you register for a subscription key. Services are managed through the Azure portal, which also handles billing. At the moment the service is free, though there are transaction limits: 20 per minute for the Face API, for example, with a maximum of 5,000 transactions per month.

With your subscription key in hand, you access the services through Azure, with APIs listed as subscribed Marketplace services. The service generates primary and secondary keys that are used to verify access to the service via its SDKs.

Microsoft's API uses RESTful JSON exchanges to link to desktop or server code. While you can use JavaScript for some interactions, you'll probably prefer to work in a language like Java or C#. Client library code simplifies working with Project Oxford APIs, allowing you to quickly build machine learning features into your apps.

For services such as the face detection tools, a simple asynchronous call uploads an image, then delivers an array of objects that can be used to draw overlay rectangles, showing detected faces.

Haven't I seen you somewhere before?

The face detection algorithm can work with a large range of face sizes, from 36 by 36 pixels to 4,096 by 4,096. But identifying faces is only part of the story. Optional parameters identify head poses, ages, and gender (though the last two are experimental and not as accurate).

The option of detecting face landmarks is interesting, if a little reminiscent of the fictional identification tools used on shows like "CSI" and "NCIS." The descriptions returned are semantic, allowing you to compare, say, the position of lips and eyebrows between two faces.

Once a face has been detected, you can use the data to attempt to compare it to other uploaded faces. Project Oxford generates a FaceID for every uploaded image (they are deleted after 24 hours). While you're storing IDs, you can compare an image with up to 100 candidate faces in an array; the result is an array that returns similar faces, ranked by their degree of similarity. If you have a selection of images of a one person, different poses and lighting will group faces together.

That's the basis of the Face API's identification. With a set of images of a person, the Project Oxford system can be trained to recognize an individual.

Language processing and more

Another element of the suite is Language Understanding Intelligent Service (LUIS). Built using the tools Microsoft developed for Bing and Cortana search, LUIS helps understand user intent. You can start with prebuilt modules that have been tried and tested in Microsoft's services, then graduate to LUIS's custom module builder.

With LUIS, you can use an HTTPS cloud endpoint to identify entities in users' typed and spoken commands. LUIS can identify key actions and concepts, such as names and dates, then use them to drive your application. The latest release goes beyond English, adding Chinese language support (including phrases that mix both languages).

Microsoft is making its Project Oxford SDKs available through GitHub, where you can download tools and sample code for working with the Face, Speech, and Computer Vision APIs. While much of the service uses RESTful APIs, sample code is available for both Windows and Android, letting you build machine learning into desktop and mobile apps, as well as Web and cloud services. Much of Project Oxford is focused on the tooling you need for back-end services, where you can handle pre- and postprocessing of results.

Bringing complex machine learning out of the laboratory is important. It lets developers get a feel for how these technologies can work in applications, as well as help researchers tune the algorithms they use -- taking them into the wider world and presenting them with a much larger set of data.

You can do a lot with tools like these, even if they aren't supremely accurate. While much of Project Oxford is in beta, it's worth trying out these APIs simply to see what modern cloud-hosted machine learning can add to your code.

Copyright © 2015 IDG Communications, Inc.

How to choose a low-code development platform