Dec 19, 2017 3:00 AM

The new UI: Developing for wheel and voice in Windows 10

Human-computer interaction goes further than keyboard and mouse in modern Windows; be sure your apps take advantage of the new UI methods


Microsoft has long been a driver of alternate user interaction models on the PC; its first mouse shipped back in 1983, two years before Windows appeared and a year before the Apple Macintosh made the mouse mainstream. Then came the first pen- and stylus-driven operating systems, with pen support for Windows 3.1 in 1995. Windows XP and Tablet PC took that model a lot further, while touch support arrived in Windows Vista.

Now, with the current generation of Windows, those familiar computer interactions are joined by cameras (both the familiar 2D and newer depth-based devices), dials, voice, and even eye-tracking hardware. That’s a lot to think about when developing new applications, especially when supporting a flexible, mobile workforce that often bring their own hardware, and who want to use every feature they’ve paid for.

Like much of Windows’s new functionality you won’t be able to get full access to these new features with Win32. Instead, you’re going to need to use the Universal Windows Platform (UWP) APIs to add alternative input methods to your applications. That’s part of Microsoft’s plan to migrate Windows application development to the Windows 10 platform (and distribution to the Windows Store). Older Windows releases won’t get access to these features, which depend on new hardware—although in some cases third-party drivers will be usable on older versions of Windows.

Introducing the wheel, the Windows 10 UI for smooth contextual actions

One of the new device categories is the wheel, perhaps best known through Microsoft’s own Surface Dial. But Surface Dial is not the only device that supports these APIs; other wheel devices are now shipping from vendors like Dell.

The wheel supports two basic interaction modes: rotation and a single button with press, hold, and click actions. It’s not intended for fine control, but is more a way of using a dial as a selector while still interacting with an application via pen or another precision pointer. It’s a surprisingly useful way of working, because it doesn’t break you out of flow by going to a menu or another selector outside a work area. Instead a tap opens a menu, rotate and click selects an option, and additional rotations set values for the function chosen.

Microsoft’s own Surface Dial offers another set of options, using the Surface Book 2 and Surface Studio screens to identify the position of the dial and supporting the drawing of UI elements around the Surface Dial itself. The onscreen menu elements grow to the size of the Surface Dial, integrating the hardware into the application. It’s an interesting approach to delivering an immersive user experience, especially on the large screen of a device like the Surface Studio, when you’re using it much like a physical drawing board with pen and dial in a CAD tool or an illustration package.

There are limitations to the wheel UX, so don’t expect to use it for a lot of items. There are seven slots on the wheel menu, so it’s important to choose what you use carefully. Not everything works well on a radial control; its best use is to provide continuous control over application features that might have required a slider in the past.

Because of its limitations, Microsoft suggests using a wheel as a contextual control, changing the features it controls based on what a user is doing. In a CAD application, it might handle 3D rotations in one mode, with clicks to select X, Y, or Z. In another mode, it could be selecting materials or choosing colors. You should think of the wheel as an analog device with smooth operations rather than discrete steps, like the volume control on your stereo, rather than as a typical user interface element. There is the option of using the wheel as a step controller, but the form factor of the first wheel devices—with big, heavy dials—is more attuned to smooth actions.

The UWP APIs make adding support for wheel devices relatively simple, with dedicated RadialController handlers to set the menu items and then to use Button and Rotation events to drive the controls, with the option of building custom XAML user interface elements to handle a rotating UI. There is the option of using haptic feedback, an approach that works well if you’re using a wheel device as a menu selector—but that can be distracting if you’re using it as an analog control.

Using the speech UI in Windows 10 apps

Wheel controls are ideal for desk-based workers, especially if they’re working with creative applications. But they’re not suitable for mobile workers, who are likely to need hands-free tools or basic touch interactions. That’s where speech come in.

UWP provides more than one way of using speech. Applications can offer their own APIs to Windows’s built-in assistant Cortana, adding new functions via the Cortana Skills Kit and taking advantage of its natural language recognition services. Alternatively, you can use the UWP speech features to add your own voice-control elements to your applications.

You won’t get the same level of speech recognition with the UWP speech APIs as you might through Cortana, because you’re limited to using speech as a replacement for text input or for basic interactions with a limited grammar.

It’s best to use the smallest grammar possible for control, because that reduces the risk of confusion or misunderstanding. When building speech recognition into an app, you should be prepared to fail gracefully, because audio conditions can significantly affect the quality of the recorded speech samples used for recognition.

Once users have triggered speech recognition, they’re presented with a customizable “listening” screen, which is followed by a “thinking” screen while the speech is processed, which in turn is followed by a “heard you say” screen. One useful feature in the UWP speech development tools is the option of asking for confirmation. Using the “heard you say” view or UWP’s test-to-speech tools, you can repeat back to the user what the app recognized before triggering an action.

Windows provides a basic default grammar that can recognize much of what a user says, but if you’re building an app that’s intended to handle specific user interactions, you’ll get better (and faster) recognition if you use a custom grammar that’s targeted at your specific use cases.

As Microsoft continues to add new APIs and new hardware support to UWP, you can expect to find more features like these in its platform. The message from Microsoft is a simple one: If you want to build code that supports new ways of working and more natural interactions, you should be moving away from Win32 as quickly as possible.