Bye-bye mouse, hello mind control

New interface methods will revolutionize how we interact with workplace computers

When workplace computers moved beyond command-line interfaces to the mouse-and-windows-based graphical user interface, that was a major advance in usability. And the command line itself was a big improvement over the punch cards and tape that came before.

We're now entering a new era of user interface design, with companies experimenting with everything from touch and voice to gestures and even direct mind control. But which of these new interfaces is appropriate for a corporate environment, and which are simply not ready for prime time?

[ Get familiar fast with Office 2010's key applications -- Word, Excel, PowerPoint, and Outlook -- with InfoWorld's set of Office 2010 QuickStart PDF guides. | Keep up on key business application news with the Technology: Applications newsletter. ]

Can you hear me now?

Voice recognition is one input technology that has made significant progress. A decade ago, accuracy was low and the technology required extensive training. Today, it is common to find voice recognition when calling customer support, and, of course, in the latest smartphones.

For general office use, however, voice recognition has made the biggest impact in specialized areas, like law and medicine. At the University of Pittsburgh Medical Center, for example, automated transcription has almost completely replaced human transcriptionists in the radiology department.

"The big thing in radiology is how can we go through as many studies as we possibly can," says Rasu Shrestha, the hospital's vice president for medical information technologies. "Turn-around time is incredibly important, as is the accuracy of the report."

The fact that the job itself is extremely routine is also important, he added. "We sit down, we look at images, and we write reports," he says. "It's a fairly mundane task."

Shrestha says he began working with voice recognition a decade ago, and it was "horrendous" at first. "We had constant struggles, especially if you had any level of an accent. But things have come a long way. The Dragon Medical Engine [from Nuance] incorporates a lot of the medical ontology and vocabulary structures, so the platforms are intelligent."

As a result, accuracy has gone from around 70 percent to 80 percent 10 years ago, to close to 100 percent accuracy today. Meanwhile, human transcription has actually fallen in accuracy as hospitals have moved from using dedicated secretaries who would get to know a doctor's voice to outsourced transcription services.

"There's no opportunity for you to build a bond with any particular person sitting at the back end of the transcription service," he says. Another reason that machine transcription is now better is that users can set up macros that automatically take care of a large chunk of work.

"If you have a normal chest X-ray, you could short-cut the entire documentation process," he says. "You can just turn on the mike and say, 'Template normal chest' and it automatically puts everything in, adds the context of the patient's name and age, and there you go - in seconds, you have created a full report that might have taken several minutes before. I would say that the days of the human transcriptionist are numbered."

Finally, machine transcription dramatically speeds up the workflow. "A decade ago, five years ago, when we were using traditional transcription service, it used to be anywhere from a day to several days before the final report was sent back," he says. "Today, it's anywhere from seconds to a couple of minutes. The minute the patient is in the scanner and the scan is completed, it's in our work list. Sometimes within seconds or minutes of the study being available to us, the ordering clinician has the report available to them. It clearly increases our productivity and streamlines the process."

A more human approach to design

Increased accuracy of speech recognition is just the beginning of how new interfaces are transforming the way we interact with computers.

"The real power isn't that any of these new approaches is perfect," says Henry Holtzman, who heads the MIT Media Lab's Information Ecology group. "But together they can allow us to have a much more human experience, where the technology is approaching us on our terms, instead of us having to learn how to use the technology."

Voice recognition is one of the drivers of this change, which turns around the standard approach to interacting with a computer. "We can say, 'Remind me that I have a meeting at five,' and that's very different from turning on the phone, getting to the home screen, picking the clock applications, putting it into alarm mode, and creating a new alarm," Holtzman says.

Traditionally most interfaces are designed around the second approach, in assembling a set of useful features and having the user learn how to use them. Even voice interfaces, such as those designed to improve accessibility for the handicapped, typically just add the ability to use voice commands to navigate the standard set of menus.

"But saying 'Remind me I have a meeting at five' is expressing a goal to the device, and having it do the steps for you," he says. That requires extra intelligence on the part of the computer.

Andrew Schrage, head of IT at MoneyCrashers, says he and other senior staff members at the company all use Siri, the virtual assistant on Apple's iPhone. "It has definitely improved productivity," he says. "We clearly get more things done on the go more expediently."

Siri can understand and carry out complex commands like "Remind me to call my assistant when I get home" and answer questions like "How deep is the Atlantic Ocean?"

"It has been somewhat of a game changer for us," Schrage says.

Intelligent agents

Apple's Siri is just one example of companies using artificial intelligence to figure out what the user wants to do, and one of the most ambitious ones, since a user could potentially ask Siri about anything.

A slightly easier job is understanding spoken language in limited contexts, such as, for example, banking and telecom call centers.

"We start with a generic set of rules that we know work for, say, the telecommunications industry, and then use that in conjunction with their specific domain," says Chris Ezekiel, CEO of Creative Virtual, a company that processes spoken and written speech for companies like Verizon, Virgin Media, Renault, and the UK's National Rail.

"'Hannah,' for instance, for [UK's] M&S Bank, knows all about their credit cards, loans, and other financial service products," he says.

For companies that deploy virtual assistants like Hannah, the goal is to answer questions that normally are handled by human staff. According to Ezekiel, these virtual agents typically average 20 percent to 30 percent success rates, and the systems are continuously updated to learn from previous encounters so that they can handle more queries.

One Creative Virtual client, Telefónica UK, found that their intelligent agent Lucy reduced customer service calls by 10 percent to 15 percent. That doesn't mean that she only understands 10 percent to 15 percent of questions, says Telefónica knowledge base manager Richard Hagerty. "One of the key questions customers ask is, 'How do I contact customer service?'"

In other cases, Lucy might not yet know the answer, and the company will need to create one. "Maybe we wouldn't answer the question, anyway," he says.

What the company has learned over the past 12 months is that it's better to have one clear answer than to respond with several possible answers. In addition, Lucy needs to become a bit less human, he adds. For example, Lucy can handle a wide variety of personal questions. She says she likes Italian food, for example, has seen Titanic several times, and enjoys tennis and salsa dancing.

"There's a back story that allows a customer to ask personal questions," Hagerty explains. "She lives in Wimbledon, and is engaged to her boyfriend. But some customers believe they are having a chat with a human being. So we are looking at reducing some of the elements of personalization so that our customers' expectations are managed correctly. We want to make it clear to our customers that it's an automated service they're using, not a human being."

Gestures a tough nut to crack

Interface designers looking to translate spoken -- or written -- words into practical goals have a solid advantage over those designing interfaces for gestures or other non-traditional input methods.

That's because designers are already familiar with the use of spoken language. And if they aren't, there is a great deal of research out there about how people use language to communicate, says MIT Media Lab's Holzman. The language of human gestures is much less understood and less studied.

"We've been playing around with browser interfaces that work with you moving your body instead of moving a mouse," he says. But there are no common gesture equivalents to the "pinch to shrink" and "swipe to flip page" touch commands.

There are some gestures that are universally identifiable, but they may be less appropriate for the workplace.

"We're at the beginning of the gesture phase," he says. "And not just the gestures, but everything we can do with some kind of camera pointing at us, such as moving our eyebrows and moving our mouths. For example, the screen saver on the laptop -- why doesn't it use the camera on the lid to figure out whether to screen save? If your eyes are open and you're facing the display it should stay lit up."

One company tracking hand motion is Infinite Z, which requires that users wear 3D glasses and use a stylus to touch objects which appear to float in the air in front of them.

"A virtual environment makes a lot of sense for computer-aided design, data visualization, pharmaceuticals, medicine, and oil and gas simulations," says David Chavez, the company's CTO. The products works with Unity 3D and other virtual environment engines, as well as the company's own Z-Space platform.

Another difficult technology to commercialize is eye tracking, which is commonly used to see which portions of an ad or Website viewers look at first. It is also used to improve communication for the handicapped.

Reynold Bailey, a computer science professor at the Rochester Institute of Technology, uses eye-tracking technology to teach doctors to read mammograms better. The idea is to subtly highlight areas that the student should look at next, teaching them the scan patterns followed by experienced radiologists.

"If this works with mammograms, there are also other applications," he says. The same technology can be used to train pilots in how to check instruments, for example.

But he says he doesn't expect eye tracking to be used as an input device, to, say, replace a mouse for general-purpose use.

"The eye is not an input device," he says. "With the mouse, you can hover over a link and decide whether to click or not. With the eye, you might just be reading it, so you don't want to activate everything you look at. So you can do blink to click, but your eyes get tired from that. And we move our eyes around and blink involuntarily."

Limits of mind control

It may sound like science fiction, but mind reading devices are already out in the market -- and they don't require sensors or plugs to be implanted into your skull. Some work by sensing nerve signals sent to arms and legs, and are useful for helping restore mobility to the handicapped. Others read brain waves, such as the Intific, Emotiv and NeuroSky headsets.

The Intific and Emotiv headsets can be used to play video games with your mind. But these mind reading devices can do more than just connect with computers. NeuroSky, for example, is the maker of the technology behind the Stars Wars Force Trainer and Mattel's MindFlex Duel game, both of which allow players to levitate balls with the power of their minds.

That doesn't mean that office workers can sit back, think about the sentences they want to write, and have them magically appear on the screen. "If you're an able-bodied individual, typing words on a keyboard is just so much quicker and more reliable than doing it with the brain control interfaces," says MIT Media Lab's Holtzman.

A paralyzed person may benefit greatly from being able to pick out letters or move a paintbrush simply by thinking about it, he says. And moving a racecar around a track with your mind is a fun parlor trick. But it's still easier just to use a real paintbrush, or simply pick up the car with your hands and move it around.

But where mind reading can directly benefit an office worker is in picking up the user's mood, he says.

For example, if the user is stressed or in a rush, a mailbox could sort itself to put the priority emails at the top -- then highlight the fun ones from friends when the user is relaxed.

"And if I'm tense and concentrating, it may delay my text alerts and messages instead of interrupting me," he says.

And it's not just facial expressions or mind scans that can be used to make computers switch modes. Today's smartphones and tablets are equipped with a variety of sensors, including GPS trackers, clocks, microphones, accelerometers, gyroscopes and compasses that can tell the device if it's moving, how it's being held, where it's located, what time of day it is, and much more.

1 2 Page
Mobile Security Insider: iOS vs. Android vs. BlackBerry vs. Windows Phone
Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies