Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register

Making speech technology mainstream

General Magic's CTO talks about speech technology, Web services, and choosing between .Net and J2EE

By Ephraim Schwartz
May 28, 2002
 

GENERAL MAGIC, A public company since 1995, proudly boasts another General -- General Motors -- as one of its premier partners and customers, but it wasn't always so. Founded in 1993 by Apple, AT&T, Sony, Motorola, Philips, Matsushita, NTT, France Telecom, and British Telecom, the company was targeting innovative GUIs for handheld devices. In 1997 the company shifted its focus to voice technology and developed one of the first unified voice messaging platforms, magicTalk, and its product Portico.

Free IT resource

Hear how top CIOs turn change into a competitive advantage.

Sponsored by HP

Free IT resource

Try Sun servers, workstations and storage products free for 60-days.

Sponsored by Sun Microsystems

By the time the new century rolled in, however, it became obvious that unified messaging was not going to set the world on fire, and when a new CEO, Kathie Layton, was appointed to head up a turnaround, she in turn tapped Pat Haleftiras as the new CTO. Layton and Haleftiras saw the potential for an enterprise software play using Java componentry to take speech technology mainstream. Neither Layton nor Haleftiras have wavered from that vision, and now the CTO, with 25 years of software development experience under his belt, is right in the thick of where the action is with products for Web application servers and Web services. InfoWorld sent Editor at Large Ephraim Schwartz, whose daily beat includes speech technology, to talk to Pat Haleftiras about the company's new direction and its newest speech-enabled solutions.

InfoWorld: General Magic's myTalk platform is currently being deployed by General Motors, is that correct?

Haleftiras: Yes, myTalk is the technology behind OnStar's Virtual Advisor, which is the largest-in-production VXML [Voice XML] deployment in the world today. It is a marriage of Java and VXML. It is over the course of helping OnStar with the Virtual Advisor that we came to understand the dynamics of what it takes to build an enterprise-class, large-scale voice application. We took that experience and it now manifests itself in the magicTalk Enterprise Platform which we have taken to the general market.

InfoWorld: Then you selected J2EE (Java 2 Enterprise Edition) over, say, Microsoft technology?

Haleftiras: There are only two viable software component models in the enterprise: .Net or J2EE. We chose to go down the J2EE path.

InfoWorld: Why?

Haleftiras: The predominance of enterprise environments in our experience is J2EE. We are a small company and we need to pick our spots, and we focus on the marriage between VXML and J2EE. The solution can dynamically generate VXML [responses]. We can accommodate another standard [SALT] if one emerges, but the underlying engine that drives everything will be a J2EE platform.

InfoWorld: So it runs on all the standard J2EE application servers?

Haleftiras: BEA, IBM, and Jboss.

InfoWorld: Not Oracle?

Haleftiras: No. At the time when we first started IBM and BEA were 60 percent of the marketplace. We brought Jboss, [an open-source app server] to have a lower-end, lightweight solution because there are a lot of call centers without a lot of infrastructure. For $500 you can get service and have access to source code.

InfoWorld: Voice technology hasn't exactly set the world on fire. What do you think is happening to change that?

Haleftiras: Voice evolved from a black box point solution to today, [where] you can take our software and layer it on top of a J2EE app server and it becomes IVR [Interactive Voice Response] on steroids.

InfoWorld: Steroids? Please explain.

Haleftiras: Traditionally, the voice would instruct a caller to press 1. The next level added a voice recognition box to the black box. Now a caller could "press or say 1," and that is still the predominate voice-enabled system out there today. But over the last several years they have gotten more sophisticated. "Press or say 1" is turning into a "welcome to the XYZ company, would you like sales or service?" We are beginning to craft an interface that has higher usability with more complex interactions with the user. And now we are just entering the third stage of development.

At this third stage the idea of standards based-implementations are emerging. The metaphor for VXML is Web development, where you have an HTML page with GIF files and headers and tags and when an HTTP request comes in those components get pushed out to the browser. VXML is similar. Now a VXML page gets pushed out to a voice browser. And a voice interpreter resident in the network or in, say, an AT&T facility are used. We host Virtual Advisor in our facility and we have racks of voice gateways in our hosted facility. Those gateways house the VXML interpreter. As the VXML page is pushed to the server it is interpreted, turned into an analog signal, and sent out to the phone line and that is what you hear -- pushed to wire line and cellular.

InfoWorld: And how does a typical enterprise company benefit from instituting voice?

Haleftiras: The beauty for enterprise is it turns all wire line and cell phones into enterprise access devices. The benefit is to extend the reach of enterprise services to a much broader audience: customers, partners, suppliers, and employees.

InfoWorld: Where do you go from here?

Haleftiras: Remember, you used to have HTML pages that were static. Now with J2EE and .Net platforms you have server-side processing going on that ties into back-end systems and constructs the HTML system dynamically based on back-end systems or by information they were gaining from the actual interaction with the user. So what we have done is construct a framework that brings all that to the voice world. Literally we can adapt the voice user interface, based either on back-end events or based on info gleaned from conversations going on. We can have a conversation and as we learn more about the individual based on these two things we can change the conversation.

InfoWorld: But will the enterprise have to invest even more money in infrastructure?

Haleftiras: This represents bringing voice to the Web infrastructure. The enterprise already has spent huge amounts of money building their Web application infrastructure with racks of servers loaded with WebLogic and WebSphere and supporting both customer- and internal-facing applications. Now they are not only leveraging the hardware and software but also the people, bringing a new dimension of ROI to the infrastructure they invested in.

InfoWorld: So you can leverage the back end, but what does it bring to the front end?

Haleftiras: Even today with proliferation with Web and Internet technologies, 76 percent of stake holder interaction with an enterprise occurs over the phone. Even with the expected growth rates of Internet and the Web, the phone will remain the predominant channel for the enterprise, and that is why they need to pay close attention to this.

InfoWorld: Let's get into some of the details. You say the magicTalk Enterprise Platform is architected on the J2EE platform. Can you tell the readers what that means to them?

Haleftiras: We represent voice-specific application architecture. We've crafted everything from frameworks to best processes to libraries of reusable componentry that enable a Java developer to build voice applications that can be loaded onto the J2EE app server. J2EE wasn't designed for this, so we provide the bridge between standards-based J2EE and voice apps.

InfoWorld: So a Java developer can add speech components. But Java developers aren't speech technology experts. Don't you still need a speech engineer to design a program that uses a voice response interface?

Haleftiras: Manifest in the Java components are Talklets, dialog components, and within them we provide the prompts and grammars down to the smallest dialog component. If you want to build a credit card app, you need to get digits. We provide a component that is called Get Digits. We provide the No. 1, for example, in a wave file with seven different intonations. Remember, depending on where the one is in a number at the beginning, middle, or end, it is pronounced in a different way and that is brought to the Java developer. He doesn't need a PHd in speech to know that. It is incorporated in the reusable componentry and we are adding more tools to manipulate responses.

InfoWorld: Is there a J2EE developer kit that works with Sun's J2EE Developer?

Haleftiras: Those are the integrated development environments, like Forte for Java. Our Java components can be introduced into those environments, leveraging what the Java and the Web development community uses.

InfoWorld: Will there be a Web services standard for adding speech applications to a service?

Haleftiras: The standards that are evolving today represent a new programming model, and those standards that are evolving can be accommodated by the voice community today. We are already communicating with large Web services with SOAP [Simple Object Access Protocol].

InfoWorld: What do you see as the benefit of Web services?

Haleftiras: The idea is you can take application functionality from multiple places and string them together and build a virtual process that doesn't exist in any one place. Now think about that Web service with voice communications built around it. This is where it is all-going -- as you start building inter-enterprise Web services applications you wrap those Web services with a conversational voice interface.

InfoWorld: Companies are thinking now about their portal strategy and how they can add depth to portals by making them the integration point behind the UI for many applications. Where does magicTalk fit into that?

Haleftiras: The portal companies ought to be talking to us. Sun and the voice channel should be an addition to those channels. I believe it is imminent. I can't say more than that at this point. Voice should be strategic to any multi-channel addition, and that is what portals are all about. Now with the magicTalk platform it is easy to incorporate voice into an enterprise portal offering.

InfoWorld: Call centers and customer service in general are highly expensive undertakings. GM uses live agents to respond when a customer pushes the blue OnStar button, and it is rumored they are losing money big time on OnStar and looking for more ways to automate it. Where does General Magic fit in?

Haleftiras: I would love to talk to you about OnStar but I can't. Contractually we host OnStar's Virtual Advisor. We are precluded from talking about it. But the call center concept and reducing cost was the original value proposition for the IVR players. We can take that value proposition and make a quantum leap forward. If IVR offloaded some percentage of phone calls, we expect a significantly higher percentage of phone calls to be offloaded due to voice. The IVR players were able to handle fairly straightforward requests for information. Now with the ability to have more complex interactions we can offload those that previously required human interaction. For example, today if you want to do renewals and payments on accounts we turn the interaction into a transaction, when previously you had to get a human involved on this.

InfoWorld: So at the end of the day, is that the benefit of voice technology to the enterprise?

Haleftiras: The call center and automating a larger percentage of calls is the low-hanging fruit and is where everyone starts and is focused. But the bigger picture is about productivity and business process efficiencies, turning a 60-second process into an 8-second process with speed dial to access information. If you have a few hundred thousand employees and each employee has 10 interactions at 130 seconds per interaction per day, if you can save 100 seconds per call, per employee, per day [those are] pretty big numbers. The beauty for the enterprise is they can attack the call center in the near term, but the long term will be in other dimensions. [They can] get a short-term ROI and set themselves up for long-term ROI as they deploy more applications on the same platform.

InfoWorld: What keeps you up at night?

Haleftiras: Well, in addition to my technology duties I've taken over the enterprise sales team. I am in the trenches and looking at the customer's face, and this is unique for a CTO. We are seeing a lot of activity, but the enterprise sales cycle is a good nine to 12 months plus. GM is 120 people, so the dynamics of all of that keep me up, but I've also learned that technology for the sake of technology doesn't cut it.





 


 
Ephraim Schwartz is an editor at large at InfoWorld.

  More of Ephraim Schwartz's column

Newsletter Check out all of our free newsletters!
Enter e-mail address:




 

TOP NEWS:


»  Four quick tips for choosing an IM security product
71 percent of businesses will invest in real-time messaging this year. If you're one of them, be sure to protect your enterprise

»  Forrester analysts ID hot IT jobs
Research group finds 16 IT roles with a promising future

»  Nvidia claims 10 hours of HD video on Tegra chip
The Tegra 600 and 650 can be used with hard disk drives and are designed partly for mobile Internet devices

»  Database vendors add Google's MapReduce
Greenplum and Aster Data Systems will support Google's programming technique, developed for parallel processing of large data sets across commodity hardware

»  Network management: Tips for managing costs
New technologies, changing requirements, and ongoing equipment maintenance and upgrades cost money, but there are ways to manage expenses

»  EMC targets SMBs, branch offices with new low-end storage
Celerra NX4 highlights include thin provisioning, snapshot technology for data recovery and backups, and Web-based console for management of storage volumes




Take control of your content- leverage Microsoft SharePoint
Microsoft Office SharePoint Server (MOSS) offers core content management designed for a broad user population. Attend this webcast to learn how to implement a strategy that allows for the coexistence of both MOSS and advanced ECM solution within the same IT environment. Sponsor: IBM

»  Click here to view this Webcast
  The Path to Enterprise Security
This is your comprehensive guide to Enterprise Security. In it you'll find solutions to the most pressing security threats facing you and your company. Learn the latest on insider threats and how to effectively minimize risk within your organization. Sponsored by Nokia

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 

FIND PRODUCTS AND COMPANIES
» COMPLETE PRODUCT GUIDE



TECHNOLOGY INDEX
• Applications
• Application Development
• Security
• Networking
• Wireless
• Platforms
• Hardware
• Data Management
• Storage
• Web Services
• Business
• Telecom
• Professional Services
• Standards

TECH WATCH 


What's the 411 on GOOG-411?
Just as Google has become synonymous with "performing a Web search," 411 is understood to mean "information" -- as in "what's the 411?" I was thus surprised to discover, from a billboard, no less, that the king of search is taking on the ...

Apple HTML source reveals 'iPhone Extreme'
"This one's a stretch..." reports AppleInsider. Um, yeah. Reporting on HTML code sightings of product names could be called a stretch, but iPhone Extreme has a ring to it. Now, that sounds like the product Apple should have released first, rather ...

COLUMNISTS

Unified under law
Ephraim Schwartz's Column and Blog (InfoWorld) - In the litigious world we live in, deploying a unified communications platform in your enterprise could...
» MORE COLUMNISTS

MORE INFOWORLD BLOGS


Open Sources 
Product Management
When I joined MySQL four years ago, there was quite a lot of debate about product management. We didn't actually have ...

Zero Day 
Botnet herders tending smaller flocks
New research backs up the theory that botnet operators are keeping their networks smaller in a continued effort to keep ...



• Advice Line
• Database Underground
• The Deep End
• Enterprise Mac
• Geeks in Paradise
• Grid Meter
• The Gripe Line
• InfoWorld Daily
• Inside IT
• IT Troubleshooter
• ITXtreme
• Open Sources
• ProdBlog
• Real World SOA
• Reality Check
• Security Adviser
• SMB IT
• The Storage Network
• Tech Watch
• Virtualization Report
• Zero Day

ADVERTISEMENT


RESOURCE CENTERadvertisement 

GOVERNMENT IT & POLICY
'If you don't go after the network, you're never going to stop these guys. Never.'
From the State Department, All the News for Inquiring Minds
TechPresident, the Internet Citizenry's New Consensus Taker



Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist