Microsoft Corp. is about to stir the speech recognition market with the launch of its Speech Server products next week. The
vendor promises speech recognition for the masses, but analysts warn that speech-enabling applications is not easy.
Microsoft Chairman and Chief Software Architect Bill Gates is scheduled to formally launch Speech Server 2004 Standard Edition
and Enterprise Edition at the SpeechTEK conference in San Francisco next week. The launch marks the Redmond, Washington-based
company's entry into the server-based speech recognition market where it will compete with vendors including Nuance Communications
Inc., ScanSoft Inc. and IBM Corp.
"Our goal is to make speech recognition technologies mainstream," said James Mastan, director of marketing for the Microsoft's
Speech Server group. Microsoft's way to do that is by making speech recognition available at lower cost and easier to deploy,
manage, develop and maintain than competing products, he said.
The pitch is simple. Developers can add speech capabilities to existing Web applications based on Microsoft's ASP application
framework by adding code based on XML (Extensible Markup Language) and SALT (Speech Application Language Tags) technologies
using Visual Studio .Net. Speech Server takes calls and communicates with the Web server through XML and SALT and makes applications
offered online available through the phone, Mastan said.
Speech Server runs on Windows Server 2003. The Enterprise Edition needs to run on a separate physical server while Standard
Edition, designed for small and medium-sized installations, can be placed on the same hardware as the Web server. Microsoft
will recommend configurations and resellers will offer fully configured systems, Mastan said.
Users will like Speech Server because it is familiar, Mastan said. Developers can use Visual Studio and it runs just like
any other Microsoft server product. "It is not some black box in a call center that you have to program for in some weird
language and you can't maintain yourself because you don't know how it works," he said.
Microsoft's entry will stir the speech recognition market, according to Yankee Group and Gartner Inc. analysts. However, Microsoft
has to prove itself in the market and users need to be aware that creating a speech recognition system is more complex than
Microsoft makes it sound in its marketing messages, they said.
"Speech applications and a voice user interface are pretty tricky to do. That may well get lost in the first version of the
Microsoft marketing hype that will go out there," said Steve Cramoysan, a principal analyst at Gartner. "If you're going to
use Microsoft Speech Server, use professional services people who know exactly what they're doing."
Yankee Group Senior Analyst Art Schoeller in a research note last year issued the same warning to potential Speech Server
users. "It is dangerous to imply that any Web developer will speech-enable applications, because not all have proper training
in the best practices for dialog design," he wrote.
Still, Microsoft's entry into the speech recognition market is a significant event, Cramoysan said. "Microsoft will certainly
shake up this market, but I think we're going to be looking at the second and third version of this product when they will
become much more competitive than with this first release of the product," he said.
Nuance, fingered by Mastan as Microsoft's chief rival, agrees with the analysts and goes a step further. "Microsoft is developing
an inexpensive and easy way for developers to design really bad applications," said Kevin Chatow, principal product manager
at Nuance in Menlo Park, California. Adding speech to Web applications may not result in usable applications, he said.
While Microsoft may like to position Nuance's product as obscure, Chatow pointed out that Nuance supports VoiceXML 2.0, a
recognized standard, and not SALT, which is still making its way through the standards process. Furthermore, the Nuance product
isn't tied to Microsoft technologies, but also works with Java application servers.
Nuance on Tuesday plans to announce the third major release of its Nuance Voice Platform product. Release 3.0 adds support
for Linux in addition to Windows and Solaris and a new application design and deployment environment that promises to cut
development costs by about a third, the company said in a statement.
While in terms of acquisition cost the Nuance Voice Platform may be more expensive than Microsoft's Speech Server, the Microsoft
offering may end up costing more and paying back for itself later because of technology upgrades, development quirks and other
costs associated with setting up and running the product, Chatow said.
Gartner's Cramoysan said that while Microsoft does plan to offer its product at a lower price, that indeed does not mean it
will work out to cost less over a longer timeframe. "Although Microsoft is talking about fairly aggressive pricing, it is
an unproven product. We would caution people in terms of assuming that it would be lower cost in terms of total cost of ownership,"
Cramoysan said.
About 600 customers participated in Microsoft's Speech Server beta test and 30 in an early adopter program. One early adopter
is Seattle-based Grange Insurance Group, which with the help of a consultancy company developed a system its customers can
call to check the status of their payments.
Grange uses Microsoft software across its business, except for its policy management system, which runs on an IBM mainframe.
Going to Microsoft for speech recognition was a clear decision, said Ralph Carlile, chief information officer and vice president
of technology at Grange.
"I didn't see any technology out there that I was interested in going after. Other products had high failure rates and did
not offer integration into our back-end systems," he said.
Development of the speech recognition components only took two to three weeks, Carlile said. The company did have some issues
getting a telephony board for the server and hooking that up to its phone system, he said. But now Grange is testing the speech
applications with 750 of its policy holders and the results are good, he said.
Pricing for Microsoft's Speech Server products will be "an order of magnitude lower" than competing products, Mastan said.
Details will be announced next week. Yankee Group's Schoeller in his research note predicted Microsoft will undercut the competition
by about 30 percent.
Microsoft will offer free 180-day trial versions of its Speech Server software, which will initially only be available in
U.S. English. General availability of the software is expected to be a few weeks after launch.