Mission accomplished

Microsoft InfoPath 2003 brings XML template design to noncoders

The next version of Microsoft Office is, among other things, a family of XML editors. I have discussed the XML modes of Word and Excel (see "XML for the rest of us" and "Exploring XML in Office 11"), and described the newest member of this family, InfoPath 2003, a tool for gathering XML data (see "Ten things to know about Xdocs"). Now that I've had a chance to work with Microsoft InfoPath 2003, Beta 2, its role and value are becoming clearer.

As thousands of recipients of the Office 2003 beta kit have already discovered, InfoPath has none of its behemoth siblings' heft. Yes, it's an Office app, but it's a new Office app, one that doesn't have to haul along 15 years' worth of accumulated cruft. You don't have to think twice about launching it. And it could probably work well as an embeddable component -- an option that many would welcome should Microsoft choose to offer it.

It's best to have a real project in hand when trying out new software, and in this case, I did. I'm working out the details of the XML API that is the Safari Books Online equivalent to the Google and Amazon query interfaces. InfoPath's creator Jean Paoli insists that the product is not a schema designer, and now I can see why. You can use the product to create a simple schema, and it will validate against any schema, but designing a complex schema is beyond its ken.

In my case and in a great many business-relevant scenarios, that detracts little from InfoPath's value. That value is simply stated. InfoPath enables nongeeks to define simple XML templates, then hand them to users who can fill them with real data. Through an iterative process of template refinement and data gathering, designers and users can collaboratively work out how the data wants to be shaped. No programming needs to happen until that cycle yields something worth the investment of programming effort. At that point, the pros can enhance the basic (but standard) XML Schema that InfoPath has produced, arrange fancy ways to distribute InfoPath documents, collect them, and weave those documents into business processes. InfoPath's genius lies not in its individual parts, but in its collaborative whole.

Starting points

When you launch InfoPath's form designer, there are a number of ways to jump-start a project. A prototype of the Safari XML API already exists, so I was able to point InfoPath at a query URL that returns XML results. As does Excel 2003, InfoPath can infer a schema from an XML sample retrieved from the file system or the Web. I separately tested InfoPath's capability of launching a new project based on data retrieved from a Web service. That works too, but only if the service uses document literal (vs. Remote Procedure Call) encoding.

You can also just fire up a blank form. That's what I ended up doing, although in retrospect -- once I learned how to manipulate InfoPath forms, views, and data sources -- reuse of a sample instance would have been more productive. But starting from scratch was pretty easy, too. A lot of the engineering work that's gone into InfoPath has focused on making forms, views, and data sources work in an extremely natural way for both the designer and the user. That effort has paid off handsomely.

Screen 1 shows the form I built to collect a package of search results and the schema governing that form. The structure is a sequence of books (with metadata), each with a sequence of sections (with metadata), each with a sequence of hits. You create the form by dragging elements from the schema window to the canvas. If you've defined an element as repeating, InfoPath suggests an appropriate container such as a table or a repeating section. When you specify font and layout properties, InfoPath applies them to the rendered form through which users enter data and to the HTML views it can optionally emit.

You can also specify validation properties, though these, I'm sorry to say, do not become part of the schema that InfoPath writes. Compare the Pubdate and Publisher fields in Screen 1. Pubdate, which uses a calendar control to receive input, corresponds to an xsd:date in the schema. But Publisher, which is constrained to a list of names in the InfoPath designer, maps only to an xsd:string, not to the enumerated set of values that XML Schema is capable of representing.

When the first version of my form was ready, I saved it as an .xsn file, which is just another name for a .cab archive. It contains a manifest, sample data (if any), the schema, one or more XSLT transformations corresponding to the views defined in the project, and a JavaScript shell for scripted extensions. Everything conforms to well-known XML and Web standards. You can publish the .xsn file to a shared drive, SharePoint, or any Web site. An InfoPath user who launches the file can begin entering data, as shown in Screen 2. The form grows hierarchically. In my Safari query example the user can add a new package of results for another book, or another section's worth of results to an existing book, or more hits to an existing section. Choosing File>Save yields a pure XML data file, which can in turn be launched into InfoPath. The schema and view-producing XSLT transformations are referenced from the XML, but the separation of data from presentation is cleanly and strictly enforced.

The paragraphs of text in Screen 2 are bound to a rich-text edit control. It delivers well-formed XHTML output, which InfoPath stores in a properly namespaced element in the XML data file. That's the good news. The bad news is that this editor is otherwise not much of an improvement over the widely used and much-maligned DHTML edit control, which emits horrid HTML cluttered with inline font tags and other junk. (These complaints also apply to InfoPath's CSS-less HTML renderings of entire views.) I suppose InfoPath does not want to steal Word 2003's thunder, but a more competent editing widget would be a welcome addition. Perhaps Microsoft intends to leave that door open for third-party components from Ektron, Altova, and others.

Alternative views

One of InfoPath's delights is the ease with which you can create various, live-editable views of your data. Screen 3 shows a simplified view that omits the book metadata to focus more clearly on the structure of the search results. Screen 4 shows a titles-only view. The user can switch among these views by selecting them from InfoPath's View menu. To create a view, you just open a new canvas, then repeat the process of dragging elements from the data source and binding them to controls. InfoPath automatically writes the XSLT transformations that produce the view. One caveat: If you restructure an element that's included in more than one view, you'll have to restructure it in every view. An XML element is reusable across a whole InfoPath project, but the binding of a data element to a display widget is not. The latter capability would be a nice enhancement.

Saving InfoPath-generated XML in a plain text file is extremely useful. It doesn't matter whether you post or e-mail that file; either way, another user who has access to the .xsn file can view and edit the data. InfoPath's schema-, XSLT-, and script-based validation tools conspire to preserve the data's fidelity. You can also arrange to post results back to a Web service or a server that receives raw HTTP POST requests. Using the latter (and simpler) strategy, I easily arranged to route InfoPath postings to an indexed column of a Virtuoso database.

InfoPath is scriptable in a browserlike way. It presents a Document Object Model which you can manipulate from JavaScript. As has been widely noted, .Net programmers will be disappointed to find no managed-code hooks. There are lots of ways to get at InfoPath's XML data, though, and it's important to note that the product does not mainly target developers. Its unique mission is to empower business users to design, gather, view, and exchange packets of XML data. It succeeds on those terms, and in so doing, defines a new and strategic category of desktop software. Built on open standards, it invites competitors to step up to the plate -- and I hope they will. What InfoPath does will come to be seen as an essential function of the decentralized business Web.

Copyright © 2003 IDG Communications, Inc.

How to choose a low-code development platform