November 15, 2002

XML for the rest of us

Microsoft Office 11's XML capabilities contain the seeds of a revolution in enterprise content management

Office 11 doesn't help you write your schemas. That is both a science and an art, and something that few outside the XML development community have attempted. But once you have a schema, no programming skill is needed to bind it to a document or to enforce the constraints expressed by the schema. In the rÈsumÈ example, those constraints were trivial: A user of the document who typed nondigits into the YearFrom or YearTo elements would be alerted and could not save the document until these elements were written as the integers required by the schema. But this humble example has profound implications. Consider the InfoWorld story shown in the screen shot. It's written in Word but backed by a schema that enumerates the set of allowable author names, limits the length of headlines and of the main story, and disallows Greek symbols. The story as shown violates two of those constraints: It includes a Greek letter and the author's name, misspelled, fails to match the enumerated set of allowed names. Word 11 reports the infractions as they occur and stops complaining as soon as they are corrected.

Once valid, the document can be saved as XML in two ways. The default is to create WordML, which preserves Word's styles and formatting in an XML name-space that's separate from the one bound to the schema-controlled data. You can optionally save through an XSLT transformation which, in a publish-to-the-Web scenario, could translate WordML formatting into HTML/CSS formatting. Alternatively, if you tick the Save as Data option, you can instead save just the raw XML data. In that case, you can bind one or more XSLT stylesheets to the document, each of which can generate WordML styles and formatting.

The XML expertise needed to create schemas and XSLT transformations is scarce today. Once Office 11 hits the streets, its mainstream applications could arguably commoditize those XML skills more quickly and broadly than have Web services technologies. What's more, Office is positioned as a bridge between the worlds of desktop applications and Web services. In the emerging architecture of the business Web, XML-wrapped remote procedure calls are giving way to XML documents. SOAP, we'll soon see, isn't just a way for services to talk to one another. A purchase order acquired from a Web service by means of a SOAP call will sometimes need to be modified by a person. The application used to edit that purchase order will have to be a familiar tool. It will also have to guarantee that the document it passes along contains well-structured, valid, and thus enterprise-ready data.

Office 11 appears to meet both of these requirements. And it does so in ways that respect the inherent strengths of the applications. Displayed in Word, an electronic purchase order can reflect its paper-based legacy by exploiting Word's formatting power. Instances of that same document, brought into Excel, can feed the analytical functions that are Excel's specialty. When XML data has a regular structure that maps naturally to a grid, Excel 11 can make that data immediately available for columnwise sorting, charts, and pivot tables. Here, in fact, is a case where Microsoft has put XSLT's basic XML-shredding capability into the hands of a nonprogrammer. Absent a schema, Excel 11 can still infer structure from raw XML data. When we pointed it at an XML data dump taken from a back-office system, it automatically proposed a structure. We were then able to populate a spreadsheet template with selected elements, reorder them at will, and define a mapped region into which a subset of our data could be imported. We previously had to write XPath expressions to target elements and XSLT code to rearrange them. Excel 11 makes that an interactive task that any user can perform.

Jean Paoli is wildly enthusiastic about what all this will mean. We share his excitement. Empowering ordinary users to create and interact with XML data is a huge step forward. It's too bad that Outlook hasn't been given the same treatment as Word and Excel. Most of us do a lot more communicating than document processing or number crunching. We'd like to see e-mail become a natively structured and manageable data type, too. Meanwhile, we'll have our hands full just exploring the new vistas opened up by the XML features of the new versions of Word and Excel.

Close

On Twitter now

Applications

Powered by Twitter

On Twitter now

White Paper

D2D Virtual Tape Library Replication Primer

This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.

Download now »

White Paper

An Alternative to Virtualization for Datacenter Cost Savings

Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.

Download now »

White Paper

Why Your Firewall, VPN, and IEEE 802.11i Aren't Enough to Protect Your Network

The emergence of WLANs has created a new breed of security threats to enterprise networks.

Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation

Download now »

White Paper

Bringing the Edge to the Data Center

Effectively address data protection challenges, implementing solutions that help store and protect business–critical data while cutting costs and improving efficiency and reliability.

Download now »

Subscribe to the Today's Headlines: First Look Newsletter

Find out what will be news for the day, with our first-thing-in-the-morning briefing.

©1994-2009 Infoworld, Inc.