About InfoWorld : Advertise : Subscribe : Contact Us : Awards : Events : Store
InfoWorld HomeNewsTest CenterOpinionsProduct GuideTechIndex
PRODUCT REVIEWS GUIDE    REVIEWS    ANALYSES    SPECIAL REPORTS 
 

TEST CENTER

 
XML for the rest of us

By Jon Udell
November 15, 2002


JEAN PAOLI, XML architect at Microsoft, is a man on a mission. A former developer of SGML tools, he joined Microsoft in 1996 and co-edited the first XML specification in 1998. All along, he has dreamed of building software that would make it easy for ordinary folks to create, edit, and analyze structured and semistructured data. Now, finally, his vision is coming into focus.

   ADVERTISEMENT
  

Free IT resource

TechNet: More ways to know it, share it, and keep it running.

Sponsored by Microsoft

Free IT resource

Attend the SOA Executive Forum: Breaking SOA Bottlenecks SOAExecForum.com/may2007

Sponsored by InfoWorld

RELATED LINKS
»  IT trainer offers master's degree for hackers
»  MSNBC buys participatory news site Newsvine
»  Merchants: eBay ad programs drive buyers away
»  Web services RSS feed 

IDG ENTERPRISE NETWORK
Web Services Caution Abounds  (CIO)

TOP NEWS 


IT SOLUTION SEARCH
The first public beta of Microsoft Office 11 demonstrates, as promised, that XML has become a native Office file format. What's more, Word 11 and Excel 11 can associate documents with data definitions written in XML Schema, and they can interactively validate documents against schemas. These are transforming achievements. Previous Office upgrades have been yawners, but version 11 should rivet the attention of IT planners.

We've known for many years that most of our vital information lives in documents, not databases. XML was supposed to help us capture the implicit structure of ordinary business documents (memos, expense reports) and make it explicit. Sets of such documents would then form a kind of virtual database. The cost to search, correlate, and recombine the XML-ized data would fall dramatically, and its value would soar. It was a great idea, but until the tools used to create memos and expense reports became deeply XML-aware, it was stillborn. XML did, of course, thrive in another and equally important way. It became the exchange format of enterprise databases and the lingua franca of Web services. Now Office 11 wants to erase the differences between XML documents written and read by people using desktop applications, and XML documents produced and consumed by databases and Web services. This is a really big deal.

The first beta of Office 11 doesn't include any demonstrations of the new XML features, but the Office team put together some examples for us, and Jean Paoli talked us through them. We started with a rÈsumÈ template written in Word 11. Today we use such templates mainly to control the appearance of documents. If we also want to control their content, we can ask developers to write macros that enforce business rules. In principle, a company could publish a rÈsumÈ template that would, for example, require job seekers to describe past experience in terms of a controlled vocabulary. In practice, that rarely happens. Procedural code to enforce such constraints is hard to write and even harder to reuse. With Word 11, you can attack this problem by defining a schema and mapping its elements to a rÈsumÈ template.

In the rÈsumÈ example, we associated a schema with a sample rÈsumÈ, using the Templates and Add-ins dialog. A new task pane called XML Structure then appeared, displaying a single root element named RÈsumÈ. We selected it, and chose the option Apply to Whole Document. Now subelements named Objective, Experience, and Education appeared in the task pane. Mapping these to regions of the sample rÈsumÈ revealed deeper structure until the entire schema was finally mapped.

Another example illustrated the same scenario for Excel. Here, the fields defining an expense report were captured in a schema, then mapped to an expense report. Once we saw how it worked, we were able to apply the same concept to our existing InfoWorld spreadsheet. After writing a simple schema, we dragged elements from the XML Structure pane onto the spreadsheet to bind named schema elements to numbered cells.

Office 11 doesn't help you write your schemas. That is both a science and an art, and something that few outside the XML development community have attempted. But once you have a schema, no programming skill is needed to bind it to a document or to enforce the constraints expressed by the schema. In the rÈsumÈ example, those constraints were trivial: A user of the document who typed nondigits into the YearFrom or YearTo elements would be alerted and could not save the document until these elements were written as the integers required by the schema. But this humble example has profound implications. Consider the InfoWorld story shown in the screen shot. It's written in Word but backed by a schema that enumerates the set of allowable author names, limits the length of headlines and of the main story, and disallows Greek symbols. The story as shown violates two of those constraints: It includes a Greek letter and the author's name, misspelled, fails to match the enumerated set of allowed names. Word 11 reports the infractions as they occur and stops complaining as soon as they are corrected.

Once valid, the document can be saved as XML in two ways. The default is to create WordML, which preserves Word's styles and formatting in an XML name-space that's separate from the one bound to the schema-controlled data. You can optionally save through an XSLT transformation which, in a publish-to-the-Web scenario, could translate WordML formatting into HTML/CSS formatting. Alternatively, if you tick the Save as Data option, you can instead save just the raw XML data. In that case, you can bind one or more XSLT stylesheets to the document, each of which can generate WordML styles and formatting.

The XML expertise needed to create schemas and XSLT transformations is scarce today. Once Office 11 hits the streets, its mainstream applications could arguably commoditize those XML skills more quickly and broadly than have Web services technologies. What's more, Office is positioned as a bridge between the worlds of desktop applications and Web services. In the emerging architecture of the business Web, XML-wrapped remote procedure calls are giving way to XML documents. SOAP, we'll soon see, isn't just a way for services to talk to one another. A purchase order acquired from a Web service by means of a SOAP call will sometimes need to be modified by a person. The application used to edit that purchase order will have to be a familiar tool. It will also have to guarantee that the document it passes along contains well-structured, valid, and thus enterprise-ready data.

Office 11 appears to meet both of these requirements. And it does so in ways that respect the inherent strengths of the applications. Displayed in Word, an electronic purchase order can reflect its paper-based legacy by exploiting Word's formatting power. Instances of that same document, brought into Excel, can feed the analytical functions that are Excel's specialty. When XML data has a regular structure that maps naturally to a grid, Excel 11 can make that data immediately available for columnwise sorting, charts, and pivot tables. Here, in fact, is a case where Microsoft has put XSLT's basic XML-shredding capability into the hands of a nonprogrammer. Absent a schema, Excel 11 can still infer structure from raw XML data. When we pointed it at an XML data dump taken from a back-office system, it automatically proposed a structure. We were then able to populate a spreadsheet template with selected elements, reorder them at will, and define a mapped region into which a subset of our data could be imported. We previously had to write XPath expressions to target elements and XSLT code to rearrange them. Excel 11 makes that an interactive task that any user can perform.

Jean Paoli is wildly enthusiastic about what all this will mean. We share his excitement. Empowering ordinary users to create and interact with XML data is a huge step forward. It's too bad that Outlook hasn't been given the same treatment as Word and Excel. Most of us do a lot more communicating than document processing or number crunching. We'd like to see e-mail become a natively structured and manageable data type, too. Meanwhile, we'll have our hands full just exploring the new vistas opened up by the XML features of the new versions of Word and Excel.




  BOTTOM LINE
Microsoft Office 11 and XML
EXECUTIVE SUMMARY
In Office 11, Word and Excel can display, edit, and save XML documents. Using XML Schema definitions bound to these documents, enterprise architects can for the first time ensure that users of common desktop applications will create and maintain high-quality, integration-ready data.

TEST CENTER PERSPECTIVE
In a dramatic breakthrough, Office 11's XML features target end-users with no knowledge of XML. Users of Word and Excel will be most productive when supported by developers who can fluently define data models, using XML Schema, and write XML transformations, using XSLT.


RELATED ARTICLES

http://www.infoworld.com/articles/op/xml/02/11/14/021114opwebserv.xml
http://www.infoworld.com/articles/pl/xml/02/10/28/021028plxmlclient.xml
http://www.infoworld.com/articles/op/xml/02/11/11/021118opestrat.xml


RELATED SUBJECTS

Web Technologies


SPONSORED WHITE PAPERS
EMC - Lower costs and improve reliability-Get the EMC CLARiiON white paper!
Ciphertrust - Are you ready for Sobig.G? Learn how to protect your email systems.
CDW - Personal attention. CDW. The Right Technology. Right Away.
EMC - Explore key performance features and capabilities of EMC ControlCenter 5.1.1.
Intel - Free Intel white paper shows you how to deploy a secure wireless LAN
Cisco - FREE WHITE PAPER: BLUEPRINT to design and implement secure VPNs
Verity, Inc. - "Mass Consolidation Hits the Web-Search Market"
McDATA - Download a FREE storage consolidation white paper from McDATA(R).
Lucent Technologies - Overcoming Common Firewall Limitations
Lucent Technologies - Leverage Your Mobile High Speed Data Access. Download Free White Paper!
Nokia - Get the scoop! Mobilizing business white papers & case studies.
BMC Software - Maximize the Potential of Enterprise Data: Free white paper!
Network Associates - Free white paper - Strategies for Optimizing Network Costs and Benefits
Entrust - Manage identities across applications. Improve productivity.
Stalker Software - CommuniGate Pro - Transform your Email and Calendaring
Remedy - A NEW Gartner Research Note:Producing Quality IT Services

Search the IDG White Paper Library:


SPONSORED LINKS

INFOWORLD MARKETPLACE


» EMC delivers high-speed image capture, storage
Learn how you can quickly capture, organize, and deliver information with EMC ApplicationXtender.
» Agentless SOA Management
SOA operational visibility in less than a day, without installing message agents - free download.
» Apply BPM and ITIL at your IT Help Desk
ServiceWise brings BPM to complete IT service while eliminating integration cost. Learn more here.
» Find IT Consultant
Post Your Project for Free. Get Bids from Thousands of Pre-Screened Consultants. Register Now!
» Metadata Management Software
MetaCenter: Plug & play metadata management software for enterprise systems. Features: data ...




 HOME  NEWS  TEST CENTER  OPINIONS  PRODUCT GUIDE  TECHINDEX   About : Advertise : Subscribe : Contact Us : Awards : Events 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy

All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses, phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

Computerworld :: Network World :: CIO :: PC World :: Darwin :: CMO :: CSO
IT Careers :: JavaWorld :: Macworld :: Mac Central :: Playlist :: GamePro :: GameStar :: Gamerhelp
ITWorld Canada :: Computerwoche :: Techworld UK :: tecChannel :: IDG.se :: IDG.no