Mapping XML to Java, Part 1

Employ the SAX API to map XML documents to Java objects

1 2 3 4 Page 2
Page 2 of 4

First, for each element we are interested in mapping to Java, we will reset our collection buffer in the startElement SAX event handler. Then, when startElement for a tag has occurred, but endELement has not, we will collect the characters presented by the characters SAX event. Finally, when the endElement for the tag has occurred, we will store the collected characters in the appropriate field of a Java object.

Below you'll find the sample data for our hello world example:

<?xml version="1.0"?>
<simple date="7/7/2000" >
   <name> Bob </name>
   <location> New York </location>
</simple>

Here's the source listing for the XML mapping code of the hello world example:

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
public class Example2 extends DefaultHandler {
   // Local variables to store data
   // found in the XML document
   public  String  name       = "";
   public  String  location   = "";
   // Buffer for collecting data from // the "characters" SAX event.
   private CharArrayWriter contents = new CharArrayWriter();
   // Override methods of the DefaultHandler class
   // to gain notification of SAX Events.
   //
        // See org.xml.sax.ContentHandler for all available events.
   //
   public void startElement( String namespaceURI,
              String localName,
              String qName,
              Attributes attr ) throws SAXException {
      contents.reset();
   }
   public void endElement( String namespaceURI,
              String localName,
              String qName ) throws SAXException {
      if ( localName.equals( "name" ) ) {
         name = contents.toString();
      }
      if ( localName.equals( "location" ) ) {
         location = contents.toString();
      }
   }
   public void characters( char[] ch, int start, int length )
                  throws SAXException {
      contents.write( ch, start, length );
   }
   public static void main( String[] argv ){
      System.out.println( "Example2:" );
      try {
         // Create SAX 2 parser...
         XMLReader xr = XMLReaderFactory.createXMLReader();
         // Set the ContentHandler...
         Example2 ex2 = new Example2();
         xr.setContentHandler( ex2 );
         // Parse the file...
         xr.parse( new InputSource(
               new FileReader( "Example2.xml" )) );
         // Say hello...
         System.out.println( "Hello World from " + ex2.name
                              + " in " + ex2.location );
      }catch ( Exception e )  {
         e.printStackTrace();
      }
   }
}

The following is the output of our hello world example:

Example2:
Hello World from  Bob  in  New York

This is not the simplest hello world program ever written. As such, there are several things worth noting in the example code.

First, the code demonstrates some of the bad features of event-driven code. Things get tricky when event-driven code needs to respond to a pattern of events instead of just a single event. In this specific case, we are looking for a pattern of SAX events that mark the name and location of our simple XML document.

The tagged content is presented in the characters SAX event; the tags themselves are spread between the startElement and endElement SAX events. I got around this in the hello world example by coordinating around the contents buffer, which the startElement always resets. The end element assumes that the contents have been collected and assigns them to the appropriate local variable. This is not a bad pattern, but it assumes that no two fields of a Java object possess the same tag -- not always a valid assumption. We will address this issue later.

Another interesting feature of the example code is the use of a contents buffer -- a little SAX gotcha. You can create a string directly in the characters SAX event instead of copying the characters to a buffer as in the example. But that means ignoring the fact that the SAX specification of the characters() method indicates the XML parser may call characters() multiple times. This will cause data loss if the data between two tags are large, or if the buffering of the stream feeding the XML parser data breaks in between two tags while you are collecting data. Also, reusing a buffer is much more efficient than constantly creating new strings.

Mapping our first Java object

Now that we've gotten through hello world, let's try a more useful example that maps an XML document to a Java object. This example is similar to hello world, but maps data to a single object and has an accessor for the object -- a useful pattern of using SAX present in the rest of the examples. Unlike a constructor or a Factory method, objects mapped in a SAX parser are not available until after parsing. A clean way to deal with this difference is to provide access methods from the mapping class to the finished mapped object. That way, you create the mapping class, attach it to an XMLReader, parse the XML, and then call the accessor to get a reference to the mapped object. A variation of this theme is to supply a set method and then supply the object to be mapped just before parsing.

Take a look at the sample XML document for the third example:

<?xml version="1.0"?>
<customer>
   <FirstName> Bob </FirstName>
   <LastName> Hustead </LastName>
   <CustId> abc.123 </CustId>
</customer>

Next, we see a simple class that will be mapped with data supplied by our XML document:

package common;
import java.io.*;
// Customer is a very simple class
// that holds fields for a dummy Customer
// data.
// It has a simple method to print it's
// self to a print stream.
public class Customer {
   // Customer member variables.
   public String firstName = "";
   public String lastName  = "";
   public String custId    = "";
        public void print( PrintStream out ) {
            out.println( "Customer: " );
            out.println( "  First Name -> "  + firstName );
            out.println( "  Last Name -> "   + lastName  );
            out.println( "  Customer Id -> " + custId    );
   }
}

This is the source code that does the XML mapping for our third example:

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
import common.*;
public class Example3 extends DefaultHandler {
   // Local Customer object to collect
   // customer XML data.
   private  Customer cust = new Customer();
   // Buffer for collecting data from
   // the "characters" SAX event.
   private CharArrayWriter contents = new CharArrayWriter();
   // Override methods of the DefaultHandler class
   // to gain notification of SAX Events.
   //
        // See org.xml.sax.ContentHandler for all available events.
   //
   public void startElement( String namespaceURI,
              String localName,
              String qName,
              Attributes attr ) throws SAXException {
      contents.reset();
   }
   public void endElement( String namespaceURI,
              String localName,
              String qName ) throws SAXException {
         if ( localName.equals( "FirstName" ) ) {
            cust.firstName = contents.toString();
      }
      if ( localName.equals( "LastName" ) ) {
         cust.lastName = contents.toString();
      }
      if ( localName.equals( "CustId" ) ) {
         cust.custId = contents.toString();
      }
   }
   public void characters( char[] ch, int start, int length )
                  throws SAXException {
      contents.write( ch, start, length );
   }
   public Customer getCustomer()  {
           return cust;
   }
   public static void main( String[] argv ){
      System.out.println( "Example3:" );
      try {
         // Create SAX 2 parser...
         XMLReader xr = XMLReaderFactory.createXMLReader();
         // Set the ContentHandler...
         Example3 ex3 = new Example3();
         xr.setContentHandler( ex3 );
         // Parse the file...
         xr.parse( new InputSource(
               new FileReader( "Example3.xml" )) );
         // Display customer to stdout...
         Customer cust = ex3.getCustomer();
         cust.print( System.out );
      }catch ( Exception e )  {
         e.printStackTrace();
      }
   }
}

The following is the output generated by our simple Customer object, populated with data from our XML document:

Example3:
Customer:
  First Name ->  Bob
  Last Name ->  Hustead
  Customer Id ->  abc.123

A simple list of Java objects

For more complex XML documents, we will need to map lists of objects into Java. Mapping object lists is like bartending: when a bartender pours several beers in a row, he usually leaves the tap running while he quickly swaps glasses under the tap. This is exactly what we need to do to capture a list of objects. We have no control over incoming SAX events; they flow in like beer from a tap that we can't shut off. To solve the problem, we need to provide empty containers, allow them to fill up, and continually replace them.

Our next example highlights this technique. Using an XML document that represents some information about a fictional customer order, we will map the XML that represents a list of order items to a vector of Java order-item objects. The key to implementing this concept is the current item. We'll create a variable named currentOrderItem. Every time we get an event indicating a new order item (startElement for the OrderItem tag), we will create a new empty order-item object, add it to the list of order items, and assign it as the current order item. The XML parser does the rest.

First, here is the XML document representing our fictional customer order:

<?xml version="1.0"?>
<CustomerOrder>
   <Customer>
      <FirstName> Bob </FirstName>
      <LastName> Hustead </LastName>
      <CustId> abc.123 </CustId>
   </Customer>
   <OrderItems>
      <OrderItem>
         <Quantity> 1 </Quantity>
              <ProductCode> 48.GH605A </ProductCode>
         <Description> Pet Rock </Description>
         <Price> 19.99 </Price>
      </OrderItem>
      <OrderItem>
         <Quantity> 12 </Quantity>
              <ProductCode> 47.9906Z </ProductCode>
         <Description> Bazooka Bubble Gum </Description>
         <Price> 0.33 </Price>
      </OrderItem>
      <OrderItem>
         <Quantity> 2 </Quantity>
              <ProductCode> 47.7879H </ProductCode>
         <Description> Flourescent Orange Squirt Gun </Description>
         <Price> 2.50 </Price>
      </OrderItem>
   </OrderItems>
</CustomerOrder>

Again, here is our simple customer class:

package common;
import java.io.*;
// Customer is a very simple class
// that holds fields for a dummy Customer
// data.
// It has a simple method to print it's
// self to a print stream.
public class Customer {
   // Customer member variables.
   public String firstName = "";
   public String lastName  = "";
   public String custId    = "";
        public void print( PrintStream out ) {
            out.println( "Customer: " );
            out.println( "  First Name -> "  + firstName );
            out.println( "  Last Name -> "   + lastName  );
            out.println( "  Customer Id -> " + custId    );
   }
}

Next, a simple class to represent an order item:

package common;
import java.io.*;
// OrderItem is a very simple class
// that holds fields for dummy order
// item data.
// It has a simple method to print it's
// self to a print stream.
public class OrderItem {
   // OrderItem member variables.
   public int    quantity     = 0;
   public String productCode  = "";
   public String description  = "";
   public double price        = 0.0;
        public void print( PrintStream out ) {
            out.println( "OrderItem: " );
            out.println( "  Quantity -> "  + Integer.toString(quantity) );
            out.println( "  Product Code -> "   + productCode  );
            out.println( "  Description -> " + description    );
            out.println( "  price -> " + Double.toString( price )    );
   }
}

Now, we turn our attention to the SAX parser for example four, which maps customers and order items:

1 2 3 4 Page 2
Page 2 of 4