Mapping XML to Java, Part 1

Employ the SAX API to map XML documents to Java objects

1 2 3 4 Page 3
Page 3 of 4
import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
import java.util.*;
import common.*;
public class Example4 extends DefaultHandler {
   // Local Customer object to collect
   // customer XML data.
   private  Customer cust = new Customer();
   // Local list of order items...
   private Vector orderItems = new Vector();
   // Local current order item reference...
   private OrderItem currentOrderItem;
   // Buffer for collecting data from
   // the "characters" SAX event.
   private CharArrayWriter contents = new CharArrayWriter();
   // Override methods of the DefaultHandler class
   // to gain notification of SAX Events.
   //
        // See org.xml.sax.ContentHandler for all available events.
   //
   public void startElement( String namespaceURI,
               String localName,
              String qName,
              Attributes attr ) throws SAXException {
      contents.reset();
      // New twist...
      if ( localName.equals( "OrderItem" ) ) {
                   currentOrderItem = new OrderItem();
         orderItems.addElement( currentOrderItem );
      }
   }
   public void endElement( String namespaceURI,
               String localName,
              String qName ) throws SAXException {
      if ( localName.equals( "FirstName" ) ) {
         cust.firstName = contents.toString();
      }
      if ( localName.equals( "LastName" ) ) {
         cust.lastName = contents.toString();
      }
      if ( localName.equals( "CustId" ) ) {
         cust.custId = contents.toString();
      }
      if ( localName.equals( "Quantity" ) ) {
         currentOrderItem.quantity = Integer.valueOf(contents.toString().trim()).intValue();
      }
      if ( localName.equals( "ProductCode" ) ) {
         currentOrderItem.productCode = contents.toString();
      }
      if ( localName.equals( "Description" ) ) {
         currentOrderItem.description = contents.toString();
      }
      if ( localName.equals( "Price" ) ) {
         currentOrderItem.price = Double.valueOf(contents.toString().trim()).doubleValue();
      }
   }
   public void characters( char[] ch, int start, int length )
                  throws SAXException {
         contents.write( ch, start, length );
   }
   public Customer getCustomer()  {
           return cust;
   }
   public Vector getOrderItems() {
           return orderItems;
   }
   public static void main( String[] argv ){
      System.out.println( "Example4:" );
      try {
         // Create SAX 2 parser...
         XMLReader xr = XMLReaderFactory.createXMLReader();
         // Set the ContentHandler...
         Example4 ex4 = new Example4();
         xr.setContentHandler( ex4 );
         // Parse the file...
         xr.parse( new InputSource(
               new FileReader( "Example4.xml" )) );
         // Display customer to stdout...
         Customer cust = ex4.getCustomer();
         cust.print( System.out );
         // Display all order items to stdout...
         OrderItem i;
         Vector items = ex4.getOrderItems();
         Enumeration e = items.elements();
         while( e.hasMoreElements()){
                           i = (OrderItem) e.nextElement();
            i.print( System.out );
         }
      }catch ( Exception e )  {
         e.printStackTrace();
      }
   }
}

Here's the output generated by our Customer and OrderItems objects:

Example4:
Customer:
  First Name ->  Bob
  Last Name ->  Hustead
  Customer Id ->  abc.123
OrderItem:
  Quantity -> 1
  Product Code ->  48.GH605A
  Description ->  Pet Rock
  price -> 19.99
OrderItem:
  Quantity -> 12
  Product Code ->  47.9906Z
  Description ->  Bazooka Bubble Gum
  price -> 0.33
OrderItem:
  Quantity -> 2
  Product Code ->  47.7879H
  Description ->  Fluorescent Orange Squirt Gun
  price -> 2.5

When the structure of the XML document becomes more complex, the real task is managing the creation of empty containers to contain the flow of SAX events. For simpler things like a single list of objects, this management is straightforward. However, we will need to develop techniques to help manage more complicated containment hierarchies such as lists of lists and lists of objects that contain lists.

Objects sharing tags

Before we get to the more advanced containment layouts, there is another difficulty with SAX we will sometimes need to address. While it may not always be present, occasionally data at different places in the XML document will be tagged with the same tag, but will have to be mapped to different objects in Java. Suppose you have a customer section and a customer representative section in your XML document. Both of these sections have fields with FirstName and LastName as tags. Because of this ambiguity, you can no longer be sure which object the contents buffer should be assigned to during the endElement SAX event. You must keep some information about containing startElement SAX events to clarify which object collects the contents during the common endELement SAX event.

This problem can become dangerous, even with XML documents that don't initially have this structure, if the XML document doesn't have a DTD or the DTD is changed without updating the mapping code. Without the DTD, your clients can legally supply you with any tag that you are mapping in the wrong place within the XML document.

In truth, the only way to safely deal with the problem is to constantly track information about all open start tags. As a simple example, let's say you have the following XML document:

<?xml version=1.0"?>
<CustomerInformation>
   <Customer>
      <Name>
      Some Customer Name
      </Name>
      <Company>
         <Name>
         The customer's company name
         </Name>
      </Company>
   </Customer>   

Even though the tag name Name is ambiguous, the full path to the name is not -- it's either CustomerInformation->Customer->Name or CustomerInformation->Customer->Company->Name. Keeping the full path available at all times guarantees that accidentally reusing a tag name won't fool your mapping code. It turns out that mapping recursive XML structures requires a solution to this problem; we will cover this issue in the next article.

Next, we'll examine two examples for dealing with this situation. The first example is a brute force if solution. I will set some flags during the containing element's startElement SAX event. Then during the endElement event, I will run if statements against the flags to determine which object the contents should be assigned to.

Below you'll find our sample XML document demonstrating overlapping tag names:

<?xml version="1.0"?>
<Shapes>
   <Triangle name="tri1" >
      <x> 3 </x>
      <y> 0 </y>
      <height> 3 </height>
      <width> 5 </width>
   </Triangle>
   <Triangle name="tri2" >
      <x> 5 </x>
      <y> 0 </y>
      <height> 3 </height>
      <width> 5 </width>
   </Triangle>
   <Square name="sq1" >
      <x> 0 </x>
      <y> 0 </y>
      <height> 3 </height>
      <width> 3 </width>
   </Square>
   <Circle name="circ1" >
      <x> 10 </x>
      <y> 10 </y>
      <height> 3 </height>
      <width> 3 </width>
   </Circle>
</Shapes>

The following is a base class for all of our dummy shape classes:

package common;
// Dummy base class to hold values 
// common to shapes.
public class Shape {
   public int x = 0;
   public int y = 0;
   public int height = 0;
   public int width  = 0;
   
}

Here's a simple triangle class:

package common;
import java.io.*;
// Dummy triangle shape.
public class Triangle extends Shape {
   // Dummy Triangle specific stuff...
   public String name = "";   
   
   public void print( PrintStream out ){
      out.println( "Triange: " + name + 
            " x: " + x  +
            " y: " + y  +
            " width: " + width + 
            " height: " + height );    
   }
}

Next, we see a simple square class:

package common;
import java.io.*;
// Dummy square shape.
public class Square extends Shape {
   // Dummy Triangle specific stuff...
   public String name = "";   
   
   public void print( PrintStream out ){
      out.println( "Square: " + name + 
            " x: " + x  +
            " y: " + y  +
            " width: " + width + 
            " height: " + height );    
   }
}

Here's a simple circle shape:

package common;
import java.io.*;
// Dummy circle shape.
public class Circle extends Shape {
   // Dummy Circle specific stuff...
   public String name = "";
   public void print( PrintStream out ){
      out.println( "Circle: " + name +
            " x: " + x  +
            " y: " + y  +
            " width: " + width +
            " height: " + height );
   }
}

Next, we map code that represents the brute force method of separating identical tag names associated with different objects:

import org.xml.sax.*;
import org.xml.sax.helpers.*;
import java.io.*;
import java.util.*;
import common.*;
public class Example5 extends DefaultHandler {
   // Flags to help us capture the contents
   // of a tagged element.
   private boolean inCircle      = false;
   private boolean inTriangle    = false;
   private boolean inSquare      = false;
   // Local list of different shapes...
   private Vector triangles = new Vector();
   private Vector squares = new Vector();
   private Vector circles = new Vector();
   // Local current shape references...
   private Triangle currentTriangle;
        private Circle   currentCircle;
   private Square   currentSquare;
   // Buffer for collecting data from
   // the "characters" SAX event.
   private CharArrayWriter contents = new CharArrayWriter();
   // Override methods of the DefaultHandler class
   // to gain notification of SAX Events.
   //
        // See org.xml.sax.ContentHandler for all available events.
   //
   public void startElement( String namespaceURI,
               String localName,
              String qName,
              Attributes attr ) throws SAXException {
      contents.reset();
      if ( localName.equals( "Circle" ) ) {
                    inCircle = true;
                        currentCircle = new Circle();
         currentCircle.name = attr.getValue( "name" );
         circles.addElement( currentCircle );
      }
      if ( localName.equals( "Square" ) ) {
                    inSquare = true;
         currentSquare = new Square();
         currentSquare.name = attr.getValue( "name" );
         squares.addElement( currentSquare );
      }
      if ( localName.equals( "Triangle" ) ) {
                    inTriangle = true;
         currentTriangle = new Triangle();
         currentTriangle.name = attr.getValue( "name" );
         triangles.addElement( currentTriangle );
      }
   }
   public void endElement( String namespaceURI,
               String localName,
              String qName ) throws SAXException {
      if ( localName.equals( "x" ) ) {
         if ( inCircle ) {
                           currentCircle.x = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else if ( inSquare ) {
                           currentSquare.x = 
                               Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else {
                           currentTriangle.x = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
      }
      if ( localName.equals( "y" ) ) {
         if ( inCircle ) {
                           currentCircle.y = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else if ( inSquare ) {
                           currentSquare.y = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else {
                           currentTriangle.y = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
      }
      if ( localName.equals( "width" ) ) {
         if ( inCircle ) {
                           currentCircle.width =
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else if ( inSquare ) {
                           currentSquare.width = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else {
                           currentTriangle.width = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
      }
      if ( localName.equals( "height" ) ) {
         if ( inCircle ) {
                           currentCircle.height = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else if ( inSquare ) {
                           currentSquare.height = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
         else {
                           currentTriangle.height = 
                              Integer.valueOf
                              (contents.toString().trim()).intValue();
         }
      }
      if ( localName.equals( "Circle" ) ) {
                    inCircle = false;
      }
      if ( localName.equals( "Square" ) ) {
                    inSquare = false;
      }
      if ( localName.equals( "Triangle" ) ) {
                    inTriangle = false;
      }
   }
   public void characters( char[] ch, int start, int length )
                  throws SAXException {
      // accumulate the contents into a buffer.
      contents.write( ch, start, length );
   }
   public Vector getCircles() {
           return circles;
   }
   public Vector getSquares() {
           return squares;
   }
   public Vector getTriangles() {
           return triangles;
   }
   public static void main( String[] argv ){
      System.out.println( "Example5:" );
      try {
         // Create SAX 2 parser...
         XMLReader xr = XMLReaderFactory.createXMLReader();
         // Set the ContentHandler...
         Example5 ex5 = new Example5();
         xr.setContentHandler( ex5 );
         // Parse the file...
         xr.parse( new InputSource(
               new FileReader( "Example5.xml" )) );
         // Display all circles to stdout...
         Circle c;
         Vector items = ex5.getCircles();
         Enumeration e = items.elements();
         while( e.hasMoreElements()){
                           c = (Circle) e.nextElement();
            c.print( System.out );
         }
         // Display all squares to stdout...
         Square s;
         items = ex5.getSquares();
         e = items.elements();
         while( e.hasMoreElements()){
                           s = (Square) e.nextElement();
            s.print( System.out );
         }
         // Display all triangle to stdout...
         Triangle t;
         items = ex5.getTriangles();
         e = items.elements();
         while( e.hasMoreElements()){
                           t = (Triangle) e.nextElement();
            t.print( System.out );
         }
      }catch ( Exception e )  {
         e.printStackTrace();
      }
   }
}

The following is the output we have collected into our shape classes:

Example5:
Circle: circ1 x: 10 y: 10 width: 3 height: 3
Square: sq1 x: 0 y: 0 width: 3 height: 3
Triange: tri1 x: 3 y: 0 width: 5 height: 3
Triange: tri2 x: 5 y: 0 width: 5 height: 3

The second solution takes advantage of the fact that you can replace the SAX ContentHandler of a SAX parser while it's running. This allows us to divide our mapping tasks into modular pieces. We can implement mapping code only in the local terms of its particular fragment of XML document.

The endElement() method of the second example does not contain a network of nested if statements. This modularity becomes critical when processing more complex XML documents. It also ensures that this style of mapping code does not error in the face of duplicate tag names in unexpected locations within the XML document.

1 2 3 4 Page 3
Page 3 of 4