Herbert Spencer wrote, "Science is organized knowledge." The corollary might be that applications are organized objects. Let's take a moment to turn to some aspects of Java that are critical for developing applications rather than applets.
Of those who have heard of Java, the majority has learned about the language through the popular press. The statement that crops up a lot is that Java is for "programming small applications, or applets, that can be embedded on a Web page." While correct, this definition conveys only one aspect of the new language; it doesn't describe the whole picture. Perhaps Java can be better described as a language designed to build systems -- large systems -- of well-understood portable pieces of executable code that can be combined, in whole or in part, to produce a desirable whole.
In this column I will begin to look at the various tools you can use to build in Java. I will demonstrate how these tools can be combined to make a larger application, and how, once you have an application, you can further aggregate the application into still larger systems -- all possible because in Java there is no distinction between a complete application and a simple subroutine.
To provide source code fodder for this and past columns, I chose to build a BASIC interpreter. "Why BASIC?" you might ask, thinking no one uses BASIC anymore. This is not entirely true. BASIC lives on in Visual Basic and in other scripting languages. But more importantly, many people have been exposed to it and can make the following conceptual leap: If "applications" are programmed in BASIC, and BASIC can be written in Java, then applications can be written in Java. BASIC is just another interpreted language; the tools that we will be building can be modified to use any language syntax, thus the core concepts are the focus of these articles. Therefore, what starts as an application becomes a component of other applications -- even applets perhaps.
Generic classes and containers
Building generic classes is particularly relevant when creating applications because reusing classes provides tremendous leverage in reducing both complexity and time to market. In an applet, the value of a generic class is mitigated by the requirement of loading it over the network. The negative impact of loading generic classes over the network is demonstrated by Sun's Java Workshop (JWS). JWS augments the standard version of the abstract windowing toolkit (AWT) by using some very elegant "shadow" classes. The advantage is that applets are easy to develop and are rich in features; the downside is that loading these classes can take a lot of time on a slow network link. While this disadvantage will disappear eventually, what we find is that a system perspective on class development is often required to achieve the best solution.
Since we're starting to look a bit more seriously at application development, we will assume we have already determined that generic classes are a valid solution.
Java, like many general-purpose languages, provides several tools for creating generic classes. Different requirements will necessitate using
different tools. In this column I will use the development of a container
class as an example since it can accommodate nearly all of the tools a user might wish to use.
Containers: A definition
For those of you who are not yet familiar with things object-oriented, a container is a class that organizes other objects. Common containers are binary trees, queues, lists, and stacks. Java supplies three container classes with the JDK 1.0.2 release: java.util.Hashtable, java.util.Stack, and java.util.Vector.
Containers have both an organizing principle and an interface. Stacks, for example, may be organized as "first in, last out" (FILO), and their interface may be defined to have two methods -- push() and pop(). Simple containers can be thought of as having the standard methods add and remove. Further, they will have a means to enumerate the entire container, to check to see if a candidate object is already in the container and to test the number of elements being held by the container.
The Java container classes demonstrate some of the problems with containers, especially keyed containers (those containers that use a key to locate an object). The non-keyed containers like Stack and Vector simply stuff objects in and pull objects out. The keyed container Hashtable uses a key object to locate a data object. For the keying function to work, the key object must support a method HashCode that returns a unique hash code for every object. This keying ability works because the Object
class defines a HashCode method and thus is inherited by all objects, but it isn't always what you want. For example, if you are putting objects into your hash table container and you are indexing them with String objects, the default HashCode method simply returns a unique integer based on the object reference value. For strings, you really want the hash code to be a function of the string value, so String overrides HashCode and supplies its own version. This means that for any object you develop, and want to store in a hash table using an instance of the object as the key, you must override the HashCode method. This insures that identically constructed objects hash to the same code.
But what about sorted containers? The only sorting interface provided in the Object
class is equals(), and it's constrained to equating two objects as having the same reference, not having the same value. This is why, in Java, you can't write the following code:
if (someStringObject == "this") then { ... do something ... }
The above code compares the object references, notes that there are two different objects here, and returns false. You have to write the code as follows:
if (someStringObject.compareTo("this") == 0) then { ... do something ...}
This latter test uses knowledge encapsulated in the compareTo method of String to compare two string objects and return an indication of equality.
Using the tools in the box
As I mentioned earlier, generic program developers have two primary tools available to them: implementation inheritance (extending) and behavioral inheritance (implementing).
To use implementation inheritance, you extend (subclass) an existing class. By extension, all subclasses of the base class have the same capabilities as the root class. This is the basis for the HashCode
method in the Object
class. As all objects inherit from the java.lang.Object
class, all objects have a method HashCode
that returns a unique hash for that object. If you wish to use your objects as keys, however, keep in mind the caveat mentioned earlier about overriding HashCode
.
In addition to implementation inheritance, there is behavioral inheritance (implementing), which is achieved by specifying that an object implements a particular Java interface. An object that implements an interface can be cast to an object reference of that interface type. Then that reference can be used to invoke the methods specified by that interface. Typically, interfaces are used when a class may need to process several objects of different types in a common way. For example, Java defines the Runnable interface that is used by the thread classes to work with classes in their own thread.
Building a container
To demonstrate the tradeoffs in writing generic code, I will walk you through the design and implementation of a sorted container class.
As I mentioned earlier, in the development of general-purpose applications, in many cases a good container would be useful. In my example application I needed a container that was both keyed, meaning that I wanted to retrieve contained objects by using a simple key, and sorted so that I could retrieve the contained objects in a specific order based on the key values.
When designing systems, it is important to keep in mind what parts of the system use a particular interface. In the case of containers, there are two critical interfaces -- the container itself and the keys that index the container. User programs use the container to store and organize objects; the containers themselves use the key interfaces to help them organize themselves. When designing containers, we strive to make them easy to use and to store a wide variety of objects (thus increasing their utility). We design the keys to be flexible so that a wide variety of container implementation can use the same key structures.
To solve my behavioral requirements, keying and sorting, I turn to a useful tree data structure called a binary search tree (BST). Binary trees have the useful property of being sorted, so they can be efficiently searched and can be dumped out in sorted order. The actual BST code is an implementation of the algorithms published in the book Introduction to Algorithms, by Thomas Cormen, Charles Leiserson, and Ron Rivest.
java.util.Dictionary
The Java standard classes have taken a first step toward generic keyed containers with the definition of an abstract class named java.util.Dictionary
. If you look at the source code that comes with the JDK, you will see that Hashtable
is a subclass of Dictionary
.
The Dictionary
class attempts to define the methods common to all keyed containers. Technically, what is being described could more properly be called a store as there is no required binding between the key and the object it indexes. However, the name is appropriate as nearly everyone understands the basic operation of a dictionary. An alternative name might be KeyedContainer
, but that title gets tedious pretty quickly. The point is that the common superclass of a set of generic classes should express the core behavior being factored out by that class. The Dictionary
methods are as follows:
size( ) | This method returns the number of objects currently being held by the container. |
isEmpty( ) | This method returns true if the container has no elements. |
keys( ) | Return the list of keys in the table as an Enumeration. |
elements( ) | Return the list of contained objects as an Enumeration. |
get(Object k) | Get an object, given a particular key k. |
put(Object k, Object o) | Store an object o using key k. |
remove(Object k) | Remove an object that is indexed by key k. |
By subclassing Dictionary
, we use the tool of implementation inheritance to create an object that can be used by a wide variety of clients. These clients need know only how to use a Dictionary, and we can then substitute our new BST or a Hashtable without the client noticing. It is this property of abstracting out the core interface into the superclass that is crucial to reusability, general-purpose function, expressed cleanly.
Basically, Dictionary
gives us two groups of behavior, accounting and administration -- accounting in the form of how many objects we've stored and bulk reading of the store, and administration in the form of get, put, and remove.
If you look at the Hashtable
class source (it is included with all versions of the JDK in a file named src.zip), you will see that this class extends Dictionary
and has two private internal classes, one named HashtableEntry and one named HashtableEnumerator. The implementation is straightforward. When put is called, objects are placed into a HashtableEntry object and stored into a hash table. When get is called, the key passed is hashed and the hashcode is used to locate the desired object in the hash table. These methods keep track of how many objects have been added or removed, and this information is returned in response to a size request. The HashtableEnumerator
class is used to return results of the elements method or keys method.
First cut at a generic keyed container
The BinarySearchTree
class is an example of a generic container that subclasses Dictionary
but uses a different organizing principle. As in the Hashtable
class, I've added a couple of classes to support holding the stored objects and keys and for enumerating the table.
The first is BSTNode, which is equivalent to a HashtableEntry. It is defined as shown in the code outline below. You can also look at the source.
class BSTNode { protected BSTNode parent; protected BSTNode left; protected BSTNode right; protected String key; protected Object payload; public BSTNode(String k, Object p) { key = k; payload = p; } protected BSTNode() { super(); } BSTNode successor() { return successor(this); } BSTNode precessor() { return predecessor(this); } BSTNode min() { return min(this); } BSTNode max() { return max(this); } void print(PrintStream p) { print(this, p); } private static BSTNode successor(BSTNode n) { ... } private static BSTNode predecessor(BSTNode n) { ... } private static BSTNode min(BSTNode n) { ... } private static BSTNode max(BSTNode n) { ... } private static void print(BSTNode n, PrintStream p) { ... } }
Let's take a look at this code in order to clarify two things. First, there's the null-protected constructor, which is present so that subclasses of this class do not have to declare a constructor that overrides one of this class' constructors. Second, the methods successor, predecessor, min, max, and print are very short and merely call the same private equivalent so as to conserve memory space.