In my December 1996 column titled "Generically speaking," I discussed the difficulty of building a generic collection class in Java. Since that time two things have happened: Sun created inner classes to assist in the implementation of Java Beans; and Sun has proposed a basic collection class for Java. I like Sun's proposal, but I wish the company had included a way of doing type enforcement. For those of you not familiar with the term, type enforcement is a mechanism that ensures that objects placed into a collection are of the proper type. Currently, the collections use class Object
as a generic reference which, if through a bug in your class, you store an unexpected type into the collection, will cause a class cast exception when you retrieve the stored reference from the collection. For an example of a class that provides type enforcement, see the class ContainerOrganizer
in the Resources section below.
If you think that type enforcement in collections is a Good Thing, I encourage you to write Sun at collection-comments.
For this column I thought it would be useful to revisit the class we designed last December and replace that example's "mixin" classes with inner classes.
What are inner classes?
Inner classes are class files whose "name" is scoped to be inside another class. What do I mean by "scope"? The scope of a name defines when that name can be referenced by Java code. When two classes share the same package, they are said to have "package scope." Classes that are outside the containing package cannot refer to classes inside the package that are not declared as being public. Thus the names of the classes are scoped to the package.
Inner classes are scoped to the class used to declare them -- thus they are effectively "invisible" to the other classes in the same package. This lack of visibility to other classes in the package gives the programmer the opportunity to create a set of classes within a containing class without cluttering up the name space of classes in their package.
According to Sun's design documents, the Sun engineers felt that inner classes removed what had been a wart in the definition of Java. The wart was that a Java class could only be defined as a peer of other classes. This created an artificial level that applied only to class definitions. Inner classes primarily are a change to Java compilers that gives them a mechanism to eliminate this wart. To support backwards compatibility with existing virtual machines, the inner classes are actually compiled to regular class files, only the names are changed to prevent them from colliding with existing classes. For more information on this "exchange of warts," I refer you to the complete design document.
The changes that were made to the Java language that would allow the language to support inner classes consisted of changing the rules about where you could declare a class. If you have been programming in Java 1.0x, you know that you can declare multiple classes in a single source file (as long as only one of them is public) and that the declaration of the second and subsequent classes must follow the closing brace of the first class. That is now changed in 1.1: Now you can declare a class within another class. The following example will illustrate this more clearly. In Java 1.0 you had to declare two classes in a file sequentially as shown in the code below.
public class MyPublicClass { public MyPublicClass() { ... } void method1() { ... } int method2() { ... } public method3(int x, int y) { ... } }
The above class is the "base class" that is stored in a file named MyPublicClass.java. As shown, it needs a helper class in the method3 method to perform some function. Typically, this simple class is defined in the same file, and an example is shown below.
class MyHelperClass { MyHelperClass() { ... } int someHelperMethod(int z, int q) { ... } }
Usually, classes that are stored in the same file are related in some way. In the class BinarySearchTree, for example, I created another class BSTEnumerator, which provided a way of returning all of the elements (or keys) in a binary search tree instance. In subclasses of Dictionary, such as Hashtable, it is fairly common to combine both of these classes into a single file.
With inner classes, you can actually declare this code in your file as follows.
public class MyPublicClass { public MyPublicClass() { ... } class MyHelperClass { MyHelperClass() { ... } int someHelperMethod(int z, int q) { ... } } void method1() { ... } int method2() { ... } public method3(int x, int y) { ... } }
As you can see, the only change above involved moving around some text. However, what it means when a class appears inside another class is more significant.
The name of the inner class in the above example is MyHelperClass
, but in the first case you had two classes (probably in the default package): one called MyPublicClass
, and one called MyHelperClass
. In the second example, you still have two classes, except that one is called MyPublicClass
, and the other is mapped to the name MyPublicClass$MyHelperClass
. The advantage here is that a new space for names was created that holds just the name of the helper class. In the previous example both the base class and the helper class were cluttering up the package name space, with the inner class, the helper class gets a name space all to itself.
How can I use inner classes?
To illustrate the use of inner classes, I took the BinarySearchTree
class that I used in the December 1996 column and modified it to support an inner enumerator. This is the canonical example of an inner class since the enumerator is truly a use-and-discard type of object class. In object-oriented speak, we say this is a return value with behavior.
The modified source is as follows:
import java.io.PrintStream; import java.util.Enumeration; import java.util.NoSuchElementException; import java.util.Dictionary; public class BinarySearchTree extends Dictionary { BSTNode rootNode; private int elementCount; private ContainerOrganizer co;
Clearly, the first part of the class is the same as it was in the December column, however following the class declaration, another class declared: the iterator. This is shown below.
/** * Define an inner class to be the enumerator */ class BSTEnumerator implements Enumeration { private BSTNode currentNode; private boolean keys; BSTEnumerator(BSTNode start, boolean doKeys) { super(); currentNode = (start != null) ? start.min() : null; keys = doKeys; } public boolean hasMoreElements() { return (currentNode != null); } public Object nextElement() { if (currentNode == null) throw new NoSuchElementException(); BSTNode n = currentNode; currentNode = n.successor(); return (keys ? n.key : n.payload); } }
This BSTEnumerator
class is identical to that in the December column, except that now it is defined to be inside the BinarySearchTree
class. Thus, when you compile this version of the class with JDK 1.1, the compiler will create both a BinarySearchTree.class
file and a BinarySearchTree$BSTEnumerator.class
file. Notice that there is a $ character in the middle of the class name. This character was chosen to distinguish inner classes from the base class that contains them. After the inner class definition of BSTEnumerator
, the definition of BinarySearchTree
continues, unchanged from the original version.
public BinarySearchTree(ContainerOrganizer c) { super(); co = c; co.setDict(this); } ... and so on for the rest of the class ... }
When using this form of inner classes, the main value is the ability to create a specialized enumeration class without cluttering up the class name space at the package level. When you consider that a package of collection classes may have many different behavior-based return values (Enumeration, SortedList, Localized, and so on), the number of specialized classes can become quite large. That can be a lot of names that would previously need a unique value. And yes, it is entirely possible to maintain these name spaces manually and achieve substantially the same effect as inner classes; that is why Sun claims it simply syntactic sugar in the compiler.
Anonymous classes -- a kind of inner class
There is a more interesting version of inner classes: anonymous classes.
When you don't even need a name because you really just want to pass a method that does something (analogous to a C language callback function), then you can get away with creating an anonymous inner class. These classes are simply inner classes that are not given a specific name. Typically, a class is not named when it is used only once.
As an illustration of an anonymous class, I rewrote the keys and elements methods of BinarySearchTree, both of which used to return a BSTEnumerator
object, as shown below. Note that the complete source is available via a link in the Resources section below.
private Enumeration doEnumerate(final boolean k) { return new Enumeration() { private BSTNode currentNode = rootNode; private boolean keys = k; public boolean hasMoreElements() { return (currentNode != null); } public Object nextElement() { if (currentNode == null) throw new NoSuchElementException(); BSTNode n = currentNode; currentNode = n.successor(); return (keys ? n.key : n.payload); } }; } public Enumeration elements() { return doEnumerate(false); } public Enumeration keys() { return doEnumerate(true); }
Check out the doEnumerate function; two things are weird about it. First, it returns a new instance of an interface. Second, its formal parameter k is final. What's going on here? The answer is that this technique is the other way to use inner classes.
The return statement looks like it is creating a new instance of an interface, but what it is really doing is creating a new class that doesn't have an explicit name and that implements the interface. In this case, the actual class name turns out to be BinarySearchTree
, but this is only useful to compiler writers.
One of the constraints on creating a class in this way is that the new class can't define constructors (it has no name, remember) and so there has to be some way to get the initial state set up inside the class. Java 1.1 accomplishes this trick by redefining what you can do with non-class variables. In particular, it is now legal to initialize them as you would a static variable without putting them inside a static initializer block. What this means for programmers is that you can declare an instance value that is both non-static and pre-initialized -- such as "int foo = 10;
".
Of course, when you create an inner class, typically you are interested in initializing the inner class from information contained in the enclosing class. (In the example, BinarySearchTree
is the enclosing class.) This is accomplished by a third tweaking of the rules that say an inner class has access to any fields (variables) that a statement would have if a statement occurred where the definition for the class occurred. Thus in the example above, the initializer "currentNode = rootNode
" refers to the instance variable rootNode in the enclosing class instance.
There is a second initializer in the anonymous class above, which is the one that says "keys = k;
". C programmers who see this will immediately duck and run for cover screaming "You are using a value from the stack in your return value!" -- noting that, in this code, k is the formal parameter associated with doEnumerate. Therein lies the reason for the new semantics on the final keyword.
In the Java language, the attribute final
on a variable means that its value can only be assigned once. Thus, once assigned, a final variables value will never change. In the above example the compiler can safely make a copy of the value of k into stable storage.
Java is a "pass-by-value" language, meaning that all parameters to methods contain a value, and modifying that value will have no impact on the invoking methods copy of that value. This semantic is a bit confusing when you consider that an object reference is also a value, and if you pass a reference value of an object and modify that object, everyone with the same reference will see the modified object. Confusing as this may seem, Java's semantics are actually consistent.
With anonymous inner classes, the question arises about what would happen with the assignment of k when the value of k might change? Consider for the moment that perhaps some twisted code was written as follows:
private Enumeration doEnumerate(boolean k) { Enumeration ee = new Enumeration() { private BSTNode currentNode = rootNode; private boolean keys = k; public boolean hasMoreElements() { ... } public Object nextElement() { ... } }; k = !k; return ee; }