This month's article continues the discussion of Java's security model begun in August's "Under the Hood." In that article, I gave a general overview of the security mechanisms built into the Java virtual machine (JVM). I also looked closely at one aspect of those security mechanisms: the JVM's built-in safety features. In September's "Under the Hood," I examined the class loader architecture, another aspect of the JVM's built-in security mechanisms. This month I'll focus on the third prong of the JVM's security strategy: the class verifier.
The class-file verifier
Every Java virtual machine has a class-file verifier, which ensures that loaded class files have a proper internal structure. If the class-file verifier discovers a problem with a class file, it throws an exception. Because a class file is just a sequence of binary data, a virtual machine can't know whether a particular class file was generated by a well-meaning Java compiler or by shady crackers bent on compromising the integrity of the virtual machine. As a consequence, all JVM implementations have a class-file verifier that can be invoked on untrusted classes, to make sure the classes are safe to use.
One of the security goals that the class-file verifier helps achieve is program robustness. If a buggy compiler or savvy cracker generated a class file that contained a method whose bytecodes included an instruction to jump beyond the end of the method, that method could, if it were invoked, cause the virtual machine to crash. Thus, for the sake of robustness, it is important that the virtual machine verify the integrity of the bytecodes it imports.
Although Java virtual machine designers are allowed to decide when their virtual machines will perform these checks, many implementations will do most checking just after a class is loaded. Such a virtual machine analyzes bytecodes (and verifies their integrity) once, before they are ever executed. As part of its verification of bytecodes, the Java virtual machine makes sure all jump instructions -- for example, goto
(jump always), ifeq
(jump if top of stack zero), etc. -- cause a jump to another valid instruction in the bytecode stream of the method. As a consequence, the virtual machine need not check for a valid target every time it encounters a jump instruction as it executes bytecodes. In most cases, checking all bytecodes once before they are executed is a more efficient way to guarantee robustness than checking each bytecode instruction every time it is executed.
A class-file verifier that performs its checking as early as possible most likely operates in two distinct phases. During phase one, which takes place just after a class is loaded, the class-file verifier checks the internal structure of the class file, including verifying the integrity of the bytecodes it contains. During phase two, which takes place as bytecodes are executed, the class-file verifier confirms the existence of symbolically referenced classes, fields, and methods.
Phase one: Internal checks
During phase one, the class-file verifier checks everything that's possible to check in a class file by looking at only the class file itself (without examining any other classes or interfaces). Phase one of the class-file verifier makes sure the imported class file is properly formed, internally consistent, adheres to the constraints of the Java programming language, and contains bytecodes that will be safe for the Java virtual machine to execute. If the class-file verifier finds that any of these are not true, it throws an error, and the class file is never used by the program.
Checking format and internal consistency
Besides verifying the integrity of the bytecodes, the verifier performs many checks for proper class file format and internal consistency during phase one. For example, every class file must start with the same four bytes, the magic number: 0xCAFEBABE
. The purpose of magic numbers is to make it easy for file parsers to recognize a certain type of file. Thus, the first thing a class-file verifier likely checks is that the imported file does indeed begin with 0xCAFEBABE
.
The class-file verifier also checks to make sure the class file is neither truncated nor enhanced with extra trailing bytes. Although different class files can be different lengths, each individual component contained inside a class file indicates its length as well as its type. The verifier can use the component types and lengths to determine the correct total length for each individual class file. In this way, it can verify that the imported file has a length consistent with its internal contents.
The verifier also looks at individual components to make sure they are well-formed instances of their type of component. For example, a method descriptor (the method's return type and the number and types of its parameters) is stored in the class file as a string that must adhere to a certain context-free grammar. One of the checks the verifier performs on individual components is to make sure each method descriptor is a well-formed string of the appropriate grammar.
In addition, the class-file verifier checks that the class itself adheres to certain constraints placed on it by the specification of the Java programming language. For example, the verifier enforces the rule that all classes, except class Object
, must have a superclass. Thus, the class-file verifier checks at runtime some of the Java language rules that should have been enforced at compile-time. Because the verifier has no way of knowing if the class file was generated by a benevolent, bug-free compiler, it checks each class file to make sure the rules are followed.