For me, the highlight of the JavaOne Developer Conference in San Francisco last March was Dallas Semiconductor's iButton with Java -- aka the Java Ring, a wearable computer that ran Java. It allegedly had a high-performance encryption engine, an exciting prospect indeed, until I discovered that the encryption unit wasn't accessible on the ring. Dallas Semiconductor later confirmed that it couldn't be enabled at all, which really dampened my enthusiasm for the whole concept.
I am resourceful though, and since Dallas Semiconductor had promised that a fully functional Java Ring was going to be available eventually, all I needed to do was wait. And while I was waiting, what better to do but learn the ins and outs of programming my new piece of "smart" jewelry?
Of course this pursuit of knowledge did raise the question of what, exactly, I should program into my ring.
I pondered this question for a while, until I came up with what was, for me, the ideal solution: I would program my ring to simulate the most important piece of cryptographic equipment used in the second World War -- the German ENIGMA machine. The ENIGMA was used by German commanders to encrypt all of their important plans and orders to the field marshalls. But first I had to figure out what the heck an APDU was!
The Java Card applet model
The Java Ring is in fact a Java smart card, and the ring's virtual machine is based on the Java virtual machine (JVM) that was proposed as the Java Card 2.0 standard. I've seen some wonderful technical discussions on how to program these devices (see the Resources section for pointers), but, to be perfectly honest, the descriptions were quite opaque to me. I was looking for linkages between the JVM on the ring and the JVM on a PC and finding nothing beyond descriptions of a rather peculiar serial interface that connected them. I then realized that what really connected the ring to the "outside" world was not a serial port but a network protocol. Allow me to explain.
The Java Card architecture has taken client/server architectures to a new place -- one where the "server" is a small piece of software on an extremely small system, and the client is a potentially huge piece of software on a potentially much larger system. The network protocol is encapsulated in packets that are called application program data units, or APDUs for short.
Unlike packets in the TCP/IP world, these APDU packets don't carry any sort of addressing information. Instead, they are implicitly addressed to the computer on the other end of the serial link. However, like their big-brother packets in the TCP/IP world, APDUs do carry a few bytes that are common to all packets. These can be used by the smart card infrastructure to decide when to send the APDUs to the server on the smart card, and when to interpret them directly.
This understanding provided an answer to one of my first questions about the Java Ring: How come a broken applet doesn't make the ring unusable? The answer is that the smart card runtime code gets the first crack at decoding the APDUs as they arrive on the serial interface. Further, there are predefined APDUs that tell the runtime to select an applet, delete applets, load applets, and so on. Thus, errant applets are simply deleted by the developer once it's ascertained that they aren't responding correctly to the APDUs they receive.
The structure of a smart card applet then became clear: The applet sits in a virtual loop, like a network service waiting for packets to arrive on its network interface. The method that handles this loop is named process
and is one of four required smart card applet methods. The other three are install
, select
, and deselect
.
The install
method must instantiate a new object that represents the applet. Like the main
method in a Java application class, the install
method is static so that it can be invoked by the smart card JVM before an object exists (it's invoked directly from the class definition).
In the simplest sense, the select
and deselect
methods tell the applet to either "get ready" or "go to sleep." Actually, the select
method has the option of refusing to be selected. For example, a personal identification number (PIN) or some other authorizing information might need to be passed from the client program to the method (by way of the APDU) before the applet would allow itself to be selected.
The deselect
method, on the other hand, doesn't have the ability to deny being deselected -- it provides more of a courtesy to the applet, telling it that it won't be getting any more packets. When a smart card is yanked from the reader before the client is finished, the deselect
method isn't even called. The next thing the applet will see will be another invocation of select
.
Missing from all of this was, in my mind, support of Java objects. I had hoped that in the APDU one could just dump an object or two and have them show up in the ring. But object support isn't part of the package. Instead you get a byte array that you can copy data into, send, receive, and then copy the data back out of. Not as sophisticated as I might hope, but it's still more than enough for my little encryption engine.
Substitution ciphers and the ENIGMA machine
After determining exactly what an APDU was and was not, I began to design the Java classes that would simulate an ENIGMA machine. Of course I first needed to figure out exactly how an ENIGMA machine worked, but that wasn't too difficult with the World Wide Web at my fingertips.
I surfed around and found pictures, a decent description of how it worked, a Java applet that was a pretty cool emulation of the machine, and even a patent relating to the machine (see Resources for links to these items). I found the patent to be particularly interesting because it was filed in 1944 but it didn't issue until 1976, over thirty years later! Curiously, 1976 is the same year that the U.S. Government officially adopted the Data Encryption Standard (DES) as its official encryption technology of choice. My guess is that the ENIGMA patent was classified as top secret until 1976, when the government stopped using cryptography based on the ENIGMA technology.
The basic algorithm behind the ENIGMA is known as a substitution cipher. A substitution cipher takes each input letter and substitutes a different letter for it. The resulting text looks like gibberish unless you know the secret substitution table that was originally used to scramble the message. This fact makes the ENIGMA a particularly good choice for my ringlet -- the secret decoder rings that we used to get in cereal boxes and Cracker Jacks-brand caramel popcorn were also based on substitution ciphers.
There are two kinds of simple substitution ciphers, algorithmic and mapping.
Algorithmic substitution cipher
In an algorithmic substitution cipher the "key" to the cipher is the algorithm. For example, an algorithmic cipher that has the key "Add 3" would operate as follows:
For each letter in the input message, substitute the letter that is three more letters along in the alphabet
- If the letter is X, Y, or Z, then substitute the letters A, B, and C respectively
When the above cipher is used, the input message, HELLO BOB (this is called the message text) would be translated into the output message KHOOR ERE (called the cipher text). Unfortunately, algorithmic ciphers are pretty easy to decode even without the key, as there are very common letter patterns in the English language. Further, the "rules" associated with English spelling (such as the necessity of having a vowel in every word) and certain common single-character words make the letter relationships fairly apparent. To obscure the obvious relationship of the letters in the output message to the letters in the input message, the output message was often reformatted in a fixed pattern of character groups. Thus the output KHOOR ERE would be reformatted using four-letter groups to be KHOO RERE. The reformatting makes it less obvious where the words begin and end, but it means that the clerk decoding the message has to read it to figure out where to reinsert the spaces.
Another problem with algorithmic ciphers is their construction. The construction of an algorithmic substitution cipher requires that the key be just one algorithm. Once that algorithm is known for one letter, it's known for all letters. This can be used by an adept cryptanalyst who first counts all of the letters in the cipher text, notes the most commonly occurring letter, combines this with the knowledge that the letter E is the most commonly occurring letter in the English language, and then figures out through deduction what the relationship is between the most common character in the cipher text and the letter E. In short order, the algorithm is known and the cipher is broken. The message is then easily decoded.
Mapping substitution cipher
This brings me to the other type of substitution cipher, the mapping cipher. In a mapping cipher, a fixed relationship exists between input characters and output characters, but that relationship is held by a translation table rather than by using an algorithm. For example, let us assume your alphabet consisted of exactly seven letters -- A, B, C, D, E, F, and G -- and you set up a translation table as shown below.
Input character | A | B | C | D | E | F | G |
---|---|---|---|---|---|---|---|
Output character | B | G | E | D | F | C | A |
Table 1: Character substitution map |
If you were to use the message text CAFE BABE and substitute the letters from the table, you would get the cipher text EBCF GBGF. This doesn't look all that different from the algorithmic cipher -- until the cryptanalyst tries to break the code. The advantage of using the table is that while the analyst could find the relationship between the message text letter E and the cipher text letter F using the same statistical method that I described above, that information wouldn't tell her the relationship between the message text letter B and the cipher text letter G. This property, the independence of the relationship between the letters of the message and the encrypted message, makes the mapping cipher stronger -- in the sense that it's harder to break than the algorithmic cipher.
Of course, neither form of these substitution ciphers is particularly strong in the face of a determined attack. And that brings us to the ENIGMA.
How the ENIGMA machine works
The user interface for the original ENIGMA machine consists of a keyboard with 26 keys inscribed with the 26 anglo-roman letters A through Z, and 26 lightbulbs behind transparent windows that are also inscribed with the letters A through Z. The ENIGMA operator presses a key, the ENIGMA machine performs its substitution, and a lightbulb under one of the characters illuminates. The code clerk then copies down the letter associated with the lightbulb and the next letter of the message is "typed." The process is repeated until all characters in the input message have been processed. The magic, of course, goes on inside the machine, which is where the rotors described in the patent come into play.
A rotor is simply a character-substitution map built into hardware. On one side of the rotor there are 26 contacts representing the 26 letters of the alphabet, and those 26 contacts are wired to 26 different contacts on the other side of the rotor. If the lamps had been connected to the 26 contacts on the other side of the rotor, that configuration would have been an implementation of a simple mapping cipher. However, instead of the lamps, the rotor's output connectors are connected to another rotor. The second rotor provides a second mapping. Then, that rotor is connected to yet a third rotor, providing three levels of substitution.
The third rotor in the series is connected to a special fourth rotor called the reflector. The reflector has two properties that make it unlike the other rotors in the system. First, it doesn't rotate (and I'll get to the rotation bit in a minute), and second, its substitution map is symmetric. A symmetric map is one in which the substitutions work bi-directionally. For example, in Table 1 above, the letter A maps to the letter B; however, the letter B does not map to the letter A. Thus, the mapping in Table 1 is asymmetric. In Table 2, shown below, I've created a symmetric mapping.
Input character | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
Output character | B | A | E | G | C | H | D | F |
Table 2: Symmetric character substitution map |
If you look closely at Table 1, you'll notice that I also extended the alphabet in Table 2 by one character, to make the total number of characters even at 8. This extension was necessary because with an odd number of characters, one of the mappings won't be unique (the character will map to itself).
In the case of the ENIGMA, the reflector maps the incoming character to another character, and sends the signal back out to the three rotors -- this time traveling in the opposite direction. I drew a picture of this to illustrate what is happening, shown below in Figure 1.