Note that the JVM offers very little support for Boolean values. The Java compiler transforms them into 32-bit values with 1 representing true and 0 representing false.
Character
The character type describes character values (for instance, the uppercase letter A, the digit 7, and the asterisk [*] symbol) in terms of their assigned Unicode numbers. (As an example, 65 is the Unicode number for the uppercase letter A.) Character values are represented in memory as 16-bit unsigned integer values. Operations performed on characters include classification, for instance classifying whether a given character is a digit.
Extending the Unicode standard from 16 bits to 32 bits (to accommodate more writing systems, such as Egyptian hieroglyphs) somewhat complicated the character type. It now describes Basic Multilingual Plane (BMP) code points, including the surrogate code points, or code units of the UTF-16 encoding. If you want to learn about BMP, code points, and code units, study the Character
class's Java API documentation. For the most part, however, you can simply think of the character type as accommodating character values.
Integer types
Java supports four integer types for space and precision reasons: byte integer, short integer, integer, and long integer. Arrays based on shorter integers don't consume as much space. Calculations involving longer integers give you greater precision. Unlike the unsigned character type, the integer types are signed.
The Byte integer type describes integers that are represented in 8 bits; it can accommodate integer values ranging from -128 through 127. As with the other integer types, byte integers are stored as two's-complement values. In a two's-complement all the bits are flipped, from one to to zero and from zero to one, and then the number one is added to the result. The leftmost bit is referred to as the sign bit and all other bits refer to the number's magnitude. This representation is illustrated in Figure 2.
Figure 2. The internal representation of positive and negative 8-bit integers consists of sign and magnitude
Byte integers are most useful for storing small values in an array. The compiler generates bytecode to convert a byte integer value to an integer value before performing a mathematical operation such as addition. Java's byte
reserved word identifies the byte integer type in source code.
The Short integer type describes integers that are represented in 16 bits; it can accommodate integer values ranging from -32,768 to 32,767. It possesses the same internal representation as byte integer, but with more bits to accommodate its larger magnitude. The compiler generates bytecode to convert a short integer value to an integer value before performing a mathematical operation. Java's short
reserved word identifies the short integer type in source code.
The integer type describes integers that are represented in 32 bits; it can accommodate integer values ranging from -2,147,483,648 to 2,147,483,647. It possesses the same internal representation as byte integer and short integer, but with more bits to accommodate its larger magnitude. Java's int
reserved word identifies the integer type in source code.
The long integer type describes integers that are represented in 64 bits; it can accommodate integer values ranging from -263 to 263-1. It possesses the same internal representation as byte integer, short integer, and integer, but with more bits to accommodate its larger magnitude. Java's long
reserved word identifies the long integer type in source code.
Floating-point types
Java supports two floating-point types for space and precision reasons. The smaller type is useful in an array context, but cannot accommodate as large a range of values. Although it occupies more space in an array context, the larger type can accommodate a greater range.
The floating-point type describes floating-point values that are represented in 32 bits; it can accommodate floating-point values ranging from approximately +/-1.18x10-38 to approximately +/-3.4x1038. It is represented in IEEE 754 format in which the leftmost bit is the sign bit (0 for positive and 1 for negative), the next eight bits hold the exponent, and the final 23 bits hold the mantissa, resulting in about 6-9 decimal digits of precision. Java's float
reserved word identifies the floating-point type in source code.
The double precision floating-point type describes floating-point values that are represented in 64 bits; it can accommodate floating-point values ranging from approximately +/-2.23x10-308 to approximately +/-1.8x10308. It is represented in IEEE 754 format in which the leftmost bit is the sign bit (0 for positive and 1 for negative), the next 11 bits hold the exponent, and the final 52 bits hold the mantissa, resulting in about 15-17 decimal digits of precision. Java's double
reserved word identifies the double precision floating-point type in source code.
Reference types
A reference type is a type from which objects are created or referenced, where a reference is some kind of pointer to the object. (A reference could be an actual memory address, an index into a table of memory addresses, or something else.) Reference types are also known as user-defined types because they are typically created by language users.
Java developers use the class feature to create reference types. A class is either a placeholder for an application's main()
method (see the HelloWorld
application in "Learn Java from the ground up") or various static
methods, or it's a template for manufacturing objects, as demonstrated below.
class Cat
{
String name; // String is a special reference type for describing strings
Cat(String catName)
{
name = catName;
}
String name()
{
return name;
}
}
This class declaration introduces a Cat
class for describing felines. Its name
field stores the cat's name as a string, its constructor initializes this data member to a cat name, and its name()
method returns the cat's name. The following code snippet, which would be located in a main()
method, shows how to manufacture a cat and obtain its name:
Cat cat = new Cat("Garfield");
System.out.println(cat.name()); // Output: Garfield
Referencing objects with interfaces
Java's interface feature lets you reference an object without concern for the object's class type. As long as the object's class implements the interface, the object is also considered to be a member of the interface type.
The following example declares a Shape
interface along with Circle
and Rectangle
classes:
interface Shape
{
void draw();
}
class Circle implements Shape
{
void draw()
{
System.out.println("I am a circle.");
}
}
class Rectangle implements Shape
{
void draw()
{
System.out.println("I am a rectangle.");
}
}
The next example instantiates Circle
and Rectangle
, assigns their references to Shape
variables, and asks them to draw themselves:
Shape shape = new Circle();
shape.draw(); // Output: I am a circle.
shape = new Rectangle();
shape.draw(); // Output: I am a rectangle.
You can use interfaces to abstract commonality from a set of otherwise dissimilar classes. As an example, an Inventory
interface would extract commonality from Goldfish
, Car
, and Hammer
classes, because each of these items can be inventoried. Interfaces offer considerable power when combined with arrays and loops, which you'll learn about later in this series.
Array types
Array is the last of our three types. An array type is a special reference type that denotes an array, which is a region of memory that stores values in slots that are of equal size and are (typically) contiguous. These values are commonly referred to as elements. The array type is composed of the element type (a primitive type or a reference type) and one or more pairs of square brackets that indicate the number of dimensions (extents) occupied by the array. A single pair of brackets signifies a one-dimensional array (a vector); two pairs of brackets signify a two-dimensional array (a table); three pairs of brackets signify a one-dimensional array of two-dimensional arrays (a vector of tables); and so on. For example, int[]
signifies a one-dimensional array (with int
as the element type), and String[][]
signifies a two-dimensional array (with String
as the element type).
Literals: Specifying values in your Java code
Java provides the literals language feature for embedding values in source code. A literal is a value's character representation. Each primitive type is associated with its own set of literals, as follows.
The Boolean primitive type is associated with the literals true
or false
.
The character primitive type is associated with character literals, which often consist of single values placed between single quotes, as in capital letter A ('A'
). Alternatively, you could specify an escape sequence or a Unicode escape sequence. Consider these options:
- An escape sequence is a representation for a character that cannot be expressed literally in a character literal or a string literal. An escape sequence begins with a backslash character (
\
) and is followed by one of\
,'
,"
,b
,f
,n
,r
, ort
. You must always escape a backslash that's to be expressed literally to inform the compiler that it isn't introducing an escape sequence. You must always escape a single quote expressed literally in a character literal to inform the compiler that the single quote isn't ending the character literal. Similarly, you must always escape a double quote expressed literally in a string literal to inform the compiler that the double quote isn't ending the string literal. The other escape sequences are for characters with no symbolic representation:\b
represents a backspace,\f
represents a form feed,\n
represents a new-line,\r
represents a carriage return, and\t
represents a horizontal tab. Escape sequences appear between single quotes in a character literal context (e.g.,'\n'
). - A Unicode escape sequence is a representation for an arbitrary Unicode character. It consists of a
\u
prefix immediately followed by four hexadecimal digits. For example,\u0041
represents capital letter A, and\u3043
represents a Hiragana letter. Unicode escape sequences appear between single quotes in a character literal context (e.g.,'\u3043'
).
The integer types are associated with literals consisting of sequences of digits, with optionally embedded underscore characters. By default, an integer literal is assigned the integer (int
) type. You must suffix the literal with capital letter L
(or lowercase letter l
, which might be confused with digit 1
) to represent a long integer value. Integer literals can be specified in binary, decimal, hexadecimal, and octal formats:
- Binary consists of the numbers zero and one and is prefixed with
0b
or0B
. Example:0b01111010
. - Decimal consists of the numbers zero through nine and has no prefix. Example:
2200
. - Hexadecimal consists of the numbers zero through nine, lowercase letters a through f, and uppercase letters A through F. This literal is prefixed with
0x
or0X
. Example:0xAF
. - Octal consists of the numbers zero through seven and is prefixed with
0
. Example:077
.
To improve legibility, you can insert underscore characters between digits; for example, 1234_5678_9012_3456L
. You cannot specify a leading underscore, as in _1234
, because the compiler would assume that an identifier was being specified. You also cannot specify a trailing underscore.