There are some situations when using floating-point data types that can cause some odd behavior; see the sidebar, "When All is Not Equal," for more information.
Read-only objects
If an object is read-only (immutable), then the hash code can be computed ahead of time. When the object is created, all of the values will be passed in via a constructor, and the hash code can be calculated on that data. The hashCode()
method can then just return the pre-computed value:
public class Point { private final int x, y; private final int hashCode; public Point(int x, int y) { this.x = x; this.y = y; this.hashCode = x*31 ^ y*37; } public boolean equals(Object other) { ... } public int hashCode() { return hashCode; } }
Of course, care must be taken to ensure that the variables are not changed after their initialization; the final
keyword helps to ensure that cannot occur.
Using instanceof (or not)
In Joshua Bloch's otherwise excellent book Effective Java, he recommends using instanceof
as the test to determine type of a class. Whilst on the surface, this may seem a good idea, in fact it has a fatal flaw in that instanceof
is not symmetric. Bloch's book recommends:
public class BadPoint { private int x, y; public boolean equals(Object other) { if (other == this) return true; if (!(other instanceof BadPoint)) return false; // BAD BadPoint point = (BadPoint)other; return (x == point.x && y == point.y); } public int hashCode() { return x + y; } }
Unfortunately, because this code is shorter (it combines the test for other == null
and the other.getClass()
into the same statement; null instanceof X
is always false
) and because it was recommended in Effective Java, it has become ingrained into many Java developers' style. This has become one of the most controversial points in the book, with many on-going discussions; see the interview at Artima for more.
The big problem with using instanceof
is that it isn't symmetric. It surfaces when creating subclasses:
public class BadPoint3D extends BadPoint { private int z; public boolean equals(Object other) { if (!super.equals(other)) return false; if (!(other instanceof BadPoint3D)) return false; // BAD BadPoint3D point = (BadPoint3D)other; return (z == point.z); } }
The problem occurs in that given instances badPoint
and badPoint3D
, we have point instanceof BadPoint3D == false
, but point3D instanceof BadPoint == true
.
This example shows a point in 3 dimensions, subclassing the 2D point we've already seen. By using the instanceof
implementation, we can break symmetry:
BadPoint p1 = new BadPoint(1,1); BadPoint p2 = new BadPoint3D(1,1,1); // p1.equals(p2) == true; // incorrect // p2.equals(p1) == false; // correct
The assumption that all subclasses of BadPoint
are equal if they have the same x
and y
values causes this problem. Using instanceof
hides the fact that we no longer compare like-for-like; instead, we compare like-for-subtype.
Using the getClass()
implementation gives the correct answer of false
in both cases. It isn't possible in general cases to compare a 3D point with a 2D point, and this is the key factor in the test. Using the getClass()
implementation ensures that you only compare like-for-like tests.
Note: The weak argument that hiding subtype information is desirable is based on the assumption that you would want to create a subclass of the data type for the sole purpose of adding/changing some methods (but not data) of the superclass. Whilst this happens in classes like Applet
, this is a false argument; an Applet
is not a data object—it's a code object. In fact, you never need to create a subclass of a datastructure for the purpose of embellishing any methods; this is a common mistake by those new to object-orientation, referred to as the "is-a/has-a" argument. In this case, you're not creating a subtype of the datastructure, you just want to add/modify its behavior. The correct solution is to write a separate class that delegates to the contained object. The rest of that discussion reaches beyond this article's scope.
Practical problems with symmetry
Does breaking the equals()
method's reflexivity matter in practice? Well, as noted earlier, it is used in many of the low-level libraries that make up the collections classes, and they depend implicitly on this behavior. If we use the instanceof
variant, many odd behaviors can occur during use:
Set data = new HashSet(); data.add(new BadPoint3D(1,1,1)); data.add(new BadPoint(1,1)); data.add(new BadPoint3D(1,1,2)); data.add(new BadPoint3D(1,2,3)); data.size(); // gives 3, not 4 data.contains(new BadPoint(1,2)); // returns true, not false
Of course, this behavior occurs when the BadPoint3D
does not define its own hashCode()
. However, it is not required to define one; provided the default hashCode()
has been overridden, it fulfils the hashCode()
method's contractual obligations.
Further, the order in which the BadPoint
and BadPoint3D
instances are added determines which ones are in the final set. So although it appears trivial that the symmetry is broken, it can actually cause some deep-rooted erroneous behavior. You can see this from the samples in Resources.
Total violation
In fact, it's not just symmetry that's broken in the comparisons. If the subclass also defines a hashCode()
method, then you can end up with a situation in which two objects are equals()
with each other, but give different hash codes—a total violation of the equality contact. Instead of just breaking one method, using instanceof
can actually break two of them:
public class BadPoint3D extends BadPoint { private int z; public boolean equals(Object other) { if (!super.equals(other)) return false; if (!(other instanceof BadPoint3D)) return false; // BAD BadPoint3D point = (BadPoint3D)other; return (z == point.z); } public int hashCode() { return super.hashCode() + z; } } // BadPoint p1 = new BadPoint(1,2); // BadPoint p2 = new BadPoint3D(1,2,3); // p1.equals(p2) == true // p1.hashCode() == 3; // p2.hashCode() == 6;
Summary
Writing an implementation of equals()
and hashCode()
is usually the case of following a sample and adapting it to the fields defined in the data object. The correct implementation can provide performance benefits with many collections and other low-level libraries.
It has also been conclusively shown that using instanceof
is bad practice because of the erroneous situations that can occur when using subclasses and any other objects that rely on the correct implementation of equals()
and hashCode()
.
For completeness, the Point
and Point3D
classes and their full correct implementations are provided for comparison and convenience:
Point
public class Point { private static double version = 1.0; private transient double distance; private String name; private int x, y; public Point(String name, int x, int y) { this(x,y); this.name = name; } public Point(int x, int y) { this.x = x; this.y = y; } public boolean equals(Object other) { if (other == this) return true; if (other == null) return false; if (getClass() != other.getClass()) return false; Point point = (Point)other; return ( x == point.x && y == point.y && (name == point.name || (name != null && name.equals(point.name))) ); } public int hashCode() { return x ^ y; } }
Point3D
public class Point3D extends Point { private int z; public Point3D(String name, int x, int y, int z) { super(name,x,y); this.z = z; } public Point3D(int x, int y, int z) { super(x,y); this.z = z; } public boolean equals(Object other) { if (!super.equals(other)) return false; Point3D point = (Point3D)other; return (z == point.z); } public int hashCode() { return super.hashCode() ^ z; } }
This story, "Object equality" was originally published by JavaWorld.