Encapsulation is not information hiding

The principles of information hiding go beyond the Java language facility for encapsulation

Words are slippery. Like Humpty Dumpty proclaimed in Lewis Carroll's Through the Looking Glass, "When I use a word, it means just what I choose it to mean -- neither more nor less." Certainly the common usage of the words encapsulation and information hiding seems to follow that logic. Authors rarely distinguish between the two and often directly claim they are the same.

Does that make it so? Not for me. Were it simply a matter of words, I wouldn't write another word on the matter. But there are two distinct concepts behind these terms, concepts engendered separately and best understood separately.

Encapsulation refers to the bundling of data with the methods that operate on that data. Often that definition is misconstrued to mean that the data is somehow hidden. In Java, you can have encapsulated data that is not hidden at all.

However, hiding data is not the full extent of information hiding. David Parnas first introduced the concept of information hiding around 1972. He argued that the primary criteria for system modularization should concern the hiding of critical design decisions. He stressed hiding "difficult design decisions or design decisions which are likely to change." Hiding information in that manner isolates clients from requiring intimate knowledge of the design to use a module, and from the effects of changing those decisions.

In this article, I explore the distinction between encapsulation and information hiding through the development of example code. The discussion shows how Java facilitates encapsulation and investigates the negative ramifications of encapsulation without data hiding. The examples also show how to improve class design through the principle of information hiding.

Position class

With a growing awareness of the wireless Internet's vast potential, many pundits expect location-based services to provide opportunity for the first wireless killer app. For this article's sample code, I've chosen a class representing the geographical location of a point on the earth's surface. As a domain entity, the class, named Position, represents Global Position System (GPS) information. A first cut at the class looks as simple as:

public class Position
{
  public double latitude;
  public double longitude;
}

The class contains two data items: GPS latitude and longitude. At present, Position is nothing more than a small bag of data. Nonetheless, Position is a class, and Position objects may be instantiated using the class. To utilize those objects, class PositionUtility contains methods for calculating the distance and heading -- that is, direction -- between specified Position objects:

public class PositionUtility
{
  public static double distance( Position position1, Position position2 )
  {
    // Calculate and return the distance between the specified positions.
  }
  public static double heading( Position position1, Position position2 )
  {
    // Calculate and return the heading from position1 to position2.
  }
}

I omit the actual implementation code for the distance and heading calculations.

The following code represents a typical use of Position and PositionUtility:

// Create a Position representing my house
Position myHouse = new Position();
myHouse.latitude =    36.538611;
myHouse.longitude = -121.797500;
// Create a Position representing a local coffee shop
Position coffeeShop = new Position();
coffeeShop.latitude =    36.539722;
coffeeShop.longitude = -121.907222;
// Use a PositionUtility to calculate distance and heading from my house
// to the local coffee shop.
double distance = PositionUtility.distance( myHouse, coffeeShop );
double heading  = PositionUtility.heading(  myHouse, coffeeShop );
// Print results
System.out.println
  ( "From my house at (" +
    myHouse.latitude + ", " + myHouse.longitude +
    ") to the coffee shop at (" +
    coffeeShop.latitude + ", " + coffeeShop.longitude +
    ") is a distance of " + distance +
    " at a heading of " + heading + " degrees."
  );

The code generates the output below, which indicates that the coffee shop is due west (270.8 degrees) of my house at a distance of 6.09. Later discussion addresses the lack of distance units.

    ===================================================================
    From my house at (36.538611, -121.7975) to the coffee shop at
    (36.539722, -121.907222) is distance of 6.0873776351893385 at a
    heading of 270.7547022304523 degrees.
    ===================================================================

Position, PositionUtility, and their code usage are a bit disquieting and certainly not very object-oriented. But how can that be? Java is an object-oriented language, and the code uses objects!

Though the code may use Java objects, it does so in a manner reminiscent of a by-gone era: utility functions operating on data structures. Welcome to 1972! As President Nixon huddled over secret tape recordings, computer professionals coding in the procedural language Fortran excitedly used the new International Mathematics and Statistics Library (IMSL) in just this manner. Code repositories such as IMSL were replete with functions for numerical calculations. Users passed data to these functions in long parameter lists, which at times included not only the input but also the output data structures. (IMSL has continued to evolve over the years, and a version is now available to Java developers.)

In the current design, Position is a simple data structure and PositionUtility is an IMSL-style repository of library functions that operates on Position data. As the example above shows, modern object-oriented languages don't necessarily preclude the use of antiquated, procedural techniques.

Bundling data and methods

The code can be easily improved. For starters, why place data and the functions that operate on that data in separate modules? Java classes allow bundling data and methods together:

public class Position
{
  public double distance( Position position )
  {
    // Calculate and return the distance from this object to the specified
    // position.
  }
  public double heading( Position position )
  {
    // Calculate and return the heading from this object to the specified
    // position.
  }
  public double latitude;
  public double longitude;
}

Putting the position data items and the implementation code for calculating distance and heading in the same class obviates the need for a separate PositionUtility class. Now Position begins to resemble a true object-oriented class. The following code uses this new version that bundles the data and methods together:

Position myHouse = new Position();
myHouse.latitude =    36.538611;
myHouse.longitude = -121.797500;
Position coffeeShop = new Position();
coffeeShop.latitude =    36.539722;
coffeeShop.longitude = -121.907222;
double distance = myHouse.distance( coffeeShop );
double heading = myHouse.heading( coffeeShop );
System.out.println
  ( "From my house at (" +
    myHouse.latitude + ", " + myHouse.longitude +
    ") to the coffee shop at (" +
    coffeeShop.latitude + ", " + coffeeShop.longitude +
    ") is a distance of " + distance +
    " at a heading of " + heading + " degrees."
  );

The output is identical as before, and more importantly, the above code seems more natural. The previous version passed two Position objects to a function in a separate utility class to calculate distance and heading. In that code, calculating the heading with the method call util.heading( myHouse, coffeeShop ) didn't clearly indicate the calculation's direction. A developer must remember that the utility function calculates the heading from the first parameter to the second.

In comparison, the above code uses the statement myHouse.heading(coffeeShop) to calculate the same heading. The call's semantics clearly indicate that the direction proceeds from my house to the coffee shop. Converting the two-argument function heading(Position, Position) to a one-argument function position.heading(Position) is known as currying the function. Currying effectively specializes the function on its first argument, resulting in clearer semantics.

Placing the methods utilizing Position class data in the Position class itself makes currying the functions distance and heading possible. Changing the call structure of the functions in this way is a significant advantage over procedural languages. Class Position now represents an abstract data type that encapsulates data and the algorithms that operate on that data. As a user-defined type, Position objects are also first class citizens that enjoy all the benefits of the Java language type system.

The language facility that bundles data with the operations that perform on that data is encapsulation. Note that encapsulation guarantees neither data protection nor information hiding. Nor does encapsulation ensure a cohesive class design. To achieve those quality design attributes requires techniques beyond the encapsulation provided by the language. As currently implemented, class Position doesn't contain superfluous or nonrelated data and methods, but Position does expose both latitude and longitude in raw form. That allows any client of class Position to directly change either internal data item without any intervention by Position. Clearly, encapsulation is not enough.

Defensive programming

To further investigate the ramifications of exposing internal data items, suppose I decide to add a bit of defensive programming to Position by restricting the latitude and longitude to ranges specified by GPS. Latitude falls in the range [-90, 90] and longitude in the range (-180, 180]. The exposure of the data items latitude and longitude in Position's current implementation renders this defensive programming impossible.

Making attributes latitude and longitude private data members of class Position and adding simple accessor and mutator methods, also commonly called getters and setters, provides a simple remedy to exposing raw data items. In the example code below, the setter methods appropriately screen the internal values of latitude and longitude. Rather than throw an exception, I specify performing modulo arithmetic on input values to keep the internal values within specified ranges. For example, attempting to set the latitude to 181.0 results in an internal setting of -179.0 for latitude.

The following code adds getter and setter methods for accessing the private data members latitude and longitude:

public class Position
{
  public Position( double latitude, double longitude )
  {
    setLatitude( latitude );
    setLongitude( longitude );
  }
  public void setLatitude( double latitude )
  {
    // Ensure -90 <= latitude <= 90 using modulo arithmetic.
    // Code not shown.
    // Then set instance variable.
    this.latitude = latitude;
  }
  public void setLongitude( double longitude )
  {
    // Ensure -180 < longitude <= 180 using modulo arithmetic.
    // Code not shown.
    // Then set instance variable.
    this.longitude = longitude;
  }
  public double getLatitude()
  {
    return latitude;
  }
  public double getLongitude()
  {
    return longitude;
  }
  public double distance( Position position )
  {
    // Calculate and return the distance from this object to the specified
    // position.
    // Code not shown.
  }
  public double heading( Position position )
  {
    // Calculate and return the heading from this object to the specified
    // position.
  }
  private double latitude;
  private double longitude;
}

Using the above version of Position requires only minor changes. As a first change, since the above code specifies a constructor that takes two double arguments, the default constructor is no longer available. The following example uses the new constructor, as well as the new getter methods. The output remains the same as in the first example.

Position myHouse = new Position( 36.538611, -121.797500 );
Position coffeeShop = new Position( 36.539722, -121.907222 );
 
double distance = myHouse.distance( coffeeShop );
double heading = myHouse.heading( coffeeShop );
 
System.out.println
  ( "From my house at (" +
    myHouse.getLatitude() + ", " + myHouse.getLongitude() +
    ") to the coffee shop at (" +
    coffeeShop.getLatitude() + ", " + coffeeShop.getLongitude() +
    ") is a distance of " + distance +
    " at a heading of " + heading + " degrees."
  );

Choosing to restrict the acceptable values of latitude and longitude through setter methods is strictly a design decision. Encapsulation does not play a role. That is, encapsulation, as manifested in the Java language, does not guarantee protection of internal data. As a developer, you are free to expose the internals of your class. Nevertheless, you should restrict access and modification of internal data items through the use of getter and setter methods.

Isolating potential change

Protecting internal data is only one of many concerns driving design decisions on top of language encapsulation. Isolation to change is another. Modifying the internal structure of a class should not, if at all possible, affect client classes.

For example, I previously noted that the distance calculation in class Position did not indicate units. To be useful, the reported distance of 6.09 from my house to the coffee shop clearly needs a unit of measure. I may know the direction to take, but I don't know whether to walk 6.09 meters, drive 6.09 miles, or fly 6.09 thousand kilometers.

1 2 3 Page 1
Page 1 of 3