Internationalize your software, Part 1

Learn how to develop software for the global marketplace

1 2 Page 2
Page 2 of 2

Locales

I originally defined internationalization as the process of desiging an application that automatically adapts to different regions and countries without the need to recompile the application. When we talk about "different regions and countries," we're really talking about locales. A locale is a geographical, political, or cultural region (possibly an entire country) that shares some combination of common geography, politics, or culture.

Java treats locales as objects. A locale object is nothing more than an identifier (made up of a language and a region/country code) that is used by locale-sensitive classes -- classes containing locale-specific functionality (for example, Calendar). Locale objects are instantiated from the Locale class. Detailed information about Locale is available in the following class reference, located at Sun's Java Web site: http://java.sun.com/products/jdk/1.1/docs/api/java.util.Locale.html.

The concept of a global locale is not present in Java. Therefore, it's possible for different parts of a program to use different locales. This makes it possible to create multilingual programs. For example, imagine a Java program that displays a spreadsheet of financial transactions. Each column displays the same transactions based on locale-specific currencies using locale-specific formats.

A program can determine the default locale by calling the Locale class's static getDefault () method; it can set the default locale by calling the static setDefault (Locale) method. However, attempting to set the default locale from within an applet while running under a Netscape browser results in a security violation (since an attempt is being made to change a fundamental property on a user's machine).

Figure 1 shows a Java applet that lets you change the applet's locale to English, French, German, or Italian. Once the locale has been changed, some of the Locale class's information methods are called and the results are displayed in a text area. The source code to this applet is located in example1.java.

You need a Java-enabled browser to view this applet.

Figure 1: Locale example

Below is a code fragment, taken from the applet shown above, that creates a locale object.

l = new Locale ("en", "US");

The first argument,

en

, identifies the language that this locale will use (English), while the second argument,

US

, identifies the region used by this locale (United States). Both arguments were obtained from lists of standard language and region/country codes that are defined and maintained by the International Standards Organization (ISO). For a complete list of codes, refer to the

Resources section

.

Below is a code fragment, taken from the applet shown in Figure 1, which calls some of the Locale class's information methods.

sb.append ("Language Code = " + l.getLanguage () + "\n");
sb.append ("Country Code = " + l.getCountry () + "\n");
sb.append ("Variant = " + l.getVariant () + "\n");
sb.append ("ISO 3-letter Language Abbreviation = " + l.getISO3Language () + "\n");
sb.append ("ISO 3-letter Country Abbreviation = " + l.getISO3Country () + "\n");
sb.append ("Display Language = " + l.getDisplayLanguage (l) + "\n");
sb.append ("Display Country = " + l.getDisplayCountry (l) + "\n");
sb.append ("Display Variant = " + l.getDisplayVariant (l) + "\n");
sb.append ("Display Name = " + l.getDisplayName (l) + "\n");

The getLanguage () method returns the lowercase, two-letter ISO-639 language code. The getCountry () method returns the uppercase, two-letter ISO-3166 region/country code. The getVariant () method returns the variant portion of the locale. (A variant is specified as the third argument in the Locale (String, String, String) constructor and is used to further differentiate a region -- such as North, South -- or to provide some vendor-specific code.) The getISO3Language () method returns the three-letter ISO abbreviation for the locale's language. The getISO3Country () method returns the three-letter ISO abbreviation for the locale's region/country. The getDisplay methods take a locale object argument and return information using the locale argument's language (français would be returned as the display language for the French locale while English would be returned as the display language for the English locale). Locale defines several equivalent getDisplay methods that take no arguments. These methods return values based on the default locale. The getDisplayName (Locale) method calls getDisplayLanguage (Locale), getDisplayCountry (Locale), and getDisplayVariant (Locale), and concatenates this information into a single value.

Suppose you want to dynamically determine which locales are available on a particular platform. What do you do? Many of the locale-sensitive classes define a getAvailableLocales () method that returns an array of locale objects. These objects represent all of the supported locales. Surprisingly, this method is not part of the Locale class in any JDK version prior to 1.2.

Figure 2 shows a Java applet that lists all locales that are available on the current platform. Each locale is shown on a separate line starting with a lowercase, two-letter ISO language code, followed by an underscore character ("_") and an uppercase, two-letter ISO region/country code. This is followed by a descriptive name. If you view this applet with the JDK 1.1.6 appletviewer program, you'll see all of the region/country codes. Because Netscape Navigator 4.06 contains version 1.1.5 of the Java runtime environment, you won't see all of the region/country codes when running the applet under this browser. The source code to this applet is located in example2.java.

You need a Java-enabled browser to view this applet.

Figure 2: Available locales

Below is a code fragment, taken from the applet shown in Figure 2, that calls the Calendar class's getAvailableLocales () method to return all locale objects that are available on the current platform. Each locale object's getLanguage (), getCountry (), and getDisplayName () methods are called to obtain the ISO language and region/country codes along with descriptive text. This information is concatenated together and appended to a StringBuffer object.

// Obtain all currently available locales.

Locale [] locales = Calendar.getAvailableLocales ();

// Create a buffer for holding locale text.

StringBuffer sb = new StringBuffer ();

// Populate the buffer with locale text.

for (int i = 0; i < locales.length; i++) sb.append (locales [i].getLanguage () + "_" + locales [i].getCountry () + "\t" + locales [i].getDisplayName () + "\n");

Resource bundles

When a program is localized, a set of locale-specific elements are created for each locale where this program will be used. These elements aren't stored in source code. Instead, they're stored in resource bundles. A resource bundle is a container that holds one or more locale-specific elements and is associated with one and only one locale.

A program works with one or more families of resource bundles. Each family contains resource bundles for all supported locales and differs from another family in the kind of elements that are stored in these bundles. For example, one family might hold text in its bundles while another family holds audio clips containing language-specific verbal instructions.

Each family shares a common family name, and each of a family's resource bundles has a unique locale designation appended to this family name. This designation is what differentiates one resource bundle from another within the family. For example, suppose that you plan to localize a financial applet for French-speaking investors who live in France and follow French customs, and German-speaking investors who live in different regions/countries. There will be one family of resource bundles, and FinRes will be the name of this family. You do your research and learn that the appropriate language code for French is fr, and the appropriate region/country code for France is FR. Appending these codes to the bundle's family name, you end up with FinRes_fr_FR. You then learn that de is the appropriate language code for German. Finding that de is the appropriate language code for German, you append this code to your bundle's family name, and you end up with FinRes_de.

Resource bundles are instantiated from subclasses of Java's abstract ResourceBundle class. More detailed information about ResourceBundle is available in the following class reference, located at the Sun's Java Web site: http://java.sun.com/products/jdk/1.1/docs/api/java.util.ResourceBundle.html

When your program needs a locale-specific element, it calls the ResourceBundle class's static "getBundle" methods, either getBundle (String) for the default locale or getBundle (String, Locale) for a specified locale, to return an object that allows access to the element. Below is a code fragment that illustrates a call to getBundle (String, Locale).

currentLocale = Locale.FRANCE;
ResourceBundle resources = ResourceBundle.getBundle ("FinRes", currentLocale);

The first getBundle argument specifies the family name of the resource bundle family (shown as FinRes in the code sample above), while the second argument identifies the desired locale (shown as currentLocale in the code sample above). Both arguments are used by getBundle (String, Locale) to construct the locale-specific name of the desired resource bundle object.

The getBundle methods search for a resource bundle object in a specific order, as follows:

  • family name + "_" + language + "_" + country + "_" + variant
  • family name + "_" + language + "_" + country
  • family name + "_" + language
  • family name

The search begins by looking for a resource bundle that matches the family name, followed by the language, country, and variant components. If there's no match, then the search continues by looking for a resource bundle that matches the family name, followed by the language and country components. If then there is no match, the search continues by looking for a resource bundle that matches the family name and language component. The search continues by looking for a bundle that matches only the family name if there still is no match. Finally, when and if no match is made, a MissingResourceException object is thrown. This search process uses a graceful degradation algorithm to find a bundle that most closely matches the bundle being searched for, in the event that the specified bundle either cannot be found or doesn't exist.

If the resource bundle can be found, then the getBundle methods will return an object (shown as resources in code sample above) that contains methods for extracting a locale-specific element. This object can extract an element from either a property file, a class file, or some other developer-defined entity. It doesn't matter where this element is stored because the getBundle object provides a common interface for accessing this element.

Sun's Java Software Division has defined two kinds of resource bundles:

  • Property resource bundles
  • List resource bundles

A property resource bundle is a resource bundle that is based on a property file (a text-based list of key = value entries). This kind of bundle is useful for storing text. A common reason for a MissingResourceException object being thrown when working with property resource bundles is that the underlying property file is missing a .properties extension. A sample property file is shown below. Optional comments can appear in this file as long as they are prefaced by a # character.

Hello=Bonjour
Goodbye=Au revoir

Property resource bundles are implemented by Java's

PropertyResourceBundle

class. Detailed information about

PropertyResourceBundle

is available in the following class reference, located at the Sun's Java Web site:

http://java.sun.com/products/jdk/1.1/docs/api/java.util.PropertyResourceBundle.html

Figure 3 shows the results of running a Java applet that lets you change the applet's locale to English, French, German, or Italian, and view the words Hello and Goodbye in the languages represented by these four locales. Language-specific text is obtained from a property resource bundle (with a family name of ex3). The image on the left shows this text using the English/United States locale, while the image on the right shows this text using the Italian/Italy locale. The source code to this applet is located in example3.java.

Example3 Applet (English - United States)
Example3 Applet (Italian)
 
Figure 3: PropertyResourceBundle example

The example3 applet will not run under Netscape because this applet attempts to read from a property file stored on the user's machine and this read attempt results in a security violation. This security violation causes the Java runtime to throw a MissingResourceException object.

I've created an HTML file, example3.html, and placed it in this article's zip file (see the Resources section). You can use this HTML file with appletviewer to run this applet. Below is a code fragment, taken from the "example3" applet, that calls the resource bundle object's getString (String) method to obtain a text element from the resource bundle whose family name is ex3. The string argument passed to getString (String) identifies the key in the list of key=value entries that are stored in the property resource bundle's underlying property file.

ResourceBundle resources = ResourceBundle.getBundle ("ex3", l);

StringBuffer sb = new StringBuffer ();

sb.append ("Hello = " + resources.getString ("Hello") + "\n"); sb.append ("Goodbye = " + resources.getString ("Goodbye") + "\n");

// Populate text area control with locale information.

ta.setText (sb.toString ());

A list resource bundle is a resource bundle that is based on a Java class file. This kind of bundle is useful for storing nontext elements such as graphics and audio clips. A sample class file is shown below. This class must implement a single method, getContents ().

import java.util.*;

public class ex4_fr_FR extends ListResourceBundle { public Object [][] getContents () { return contents; }

private Object [][] contents = { { "Hello", "Bonjour" }, { "Goodbye", "Au revoir" } }; }

List resource bundles are implemented by Java's ListResourceBundle class. Detailed information about ListResourceBundle is available in the following class reference, located at the Sun's Java Web site: http://java.sun.com/products/jdk/1.1/docs/api/java.util.ListResourceBundle.html

Figure 4 shows a Java applet that lets you change the applet's locale to English, French, German, or Italian and view the words Hello and Goodbye in the languages represented by these four locales. This text is obtained from a list resource bundle (with a family name of ex4). The source code to this applet is located in example4.java.

You need a Java-enabled browser to view this applet.

Figure 4: ListResourceBundle example

Conclusion

Internationalizing your software is worth considering if access to the global marketplace is important, but this isn't a task to be undertaken lightly. Fortunately, Java has simplified the job, in part due to its platform-independent status and its internationalization and localization features.

In Part 1 of our three-part series, we've defined internationalization and localization. We've presented a list of elements that must be localized when creating international software. We encountered new characters and character sets, and explored the EBCDIC, ASCII, and Unicode character definition standards. We also examined locales and resource bundles. In Part 2, we'll explore text processing in a locale-sensitive manner along with formatters for messages, dates, times, numbers, and currencies.

If you have any questions about the material that's been presented in this article, please send me e-mail, using the link in my bio below. See you next month.

Jeff is a consultant working with various technologies including C++, digital signatures/encryption, Java, smart cards and Win32. He has worked for a number of technology-related consulting firms including EDS (Electronic Data Systems).

Learn more about this topic

  • An interesting overview article on internationalization is available at the following site http://developer.java.sun.com/developer/technicalArticles/intl.html
  • The official Unicode Web site contains a wealth of information on this character definition standard http://www.unicode.org
  • A complete list of ISO-639 language codes is available at the following site http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt
  • A complete list of ISO-3166 country codes is available at the following site http://www.ics.uci.edu/pub/ietf/http/related/iso3166.txt
  • An interesting article on localization with resource bundles is available at the following site http://developer.java.sun.com/developer/technicalArticles/ResourceBundles.html
  • You can translate text from one language to another from the following site http://babelfish.altavista.com/cgi-bin/translate?
  • Note: Two of the Resources links take you to Java Software's Division's Developer Connection site. You must be a member of Java Software's Division's Developer Connection in order to view articles and you will be prompted to enter a user ID and password the first time you access this site. There is no charge to become a member. You can register when prompted to enter a user ID/password.

This story, "Internationalize your software, Part 1" was originally published by JavaWorld.

Copyright © 1998 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2