Welcome!

Weblogic Authors: Yeshim Deniz, Elizabeth White, Michael Meiner, Michael Bushong, Avi Rosenthal

Related Topics: Weblogic

Weblogic: Article

Cover Story: A Practical Solution to Internationalization of a J2EE Web App

Making Web Applications Multilingual

As the Internet crawls to even more remote corners of the globe, the internationalization of Web applications exposes a plethora of challenges. As a real-world example, if an airline starts reaching far more remote destinations across international frontiers, a Web application representing the airline's ecommerce will face numerous challenges in terms of internationalizing the ecommerce itself.

These challenges result from many causes. For instance, one basic thing that differs from one country to another is the spoken language. Among countries where the same language is spoken, the colloquiality of the language differs. For example, in some parts of the world, the word "baggage" is widely used to represent personal belongings such as suitcases of a traveler. In some other parts of the world, the same is referred to as "luggage." In terms of ecommerce, there are numerous variations from country to country such as different currencies, different tax laws, different forms of payment, and - most important - different business rules governing the application.

Essentially, there are two parts to the internationalization of a Web application. The first part is internationalization of the application code. This involves preparing the code so that it can adapt itself to new languages and regions. In practice, this preparation involves the separation of text, labels, display messages, and any other data that is sensitive to language and region of the world. This type of adaptation of code enables generalization of the product in such a way that it can handle new languages and countries without any re-design. The second part is localization of the application. This involves actual adaptation of the internationalized code to a specific language or region (aka locale). In practice, localization involves creation of translated text, labels, and messages, and the addition of any other application data that is specific to a certain locale.

Internationalization is a common problem that typically gets a blind eye turned toward it during design and development. Internationalization design must be up-front work in the development life cycle and not an afterthought. It is rightly said, "a stitch in time saves nine." It may not be too easy to design for internationalization up front; however, it will be far more difficult to incorporate internationalization at a later stage when the application has already been developed. Up-front planning for application internationalization can save significant amounts of time and money. There could be myriad ways of addressing this problem; however the following approaches are widely used:

  • Creating internationalized pages that retrieve locale-dependent content using custom tags. This approach is typically employed if all of the pages consistently follow the same structure and look and feel across different locales. This approach also provides for easy maintenance and future enhancements across all of the locales. This approach may also employ a single source for business logic components that process the logic based on the locale.
  • Creating separate locale-specific pages. This approach is typically employed if the structure and the look and feel of the pages differ significantly across locales. In this case, there may be separate business logic components for each locale.
  • Using portal technology. Vendors such as BEA provide support for portal technology. For example, BEA WebLogic Application Server has excellent support for developing internationalized portals in the form of a set of custom tags that can be incorporated within the portal pages. Portal technology is widely used and is definitely an excellent candidate for implementing internationalized Web applications.
This article will delve into a real-time application against the backdrop of a Web-based airline-booking engine that has internationalization requirements complemented by a content management system. The implementation approach is aligned with the first option described above, where a single set of JSP pages are developed that work with locale-dependent content. The presentation tier of the application was built using existing frameworks such as Struts and JSTL custom tags. Both the Struts framework and JSTL custom tags offer internationalization support by providing mechanisms that are built upon the standard Java internationalization classes such as Locale, Resource Bundles, etc. The article will discuss in detail the technicalities involved in extending these frameworks to internationalize the Web presentation by incorporating means to retrieve localized content dynamically from the underlying content management system.

The Fundamental Concepts
Before we delve much into the implementation details, it is worth browsing through some key concepts. Terms such as "character," "character sets," "character codes," "character encoding," and so on are often heard when people talk about internationalization.

A character is the smallest component of a written language that has a specific name and some semantic value. Each character can have more than one graphical representation. For example, character "A" can be graphically represented as "A," "A," or "A." Independent of the graphical representation, the meaning of the character remains the same. Each such graphical representation is called as a glyph. A set of glyphs is called a font. So a character will have a different glyph in different fonts. A character set comprises of a group of related characters that can be used for some purpose. All the characters on an "English" key board can be grouped into a character set because they provide ability to develop meaningful and informative documents in "English."

Computers do not understand characters automatically but rather need a coded set of characters to process the data. In a coded character set, each character is assigned with an integer value commonly referred to as code point. American Standard Code for Information Interchange (ASCII) is a good example of a coded character set. ASCII is a small coded character set that comprises 127 characters. There are other coded character sets such as ISO-8859-1 and Unicode. Essentially, the code point of a character in the coded character set is used to identify the right glyph to display on the computer screen.

Character set encoding is yet another term that is widely used. A character set encoding scheme is a set of rules for mapping byte sequences (aka octets) to character code values and vice versa. Coded character sets such as ISO-8859-1, UTF-8, and UTF-16 have their own encoding schemes. For example, different schemes encode the character "ß" into byte sequences as shown in Table 1.

The terms "coded character set" and "coded character set encoding" have different meanings and should not be used interchangeably. To avoid this confusion, the short name "charset" is usually used to represent coded character set encoding. Table 2 shows some of the charsets that support different languages.

Table 2 leads to a big question: what character set should be used to support multiple different languages in an internationalized application? For example, the ISO-8859-1 character set will not support Chinese characters that are actually supported by the GB2312 character set. Obviously, there should be a common character set that can encode all of the characters in different languages of the world. Unicode is one such coded character set that promises to provide a unique code point for every character in every language. Java uses Unicode to encode characters. JRE 1.4 supports Unicode 3.0. Unicode is a large character set composed of almost 65,000 characters covering almost all world languages. Unicode encodes characters in 2 bytes, i.e., Unicode is 16-bit encoding with a range of code points from U+0000 to U+FFFF, represented in Unicode hexadecimals.

There is one more character set, known as the Universal Character Set (UCS), which can support all language characters and symbols. However, UCS uses a 31-bit encoding scheme that is not supported by most of the computer applications, whereas 16-bit encoding is widely supported. To address this issue, new transformed encoding schemes have been created based on Unicode and UCS. One of them is UTF-8 (UCS Transformation Format). UTF-8 transforms UCS characters into 1, 2, 3, or 4 byte encodings. UTF-8 preserves ASCII codes and encodes an ASCII character as a single byte. In essence, UTF-8 uses multi-byte encoding to represent characters in 1-4 bytes (octets).

The UTF-8 support for a wide range of characters and the efficient way of encoding makes it the de facto character set that should be used for displaying multiple languages. The application described in this article uses UTF-8 everywhere there is a need to encode content in different languages.

The Internationalization Requirements for the Example Application
The application described in the article is a Web-based airline-booking engine that has points of sale (POS) in different countries. The requirement was to support a number of POS countries (24 all together) such as the US, Germany, the UK, Japan, Korea, Brazil, Canada, China, Uruguay, etc., with room to expand to other countries of the world. For each POS country, the requirement was to provide a list of preconfigured languages specific to each POS in such a way that a user could select a particular language from the list in order to display content in that language. The requirement was to support a number of languages (10 all together) such as English, French, German, Chinese, Japanese, etc., with room to accommodate any other language in the future. By default, when a user lands on the application in a particular POS country, the content is expected to be displayed in the native language of that POS country.

More Stories By Murali Kashaboina

Murali Kashaboina leads Enterprise Architecture at United Airlines, Inc. He has 15+ years of enterprise software development experience utilizing a broad range of technologies, including JEE, CORBA, Tuxedo, and Web services. Murali previously published articles in WLDJ and SilverStream Developer Center. He has master’s degree in mechanical engineering from the University of Dayton, Ohio.

More Stories By Bin Liu

Bin Liu is a lead software engineer at United Airlines. Bin has more than seven years of experience developing distributed applications using J2EE technologies, WebLogic, Tuxedo, C++, and Web services. Bin has previously published articles in WLDJ.

Comments (2) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
Raj Kumar Kundu 06/04/08 10:52:33 PM EDT

This content is very useful for all those people who are thinking about internationalization of J2EE/ Web Based applications. It explains and points out the areas which should be rather can be considered for this activity. This can help people start thinking in right direction.
But this can be made extremely useful by providing some example files (Resource Bundle related AppResource files and the java files which are using those property files) or snaps of the java codes.

Henry 10/23/07 08:34:21 PM EDT

Is database-centric internationalization with JSF similar with this article?

IoT & Smart Cities Stories
SYS-CON Events announced today that IoT Global Network has been named “Media Sponsor” of SYS-CON's @ThingsExpo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. The IoT Global Network is a platform where you can connect with industry experts and network across the IoT community to build the successful IoT business of the future.
IoT is rapidly becoming mainstream as more and more investments are made into the platforms and technology. As this movement continues to expand and gain momentum it creates a massive wall of noise that can be difficult to sift through. Unfortunately, this inevitably makes IoT less approachable for people to get started with and can hamper efforts to integrate this key technology into your own portfolio. There are so many connected products already in place today with many hundreds more on the h...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
CloudEXPO New York 2018, colocated with DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City and will bring together Cloud Computing, FinTech and Blockchain, Digital Transformation, Big Data, Internet of Things, DevOps, AI, Machine Learning and WebRTC to one location.
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
DXWorldEXPO | CloudEXPO are the world's most influential, independent events where Cloud Computing was coined and where technology buyers and vendors meet to experience and discuss the big picture of Digital Transformation and all of the strategies, tactics, and tools they need to realize their goals. Sponsors of DXWorldEXPO | CloudEXPO benefit from unmatched branding, profile building and lead generation opportunities.
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear these words all day every day... lofty goals but how do we make it real? Add to that, that simply put, people don't like change. But what if we could implement and utilize these enterprise tools in a fast and "Non-Disruptive" way, enabling us to glean insights about our business, identify and reduce exposure, risk and liability, and secure business continuity?
DXWorldEXPO LLC announced today that Telecom Reseller has been named "Media Sponsor" of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...