Translations in Drupal 6

by Matt Butcher | February 2009 | AJAX Content Management Drupal Open Source

Drupal offers some enticing JavaScript tools, one of which is jQuery. The theming and behavior capabilities provided by drupal.js are other examples. Along with those cool tools comes a feature that has had a remarkable influence on the success of Drupal, but which provides far less glitz and glamour.

This tragic hero is the translation engine, which will be the subject of this article written by Matt Butcher.

Translations are important—one might even say vital—to the success of Drupal. Consequently, it is imperative that all Drupal developers become familiar with these tools. JavaScript written in Drupal 6 (and in later versions) should be translation-aware.

Here are the things we will cover in this article:

  • Get our bearings in the drupal.js library
  • Enable multi-language capabilities in Drupal
  • Learn the translation functions

Translations and drupal.js

There are four main families of tools in drupal.js:

  1. Theming functions.
  2. Translation functions.
  3. Utility functions.
  4. Support for Drupal behaviors.

Even if you don't think you need the translation functions, I advise you to read this article. The tools covered here play a very important role in Drupal, even providing additional security to your code.

Our focus in this article will be on the translation functions. When we talk about translation tools, what exactly are we talking about?

Translation functions provide language translation facilities to JavaScript. Text that would normally be hardcoded into the JavaScript is translated through this system to the user's preferred language.

As is the case with the theming system, the drupal.js translation system is designed to provide an API similar to the server-side PHP translation system.

The translation functions are designed to be simple for the developer's use. In fact, the developer needn't even turn on Drupal's translation module to use the JavaScript libraries. The idea is to make it painless enough for the developer to use, and train the developer to habitually use the translation features.

In order to show how things work, we will not only look at the translation functions, but also at how the larger translation system is used.

Translation and languages

One of the Drupal's more distinguished points is its well-integrated support for multiple languages. Drupal has been translated into dozens of languages, and installing and enabling a translation is a simple process. For these reasons, Drupal has gained an international audience.

In earlier versions of Drupal, this language support was confined to server-side PHP code. JavaScript did not have access to the translation library. But with the release of Drupal 6, basic translation support was extended to JavaScript.

In order to see how translations work, we are going to walk through the process of enabling the translation system on the server. We will then return to the drupal.js library to see how it uses the system.

Translation functions are the portions of code that developers use to make it possible for code to perform translations when appropriate. The translation system is the part of Drupal that does the actual translation. We will start with this second part, the translation system, and then go back to the translation functions.

English is the default language for Drupal. In fact, it is the only one installed by default. But since Drupal provides a complete language translation subsystem, and Drupal code is developed to support translation, enabling multi-language support is a straightforward process.

We will begin by installing a new language.

There are three steps that must be performed the first time you install a language:

  1. Multi-language support must be turned on.
  2. Translation files must be downloaded and installed.
  3. Drupal's translation preferences must be configured.

We will briefly walk through this process.

Turning on translation support

By default, Drupal's translation support is disabled. It is disabled for the practical reason that if it is not needed, the performance hit incurred by the translation subsystem should be avoided.

Turning it on is a matter of enabling a couple of modules. These modules are included in the Drupal core, so there's no need to download anything. All you need to do is go to Administer | Site building | Modules, and then check the boxes next to the Locale and Content translation modules.

Once you've done that, click on the Save configuration button at the bottom of the screen. That should do it.

Getting and installing translations

Dozens of translations are available in the Translations repository on the official Drupal.org web site. To find and download a new language, go to http://drupal.org/project/Translations and download the desired language.

Once you have the translation archive, you can install it by uncompressing the file in the same directory where Drupal is installed. For example, if Drupal is installed in /var/www/drupal (a common location for it on Linux servers), you will want to uncompress the translation file in /var/www/drupal. The language files will automatically be placed in the correct location.

The next thing to do is to let Drupal know that you have a new language installed.

Configuring languages

Once we have downloaded and unpacked the desired language(s), we need to configure Drupal's language support to determine how to handle multiple languages.

There are two steps to this process:

  1. Add the new language.
  2. Configure the global language settings.

In the first step, we are going to let Drupal know about the new language.

Adding the language

We've already installed the language, but we also need to tell Drupal that we want it to go through the process of scanning the language files and compiling a translation database. This process is called adding a language.

To do this, we need to go to the Administer | Site configuration | Languages page and click on the Add language tab as seen in the following screenshot:

On this screen you will need to select the language from the Language name drop-down list. Unfortunately, this list is not limited to the languages you have already installed, so you will have to find the language in the list. Languages are indexed by their English name. Thus, you should look for German instead of Deutsch.

Once you've found the language, click Add language and sit back while Drupal parses all of the language files.

After the parsing is finished, we are ready to move on to the next step.

Configuring languages

We have multiple languages supported, now. But we need to tell Drupal how it should determine what language we want to see when we visit a page.

To configure this, we can click on the Configure tab on the Administer | Site configuration | Languages page. There is only one set of options on this page: Language negotiation.

These settings let us configure how Drupal will determine which language to display. By default, None is checked. This means only the default language will be used.

Path prefix only determines which language to use based on a language identifier string present at the beginning of the URL. For example, my site is running at http://localhost:8888/drupal/. I have English set as the default language, and the Spanish translation is also installed.

Using these settings if I type in the previous URL, I will see the page in English (the default language). However, if I type in the URL http://localhost:8888/drupal/es/, the site will be displayed in Spanish. The es identifier is a prefix to the Drupal portion of the URL. So if I want to view a node using the Spanish translation, the URL would look like this: http://localhost:8888/drupal/es/node/1.

Path translation and language prefixes
The URLs mentioned make use of Drupal's clean URLs. By using Apache's mod_rewrite module, data that would normally appear in a query string can be embedded in the URL. If you do not have clean URLs turned on, then the previous URL would look something like this: http://localhost:8888/drupal?q=es/node/1. With the query string clearly isolated, it's a little easier to see how es is treated as a prefix.

The Path prefix with language fallback option is similar to the previous option, except that it adds one more step.

If the path provides a language prefix, then that language is used (assuming the language has been installed and added). But if no prefix is found, Drupal then checks the language preferences that the web browser sends in its HTTP headers. These look something like this:

User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; 
rv:1.9.0.1) Gecko/2008070206 Firefox/3.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: es,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cache-Control: max-age=0

This is a subset of the HTTP headers my browser sent when requesting a page from Drupal (and I viewed the headers using Firebug).

The highlighted line shows the language preferences. Spanish (es) is the first language, with US English (en-us) and generic English (en) set as my second and third choices.

With Path prefix with language fallback enabled, when I type in http://localhost:8888/drupal/, I will get the page in Spanish because Drupal will inspect the Accept-language header and determine that it is the best language to use.

If the Accept-language header isn't available, or there is no language match, then Drupal will fall back to the site's default language.

Finally, the last language negotiation type is Domain name only. In this case, the domain name portion of the URL is used to determine language. For example, http://es.example.com would resolve to the Spanish language, while http://en.example.com would resolve to English.

For multi-language development work, I find the Path prefix only choice to be the easiest to work with.

The translation feature is used to translate the strings that appear in Drupal code. This is done manually by a dedicated team of translators. Consequently, enabling translation will not affect the content you create. For example, if you write content in English, it will not be translated to Spanish for you. Only the interface (built-in menus, module descriptions, and so on) will be translated.

We now have multi-language support enabled, and you should be able to configure your Drupal installation to use more than one language. It's time to take the developer's perspective again. First, we will look at the main JavaScript translation functions. Then, we will look at a developer's tool to create translations.

Using the translation functions

Regardless of whether or not you intend to translate your module, you should always use the translation functions where applicable. There are a few reasons for this:

  • By coding in a translation-friendly way, you pave the way for easy translations later. This is especially important for contributed modules, where your module may indeed be used by speakers of other languages.
  • The translation functions provide additional security. This might sound counterintuitive at first. How can adding translation support increase security? As we will see shortly, the translation functions also perform additional escaping on text. Untrusted text is automatically escaped for display. Escaping is one way of preventing a malicious user from performing Cross-Site Scripting (often called XSS) scripting attacks or other code injection attacks.
  • Using translation functions is just good coding practice. As with many other aspects of Drupal coding, the developer community encourages (and in many cases enforces) clean, well-written, and portable code. Using the translation functions is one way of conforming to Drupal's coding guidelines.

The drupal.js file contains a pair of functions that can make use of Drupal's multi-language support. These two functions are Drupal.t() and Drupal.formatPlural().

If you've done any Drupal PHP coding, both of these should immediately be familiar to you. They are directly analogous to the t() and formatPlural() functions in Drupal's core PHP library. Not only do they share a name, but also the same method signature. They take the same arguments and return the same type of content.

Let's start out by looking at the Drupal.t() function.

The Drupal.t() function

As with all of core Drupal JavaScript functions, this function uses the Drupal namespace. The t() function is a member of the Drupal library. This function's job is to take a string and perform any translation actions on it. Here's a simple example of this:

alert(Drupal.t('hello world'));

In this case, the translation function would check the language database for the user's preferred language and see if there was a translation available. If there is, then the function will return the translated string. If not, then hello world will be returned unaltered.

Shortly, we will take a closer look at how the translation happens. It is a slightly more complex process than what initially meets the eye. But before we move on in that direction, let's look at a more complex use of the Drupal.t() function.

The Drupal.t() function can take up to two arguments. They are (in order):

  1. The string that should be translated.
  2. An object containing name/value pairs for substitution into the string.

Here's a brief example that makes use of both:

var params = {
"@siteName": "Example.Com",
"!url": "http://example.com/"
};
var txt = Drupal.t("The URL for @siteName is !url.", params);

In the code, we first create the params object that contains a mapping of placeholders to text. What is this mapping for? Look ahead to the contents of the Drupal.t() function. The Drupal.t() function takes a string object and the params mapping we created.

The string looks like this: The URL for @siteName is !url. There are two placeholders in this string, @siteName and !url. When the Drupal.t() function is executed, the placeholders will be replaced by values from the params object.

In this case, @siteName will be replaced by Example.Com, and !url will be replaced by http://example.com/. So the English rendering of the string would be The URL for Example.Com is http://example.com/.

But wait! There are a couple of details to fill in. First of all, why are we using placeholders in the first place? And second, what are the @ and ! signs for?

In answer to the first question, placeholders should be used for any values that should not be translated. The example uses a proper name for @siteName and a URL for !url. In cases like this, translation would be unnecessary. Presumably, the site name and URL are the same in all languages.

This is a simple case where placeholders might be used. However, it's not all that common in practical cases.

A more realistic use of placeholders is to substitute it in values that are not known at translation time. To elaborate the example, consider the case where the site name and site URL are retrieved from some other object. Let's say we have an object called SiteInfo that contains this information (This is a fictional example. There is no such object.)

Our params object might look like this instead:

var params = {
"@siteName": SiteInfo.name,
"!url": SiteInfo.url
};

Here, the values of these variables may not be known until runtime, long after the translation has been generated. So using placeholders clearly makes sense.

Translations are created by humans, and the process of translation is mostly handled manually. We will see this process in a few minutes. But nothing magical happens at runtime. Translated strings are simply substituted for the default (usually English-language) text.

Placeholders are then used in cases where values need to be inserted into a translated string, but where the values themselves should not be translated as part of that string.

In answer to the second question, placeholders can be demarcated by three different symbols: @, %, and !. Any word (alphanumeric characters surrounded by whitespace) inside a translation string that begins with one of those three characters will be treated as a placeholder.

Each of these three placeholder symbols serves a special purpose. Each indicates to Drupal.t() how the string should be substituted in, as explained here:

  • Placeholders that begin with the @ symbol are escaped for display in HTML. For example, if we have a param that looks like this: '@tag' : '<p>':, Drupal.t() will convert the value to &lt;p&gt; before substituting it into the target string. Mostly, you should use this method of escaping to prevent security holes.
  • Placeholders that begin with ! are inserted verbatim. Drupal does not encode any of these. This should be used with care, for it could open security holes that might, for instance, allow XSS attacks.
  • Finally, placeholders that begin with % are first encoded (like @ placeholders), and then themed for emphasis. It means, in the default Drupal configuration, the resulting string will be placed inside the <em></em> tags. Using the example '@tag': '<p>', the output would be <em><p&gt;</em>.

So what should you use and when? Most of the time, placeholders should be prefixed with @. That will do the encoding, but without necessarily adding any additional format (like % does). Placeholders should only begin with ! when escaping content would damage the output, and when the value to be substituted is known. For example, you shouldn't take user-entered text and then use a ! placeholder.

That's how the Drupal.t() function works.

When should a string be translated?

Ideally, every static piece of text in your application—labels, help text, descriptions, and so on—should be translated. Of course, there are exceptions. For example, proper nouns are usually not translated.

The Drupal.formatPlural() function

The second translation function is Drupal.formatPlural(). As you may have guessed from the name of the function, its job is to format a reference to singular and plural objects. This comes from the problem that in many languages (English and Spanish are good examples) single items and plural items have different suffixes. For example, we say "Johnny has 1 apple" and "Johnny has 2 apples". We also say, "Johnny has 0 apples."

So 1 is the only singular case in English (not all languages are this way, French treats 0 as singular). To handle this in a translation-friendly way (not all languages add s to form a plural), Drupal contains a function Drupal.formatPlural() that can determine whether the current case needs a singular form or a plural form.

This function takes these arguments:

  • A number (If it is 1, then the singular will be used, otherwise the plural form will be used.)
  • A singular string (in English)
  • A plural string (in English)

Elaborating our example, we might have code that looks something like this:

for (i = 0; i < 6; ++i) {
alert(
Drupal.formatPlural(i, "Johnny has 1 apple.", "Johnny has @count apples.")
);
}

The formatting is a little stilted to get everything on one line, but the important part is the highlighted call to Drupal.formatPlural().

When this script is run, it will loop six times and pop-up an alert message each time. Each time Drupal.formatPlural() is called, it will be passed i and the singular and plural strings.

If the value of i is 1 then the alert will say Johnny has 1 apple. In all other cases, the third parameter will be used: Johnny has @count apples. The @count placeholder is automatically replaced with the value of i. So for the first loop, we get Johnny has 0 apples. On the third loop, we get Johnny has 2 apples.

But this function doesn't just toggle between two strings. It uses the translation subsystem to translate the selected string too. So if the language is set to German and i is 0, the output should look something like this (assuming the German translation exists): Johnny hast 0 Äpfel.

That's all there is to the Drupal.formatPlural() function. The next thing we will be look at is how to translate a string and make it available to your JavaScript.

Adding a translated string

When we create a translation for our content, we want to fulfill two goals:

  1. Build a translation in such a way that the Drupal.t() function can make use of it.
  2. Make this translation portable, so that we can use the same JavaScript on different servers. Even if we are only planning on using our JavaScript on a single site, we want it to be portable for ease of migration or rebuilding.

The easiest way to meet these two goals is to install a special module. This module is called the Translation template extractor. It basically analyzes our code, looking for the Drupal.t() calls. It then generates a template that we can easily modify to add our translation.

To get this module, go to http://drupal.org/project/potx and get the latest release. The release contains both a module and a command-line tool. If you like, you can use the command-line version. However, the module version is very easy to use. It is installed simply by moving the potx/ folder in the downloaded module to your sites/all/modules directory, and then installing the module in the usual way by visiting Administer | Site building | Modules.

The main thing this module does is add a new tab to the Administer | Site building | Translate interface page:

The Extract tab (on the far right of the list of tabs) is the one that we use to parse our files and get the strings for translation.

This interface will generate a special file called a POT file, which maps the original untranslated text to translated strings.

Drupal uses the GNU gettext system for translation. Learn more about it at http://www.gnu.org/software/gettext/.

Once you've gone through the process of translating strings in this POT file, it is just a matter of putting the translation file in the right place in the theme (or module) directory.

Translating JavaScript, translating PHP
We are focused on the JavaScript here. However, PHP translations are done in exactly the same way. There's no need to learn two translation systems—the two are fully integrated.

Summary

In this article, we focused on the translation system in Drupal. We looked at installing and configuring multiple languages using the JavaScript Drupal.t() and Drupal.formatPlural() functions. We've seen an important and powerful aspect of Drupal: The Translation System.

About the Author :


Matt Butcher

Matt is a web developer and author. He has previously written five other books for Packt, including two others on Drupal. He is a senior developer for the New York Times Company, where he works on ConsumerSearch.com, one of the most traffic-heavy Drupal sites in the world. He is the maintainer of multiple Drupal modules and also heads QueryPath – a jQuery-like PHP library. He blogs occasionally athttp://technosophos.com.

 

Books From Packt

Drupal 6 Social Networking
Drupal 6 Social Networking

Drupal 6 Themes
Drupal 6 Themes

Building Websites with Joomla! 1.5
Building Websites with Joomla! 1.5

Joomla! Web Security
Joomla! Web Security

Learning jQuery 1.3
Learning jQuery 1.3

Drupal Multimedia
Drupal Multimedia

Building Powerful and Robust Websites with Drupal 6
Building Powerful and Robust Websites with Drupal 6

WordPress Plugin Development (Beginner's Guide)
WordPress Plugin Development (Beginner's Guide)

 

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software