practical tips on building an international presence
When we talk about sites that can support multiple languages we talk about internationalisation (and it’s short form i18n). We also talk about localisation (l10n), and locale, language, and countries.
Internationalisation is the process of taking an application and making it work for more than one locale or language. For an existing application it is the process of abstracting away the language and locale specific material into some sort of configuration.
We replace static text with replaceable tokens (sometimes the tokens are English sentences).
We replace dates with methods that take an internationalised date format (like ISO8601), and render locale-appropriate date formats (25th June 2014).
We replace currency amounts with, not just the right currency but also the right number formatting. Also keep an eye on measuring systems (metric versus imperial, or something else entirely).
We move country specific content into a framework or structure that recognises which locale it is in. Worst case a set of if/then/else, better case a tempting framework that allows specialisation by country and language, best case perhaps is locale-specific feature flags.
There are even worse ways of doing locale specific functionality, but always consider the ongoing maintenance costs of each approach and find the best long term strategy.
Localisation is the step of taking an internationalised codebase and creating the configuration necessary to adapt it for the chosen language or locale.
This involves populating translation strings, configuring number, date and currency formatting.
Ensure the content is suitable for the new locale. Sometimes we need functionality specific for that locale. And so our application framework needs to offer a simple way of enabling that.
And we also need to consider styling. In Western cultures Black and White are neutral colours, and Red means warning. In China, White is symbolic of death, and Red of luck and good fortune. So Red error messages will have different interpretations. There’s also left-to-right and right-to-left writing systems to consider.
We use the term locale, instead of language. Yes, at it’s heart internationalisation is mostly about language. But language alone isn’t a significant identifier for localisation. We think of localisation as a country plus language configuration (a cultural identifier), and although it still overlooks minority groups and edge cases, this distinction does get us very far down the road of an internationalised application.
Not everyone in Spain speaks Spanish, so country-wide news coverage needs to consider not just Spanish, but Catalan and Basque too.
Is English a single language? Most times you can get away with US English in a UK market, but dates catch people out (is 04/06/2014 the 4th of June or the 6th of April?), and word usage differs too (trousers, car boots and moon-bags, for example). It depends on your locale. (Also, which sport does football refer to? Football, or American football?)
The locale is an union of language and country. It recognises that Spanish in Spain might be culturally different to Spanish in the USA.
China has two main spoken languages (amongst the hundreds of regional dialects), Mandarin (the most predominant) and Cantonese (spoken in the Southern Guangdong province and Hong Kong).
But all literate Chinese can communicate with each other with the same written character sets, regardless of their spoken language.
Mainland China uses the Simplified Chinese characters, while Taiwan still uses the Traditional characters. Presenting the right characters to the right audience is a key part of internationalisation, as much as presenting the right language.
That’s why we use both a language code and a country code to define a localisation. It gives us two dimensions, a primitive but useful starting point to opening up to international markets.
It is not perfect, for example, how to identify the growing community in the US that uses a mixture of Spanish and English – Spanglish. Is it US English, or US Spanish?