practical tips on building an international presence
Most of the time the HTML for a site doesn’t need to change as you localise an already internationalised site for a specific locale. Though your website needs the flexibility to specify locale-specific markup, so it is well worth having a system in place that allows that. And implemented properly, a mechanism for specialising templates based on multiple dimensions (locale being just one dimension) can be a very powerful tool.
The typical structure for localising markup is to split a page into multiple templates and components. Typically these reusable templates are derived from the structure of your website, but essentially, any repeating patterns of markup should have it’s own template (be it a component or a partial).
The idea is that we have one place that defines how a particular design element is rendered.
An important feature of an effectively internationalised site is the ability to override specific templates or components for localisation-specific requirements. This is either to customise the markup of a particular component, or not displaying that component, or displaying something not available to any other locale.
In it’s most basic form what you want is a templating system that, when a template is required, instead of looking for it in one specific place, it looks in a series of locations (based on the requested locale) until the most appropriate one is found, and then uses that. The most appropriate template is in the range of one specialised just for that locale right up to the generic default that applies to all locales.
Essentially in a well internationalised site there’s a generic set of components that are applicable to most locales most of the time. Then there is a locale specific overrides that either customise the component, stop it from rendering, or insert something not offered in the generic set. In this way each locale can override any template at any point.
One very common mistake in inheritance is to use a site’s primary locale as the default level of localisation. Yahoo US made this mistake time and time again. Mainly because they built the US version of each media property first (because the US is Yahoo’s primary market). When they tried to create a Canadian version of Yahoo Finance (in English only: ca.finance.yahoo.com), for example, instead of refactoring the code to have a generic localisation level and both the US version and the Canadian version would specialise from that, they decided to keep the US version at the top of this tree, and the Canadian version became a descendent of that.
This created a serious maintenance headache. Since the size of the Canadian audience was miniscule in comparison to the US audience, there was no dedicated development team for it. Of course, with a proper internationalisation framework in place this isn’t a problem. But because Canada inherited the US codebase, every change to the US codebase became immediately available to Canada. Including features that could contractually only be offered in the US. That means for every change to the US codebase, some poor sap needed to undo that change for Canada by finding the previous version of the changed template, and localising it down to Canada, thus undoing the US change. This do/undo process wasted a large number of developer hours.
They realised the minefield of this approach, and instead of refactoring the Finance property to have a generic, rather than a US as a default localisation, they copied the templates for Canada into a separate codebase, and had a dedicated team in Canada to support their own offshoot.
Luckily, Yahoo in Europe had a tiny fraction of resources the US had access to, and in that scarcity they adopted a much better approach by building their own bespoke templating system, and having a generic localisation level at the top. That allowed Yahoo Europe to support 5 countries, and a dozen independent media sites, with one small team of developers maintaining the templates. (Which worked well for close on a decade; until Yahoo US decided that global properties / one codebase with everything inheriting from the US locale was the preferred solution.)
At Yahoo! Europe we had a very powerful in-house template editing system (developed in the 20th century) that allowed us to build a generic Yahoo media site. Then we had the ability to specialise these templates to various dimensions:
Although from the perspective of the page every template had a localisation path which looked like a list of paths to look for a specific template, starting from the most-specific (the template-specific level), and going up the list until it found an existing component.
The actual inheritance structure was a lot more complicated, and frankly ingenious. Proper localisation is a set of dimensions. And each dimension inter-relates independently of the others, which makes using a localisation tree impractical and limiting. Dimensions are the factors that can affect how a component displays. As I’ve mentioned, at Yahoo, the actual site itself is one major dimension (A Sports site, as opposed to a News site), then we have dimensions for the country, the data provider, the site section, the type of page, the type of data, and even the individual template itself. New dimensions could be added fairly simply: when we needed to co-brand a section of the site for a particular advertiser/partner we would introduce that as a new dimension and specialise any templates needing customisation to that level. Then when the advertising campaign was over we’d just remove the dimension from the specialism path.
Picking the right specialism level is not entirely straightforward, it needs to be considered. It’s far too easy to specialise a template right to the bottom so that you ensure that there’s no impact elsewhere on the site. This approach isn’t ideal and in the long-term it proves to be more costly, despite being the simplest and safest way to implement one change.
It takes knowledge of the available dimensions, what they are used for and where to accurately identify the appropriate level of specialism for a change. You need confidence in making the change at a higher level won’t break parts of the site you are not immediately looking at. That means understanding the implications of your change, and being in a good position to identify and test the affected areas of the site.
Focusing just on the primary locale of the site is only viable when developers are very confident of the scope and impact of their changes. Developers do need to consider the localisation implications of their changes. It’s not enough to get it working in your preferred locale and just assume everywhere else will be just fine. You need to know the impact, either before you make the change, or by confirming it as so by testing it thoroughly.
One of the weaknesses of templating systems of a dynamic and flexible nature is quickly identifying which pages are affected by changing a template. Sometimes a grep isn’t going to be enough, if the template inclusion is something other than a static reference. A developer needs good tools that know about the templating framework and help identify affected pages.
Although my experience in internationalisation-supporting frameworks is based on an internal Yahoo templating framework called Jake, it isn’t entirely Yahoo specific. Jake is written in Perl. When Yahoo adopted PHP as it’s framework of choice a team quickly got involved in creating a very flexible dimension-supporting templating system called r3. This got released as open source a few years ago. It’s very powerful, but it really needs someone who understands r3 to write us a guide in how to use r3 and wield it properly.
In the meantime, start with a templating system that allows you to define a generic default localisation for your website, and a specialisation level for each locale where every template can be specialised to for locale-specific customisation. This is the bare minimum for localisation of website templates. Though, if you build it right and allow independent dimensions, you have a very powerful and flexible templating system that will do amazing things.
 At Yahoo we referred to the specialism path as the localisation path, although this path wasn’t strictly about localisation but also covered many mutually-exclusive dimensions. Within the European webdev team we understood when we were talking about localisation of templates, and when we were talking about localisation in an internationalisation context. But it was evident that using localisation for templates in this instance can be confusing, so instead calling this specific feature “specialism” and “specialising templates” would make the difference clearer. I’m adopting this nomenclature here.
 The choice of country as the primary specialism level of localisation is a classic mistake of internationalisation: a locale or culture doesn’t necessarily map to a country. Yahoo’s Spanish News site contains stories in multiple languages, including Spanish, Catalan, Basque etc. But there’s only one Spanish news site. Instead of a country locale, Yahoo should have established a locale level – Spain-Spanish, Spain-Catalan, Spain-Basque, so three separate locales, not one country with articles in three different languages. The specialism for language is definitely needed, plus another extra level in case there’s a geographic/natural/cultural boundary between groups of people in the same country and language, but still are independent of each other. Yahoo’s insistence on countries defining locales is what limits their ability to accommodate citizens who do not use what Yahoo defines as that country’s acceptable language.