Browser Language Detection and Redirection [en]

[fr] Une explication de la méthode que j'ai suivie pour que http://stephanie-booth.com redirige le visiteur soit vers la version anglaise du site, soit la française, en fonction des préférences linguistiques définies dans son navigateur.

**Update, 29.12.2007: scroll to the bottom of this post for a more straightforward solution, using Multiviews.**

I’ve been working on [stephanie-booth.com](http://stephanie-booth.com) today. One of my objectives is the add an [English version](http://stephanie-booth.com/en/) to the previously [French-only site](http://stephanie-booth.com/fr/).

I’m doing this by using two separate installations of WordPress. The content in both languages should be roughly equivalent, and I’ll write a WordPress plugin which allows to “automate” the process of linking back and forth from equivalent content in different languages.

What I did today is solve a problem I’ve been wanting to attack for some time now: use people’s browser settings to direct them to the “correct” language for the site. Here is what I learnt in the process, and how I did it. It’s certainly not the most elegant way to do things, so let me know if you have a better solution by using the comments below.

First, what I needed to know was that the browser language preferences are sent in the HTTP_ACCEPT_LANGUAGE header (HTTP header). First, I thought of [capturing the information through PHP](http://www.webdeveloper.com/forum/showpost.php?s=d989341270ceae8820a3bc1c6273dc9e&p=217863&postcount=2), but I discovered that Apache (logical, if you think of it) could handle it directly.

[This tutorial was useful in getting me started](http://www.ibm.com/developerworks/library/wa-apac.html), though I think it references an older version of Apache. Out of the horses mouth, [Apache content negotiation](http://httpd.apache.org/docs/2.0/content-negotiation.html) had the final information I needed.

I’ll first explain the brief attempt I did with Multiviews (because it can come in handy) before going through the setup I currently have.

### Multiviews

In this example, you request a file, e.g. test.html which doesn’t physically exist, and Apache uses either test.html.en or test.html.fr depending on your language preferences. You’ll still see test.html in your browser bar, though.

To do this, add the line:

Options +Multiviews

to your .htaccess file. Create the files test.html.en and test.html.fr with sample text (“English” and “French” will do if you’re just trying it out).

Then, request the file test.html in your browser. You should see the test content of the file corresponding to your language settings appear. Change your browser language prefs and reload to see what happens.

This is pretty neat, but it forces you to open a file — and I wanted / to redirect either to /en/ or to /fr/.

It’s explained pretty well in [this tutorial I already linked to](http://www.ibm.com/developerworks/library/wa-apac.html), and [this page has some useful information](http://unix.org.ua/orelly/linux/apache/ch06_03.htm) too.

### Type maps

I used a type map and some PHP redirection magic to achieve my aim. A type map is not limited to languages, but this is what we’re going to use it for here. It’s a text file which you can name menu.var for example. In that case, you need to add the following line to your .htaccess so that the file is dealth with as a type map:

AddHandler type-map .var

Here is the content of my type-map, which I named menu.var:

URI: en.php
Content-Type: text/html
Content-Language: en, en-us, en-gb

URI: fr.php
Content-Type: text/html
Content-Language: fr, fr-ch, fr-qc

Based on my tests, I concluded that the value for URI in the type map cannot be a directory, so I used a little workaround. This means that if you load menu.var in the browser, Apache will serve either en.php or fr.php depending on the content-language the browser accepts, and these two PHP files redirect to the correct URL of the localized sites. Here is what en.php looks like:

And fr.php, logically:

Just in case somebody came by with a browser providing neither English nor French in the HTTP_ACCEPT_LANGUAGE header, I added this line to my .htaccess to catch [any 406 errors (“not acceptable”)](http://www.checkupdown.com/status/E406.html):

ErrorDocument 406 /en.php

So, if something goes wrong, we’re redirected to the English version of the site.

The last thing that needs to be done is to have menu.var (the type map) load automatically when we go to stephanie-booth.com. I first tried by adding a DirectoryIndex directive to .htaccess, but that messed up the use of index.php as the normal index file. Here’s the line for safe-keeping, if you ever need it in other circumstances, or if you want to try:

DirectoryIndex menu.var

Anyway, I used another PHP workaround. I created an index.php file with the following content:

And there we are!

### Accepted language priority and regional flavours

In my browser settings, I’ve used en-GB and fr-CH to indicate that I prefer British English and Swiss French. Unfortunately, the header matching is strict. So if the order of your languages is “en-GB, fr-CH, fr, en” you will be shown the French page (en-GB and fr-CH are ignored, and fr comes before en). It’s all explained in the Apache documentation:

> The server will also attempt to match language-subsets when no other match can be found. For example, if a client requests documents with the language en-GB for British English, the server is not normally allowed by the HTTP/1.1 standard to match that against a document that is marked as simply en. (Note that it is almost surely a configuration error to include en-GB and not en in the Accept-Language header, since it is very unlikely that a reader understands British English, but doesn’t understand English in general. Unfortunately, many current clients have default configurations that resemble this.) However, if no other language match is possible and the server is about to return a “No Acceptable Variants” error or fallback to the LanguagePriority, the server will ignore the subset specification and match en-GB against en documents. Implicitly, Apache will add the parent language to the client’s acceptable language list with a very low quality value. But note that if the client requests “en-GB; q=0.9, fr; q=0.8”, and the server has documents designated “en” and “fr”, then the “fr” document will be returned. This is necessary to maintain compliance with the HTTP/1.1 specification and to work effectively with properly configured clients.

Apache, Content Negotiation

This means that I added regional language codes to the type map (“fr, fr-ch, fr-qc”) and also that I changed the order of my language preferences in Firefox, making sure that all variations of one language were grouped together, in the order in which I prefer them:

Language Prefs in Firefox

### Catching old (now invalid) URLs

There are lots of incoming links to pages of the French site, where it used to live — at the web root. For example, the contact page address used to be http://stephanie-booth.com/contact, but it is now http://stephanie-booth.com/fr/contact. I could write a whole list of permanent redirects in my .htaccess file, but this is simpler. I just copied and modified the rewrite rules that WordPress provides, to make sure that the correct blog installation does something useful with those old URLs (**bold** is my modification):

# BEGIN WordPress

RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . **/fr**/index.php [L]

# END WordPress

In this way, as you can check, [http://stephanie-booth.com/contact](http://stephanie-booth.com/contact) is not broken.

### Next steps

My next mission is to write a small plugin which I will install on both WordPress sites (I’ve got to write it for a client too, so double benefit). This plugin will do the following:

– add a field to the write/edit post field in which to type the post slug of the correponding page/post in the other language *(e.g. “particuliers” in French will be “individuals” in English)
– add a link to each post pointing to the equivalent page in the other language

It’s pretty basic, but it beats manual links, and remains very simple. (I like simple.)

As I said, if you have a better (simpler!) way of doing all this, please send it my way.

### A simpler solution **[Added 29.12.2007]**

For each language, create a file named index.php.lg where “lg” is the language code. For French, you would create index.php.fr with the following content:

Repeat for each language available.

**Do not** put an index.php file in your root directory, just the index.php.lg files.

Add the two following lines to your .htaccess:

Options +Multiviews
ErrorDocument 406 /fr/

…assuming French is the default language you want your site to show up in if your visitor’s browser doesn’t accept any of the languages you provide your site in.

You’re done!

Similar Posts: