Requirements for a Multilingual WordPress Plugin [en]

[fr] Quelques réflexions concernant un plugin multilingue pour WordPress.

My blog has been bilingual for a long time now. I’ve hacked bilingualism into it and then plugged it in. Other plugins for multilingual bloggers have been written, and some unfortunately got stuck somewhere in the development limbo.

Defining specs is a hairy problem. They need to work for the person visiting the site (polyglot or monoglot). They need to work for the person (or people! translation often involves more than one person) writing the posts. They need to work for all the robots, search engines, and fancy browsers who deal with the site.

Here is what I would like a multiple language plugin to do (think “feature requirements”, suggestion, draft):

  1. Recognize the browser language preference of the visitor and serve “page furniture” and navigation in the appropriate language. This can be overridden by a cookie-set preference when clicking on a “language link”.
    • “WordPress” furniture can be provided by the normal localization files
    • how do we deal with other furniture content in the theme (navigation, taglines, etc.)? should the plugin provide with guidelines for theme localization? do such guidelines already exist? extra information appreciated on this point
    • “language links” shouldn’t be flags, but language names in their respective languages; can this list be generated automatically based on present localization files? otherwise, can it be set in an admin panel?
    • upon “language change” (clicking on a language link), could the localization (action) be done in an AHAH– or AJAX-like way?
    • inevitable hairy problem: tag and category localization
  2. Manage “lazy multilingualism” in the spirit of the Basic Bilingual plugin and “true multilingualism” elegantly and on a per-post basis.
    • allow for “other language abstracts”
    • allow for actual other language version of the post
    • given the “general user language” defined above, show posts in that language if a version for that language exists, with mention of other language versions or abstracts
    • if that language doesn’t exist, show post in “main blog language” or “main post language” (worst case scenario: the wordpress install default) and show alongside other language abstracts/versions
    • abstract in one language (would be “excerpt” in the “main” language) and existence of the post in that language are not mutually exclusive, both can coexist
    • does it make more sense to have one WordPress post per language version, or a single post with alternate language content in post_meta? For lazy multilingualism, it makes more sense to have a single WP post with meta content, but fore “translation multilingualism”, it would make more sense to have separate posts with language relationships between them clearly defined in post_meta
  3. Use good markup. See what Kevin wrote sometime back. Make it nice for both polyglot and monoglot visitors. Inspiration?
    • use <div lang="xx"> and also rel attributes
  4. Provide a usable admin panel.
    • when I’m writing the other version of a post, I need access to the initial version for translation or abstracting
    • ideally, different language version should be editable on the same admin panel, even if they are (in the WordPress database) different posts
    • languages in use in the blog should be defined in an options screen, and the plugin should use that information to adapt the writing and editing admin panels
    • idea: radio button to choose post language; N other language excerpt/abstract fields with radio buttons next to them too; abstract radio buttons change dynamically when main post language is set; in addition to other language abstract fields, another field which can contain a post id/url (would have to see what the best solution is) to indicate “this is an equivalent post in another language” (equivalent can be anything from strict translation to similar content and ideas); this means that when WP displays the blog, it must make sure it’s not displaying a post in language B which has an equivalent in language A (language A being the visitor’s preferred language as defined above)
  5. Manage URLs logically (whatever that means).
    • if one post in two languages means two posts in WP, they will each have their own slug; it could be nice, though, to be able to switch from one to an other by just adding the two-letter language code on the end of any URL; a bit of mod_rewrite magic should do it
  6. Integrate into the WordPress architecture in a way that will not break with each upgrade (use post-meta table to define language relationships between different posts, instead of modifying the posts table too much, for example.)
    • one post translated into two other languages = 3 posts in the WP posts table
    • excerpts and post relationships stored in post_meta
    • language stored in post_meta

I have an idea for plugin development. Once the specs are drafted out correctly, how about a bunch of us pool a few $ each to make a donation to (or “pay”) the person who would develop it? Who would be willing to contribute to the pool? Who would be willing to develop such a plugin (and not abandon the project half-way) in these conditions?

These specs need to be refined. We should start from the markup/reader end and get that sorted out first. Then, think about the admin panel/writer end. Then worry about code architecture. How does that sound?

We’ve started a discussion over on the microformats.org wiki. Please join us!

Update: this post is going to suffer from ongoing editing as I refine and add ideas.

Plugin Updates for WordPress 2.0 [en]

[fr] Les plugins "Bunny's Technorati Tags" et "Basic Bilingual" fonctionnent à  présent avec WordPress 2.0.

It took me a couple of hours, but both Basic Bilingual and Bunny’s Technorati Tags are now WordPress 2.0-compatible.

A few minor tweaks have been made, most significant of which is that these two plugins can now be used for Pages in addition to normal posts.

They should work, though I haven’t troubleshooted them extensively — please ring the bell if you bump into any problems.

Basic Bilingual Plugin [en]

This is a simple plugin which wraps together my bilingual hacks to make day-to-day posting less of a hassle.

[fr] Ce plugin pour WordPress regroupe les hacks que j'utilise depuis un moment déjà pour gérer le bilinguisme de mon weblog. Il permet d'afficher un sommaire de chaque billet dans "l'autre langue" et d'appliquer un formattage par langue via l'attribut lang.

Plus de détails sur la page officielle du plugin.

Update 01.02.2007: This plugin broke badly with WordPress 2.1, but has now been (hopefully) updated. The wiki page on wp-plugins.org is frozen and may not be up-to-date anymore. Download here.

This post is the test run for my Basic Bilingual plugin.

It doesn’t add much functionality to what I already have through my hacks, but it’s cleaner from a code point of view, and it’s portable — you can use it too if you wish.

Be patient if the wiki page isn’t exactly up-to-date. It will be shortly — and the plugin will be available through the Plugin Manager as soon as I’ve made sure it’s functional enough (ie, when I press publish and hell doesn’t break loose).

This plugin basically allows you to do what you can see on this weblog: add lang attributes to your posts, excerpts in “the other language”, and localize the date. It also creates permanent fields in the admin pages for entering the language and “other language excerpt” easier.

I’d like to emphasize that this plugin is very simple. It is in no way a replacement of any sort for the larger-scale multilingual efforts going on these days. I wanted to get my code cleaned up and my hacks back in the admin interface (I lost them when I upgraded WP), and I’m making the result public.

Call to WordPress Plugin Developers [en]

Call for help to WordPress plugin developers. I have a bunch of hacks and modifications I’d like to turn into plugins, but I am unfortunately as plugin-challenged as ever.

[fr] Un descriptif des plugins que j'écrirais pour WordPress si je ne faisais pas un vilain blocage sur le sujet. Ne vous gênez pas si vous voulez contribuer!

If I was fluent in WordPress plugin coding, here are the plugins I’d write. If you feel like coding one of them yourself, or helping me get it done, you’re most welcome. Carthik has already pointed me to Plunge into Plugins, which I will have a close look at once I’ve finished writing this post.

Of course, if you know of a plugin which does precisely what I’m describing here, leave a link to it in the comments!

Keywords plugin

This would be a pretty straightforward one:

  • add a “keywords” text input to post.php
  • save the value of that text input to a custom field called “keywords”
  • add those keywords as an HTML meta tag on the individual post pages.
Excerpt plugin

This one would also be pretty straightforward, as all it would do is add the “excerpt” field to the “simple” post.php layout.

Customize post.php plugin

This would be more complex, but allow for more flexibility than the previous plugin. I don’t yet have a clear idea of how to make it work, but the basic principle would be to allow the user to select which fields should appear on the post.php page. Instead of having “simple” and “advanced” controls, this would add the option to have “custom” controls and define them.

TopicExchange plugin

As far as functionality is concerned, this plugin would do what my TopicExchange hack did:

  • add a “trackback TopicExchange channels” text input to post.php
  • store the space-separated list of keywords in a meta value named (e.g.) ite_topic (one record for each value)
  • for each value, trackback the appropriate TopicExchange channel
  • display the trackbacked channels (with link) on each post.
Bilingual plugin

This would be a clean version of my language hacks:

  • add a small “language” text input to post.php (with a default value)
  • add an “other language excerpt” textarea, which posts to the corresponding custom value
  • display the “other language excerpt” at the top of each post
  • provide a function to return the post language, and the other-excerpt language (so it can be declared in a lang attribute, allowing the use of language-dependant CSS formatting, in addition to being semantically correct)
  • if this is not already possible with the date function in the WordPress core, provide an alternative date function which will format the date correctly corresponding on the language of the post
  • optional: figure out a way to adapt text like “comments”, “categories” etc. to the post language; make the plugin usable with more than two languages.
Smart Linkroll plugin

I love the way ViaBloga manages blogrolls and would love to see a plugin for WordPress that does the same thing. In ViaBloga, you simply enter the URL of the site you want to add to your links. ViaBloga then retrieves the title, description, RSS feed address, and even (yes!) a screenshot for the site. No need to fill in fields manually anymore…

Wiki-Keywords plugin

I haven’t through this through yet completely, but it seems to me that a plugin which would add wiki-like capability to WordPress, like ViaBloga does with keywords, could be an interesting idea to explore.

Technorati plugin

This is really a simple one: add a function which will allow easy display of the Technorati cosmos of each post, like I have done manually for this weblog.

On the subject of multilingual blogging, Kevin Marks has some interesting markup suggestions I need to look at more closely.

Life and Trials of a Multilingual Weblog [en]

Here is an explanation of how I set up WordPress to manage my bilingual weblog. I give all the code I used to do it, and announce some of the things I’d like to implement. A “Multilingual blogging” TopicExchange channel is now open.

[fr] J'explique ici quelles sont les modifications que j'ai faites à WordPress pour gérer le bilinguisme de mon weblog -- code php et css à l'appui. Je mentionne également quelques innovations que j'ai en tête pour rendre ce weblog plus sympathique à mes lecteurs monolingues (ce résumé en est une!) Un canal pour le weblogging multilingue a été ouvert sur TopicExchange, et vous y trouverez peut-être d'autres écrits sur le même sujet. Utilisez-le (en envoyant un trackback) si vous écrivez des billets sur le multinguisme dans les weblogs!

My weblog is bilingual, and has been since November 2000. Already then, I knew that I wouldn’t be capable of producing a site which duplicates every entry in two languages.

I think this would defeat the whole idea of weblogging: lowering the “publication barrier”. I feel like writing something, I quickly type it out, press “Publish”, and there we are. Imposing upon myself to translate everything just pushes it back up again. I have seen people try this, but I have never seen somebody keep it up for anything nearing four years (this weblog is turning four on July 13).

This weblog is therefore happily bilingual, as I am — sometimes in English, sometimes in French. This post is about how I have adapted the blogging tools I use to my bilingualism, and more importantly, how I can accommodate my monolingual readers so that they also feel comfortable here.

First thing to note: although weblogging tools are now ready to be used by people speaking a variety of languages (thanks to a process named “localization”), they remain monolingual. Language is determined at weblog-level.

With Movable Type, I used categories to emulate post-level language awareness. This wasn’t satisfying at all: I ended up with to monstrous categories, Français and English, which didn’t help keep rebuild times down.

With WordPress, the solution is far more satisfying: I store the language information as Post Meta, or “custom field”. No more category exploitation for something they shouldn’t be used for.

Before I really got started doing the exciting stuff, I made a quick change to the WordPress admin interface. If I was going to be adding a “language” custom field to each and every post of mine, I didn’t want to be doing it with the (imho) rather clumsy “Custom Fields” form.

In edit.php, just after the categorydiv fieldset, I inserted the following:

<fieldset id="languagediv">
      <legend>< ?php _e('Language') ?></legend>
	  <div><input type="text" name="language" size="7"
                     tabindex="2" value="en" id="language" /></div>
</fieldset>

(You’ll probably have to move around your tabindex values so that the tabbing order makes sense to you.)

I also tweaked the wp-admin.css file a bit to keep it looking reasonably pretty, adding the rule below:

#languagediv {
	height: 3.5em;
	width: 5em;
}

and adding #languagediv everywhere I could see #poststatusdiv, so that they obeyed the same rules.

In this way, I have a small text field to edit to set the language. I pre-set it to “en”, and have just to change it to “fr” if I am writing in French.

We just need to add a little piece of code in the form processing script, post.php, just after the line that says add_meta($post_ID):

 // add language
	if(isset($_POST['language']))
	{
	$_POST['metakeyselect'] = 'language';
        $_POST['metavalue'] = $_POST['language'];
        add_meta($post_ID);
        }

The first thing I do with this language information is styling posts differently depending on the language. I do this by adding a lang attribute to my post <div>:

<div class="post" lang="<?php $post_language=get_post_custom_values("language"); $the_language=$post_language['0']; print($the_language); ?>">

In the CSS, I add these rules:

div.post:lang(fr) h2.post-title:before {
  content: " [fr] ";
  font-weight: normal;
}
div.post:lang(en) h2.post-title:before {
  content: " [en] ";
  font-weight: normal;
}
div.post:lang(fr)
{
background-color: #FAECE7;
}

I also make sure the language of the date matches the language of the post. For this, I added a new function, the_time_lg(), to my-hacks.php. I then use the following code to print the date: <?php the_time_lg($the_language); ?>.

Can more be done? Yes! I know I have readers who are not bilingual in the two languages I use. I know that at times I write a lot in one language and less in another, and my “monolingual” readers can get frustrated about this. During a between-session conversation at BlogTalk, I suddenly had an idea: I would provide an “other language” excerpt for each of my posts.

I’ve been writing excerpts for each of my posts for the last six months now, and it’s not something that raises the publishing barrier for me. Quickly writing a sentence or two about my post in the “other language” is something I can easily do, and it will at least give my readers an indication about what is said in the posts they can’t understand. This is the first post I’m trying this with.

So, as I did for language above, I added another “custom field” to my admin interface (in edit-form.php). Actually, I didn’t stop there. I also added the field for the excerpt to the “simple controls” posting page that I use (set that in Options > Writing), and another field for keywords, which I also store for each post as meta data. Use at your convenience:

<!-- BEGIN BUNNY HACK -->
<fieldset style="clear:both">
<legend><a href="http://wordpress.org/docs/reference/post/#excerpt"
title="<?php _e('Help with excerpts') ?>"><?php _e('Excerpt') ?></a></legend>
<div><textarea rows="1" cols="40" name="excerpt" tabindex="5" id="excerpt">
<?php echo $excerpt ?></textarea></div>
</fieldset>
<fieldset style="clear:both">
<legend><?php _e('Other Language Excerpt') ?></legend>
<div><textarea rows="1" cols="40" name="other-excerpt"
tabindex="6" id="other-excerpt"></textarea></div>
</fieldset>
<fieldset style="clear:both">
<legend><?php _e('Keywords') ?></legend>
<div><textarea rows="1" cols="40" name="keywords" tabindex="7" id="keywords">
<?php echo $keywords ?></textarea></div>
</fieldset>
<!-- I moved around some tabindex values too -->
<!-- END BUNNY HACK -->

I inserted these fields just below the “content” fieldset, and styled the #keywords and #other-excerpt textarea fields in exactly the same way as #excerpt. Practical translation: open wp-admin.css, search for “excerpt”, and modify the rules so that they look like this:

#excerpt, #keywords, #other-excerpt {
	height: 1.8em;
	width: 98%;
}

instead of simply this:

#excerpt {
	height: 1.8em;
	width: 98%;
}

I’m sure by now you’re curious about what my posting screen looks like!

To make sure the data in these fields is processed, we need to add the following code to post.php (as we did for the “language” field above):

// add keywords
	if(isset($_POST['keywords']))
	{
	$_POST['metakeyselect'] = 'keywords';
        $_POST['metavalue'] = $_POST['keywords'];
        add_meta($post_ID);
        }
   // add other excerpt
	if(isset($_POST['other-excerpt']))
	{
	$_POST['metakeyselect'] = 'other-excerpt';
        $_POST['metavalue'] = $_POST['other-excerpt'];
        add_meta($post_ID);
        }

Displaying the “other language excerpt” is done in this simple-but-not-too-elegant way:

<?php
$post_other_excerpt=get_post_custom_values("other-excerpt");
$the_other_excerpt=$post_other_excerpt['0'];
if($the_other_excerpt!="")
{
	if($the_language=="fr")
	{
	$the_other_language="en";
	}

	if($the_language=="en")
	{
	$the_other_language="fr";
	}
?>
    <div class="other-excerpt" lang="<?php print($the_other_language); ?>">
    <?php print($the_other_excerpt); ?>
    </div>
  <?php
  }
  ?>

accompanied by the following CSS:

div.other-excerpt:lang(fr)
{
background-color: #FAECE7;
}
div.other-excerpt:lang(en)
{
background-color: #FFF;
}
div.other-excerpt:before {
  content: " [" attr(lang) "] ";
  font-weight: normal;
}

Now that we’ve got the basics covered, what else can be done? Well, I’ve got some ideas. Mainly, I’d like visitors to be able to add “en” or “fr” at the end of any url to my weblog, and that would automatically filter out all the content which is not in that language — maybe using the trick Daniel describes? In addition to that, it would also change the language of what I call the “page furniture” — titles, footer, and even (let’s by ambitious) category names. Adding language sensitivity to trackbacks and comments could also be interesting.

A last thing I’ll mention in the multilingual department for this weblog is my styling of outgoing links if they are written in a language which is not my post language, using the hreflang attribute. It’s easy, and you should do it too!

Suw (who has just resumed blogging in Welsh) and I have just set up a “Multilingual blogging” channel on TopicExchange — please trackback it if you write about blogging in more than one language!