Geekeries du front [fr]

[en] Spent a lot of time fixing various server and device annoyances. Thanks to everyone who helped me for one thing or another! Everything seems to be running fine now, except for MySQL on the web server which continues to choke pretty regularly. Lesson learned in all these adventures: sometimes a clean reinstall is a better strategy than lengthy troubleshooting.

Ouh là, il faut vraiment que j’écrive un article. Je crois qu’un article sur deux de mon blog commence comme ça, ces temps.

Le temps.

Arbre et mauvais temps

Je suis perdue dedans. Les journées filent les unes après les autres. Déjà mi-janvier. Déjà six semaines depuis mon retour d’Inde. L’absence de structure, ça ne me réussit pas. Alors je suis en train de restructurer, mais c’est long, c’est long.

Je n’ai pas chômé, pourtant. J’ai passé des heures et des heures devant des écrans de machines récalcitrantes: mon petit serveur lubuntu qui sert surtout à recevoir des sauvegardes Crashplan, le NAS, dont les volumes partagés doivent être visibles par au moins trois machines avec des OS différents (la machine lubuntu, mon MacBook Air, et l’Android TV), l’Android TV Box, justement, dont le Kodi refusait de jouer quoi que ce soit, et pour finir mon serveur web, qui plante toujours allègrement.

Qu’est-ce que j’ai appris?

La principale leçon, vraiment la plus importante, c’est que c’est souvent moins de travail de réinstaller ou faire une mise à jour que de troubleshooter.

Sur l’Android TV Box, supprimer puis réinstaller Kodi a réglé le problème. Oui, il a fallu reconfigurer et reinstaller les extensions, mais c’est mieux de passer du temps à faire quelque chose qu’on a déjà fait et qui est du “terrain connu” que de rajouter encore des changements non maîtrisés dans le système.

Pour lubuntu, un problème de déconnexion du réseau local a été résolu par une mise à jour foireuse, qui a introduit d’autres problèmes, mais qui, après quelques commandes de réparation, de mise à jour partielle, de redémarrages, et de re-mise à jour, semblent avoir disparu.

Côté présence web, je bosse en priorité à terminer la migration de mon site web professionnel. Pas terrible, quand on désire de nouveaux mandats et développer des nouvelles offres, d’avoir un site web professionnel pas à jour et qui redirige une fois sur deux sur ce blog, suivant comment on en tape l’adresse.

Là, je me heurte à la difficulté suivante: un vieux WordPress et un vieux WPML, qu’il faut transférer dans un WordPress multisite à jour. L’importation toute bête ne marche pas. Il faut d’abord mettre à jour l’installation sur le vieux serveur avant de faire le transfert. Un post sur le forum de WPML me donne l’ordre pour les updates à faire. Je passe une demi-journée à faire tout ça, backups y compris. La home page se charge bien, je vérifie à chaque pas. Mais à la fin, misère, je réalise que tous les contenus “traduits” on disparu.

Le site contient une trentaine de pages. Au lieu de replonger pour tenter de voir quelle mise à jour “casse” l’installation, je vais copier-coller à la main les contenus. Ça me prendre quelques heures mais au moins je sais que ce sera réglé à la fin.

Côté serveur web qui se casse la figure, malgré quelques réglages à la configuration Apache faits déjà il y a un moment, puis l’installation de WordFence pour mettre un stop à d’éventuelles requêtes malicieuses (il y en a, mais pas de quoi en faire tout un fromage), ça continue. La dernière expédition spéléo a permis de mettre à jour une base de données WordPress de 250Mb (même pour 15 ans de blog et 8 “site” ça fait beaucoup).

En y regardant de plus près, c’est la table commentmeta qui pèse le plus lourd: près de 100Mb, si ma mémoire est bonne. 150’000 lignes. Cure d’amaigrissement grâce à un peu de googlage, ici et , déjà. Je remarque aussi que Antispam Bee, un plugin anti-spam que j’avais essayé à un moment donné et que je croyais avoir supprimé, est encore actif et a aussi laissé tout un tas de chenit. Bref, après nettoyage, et vidage du spam, la table est vide.

Bon, c’est visiblement pas ce qui causait les problèmes avec le serveur, puisque c’est à nouveau arrivé depuis le nettoyage, mais ça fait plaisir d’avoir une base de données qui fait moins de 100Mb. Merci au passage à PhpMyAdmin, qui malgré toutes les critiques qu’on peut lui faire, est quand même un outil vachement pratique pour repérer et régler ce genre de chose. Les informations sur le status du serveur MySQL, ainsi que les stats, sont aussi présentées de façon accessibles et me donnent des idées pour la suite: peut-être une histoire de taille de cache pour les tables qui n’est pas optimale.

Etape suivante, donc, du coup, avant de faire l’apprenti-sorcière, c’est de voir ce que raconte l’utilitaire mysqltuner. Je vous ferai un rapport si je trouve ce que c’est!

Disqus Plugin Aftermath: Removing Duplicate Comments [en]

[fr] Comment se débarrasser de 5000 commentaires à double dans sa base de données WordPress!

Now that Disqus integrates Friendfeed comments, I could be tempted to give it another try, if I hadn’t spent an hour yesterday cleaning up my database because of an earlier attempt to use Disqus on this blog. After the story, how I did it — in case you’re in the same mess and could use the help.

Back in August, I installed the Disqus plugin for WordPress. Things started off not too badly, though I was a bit concerned that the plugin seemed to have duplicated all the comments in my database. It didn’t seem to show up on the blog though, so I didn’t worry too much.

After a few months, I was a bit frustrated with Disqus and the plugin (which was clearly an older version than the Disqus plugin available now). Moderating comments through the WordPress interface seemed to work erraticly, and some spam just wouldn’t accept to stay in the spambox. I never really tried to identify the exact problems too closely, I have to admit, but things were not really working how I expected them to.

Then a few (unrelated) people told me they had completely failed to comment on my blog with the new system. At some point, I got fed up and uninstalled it. Unfortunately, the duplicate comments which had been hidden from view remained there after uninstalling the plugin, so all the old comments appeared on the blog twice. I let the problem sit for a long time before attempting to fix it — wild hope there might be a ready-made script out there I could just run& in vain.

Here’s how I tackled the problem this week-end and ended up removing the duplicate comments without too much trouble, through PhpMyAdmin (PMA for short).

  • In PMA, I made sure that duplication seemed constant — it was
  • I discovered that the duplicate comments had “DISQUS” in the user-agent field
  • I dug around until I identified the last duplicate comment (when I installed the Disqus plugin, actually; I sorted the database comments table by comment date to do that)
  • I did a search, selecting comments which were younger than the last duplicate comment date AND had “DISQUS” in the user-agent field (the date bit is important, because comments posted while the plugin were active have “DISQUS” in the user-agent field but are not duplicates)
  • Then, I deleted everything that came up in the search — about 5000 comments (it helps to tell PMA to display 3000 lines per page when doing that :-))

Hope this can help somebody, and remember: always back up your database first!

Invalid argument supplied for foreach() in wp-capabilities.php: Case Cracked! [en]

[fr] Le problème avec wp-capabilities.php qui fait qu'on peut se retrouver "exfermé" (enfermé dehors) de son blog WordPress (typiquement en cas de changement de serveur) semble avoir sa source dans le contenu du champ wp_user_roles dans la table wp_options. En particulier, pour la version française, "Abonné" est un rôle d'utilisateur, et en cas de problèmes d'encodage MySQL, le caractère accentué sera corrompu, causant ainsi l'erreur.

Il suffit de remplacer le caractère fautif dans PhpMyAdmin, et on retrouve l'accès à son blog. Bon, reste ensuite à régler les questions d'encodage... mais c'est déjà ça!

Finally. At last. Endlich. Enfin.

Once more, while trying to transfer a WordPress installation from one server to another, I found myself facing the dreaded problem which locks me out of my WordPress install with a rather cryptic message:

Warning: Invalid argument supplied for foreach() in /home/user/wp/wp-includes/capabilities.php on line 31

(Your lineage may vary.)

What happens is that WordPress cannot read user roles, and therefore, even though your password is accepted, you get a message telling you that you’re not welcome in the wp-admin section:

Vous n’avez pas les droits suffisants pour accéder à cette page.

Or, in English:

You do not have sufficient permissions to access this page.

A quick search on the WordPress forums told me that I was not alone in my fight with wp-capabilities.php, but that many problems had not been resolved, and more importantly, that suggested solutions often did not work for everyone.

I’ve bumped into this problem a couple of times before, and I knew that it was linked to encoding problems in the database. (I’ve had my share of encoding problems: once, twice, thrice — “once” being on of the most-visited posts on this blog, by the way, proof if needed that I’m not alone with mysql encoding issues either.)

I’ll leave the detailed resolution of how to avoid/cure the MySQL problems later (adding
mysql_query("SET NAMES 'utf8'");
to wp-db.php as detailed in this thread, and as zedrdave had already previously told me to do — should have listened! — should prevent them). So anyway, adding that line to my working WordPress install showed me that the problem was not so much in the database dumping process than in the way WordPress itself interacted with the database, because the dreaded wp-capabilities.php problem suddenly appeared on the original blog.

Now, this is where I got lucky. Browsing quickly through the first dozen or so of forum threads about wp-capability.php problems, this response caught my eye. It indicated that the source of the problem was the content of the wp_user_roles field (your prefix may vary). In this case, it had been split on more than one line.

I headed for the database, looked at the field, and didn’t see anything abnormal about it at first. All on one line, no weird characters… just before giving up, I moved the horizontal scrollbar to the end of the line, and there — Eurêka! I saw it.

Abonné

“Contributor”, in French, is “abonné”, with an accent. Accent which got horribly mangled by the MySQL problems which I’ll strive to resolve shorty. Mangled character which caused the foreach() loop to break in wp-capabilities.php, which caused the capabilities to not be loaded, which caused me to be locked out of my blog.

So, in summary: if you’re locked out of your blog and get a warning/error about wp-capabilities and some invalid foreach() loop thingy, head for PhpMyAdmin, and look carefully through the wp_user_roles field in the wp_options table. If it’s split over two or more lines, or contains funky characters, you have probably found the source of your problem.

Good luck!

Finally out of MySQL encoding hell [en]

[fr] Description de comment je me suis sortie des problèmes d'encodage qui résultaient en l'affichage de hiéroglyphes sur tous les sites hébergés sur mon serveur.

It took weeks, mainly because I was busy with a car accident and the end of school, but it also took about two real whole days of head-banging on the desk to get it fixed.

Here’s what happened: remember, a long time ago, I had trouble with stuff in my database which was supposed to be UTF-8 but seemed to be ISO-8859-1? And then, sometime later, I had a weird mixture of UTF-8 and ISO-8859-1 in the same database?

Well, somewhere along the line this is what I guess happened: my database installation must have been serving UTF-8 content as ISO-8859-1, leading me to believe it was ISO-8859-1 when it was in fact UTF-8. That led me to try to convert it to UTF-8 — meaning I took UTF-8 strings and ran them through a converter supposed to turn ISO-8859-1 into UTF-8. The result? Let’s call it “double-UTF-8” (doubly encoded UTF-8), for want of a better name.

Anyway, that’s what I had in my database. When we upgraded MySQL and PHP on the server, I suddenly started seeing a load of junk instead of my accented characters:

encoding-problem-2

What I was seeing looked furiously like UTF-8 looks when your server setup is messed up and serves it as ISO-8859-1 instead. But, as you can see on the picture above, this page was being served as UTF-8 by the server. How did I know it wasn’t ISO-8859-1 in my database instead of this hypothetical “double-UTF-8”? Well, for one, I knew the page was served as UTF-8, and I also know that ISO-8859-1 (latin-1) served as UTF-8 makes accented characters look like question marks. Then, if I wanted to be sure, I could just change the page encoding in Firefox to ISO-8859-1 (that should make it look right if it was ISO-8859-1, shouldn’t it?) Well, it made it look worse.

Another indication was that when the MySQL connection encoding (in my.cnf) was set back to latin-1 (ISO-8859-1), the pages seemed to display correctly, but WordPress broke.

The first post on the picture I’m showing here looks “OK”, because it was posted after the setup was changed. It really is UTF-8.

Now how did we solve this? My initial idea was to take the “double-UTF-8” content of the database (and don’t forget it was mixed with the more recent UTF-8 content) and convert it “from UTF-8 to ISO-8859-1”. I had a python script we had used to fix the last MySQL disaster which converted everything to UTF-8 — I figured I could reverse it. So I rounded up a bunch of smart people (dda_, sbp, bonsaikitten and Blackb|rd — and countless others, sorry if I forgot you!) and got to work.

It proved a hairier problem than expected. What also proved hairy was explaining the problem to people who wanted to help and insisted in misunderstanding the situation. In the end, we produced a script (well, “they” rather than “we”) which looked like it should work, only… it did nothing. If you’re really interested in looking at it, here it is — but be warned, don’t try it.

We tried recode. We tried iconv. We tried changing my.cnf settings, dumping the databases, changing them back, and importing the dumps. Finally, the problem was solved manually.

  1. Made a text file listing the databases which needed to be cured (dblist.txt).
  2. Dumped them all: for db in $(cat dblist.txt); do mysqldump --opt -u user -ppassword ${db} > ${db}-20060712.sql; done
  3. Sent them over to Blackb|rd who did some search and replace magic in vim, starting with this list of characters (just change the browser encoding to latin-1 to see what they look like when mangled)
  4. Imported the corrected dumps back in: for db in $(cat dblist.txt); do mysql -u user -ppassword ${db} < ${db}-20060712.sql; done

Blackb|rd produced a shell script for vim (?) which I’ll link to as soon as I lay my hands on the URL again. The list of characters to convert was produced by trial and error, knowing that corrupted characters appeared in the text file as A tilde or A circonflexe followed by something else. I’d then change the my.cnf setting back to latin-1 to view the character strings in context and allow Blackbr|d to see what they needed to be replaced with.

Thanks. Not looking forward to the next MySQL encoding problem. They just seem to get worse and worse. (And yes, I do use UTF-8 all over the place.)

Fixing Newline Abuse in WordPress [en]

Newlines had crept into some of my old posts and corrupted the formatting. Quick and dirty script I used to solve the problem.

[fr] Des retours à  la ligne excédentaires se sont glissés dans mes anciens posts à  un moment donné, cassant des liens et le formattage en général. J'ai utilisé un petit script pour supprimer tous les retours à  la ligne de ces posts.

I only realised right now how badly broken about 500 of my old posts where. Somewhere in the process, newlines got added at the end of each line, and not just at the end of each paragraph. As WordPress kindly and intelligently adds the necessary HTML markup for paragraph and line breaks in posts, this meant that lots of my posts had broken links (when the newline occurred in the middle of a link) and uneven lines.

Victims: about 500 posts.

Solution: luckily, all the concerned posts had complete HTML formatting in them. Purely and simply removing all the newlines with a short PHP script was the way to go.

Thanks to all those who helped. You probably won’t be able to use the script as is, but if you have a similar problem at some point, it might help. The script pulls out the posts from the WordPress database, removes the newlines, and puts the post back into the table.

Scripts for a WordPress Weblog Farm [en]

A first step to WordPress-farming: a shell script and a PHP script which allow you to easily install a whole lot of WordPress weblogs in only a few minutes (I installed over 30 in less than 5 minutes). Scripts require adapting to your environment, of course.

Update 03.11.06: Batiste made me realise I should point the many people landing here in the search of multi-user WordPress to WordPress MU. All that I describe in this post is very pretty, but nowadays completely obsolete.

Here is the best solution I’ve managed to come up with in half a day to finally install over 30 WordPress weblogs in under 5 minutes (once the preparation work was done).

A shell script copies the image of a WordPress install to multiple directories and installs them. A PHP script then changes a certain number of options and settings in each weblog. It can be used later to run as a “patch” on all installed weblog if a setting needs modifying globally.

Here are the details of what I did.

I first downloaded and unzipped WordPress into a directory.

wget http://wordpress.org/latest.tar.gz
tar -xzvf latest.tar.gz
mv wordpress wp-farm-image

I cleaned up the install (removing wp-comments-popup.php and the import*.php files, for example), added a language directory (as I’m wp-farming in French) and modified index.php to my liking; in particular, I edited the import statement for the stylesheet so that it looked like this:

@import url( http://edublogs.net/styles/toni/style.css );

The styles directory is a directory in which I place a bunch of WordPress styles. I don’t need the style switcher capability, but I do need to styles. Later, users will be able to change styles simply by editing that line in their index.php (or I can do it for them).

Another very important thing I did was rename wp-config-sample.php to config-sample and fill in the database and language information. I replaced wp_ by xxx_ so that I had $table_prefix = 'xxx_';.

To make it easier to install plugins for everyone, correct the language files, and edit whatever may be in wp-images, I moved these three directories out of the image install and replaced them with symbolic links, taking inspiration from Shelley’s method for installing multiple WordPress weblogs.

mv image/wp-content common
mv image/wp-images common
mv image/wp-includes/languages common
ln -s common/wp-content image/wp-content
ln -s common/wp-images image/wp-images
ln -s common/languages image/wp-includes/languages

I also added an .htaccess file (after some painful tweaking on a test install).

Once my image was ready, I compiled a list of all the users I had to open weblogs for (one username per line) in a file named names.txt, which I placed in the root directory all the weblog subfolders were going to go in.

I then ran this shell script (many thanks to all those of you who helped me with it — you saved my life):

for x in `cat names.txt`
do
cp -rv /home/edublogs/wp-farm/image/ $x
cat $x/wp-config.php | sed "s/xxx/${x}/" > config.tmp
mv config.tmp $x/wp-config.php
wget http://st-prex.edublogs.net/$x/wp-admin/install.php?step=1
wget http://st-prex.edublogs.net/$x/wp-admin/install.php?step=2
wget http://st-prex.edublogs.net/$x/wp-admin/install.php?step=3
done

This assumes that my WordPress install image was located in /home/edublogs/wp-farm/image/ and that the weblog addresses were of the form http://st-prex.edublogs.net/username/.

This script copies the image to a directory named after the user, edits wp-config to set the table prefix to the username, and then successively wgets the install URLs to save me from loading them all in my browser.

After this step, I had a bunch of installed but badly configured weblogs (amongst other things, as I short-circuited the form before the third install step, they all think their siteurl is example.com).

Entered the PHP patch which tweaks settings directly in the database. I spent some time with a test install and PHPMyAdmin to figure out which fields I wanted to change and which values I wanted to give them, but overall it wasn’t too complicated to do. You’ll certainly need to heavily edit this file before using it if you try and duplicate what I did, but the basic structure and queries should remain the same.

I edited the user list at the top of the file, loaded it in my browser, and within less than a few seconds all my weblogs were correctly configured. I’ll use modified versions of this script later on when I need to change settings for all the weblogs in one go (for example, if I want to quickly install a plugin for everyone).

In summary:

  1. compile list of users
  2. prepare image install
  3. run shell script
  4. run PHP script

If you try to do this, I suggest you start by putting only two users in your user list, and checking thoroughly that everything installs and works correctly before doing it for 30 users. I had to tweak the PHP script quite a bit until I had all my settings correctly configured.

Hope this can be useful to some!

Update 29.09.2005: WARNING! Hacking WordPress installs to build a farm like this one is neat, but it gets much less neat when your weblog farm is spammed with animal porn comments. You then realise (oh, horror!) that none of the anti-spam plugins work on your beautiful construction, so you weed them out by hand as you can, armed with many a MySQL query. And then the journalist steps in — because, frankly, “sex with dogs” on a school website is just too good to be true. And then you can spend half a day writing an angry reaction to the shitty badly-researched article.

My apologies for the bad language. Think of how you’re going to deal with spam beforehand when you’re setting up a school blog project.

Converting MySQL Database Contents to UTF-8 [en]

I finally managed to convert my WordPress database content to UTF-8. It’s easy to do, but it wasn’t easy to figure out.

[fr] Voici comment j'ai converti le contenu de ma base de données WordPress en UTF-8. C'est assez simple en soi, mais ça m'a pris longtemps pour comprendre comment le faire!

A few weeks ago, I discovered (to my horror) that my site was not UTF-8, as I thought it was. Checking my pages in the validator produced this error:

The character encoding specified in the HTTP header (iso-8859-1) is different from the value in the XML declaration (utf-8). I will use the value from the HTTP header (iso-8859-1).

In all likeliness, my server adds a default header (iso-8859-1) to the pages it serves. When I switched to WordPress, I was careful to save all my import files as UTF-8, and I honestly thought that everything I had imported into the database was UTF-8. Somewhere in the process, it got switched back to iso-8859-1 (latin-1).

The solution to make sure the pages served are indeed UTF-8, as specified in the meta tags of my HTML pages, is to add the following line to .htaccess:

AddDefaultCharset OFF

(If one wanted to force UTF-8, AddDefaultCharset UTF-8 would do it, but actually, it’s better to leave the possibility to serve pages with different encodings, isn’t it?)

Now, when I did that, of course, all the accented characters in my site went beserk — proof if it was needed that my database content was not UTF-8. Here is the story of what I went through (and it took many days to find the solution, believe me, although it takes only 2 minutes to do once everything is ready) to convert my database content from ISO-8859-1 to UTF-8. Thanks a lot to all those who helped me through this — and they are many!

First thing, dump the database. I always forget the command for dumps, so here it is:

mysqldump --opt -u root -p wordpress > wordpress.sql

As we’re going to be doing stuff, it might be wise to make a copy of the working wordpress database. I did that by creating a new database in PhpMyAdmin, and importing my freshly dumped database into it:

mysql -u root -p wordpress-backup < wordpress.sql

Then, conversion. I tried a PHP script, I tried BBEdit, and they seemed to mess up. (Though as I had other issues elsewhere, they may well have worked but I mistakenly thought the problem was coming from there.) Anyway, command-line conversion with iconv is much easier to do:

iconv -f iso-8859-15 -t utf8 wordpress.sql > wordpress-iconv.sql

Then, import into the database. I first imported it into another database, edited wp-config.php to point to the new database, and checked that everything was ok:

mysql -u root -p wordpress-utf8 < wordpress-iconv.sql

Once I was happy that it was working, I imported my converted dump into the WordPress production database:

mysql -u root -p wordpress < wordpress-iconv.sql

On the way there, I had some trouble with MySQL. The MySQL dump more or less put the content of all my weblog posts on one line. For some reason, it didn’t cause any problems when importing the dump before conversion, to create the backup database, but it didn’t play nice after conversion.

I got this error when trying to import:

ERROR 1153 at line 378: Got a packet bigger than 'max_allowed_packet'

Line 378 contained half my weblog posts… and was obviously bigger than the 1Mb limit for max_allowed_packet (the whole dump is around 2Mb).

I had to edit my.cnf (/etc/mysql/my.cnf on my system) and change the value for max_allowed_packet in the section titled [mysqld]. I set it to 8Mb. Then, I had to stop mysql and restart it: mysqladmin -u root -p shutdown to stop it, and mysqld_safe & to start it again (as root).

This is not necessarily the best way to do it, and it might not work like that on your system, but it’s what I did and the site is now back up again. Comments welcome, and hope this can be useful to others!