utf8 lessons


The other day, I was working on a Drupal site. I based the theme for the site on the Zen theme (ver 1.1).

I was experiencing a very strange problem: I could not save any of the changes I made on the theme configuration page. I tried to change the logo. I tried to make it display the slogan. Nothing worked.

I started editing the database by hand, throwing serialized php arrays in there to set the settings I wanted. I watched in horror as I SELECTed the row, loaded the page, and SELECTed the row again, only to find that somehow my settings had been overwritten with the defaults.

"Is this some bug in the zen theme?" I thought. If so, it would be pretty major. Someone surely would have caught this by now.

No. No it wasn't. It turns out that my mysql database was using latin1 encoding rather than utf8. The settings for the theme included a fancy unicode character as the "breadcrumb separator". This caused the database to freak out and throw nonsense into the "variable" table. When I inspected the SELECT statements carefully, I found that the edit I made by hand did not get faithfully transferred into the database. This caused the serialized php array to be invalid. Drupal then helped me by sticking the default values in there, just to have something that made sense.

latin1 encoded datbases cause nothing but problems. utf8 databases have nothing but solutions.

So I dumped my data, dropped the database, recreated it using utf8 explicitly, and then reloaded my data. I was saved.

(this was the create-database syntax I used.

  1. CREATE DATABASE cooldrupaldatabase CHARACTER SET utf8;

Okay, but why? Why do I need to set that explicitly. We live in a utf8 world. Why would utf8 be the default?

It turns out that I had solved this problem awhile ago on another computer, so you'll have to excuse me for not having the links anymore.

Debian feels that backwards compatibility is the most important thing with their mysql package. They don't switch the default to utf8 because this will somehow break backward compatibility with debian systems that have been running for a long time? I guess? This is what they say. Ubuntu carries forward the debian mysql package, including this default setting.

So, to fix it, you can add a file to your /etc/mysql/conf.d that tells it to default to utf8. I have attached the file. Once again, I'm sorry that I forgot where I got it from.

Your rating: None Average: 3.6 (7 votes)
utf8.cnf348 bytes


I heart Ryan, too!