Regression in MySQL server localization from 5.0 to 5.6

MySQL server has supported localization of error messages since the very beginning, and its implementation has gone through a few revisions:

  • Through 4.1, a separate language/errmsg.txt file for each language, with one message per line, and each file in a language-appropriate character set.
  • In 5.0 and 5.1, a single errmsg.txt file with a group of translations for each message, but with different character sets for each language (making the file very difficult to edit with any editor).
  • In 5.5 and 5.6, a single errmsg-utf8.txt file with the same structure as in 5.0 and 5.1, but with all messages in UTF-8 (whew!).

In the early days, folks at MySQL tried to translate all of the error messages fairly frequently, keeping most localizations relatively up to date. Many volunteers also translated the messages into their favorite language and contributed those files.

In recent years, however, the vast majority of error messages added to the message file are in English only, or at most English and one other language1 (presumably the author’s native language, often German). While the number of unique error messages in English has increased from 481 to 862 between 5.0 to 5.6, all other translations with the exception of German, Swedish, and Japanese2 have been almost entirely stagnant.

This chart shows the percentage of translated messages available by MySQL version from 5.0 through 5.6:

This chart shows a few of the major supported languages, showing the utter stagnation of translated messages for all languages since 5.0, and for German since 5.5:

(Click on the graphs to see the Google Docs Spreadsheet, which is rendered a lot better than its image export.)

The best-supported language (after English) is German, but even it has fallen from 94% translated in MySQL 5.0 to only 77% translated in MySQL 5.6. Swedish, which was once one of the sacred translations, has fallen from 53% translated in 5.0 to only 39% in 5.6. Eliminating English and German (as high outliers) and Bulgarian (as a low outlier), the average translation completeness in MySQL 5.6 is less than 25%.

Is it useful to actually have multiple language support if it is this woefully incomplete3? For most of these languages, even if the user goes to the trouble to enable their alternate language, 75% on average of the messages they see will be in English. Is that really any better than 100%?

Have Oracle given up on maintaining the error message translations? Would community effort to get them all updated be welcome? Would it be useful to rip out this mess and start over with a more standardized and mature localization framework?

1 Bizarrely, in MySQL 5.6, Georgi Kodinov added Bulgarian as a supported language, with exactly one translated message supported.

2 It appears that Japanese got a major overhaul by Yasufumi Kinoshita, removing the unused “jps” variant and adding and adding a bunch more translations to the “jpn” variant. Alas, it is still quite incomplete at only 34% translated in 5.6.

3 Leaving aside the any discussion about the way that languages are implemented in MySQL currently, which is not awesome.

16 thoughts on “Regression in MySQL server localization from 5.0 to 5.6

    • Wlad: As a largely monolingual American I would not be in the best position to answer that. My primary point was that either the localization should be kept up (to at least 90%+ of messages translated), or the whole thing should be removed. Enabling non-English language and still getting 75% of the messages in English is a bad scenario, and kind of silly.

  1. I’m trilingual original Eastern European. I have a big problem with localized messages. They are ungooglebar. English, no matter how cryptic, are 1) googlebar and 2) that is a language understood by every programmer worth its salt. As for removing the whole thing , I do not know. It is useful, if someone checks that English in error messages is not embarassingly bad, and it is easiest to do if all messages are in one file, rather than spread with printfs through the whole source code.

  2. So, personally I would not mind to remove all languages except for English. Unless there are volunteers who want to support their own language (like presumably Stefan was doing with German) in 100% of all messages. 75% english and 25% native does nobody good , indeed.

    • Wlad: I can see the issues with that. If you get a “rare” error, and you get the message in Estonian, it’s even far rarer. However, there are solutions for that: For example both the English normalized and localized message could be sent, solving both problems at the same time.

  3. Part of the problem is that translators are used to GNU gettext (po files) and not to this custom way of managing translations. Also most translation programs are not capable of using this type of file. Translation programs are often equipped with a translation cache which really speeds up translations. Also launchpad’s translation system can’t be used.

    There is documentation for the error messages:
    http://dev.mysql.com/doc/internals/en/error-message-adding.html

    I do think that localized error messages can help, especially for new users. If non-DBA, non-developer users use the MySQL Excel plugin they might encounter an error which is send from the server. So it helps to make MySQL more user friendly.

    • Daniël: I think your argument is really for redoing the whole thing though. In your example case, since the language is set on the server side, there’s a very good chance that the user doesn’t have the opportunity to set the language at all, since he is not the administrator.

      • FWIW, in MySQL 5.5 and up there is a lc_messages variable, which a user can use to set the locale for error messages. Although I wonder if this gets automatically set according to the user’s locale.

  4. Error messages in a local language are better for non-technical users. So errors about SQL syntax problems etc are probably helpful for end users. There are a large number of users connecting to the databases I manage who access the system via ODBC. I can imagine they might prefer to see errors in a language they better understand.

    So if the translation can be done in multiple languages that’s probably better, but then it would indeed be helpful to have an easy way to identify in each language which error messages need translating and providing better non-technical infrastructure so people can more easily provide those translations.

  5. “So errors about SQL syntax problems etc are probably helpful for end users” se4ems to be a wrong conclusion. End-users are not technical, they do not write SQL. Error messages that MySQL spits out are not actionable for and end-user.

  6. Maybe it is about time to stop using the NIH errmsg*.txt format and to replace it with something gettext() based? There are plenty of online translation platforms, like e.g. transifex.net, or even the translation interface on launchpad itself (where mysql bzr mirrors and most of the “forks” main repositories) are hosted, that can deal with the gettext() .po file format just fine …

  7. To me localized German error messages are usually confusing, too, but we simply can’t assume that everyone understands English as at least a 2nd language, even less so being comfortable with it.

    In former West Germany it shouldn’t be much of an issue as almost everyone learned English at school starting from fifth grade, and even here i know enough people in IT jobs who constantly ask for German documentation and localizations.

    Other western Europe countries are already more tricky, like France where nobody seems to be willing to freely admit being able to speak other languages at all … or Italy where MySQL AB never had many support customers as we were offering English support only … and as you get even furhter east in Eurasia you should rely on “they should understand English just fine” less and less …

    In a former life i dind’t understand the need for a German translation of the PHP (and MySQL) manuals either, but having one does indeed help a lot of people (so i spent quite some time writing for it even though i’ll never look up anything in it), and i’m meeting an “but this is English only, don’t you have something German that i could read instead” attitude even with programmers much younger than me … i don’t understand that attitude, but it exists, and we probably won’t get rid of it any time soon …

  8. Wlad: “end user” can as well be someone just “clever” enough to set up a WordPress or PhpFAQ or $whatever … instance on a LAMP or WAMP machine without too much of in-deep knowledge … still an understandable error message can make the difference between “oh, i may be able to fix this” and “WTF? ok, i’ll rather use something else instead …”

    • Agreed. End-user in this case is not the literal “end” user of a website per se, they are someone who is either a DBA or database user or is pretending to be one and connecting directly to a database. They do in fact need error messages that are clear and actionable, and while I can’t speak to it personally given my monolingual state, I too have observed appreciation from folks who are quite good technically but can’t understand English error messages. (Many of the English error messages are not that good, anyway, but that is wholly another problem.)

    • Imagine – some German user will install WAMP following the instructions he read in a blog,. and then get a message “You have an error in SQL syntax near ‘foo'”. WTF he thinks. It talks English to me. But suppose MySQL talks German to him and he gets instead “Fehler im SQL Syntax hier ‘foo'”. Will that be any more helpful to that guy?

      • From a logic point of view it shouldn’t make a difference (assuming that every German has learned English at school, and hasn’t forgotten everything about right afterwards) …

        From a support experience point of view it sure does make a difference to some, even to those that are actually able to communicate in English when forced to … it is not only a language skill thing but also a user acceptance thing …

Leave a reply to Wlad Cancel reply