On MySQL forks and MySQL’s non-Open Source documentation

All of this talk of Drizzle (a fork of the MySQL server by Brian and Monty) has reminded me of a topic I have wanted to discuss for quite some time…

One of the things that sets MySQL apart (in, IMHO, a very bad way) from other Open Source database projects/products such as PostgreSQL (license) and Firebird (license) is that the MySQL documentation is NOT Open Source. The MySQL documentation is and always has been copyright MySQL AB, and “… use of this documentation, in whole or in part, in another publication, requires the prior written consent from an authorized representative of MySQL AB”. This presents a major impediment to forking the server: who wants to re-write many hundreds of pages of documentation on things that haven’t even changed? Even if you’re OK with all of your new code being GPL (and anyone forking ought to be), and never being able to dual license or re-license your MySQL fork (oh well), you will have to start with no documentation, or publish an errata against the official MySQL documentation (which will only go so far).

Does this mean that MySQL is not really Open Source? I would say not exactly, although I could probably be convinced either way. Others may say yes. But it does go a long way to making the point that some things may not be quite as “open” as they initially appear. What do you think? Can a piece of software really be Open Source while its primary/only documentation is not? Did you know that the MySQL documentation was not Open Source?

MySQL Cluster splits from MySQL — not pluggable?

Kaj just announced that as of 5.1.25, MySQL Cluster will no longer be included in the normal MySQL packages. Instead, a new branch is being created for MySQL Cluster releases, where the first release is informally called “6.2.15” but the releases are really named “mysql-5.1.23-ndb-6.2.15”. This new branch is based on the MySQL Cluster Carrier Grade Edition of MySQL.

Overall, this seems like a good idea — the needs for releases, schedules, pressures, etc., are at odds between MySQL Cluster and MySQL’s core. I am, however, baffled by the decision of how to release the new product: coupled with MySQL as a single monolithic package with compiled-in storage engine. After all, 5.1 has long been touted to have this amazing new pluggable storage engine architecture. Why not use it?

With Oracle/InnoDB recently announcing that they are decoupling from MySQL and releasing their storage engine as a plugin, this would make so much sense. The only thing I can think of is that MySQL and/or the MySQL Cluster team must think that the pluggable storage engine concept is not workable in its current state or easy enough to use… which I would agree1 with absolutely. However, having a team within MySQL really pushing a product and using the pluggable interface to make their releases would help dramatically to make the interface really usable for the rest of the world.

Why not do it? Let’s hear it…

1 My short-list of gripes, by no way complete: The error message interface sucks, no way to compile a plugin outside of the MySQL source (or even symlink it into the source tree), plugins are tied to an exact version of MySQL server, and no reasonable way to manage plugins in an OS context (RPMs, .debs, etc).

Now available: Proven Scaling MySQL yum repository

Yum is an extremely popular system to download, install, and update RPM-based packages from multiple repositories. Proven Scaling has launched a set of repositories to augment the existing central distributions’ repositories with packages our customers need for deploying MySQL-based systems. We’ve been working on it for a while, and have had many people making use of it. We are providing:

  • RPMs of community and enterprise releases of MySQL for RHEL/CentOS, as built by MySQL and distributed on MySQL.com
  • RPMs of community tools such as maatkit and innotop and their dependencies.
  • Proven Scaling-created tools such as mysql_snapshot (an LVM snapshot-based backup utility).
  • Difficult to find RPMs of Perl libraries (dependencies for other scripts, such as innotop).

Here are the yum repositories we are providing:

To install these repositories, grab the .repo file and place it in /etc/yum.repos.d/. You should then be able to install packages using e.g. yum install maatkit. Here are the .repo files:

We hope you like them and find them useful! Let me know if there are any additional packages you think we should add.

Bravo Oracle: InnoDB Plugin 1.0 released

Yesterday, Oracle‘s Ken Jacobs and Heikki Tuuri, creator of InnoDB, have announced the immediate release of InnoDB Plugin 1.0 for MySQL 5.1. I’ve already downloaded it and played around with it a bit. I haven’t had time to do any performance benchmarks or similar just yet. Those will come in due time.

This release is the beginning of two exciting things: InnoDB is now officially decoupled from MySQL release-wise, and lots of new features have been added to this new release. I will come to what the decoupling means in a moment, but first, the major new features in this release of InnoDB (from my perspective, and with my commentary):

  • Fast in-place index management — The ability to add and drop indexes without rebuilding the entire table in place. This isn’t a complete implementation of the long-awaited “online ALTER TABLE“, as that is mostly a MySQL problem rather than an InnoDB problem.
  • Compression of data and indexes — This should allow data size on disk to be reduced substantially.
  • Storage of entire BLOB, TEXT, and VARCHAR data off-page — This can allow more efficient PRIMARY KEY indexes (where the data is stored in InnoDB because of the index clustering) on tables with BLOB, TEXT, or large VARCHAR columns
  • New information_schema tables — InnoDB is now providing a lot more visibility into what is going on internally. I’m hopeful to extend this even further.
  • InnoDB-specific “strict mode” — Don’t allow InnoDB to fudge things internally, forcing it to error in circumstances it can’t handle, rather than just give a warning or silently continue.

All of the above features look excellent, but one of the more interesting aspects of this announcement is the fact that MySQL and InnoDB are now decoupled for release. That is, Oracle can make a release of InnoDB without having to wait for MySQL to make their own release. While this will make it slightly more difficult to describe what version of MySQL/InnoDB you’re using (especially without a way to find that out from MySQL), it has the potential to make the release process much quicker and more efficient.

I’m quite hopeful, and from what I have seen in the recent past with the collaboration between Heikki, Ken, and Yasufumi (an outside contributor) on fixing InnoDB performance bugs, quite positive, that Oracle will be a lot more accepting of outside patches and code contributions to the InnoDB codebase than MySQL has been recently. I’ve got a lot of ideas for new features targeted at manageability that I’d like to get implemented. I’d love to hear Oracle’s comments on how they will accept patches to InnoDB now, and what they see the community interaction looking like in the future. Ken, any comment?

The only negative aspects I can see with this announcement are:

  • Oracle spent a couple of years working on this in silence, away from the community. Most everyone was surprised by this release, as we haven’t seen anything from Oracle about InnoDB in a long time, and to some extent we were sitting in the corner hopeful that Oracle really doesn’t plan on killing the project. While I understand that perfectly well from a business standpoint, I am hopeful that working for long periods away from the community can be minimized in the future, so that we can all stay more involved.
  • It is now a bit harder to get a new MySQL/InnoDB installation up and running, as the newest InnoDB is not part of the MySQL packages anymore. I’m sure this can be cleaned up with some smart RPM (etc.) packaging.

Overall, I am excited about this announcement, and quite happy that Oracle is making some serious contributions and commitments to maintaining InnoDB. Thanks for all your hard work, Ken and Heikki and the rest of the InnoDB team! Let me know if there’s anything I can do for you!