New Storage Engines: A welcome change

There’s been a lot of buzz lately about new storage engines (Solid’s SolidDB and Jim Starkey’s Falcon) being developed for MySQL. Quite a few people have asked me what I think about them, and if it’s really a seamless process to switch storage engines. Everybody still has Oracle’s acquisition of Innobase Oy fresh on their minds, so nobody is really terribly surprised by the recent announcements. As for my opinion on the matter, well, it’ll take some discussion. I was quoted in ComputerWorld‘s article MySQL to encourage partners to build data storage engines:

Jeremy Cole, who oversees about 8,000 installations of the open-source database at Yahoo Inc., said the Sunnyvale, Calif.-based Web firm uses MyISAM for applications mostly requiring the reading of data—and InnoDB when many users may be writing data simultaneously.

Cole called InnoDB “great,” but also said it is “somewhat poorly integrated” with MySQL, lacking several common features such as full-text search and online configuration changes, while poorly supporting “referential integrity,” which keeps the relationships between data tables consistent.

Furthermore, the only way to do “reasonably fast online hot backups” with InnoDB is a closed-source tool called ibbackup, which is now owned by Oracle.

“If a new storage engine offered InnoDB’s current feature set without the above problems, and was stable, I would switch in a heartbeat,” Cole said. However, he doesn’t expect any of the unnannounced storage engines “to really be ready for use for another year or so.”

I wanted to follow up and provide a bit more depth and context, and some of the technical details that were not completely appropriate for ComputerWorld’s audience:

InnoDB has been great in that it has row-level locking, supports multi-versioning and isolation. That solves quite a few problems for transactional heavy-write applications. However, it’s somewhat poorly integrated into MySQL, and has some problems of its own. To name a few:

  • InnoDB’s only option for reasonably fast online hot backup is a tool called ibbackup, which is closed-source, was previously owned and sold by Innobase Oy (for about $1400 per server) and is now owned by Oracle. I haven’t heard anything about what Oracle intends to do with ibbackup.
  • InnoDB doesn’t allow any online configuration changes, as the rest of MySQL (and MyISAM) does. (Technically: It doesn’t support using the SET command to change its configuration on-the-fly.)
  • InnoDB doesn’t support full-text search—the ability to search for words within text documents. This is a feature used by many web applications. Effectively, users must choose between no transactions, table level locking, and full-text search (MyISAM), or transactions, row-level locking, and no full-text search (InnoDB). It’s sometimes a very painful choice.
  • Innobase Oy / InnoDB made a very “cowboy” effort to support foreign keys a.k.a. “referential integrity”—instead of working with MySQL to support it as a in-built feature of MySQL itself, they basically duck-taped it onto InnoDB itself. This has caused a lot of headaches for many people, including but not limited to completely useless error messages and mysterious failures.
  • InnoDB’s tablespace management leaves a lot to be desired. There are no online tablespace management commands (CREATE TABLESPACE, DROP TABLESPACE, ALTER TABLESPACE). You have two options:

    • All data for all tables and databases is stored in a single common set of files, with no online management, no ability to shrink the tablespace, the only way to add space is to set one of the files (and only one!) to auto-extend, and no ability to move data between the tablespace files; or
    • Each table’s data is stored in a single .ibd file—this file will be as large as the size of the table, so if you have 500GB of data in one table, you have a single unmanageable 500GB file. This file can never be shrunk, it can only grow.

Switching storage engines in MySQL is actually almost as easy as they claim. However, I don’t expect any of these new storage engines to really be ready for use for another year or so. There are a lot of integration issues to be had when pulling in a new storage engine. InnoDB had quite a lot of bugs in the first year due to this as well.

Will I use them? Absolutely.

One thought on “New Storage Engines: A welcome change

  1. As for the full-text search, the one which is built in into MySQL is in fact very poor, too. It’s next to impossible to search through a few hundred thousand records with MySQL FT.

    So in *both* cases, the solution would be to use some external search engine. And of those, I’d like to bring your attention to Sphinx ( http://sphinxsearch.com/ ), which is GPL, integrates with MySQL pretty good, and allows you to set up searching through both MyISAM and InnoDB easily.

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s