I’ve been working on a new project to fulfill a specific need: consistent, fast, cheap, and flexible backups for MySQL, for all storage engines1. To that end I’m creating a tool called dbsnapper—a plugin-based backup tool. The tool itself is very basic and handles a few jobs: getting configuration information from the user, running through a “run sheet” of different configurable tasks, and reporting status and errors to the user.
The tasks then—the actual backup steps—are fully configurable, via plugins. In fact, the whole process isn’t even MySQL specific, and can potentially be used for PostgreSQL2 and other database as well. Remember the requirements for backups (above):
- Consistent—We need to do some locking inside MySQL to make sure that the backups are consistent, for both MyISAM and InnoDB tables. This generally means the FLUSH TABLES WITH READ LOCK command.
- Fast—There are two ways to get fast, but they both involve snapshotting: either inside the database, or on the volume level. The best way to get a backup quickly is by using Linux’s LVM, the Logical Volume Manager, to take a snapshot of the whole filesystem. Using mysqldump for backups fails miserably on this point.
- Cheap—Well, backups should be free, and open source. Sorry ibbackup, sorry commercial utilities, become open source and we’ll talk.
- Flexible—Everyone wants to do something slightly different with their backups, and in order for them to use one common tool, that tool needs to be very flexible. Most backup tools for MySQL are completely inflexible (other than the destination of the backup files). People often have slightly different requirements, why not try to make a single tool work?
It’s possible to meet all of the above requirements right now, but you would likely have to write your own backup script. When writing that script, you would likely do the minimum to make it work in your environment. Why should everyone write their own? My project3, dbsnapper is designed from the start to handle backups in a flexible and configurable way—to allow the user to decide what tools and processes to use, but to do it for them.
Keep an eye out, I’ll blog again once I’ve published the code!
1 Yes, yes, I know about the blue-sky internal online backup plans. Need I mention that online backup was originally planned for 4.0? Then it was moved to 4.1, where it would definitely get done… then to 5.0, a major new version, surely it will get done then. Now it’s likely not going to make it in 5.1, and slotted for 5.2, as far as I know. In the end, even if it does get done, that still doesn’t help people who want to backup their 4.0 or 4.1 installations, which is very common.
2 If someone is interested in working on the plugins for PostgreSQL, let me know, and I’ll give you a nudge once the plugin API is stable!
3 It need not be only “my” project. Anyone interested in helping?