I love you, Akismet

Blogging used to be fun.

Then it started to suck. Spam sucked. Life sucked.

Now life is good. Spam is no more. Matt told me to use Akismet; I was skeptical. I am no longer skeptical. I love you, Akismet.

Akismet has caught 501,725 spam for you since you first installed it.

Yup. Since January 15.

On iostat, disk latency; iohist onward!

Just a little heads-up and a bit of MySQL-related technical content for all of you still out there following along…

At Proven Scaling, we take on MySQL performance problems pretty regularly, I’m often in need of good tools to characterize current performance and find any issues. In the database world, you’re really looking for a few things of interest related to I/O: throughput in bytes, requests, and latency. The typical tool to get this information on Linux is iostat. You would normally run it like iostat -dx 1 sda and its output would be something like this, repeating every 1 second:

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 8.00 0.00 4.00 0.00 96.00 0.00 48.00 24.00 0.06 15.75 15.75 6.30

Most of the output of iostat is interesting and reasonable for its intended purpose, which is as a general purpose way to monitor I/O. The really interesting things for most database servers (especially those in trouble) are:

avgrq-sz — Average request size, in kilobytes.
avgqu-sz — Average I/O queue length, in requests.
await — Average waiting time (in queue and scheduler) before servicing a request, in milliseconds.
svctm — Average total service time for I/O requests, in milliseconds. This includes await, so should always be higher than await. This is the most interesting number for any write-heavy transactional database server, as it translates directly to transaction commit time.
%util — Approximate percent utilization for the device.

There are one major problem with using iostat to monitor MySQL/InnoDB servers: svctm and await combine reads and writes. With a reasonably configured InnoDB, on a server with RAID with a battery-backed write cache (BBWC), reads and writes will have very different behaviour. In general, with a non-filled cache, writes should complete (to the BBWC) in just about zero milliseconds. Reads should take approximately the theoretical average time possible on the underlying disk subsystem.

I’ve often times found myself scratching my head looking at a non-sensical svctm due to reads and writes being combined together. One day I was perplexed enough to do something about it: I opened up the code for iostat to see how it worked. It turns out that the core of what it does is quite simple (so much so, I wonder why it’s C instead of Perl) — it opens /proc/diskstats, and /proc/stat and does some magic to the contents.

What I really wanted is a histogram of the reads and writes (separately, please!) for the given device. I hacked up a quick script to do that, and noticed how incredibly useful it is. I recently had to extend it to address other customer needs, so I worked on it a bit more and now it looks pretty good. Here’s an example from a test machine (so not that realistic for a MySQL server):

util:   1.27% r_ios:     0  w_ios:     1  aveq:     0,
ms : r_svctm                     : w_svctm
 0 :                             :
 1 :                             :
 2 :                             :
 3 : x                           :
 4 : x                           :
 5 : xxx                         :
 6 : xxxx                        :
 7 :                             :
 8 : x                           : x
 9 : x                           : xx
10 : x                           : xxxxx
11 :                             : xxxxxxxxxxxxxxx
12 :                             : xxxxxxxxxxxxxxxxxxxxxxxxx
13 : xx                          : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
14 :                             : xxxxxxxxxxxxxxxxxxxxx
15 : xx                          : xxxxxxxxxxx
16 : x                           : xxxxx
17 : x                           : xxxxxx
18 :                             : xxxx
19 : x                           : xx
20 :                             : x
21 :                             : x
22 :                             : x
23 :                             : x
24 :                             : x
25 :                             :
26 :                             :
27 :                             :
28 : x                           :
29 :                             :
30 :                             :
++ : 0                           : 250

It uses Curses now to avoid redrawing the entire screen, and I’ve got a ton of ideas on how to improve it. I have a few more must-haves before I release it formally to the world, but I wonder what more features people would want from it. It is Linux-only for the foreseeable future.

What do you think?

MySQL Scaling and High Availability Architectures

Eric and I gave a 3-hour tutorial on Monday morning at the MySQL Conference and Expo 2007 entitled “Scaling and High Availability Architectures“. The slides aren’t great, and of course you should have been there. :) Nonetheless many of you will find the slides useful, and we’re keen to provide them, so here they are:

MySQL Scaling and High Availability Architectures

If you have any questions or need help validating or understanding how our recommendations may fit into your environment, contact us.

DorsalSource: MySQL Community Build Site Launched

Back in late 2006, MySQL AB decided to split (or “fork” for the more common open source term) their source code and release structure into two parts: “Community” and “Enterprise”. This has caused quite a lot of stirring in the MySQL market, and a lot of confusion about what exactly the difference is and how the release structure works. It’s actually not easy to really explain the new structure, and I won’t try here. The key point for the purposes of this discussion, is that MySQL is effectively no longer providing builds (binaries) of their community releases, and they don’t provide enterprise builds at all to the public. They are providing source releases of community, but fairly infrequently.

This is a big problem, because it means that there is no realistic way for the average developer or MySQL user to get regular builds (and ones that quickly address bugs) without paying MySQL for a support contract. Most of Proven Scaling‘s customers do not have support contracts with MySQL AB, and have been quite unhappy about the change, and personally I’ve lost a reasonable way to get new features pushed out to the public without excessive delays (up to 6 months to the next community release, and years for enterprise).

Back when MySQL originally polled me on this issue and told me of their plans, I told them that they would just force the community to repair the “damage” in the ecosystem, by providing the builds themselves. I even warned that the one doing the repairing could possibly be myself and Proven Scaling…

So, to get to the point…

This week, at the MySQL Conference and Expo, Solid Information Technology and Proven Scaling have announced a collaborative project to address the needs of the community for frequent releases with interesting features, bug fixes, and new patches: DorsalSource. Immediately, we have begun providing over 40 binaries of MySQL community and enterprise forks on Linux, Mac OS X, and Windows. We plan to add many additional platforms and variations of the builds, in addition to many more features to build a real developer community around MySQL.

We will continue the development of DorsalSource, adding many great new features—we have a ton of ideas, it’s just a matter of implementing them and getting them out there for users to use. If you have any questions, comments, or ideas for how to improve the site, or really anything at all, please feel free to contact me directly or leave a comment here.

MySQL Job Fair at MySQL Conf & Expo 2007

One of the main questions we get at Proven Scaling is “Where do we find MySQL DBAs, Architects, and Developers?”

Next week is the MySQL Conference and Expo in Santa Clara, California. Inspired by the success of our impromptu job board at MySQL Camp last year (although whiteboards with no moderation tend to get messy), we decided to make a more formal job board this year. Something much more prominent, more organized, cleaner, and easier for both employers and job seekers to use. With great thanks to Solid for allowing us to use some of their Expo Hall space, we will have a formal job board in the Expo Hall, which is open April 24th and 25th.

We will have several “wall” panels at the edge of Solid’s Expo Hall space, and a kiosk (with Internet access) where you can enter your own job postings on the spot. Each posting will be printed out as a nice PDF, which will be mounted on the wall, and take-away cards will be printed with your contact information for candidates to take with them.

Posting your open jobs is completely FREE as a service to the community.

If you’re not going to the conference yourself, you just want to get your posting in early (please do!), or you want to post from the comfort of your own computer, head over to ExpoJobFair and sign up to post your open jobs!

Jeremy Cole

Geek, electronics nerd, database nerd, aviation nerd, father of three.