Progress in MySQL Process List

Today I had a sort of short epiphany regarding getting progress of running statements in MySQL. MySQL already keeps a running count of rows touched in most multi-row statements (called thd->row_count1), so I figured there must be a way to make use of it. It was trivial to expose row_count through SHOW PROCESSLIST. After that, it was fairly obvious that another variable could be added: row_count_expected. For certain statements (currently only ALTER TABLE) it is easy to estimate how many rows will be touched, so that number can be used to calculate a Progress_percent column.

The Progress_rows number indicates progress within a given step of executing the query. For instance, if you run a SELECT with a GROUP BY that can’t be done using an index, you will see two cycles of Progress_rows: once with a State of “Copying to tmp table” and once with “Sending data”.

I implemented this all in a small patch to MySQL 5.0 (and backported to MySQL 4.1) which produces the following output from SHOW FULL PROCESSLIST:

mysql> show full processlist G
*************************** 1. row ***************************
              Id: 1
            User: jcole
            Host: localhost
              db: test
         Command: Query
            Time: 3
           State: copy to tmp table
            Info: alter table sclot__a type=myisam
   Progress_rows: 44141
Progress_percent: 76.09

This was really way, way too easy. Hopefully it can be one with MySQL Community soon.

1 Note that currently thd->row_count is a 32-bit unsigned integer, so it will wrap at about 4.2 billion rows. Someone should really think about fixing this. :)

Corrupted relay logs?

I just opened MySQL Bug #26123, to attempt to find out how many people are seeing this possible replication bug. A few Proven Scaling customers have seen the same bug, and I haven’t been able to reproduce it, so I opened a bug as a feeler. It appears to have something to do with using BLOB or TEXT fields in replication.

Are you seeing slaves stop with corrupted relay logs? Does restarting replication using CHANGE MASTER and the Exec_Master_Log_Pos from the stopped slave1 work just fine? Do the master’s binary logs look perfectly OK? Leave a comment on the bug.

1 This effectively forces the slave to re-download the exact same log events that it currently has in its relay logs. Since the corruption appears to happen either in the master’s slave thread, or the slave’s replication IO thread, this gets things going again.

MySQL Meetup Silicon Valley now at Google

For the past year I’ve been running the MySQL Meetup Silicon Valley, and it’s been fun. We normally either have open discussion, or a scheduled topic. I often present something. Starting with the February 12 Meetup we will be meeting at Google HQ in Mountain View, CA. (Thanks, Google!)

Feel free to come on down or up to Mountain View and hang out with us on the second Monday of each month! Wanna speak at one of the Meetups? Let me know!

Colbert on Bush on Healthcare

In reference to Bush’s suggestion during the State of the Union address that a tax break will help poor people afford health insurance, Steven Colbert said:

It’s simple: Most people who can’t afford health insurance are too poor to owe taxes, but if you give them a deduction from the taxes they don’t owe, they can use the money they’re not getting back from what they haven’t given to buy the health care they can’t afford.

Brilliant. Just brilliant.