Temporary tables for SHOW VARIABLES in MySQL 5.0

In MySQL 5.0, with the introduction of information_schema, the SHOW commands were changed to really be wrappers around SELECTs against information_schema. This means that when you issue e.g., a SHOW VARIABLES command, Created_tmp_tables is incremented once, and Handler_write and Handler_read_rnd_next are incremented about once per row in the result. Keep this in mind in your monitoring systems; these numbers could be wildly inaccurate if some part of your application is doing many SHOW commands.

This came to my attention because Connector/J’s ReplicationDriver runs SHOW VARIABLES LIKE 'tx_isolation' whenever you switch between setReadOnly(true) and setReadOnly(false), so in large, complex Java applications don’t be surprised if you see a high number1 of Created_tmp_tables. How can this be solved? Use SELECT instead of SHOW. If you need to access one variable, instead of doing:

SHOW VARIABLES LIKE 'tx_isolation'

This will copy all variables into a temporary table, and apply the LIKE filter to that temporary table.

Use this instead:

SELECT @@session.tx_isolation

This will just fetch the tx_isolation variable from memory directly. This is much more efficient and to-the-point.

1 Like, say, 8,000 per second. Creating so many temporary tables burns up a pretty big chunk of CPU time for nothing.

Progress in MySQL Process List

Today I had a sort of short epiphany regarding getting progress of running statements in MySQL. MySQL already keeps a running count of rows touched in most multi-row statements (called thd->row_count1), so I figured there must be a way to make use of it. It was trivial to expose row_count through SHOW PROCESSLIST. After that, it was fairly obvious that another variable could be added: row_count_expected. For certain statements (currently only ALTER TABLE) it is easy to estimate how many rows will be touched, so that number can be used to calculate a Progress_percent column.

The Progress_rows number indicates progress within a given step of executing the query. For instance, if you run a SELECT with a GROUP BY that can’t be done using an index, you will see two cycles of Progress_rows: once with a State of “Copying to tmp table” and once with “Sending data”.

I implemented this all in a small patch to MySQL 5.0 (and backported to MySQL 4.1) which produces the following output from SHOW FULL PROCESSLIST:

mysql> show full processlist G
*************************** 1. row ***************************
              Id: 1
            User: jcole
            Host: localhost
              db: test
         Command: Query
            Time: 3
           State: copy to tmp table
            Info: alter table sclot__a type=myisam
   Progress_rows: 44141
Progress_percent: 76.09

This was really way, way too easy. Hopefully it can be one with MySQL Community soon.

1 Note that currently thd->row_count is a 32-bit unsigned integer, so it will wrap at about 4.2 billion rows. Someone should really think about fixing this. :)

On Triggers, Stored Procedures, and Call Stacks

If you’re a frequent reader, you might have noticed that I’m on a roll contributing MySQL patches by now… there are many more to come. This is part of the reason that I founded Proven Scaling — to be able to spend my time solving interesting problems and making MySQL better. So what about triggers, stored procedures, and call stacks?

I’ve written a patch that implements three new functions in MySQL 5.01:

  • CALLER(level) — Returns the SQL statement in the call stack at level, where 0 is the level containing the call to CALLER() itself (which is nearly useless), and 1 and above are any stored procedure or trigger calls that got us here.
  • CALLER_DEPTH() — Returns the current depth of the call stack, not counting 0. CALLER(CALLER_DEPTH()) will always return the top-most level (i.e. the user-generated command).
  • CALLER_WS(separator, level) — Returns the entire call stack starting at level, with each level separated by separator.

What are these good for? Quite a bit! For a few examples:

  • Debugging — From any place in your stored procedure code, you can now find out how exactly you got there, no matter how complex the code is. In the future, this could be augmented by allowing you to see the values of the parameters at each step as well.
  • Auditing — This is much more interesting for most people, and while the current implementation isn’t perfect2, it’s getting a lot closer to what people need for auditing purposes. Using triggers and CALLER(), USER(), and NOW() you can get most of the auditing information you may need.

Now, for some examples of the output from these new functions:

Just a test of CALLER() and CALLER_DEPTH():

mysql> select caller(0);
| caller(0)        |
| select caller(0) | 
1 row in set (0.00 sec)

mysql> select caller_depth();
| caller_depth() |
|              0 | 
1 row in set (0.00 sec)

In order to get anything interesting, we need to generate some depth of the call stack:

delimiter //
create procedure stack() begin select caller_ws(' <called by> ', 1); end //
create procedure ts1() begin call stack(); end //
create procedure ts2() begin call ts1(); end //
create procedure ts3() begin call ts2(); end //

Now we can try it out for real:

mysql> call ts2()//
| caller_ws(' <called by> ', 1)                              |
| call stack() <called by> call ts1() <called by> call ts2() | 
1 row in set (0.00 sec)

As an example of what you can do inside stored procedures:

DROP PROCEDURE IF EXISTS test.dump_stack //
CREATE DEFINER='root'@'localhost' PROCEDURE test.dump_stack ()
  DECLARE caller_depth INT DEFAULT 0;
  SET caller_depth = CALLER_DEPTH();

    depth INT NOT NULL,
    query TEXT, 
    PRIMARY KEY (depth)
  WHILE i <= caller_depth
    INSERT INTO test_stack_t (depth, query) VALUES (i, caller(i+1));
    SET i = i + 1;
  SELECT depth, query FROM test_stack_t ORDER BY depth DESC;
  DROP TEMPORARY TABLE test_stack_t;

We can then redefine stack() to call dump_stack() instead, so that we can use the same tsX() procedures above:

drop procedure if exists stack //
create procedure stack() begin call dump_stack(); end //

And the same test:

mysql> call ts2()//
| depth | query        |
|     3 | call ts2()   | 
|     2 | call ts1()   | 
|     1 | call stack() | 
3 rows in set (0.00 sec)

With a little additional magic, we can use CALLER() for auditing2:

DROP TABLE IF EXISTS test.audit //
CREATE TABLE test.audit (
  id INT NOT NULL auto_increment,
  user CHAR(64), 
  call_stack TEXT, 
  PRIMARY KEY (id), 
  INDEX (ts), 
  INDEX (user)

CREATE DEFINER='root'@'localhost' PROCEDURE test.audit ()
  INSERT INTO test.audit (ts, user, call_stack)
  VALUES (now(), user(), caller_ws(' <called by> ', 2));

CREATE TABLE test.test (
  c CHAR(50)

CREATE DEFINER='root'@'localhost' TRIGGER test_test_b_i
FOR EACH ROW CALL test.audit()

CREATE DEFINER='root'@'localhost' PROCEDURE addtest(in in_c char(50))
  INSERT INTO test.test (c) VALUES (in_c);

And, let’s try it out:

mysql> INSERT INTO test.test (c) VALUES ('jeremy') //
Query OK, 1 row affected (0.01 sec)

mysql> SELECT * FROM test.audit //
| id | ts                  | user          | call_stack                                  |
|  4 | 2006-10-01 18:07:39 | root@dhcp-100 | INSERT INTO test.test (c) VALUES ('jeremy') |
1 row in set (0.00 sec)

mysql> CALL addtest('monty') //
Query OK, 1 row affected (0.00 sec)

mysql> SELECT * FROM test.audit //
| id | ts                  | user          | call_stack                                                                |
|  4 | 2006-10-01 18:07:39 | root@dhcp-100 | INSERT INTO test.test (c) VALUES ('jeremy')                               |
|  5 | 2006-10-01 18:11:26 | root@dhcp-100 | INSERT INTO test.test (c) VALUES (in_c) <called by> CALL addtest('monty') |
2 rows in set (0.01 sec)

Let me know what you think!

1 Specifically mysql-5.0.25-nightly-20060823.

2 Currently, triggers are missing three important features to make them really useful for auditing: a way to find the call stack (which this patch addresses), the ability to have multiple triggers per type per table (currently MySQL only allows one trigger per table per type [before update, after insert, etc.]), and per-statement triggers (currently MySQL only has per-row triggers).

Followup: On IPs, hostnames, and MySQL

Back in April, I wrote On IPs, hostnames, and MySQL, which described the (sometimes surprising) ways in which MySQL deals with IP addresses, hostnames, and privileges, as well as some basics about the host cache itself. In a footnote to that post, I mentioned a patch I had written against MySQL 4.1 to give some more visibility into the host cache.

Over the past two days, I have worked on porting that patch to MySQL 5.01, 2, and making some fairly large improvements to it. The patch implements a few things:

  1. Configurable Size — Without the patch, the size of the cache is fixed at 128 entries, and can only be changed by changing a #define and recompiling. You may now tune the size of the host cache using SET GLOBAL host_cache_size=X.
  2. Metrics — You may now see how well the cache is performing using SHOW GLOBAL STATUS LIKE 'Host%'. (See example below.)
  3. Display — You may now see the complete contents of the cache using SHOW HOST CACHE. (See example below.)

The above new features should give MySQL DBAs much better insight into what’s happening in their MySQL server, in an area where there has historically been very little information and a lot of misunderstandings.

Now, for the examples:


mysql> show status like 'host%';
| Variable_name      | Value |
| Host_cache_free    | 127   | 
| Host_cache_hits    | 2     | 
| Host_cache_inserts | 1     | 
| Host_cache_misses  | 1     | 
| Host_cache_prunes  | 0     | 
| Host_cache_used    | 1     | 
6 rows in set (0.01 sec)


mysql> show host cache;
| IP        | Hostname | Errors | Hits |
| | hiriko   |      0 |    2 | 
1 row in set (0.00 sec)


1 Hopefully this will be accepted into MySQL 5.0. Brian?

2 The patch is against mysql-5.0.25-nightly-20060823 specifically.

On 1U cases, PCI risers, and LSI MegaRAID

I’ve been working with one Proven Scaling customer that has had some interesting issues recently, involving InnoDB corruption, resulting in messages similar to these:

InnoDB: Page checksum 3156980109, prior-to-4.0.14-form checksum 577557610
InnoDB: stored checksum 741279449, prior-to-4.0.14-form stored checksum 577557610
InnoDB: Page lsn 0 2323869442, low 4 bytes of lsn at page end 2323869442
InnoDB: Page number (if stored to page already) 195716,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be an index page where index id is 0 2831
InnoDB: (index PRIMARY of table db/table)
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 195716.

The problem was encountered when testing hardware for a move from software RAID to hardware RAID using LSI MegaRAID SCSI 320-2 cards. The servers are 1U machines with Tyan motherboards, and a PCI riser card which the MegaRAID plugged into.

They were receiving the same messages on several different machines, ruling out a single bad piece of hardware. After spending weeks trying to figure out what the problem could be, testing different configurations and isolating variables, it was tracked down to the PCI riser cards. Searching for “lsi pci riser” shows quite a few people having similar issues.

It turns out that LSI “does not support” using their cards with PCI risers, at all. Maybe they should reword things a bit—if their cards don’t work with PCI risers.

The scariest part of the whole exercise, though, is that the corruption was occurring completely silently: data comes in, is written to disk, but gets corrupted in flight. Since the OS wrote certain data it is now caching the correct copy of the data, but the disks contain something different. The only way the corruption is discovered is when the page is read back quite a bit later, after having been flushed from cache.

You’d think that somewhere along the line, the OS or the RAID card would catch the corruption?