InnoDB online index add and “The table ‘t’ is full” error

While trying to add an index to a fairly large table today, on a server1 I’d not worked on previously, I got the following error after some time (and while I was away from the computer):

mysql> ALTER TABLE t
    ->   ADD KEY `index_a` (`a`),
    ->   ADD KEY `index_b` (`b`),
    ->   ADD KEY `index_c` (`c`);
ERROR 1114 (HY000): The table 't' is full

The error log did not bring particular enlightenment (as usual, InnoDB is extremely verbose with the logs, without saying anything useful):

110105 16:22:30  InnoDB: Error: Write to file (merge) failed at offset 4 1387266048.
InnoDB: 1048576 bytes should have been written, only 888832 were written.
InnoDB: Operating system error number 0.
InnoDB: Check that your OS and file system support files of this size.
InnoDB: Check also that the disk is not full or a disk quota exceeded.
InnoDB: Error number 0 means 'Success'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.1/en/operating-system-error-codes.html
110105 16:23:00 [ERROR] /usr/sbin/mysqld: The table 't' is full

I had to re-run the command while keeping a close eye on things, and I discovered that it was writing significant amounts of data to the root file system (which isn’t very big, as usual). I looked in all the usual places, and didn’t see any files of note. However, on a hunch, I checked out /proc/<pid>/fd (which can be a lifesaver). I found these:

# ls -l /proc/`pidof mysqld`/fd | grep deleted
lrwx------ 1 root root 64 Jan  5 17:33 14 -> /var/tmp/ibEnEaSj (deleted)
lrwx------ 1 root root 64 Jan  5 17:33 5 -> /var/tmp/ibEoZHQc (deleted)
lrwx------ 1 root root 64 Jan  5 17:33 6 -> /var/tmp/ibHlWZb3 (deleted)
lrwx------ 1 root root 64 Jan  5 17:33 7 -> /var/tmp/ibUtVhxT (deleted)
lrwx------ 1 root root 64 Jan  5 17:33 8 -> /var/tmp/ibt1daDR (deleted)

I can only assume it’s one of these files that’s growing. Changing the setting of tmpdir fixed things up, and it’s writing its large data files to a place with significant space now (and on a much bigger and faster RAID array, to boot). However, this brings with it a couple of questions:

  1. Why does InnoDB need significant space in tmpdir? This is a new requirement with InnoDB plugin (due to online index addition only?), but I don’t see it documented anywhere.2
  2. Why are the files deleted while in use? This makes it very painful for a DBA to manage it and see what’s using space. I know it’s a typical Unix paradigm, but cleanup-on-start and leaving the files linked is much easier to manage.
  3. Why are the error messages useless? How else is a DBA supposed to track this down?

I could also note that using online index add makes useless the only previous way of getting some sort of a status update while adding indexes: watching the temporary file grow. Perhaps it’s time to bring back my patch to show progress (MySQL Bug #26182)?

1 Running MySQL-server-percona-5.1.42-9.rhel5 with innodb_version = 1.0.6-9.

2 Perhaps it should go on Fast Index Creation in the InnoDB Storage Engine: Limitations at very least?

Are we really living in the 1700s?

I stumbled upon an interesting theory today, which I hadn’t heard of or researched before: a supposition that the Dark Ages was not dark merely because there was so little political, cultural, archeological, scientific, etc. advancements, but rather that it’s because a period of approximately 300 years (AD 614-911) of the dark ages didn’t exist at all: A theory titled the Phantom time hypothesis.

A paper Did the Early Middle Ages Really Exist? by Dr. Hans-Ulrich Niemitz, lays out the theory (much of it originated from Heribert Illig in the early 1990s). An interesting FAQ written by Jan Beaufort rounds out some the questions. A summary and a few more bits of insight are provided by Gunnar Ries and Ruth Lelarge. The topic is also covered by a DamnInteresting post and thread of comments—some comical and some insightful. Of course any such theory is not without a lot of criticism and counterpoints, as it should have, and Phil John Kneis counters a lot of things nicely. I found the story of The Seven Sleepers of Ephesus as a potential link to possible truth of this story; not that the Seven Sleepers story was at all truth, but that in that time stories (fictions) were made up to relate truths or partial truths quite frequently.

Some of the claims made, often based at least tangentially on well-known reasons for having called them the “Dark Ages” in the first place (and in many cases refuted by others) are:

  • Overall lack of reputable, accurately dated literature and historical documents from the time period.
  • Much of the otherwise supporting evidence in support of the time period depending on Carbon-14 dating, which has had adjustments made in order to line up with perceived history, and which has been calibrated based on dendrochronology (tree-ring dating) which has had known flaws.
  • Architectural development discontinuities; the seemingly complete stoppage of forward progress in architectural style for 300 years, with buildings known to be constructed after the Dark Ages (based on calibration backwards from modern times) compared to buildings known to be constructed before them (based on calibration forwards from ancient times) showing little or no difference.
  • Farming, war, and scientific knowledge making practically no advancement during the period.
  • A huge spate of forgery of official documents and religious texts otherwise before, during, and after the time period.

It’s not easy to believe one way or the other, but I personally find it not that hard to believe that three hundred years may have been accidentally or intentionally inserted into the calendar for whatever reason. Remember that calendars are man-invented tools, not scientific absolutes, and particularly the points of reference used in them are almost entirely arbitrary. It wouldn’t change our daily lives1 in any way if the actual number of years having passed since the Roman empire was closer to 1700 than 2000. We’ve made a lot of adjustments to the calendar, and even switched points of reference and entire ways of counting several times during man’s history. Even today, not everyone uses the same calendar or agrees on the calendar.2

It’s easy to misinterpret the meaning of “missing years”, and it seems like many comment authors on the various articles have made the mistake of thinking that Illig and Niemitz are suggesting that the years themselves didn’t exist, which of course is nonsense. The only claim being made is that three centuries of already-sketchy history may have just been fabricated entirely, an offset to the calendar was introduced intentionally or not, and as a result of that, we’ve been mis-numbering years for the 1100 years following that.

If true, there would be no need to correct the current calendar date, just to note in history books, elapsed-time calculations involving the past and including those years, and other places that there is a gap. There are already other gaps in the Gregorian calendar, this would just be the largest.

What do you think? Do you find it plausible? Should I be signing this post January 2, 1714? If true, I guess that gives us an extra 298 years before we the Mayan-predicted end-times in 2012, so that’s a bonus!

1 With the possible exception that some churches, and some acts of various churches, may be on slightly shakier footings, especially if it could be proven that they had been maliciously involved in the fabrication of the calendar.

2 See Wikipedia’s List of calendars and Calendar pages for a few points of study. In addition to our Gregorian calendar, at least the Hebrew and Chinese calendars are still in widespread use.

Now a Database Architect at Twitter

Just a small announcement:

Starting today, I am now “MySQL Database Architect” at Twitter, where I am joining some old friends on the small but hard-working DBA and operations teams there. I’ll be working to help debug, support, and scale the MySQL databases, of course, and who knows what else. I’m looking forward to the challenges and fast paced operations again. I’m also looking forward to writing a lot more on this blog about MySQL. I’ve had a Twitter account for a long time, but I suppose I’ll write a lot more on it now:

Since I will now be commuting from Sunnyvale to San Francisco nearly daily, I think my old “Commuting” category will get more of a work-out too.

On early MySQL development hostnames

While reading through the manual I ran across something I had totally forgotten about from the early MySQL days. Early on, Monty (or was it Jani?) decided to name many development servers variants of “bitch” in different languages. I have no idea what the back-story was, but maybe Monty or Jani can fill it in. All of these names live on all over the place, such as in the MySQL and InnoDB documentation, bug reports, and mailing list messages. See:

  • bitch.mysql.fi — English, of course.
  • hundin.mysql.fiGerman
  • hynda.mysql.fiSwedish
  • narttu.mysql.fiFinnish
  • tik.mysql.fiSwedish
  • tramp.mysql.fi — English, probably. Similar in meaning to some slang uses of “bitch”.

There are a few honorable mentions, which I’m not sure are variants of “bitch”, but very well could be:

  • donna.mysql.fi
  • mashka.mysql.fi
  • mishka.mysql.fi

It’s funny to see how those names live on in “infamy” on Google. Try searching for any of them like “+mysql +hundin“.

How to find an errant MySQL client

A common story: You’ve got some connection, either it’s busy running something it shouldn’t be, it’s in Sleep but holding some important lock, or you just don’t know why it’s connected to your database server in the first place. You see it in your SHOW PROCESSLIST like so:

mysql> show processlist G
*************************** 1. row ***************************
     Id: 5979887
   User: root
   Host: localhost:55997
     db: NULL
Command: Sleep
   Time: 475
  State: 
   Info: NULL

How do you find that client, especially if it’s on another host? MySQL is providing you all the information you need above: localhost:55997. Of course localhost is the host or IP address, and 55997 is the source port of the socket; the port number (usually randomly assigned) on the far end of the socket, from the MySQL server’s perspective. You can turn that number into something useful—the PID and user—, by running the following command on the host that made the connection:

# lsof -nPi :55997
COMMAND  PID  USER   FD   TYPE    DEVICE SIZE NODE NAME
mysqld  5026 mysql 1654u  IPv4 303329996       TCP 127.0.0.1:3306->127.0.0.1:55997 (ESTABLISHED)
mysql   9146 jcole    3u  IPv4 303329995       TCP 127.0.0.1:55997->127.0.0.1:3306 (ESTABLISHED)

(Note that here you can see both sides of the socket, since I’m running these commands on localhost. Disregard the first entry, as it’s the half of the connection owned by the MySQL server.)

You can then find out the full command-line of the process with ps:

# ps -fp 9146
UID        PID  PPID  C STIME TTY          TIME CMD
jcole     9146  8740  0 12:53 pts/3    00:00:00 mysql -h 127.0.0.1 -u root -p

And see what it’s doing with strace (waiting on a read of stdin in this dumb test):

# strace -p 9146
Process 9146 attached - interrupt to quit
read(0,  <unfinished ...>
Process 9146 detached

I hope that’s helpful! It’s a pretty common debugging trick for me.