Exploring InnoDB page management with innodb_ruby

In On learning InnoDB: A journey to the core I introduced a new library and command-line tool in the innodb_ruby project. Later on in A quick introduction to innodb_ruby I walked through installation and a few quick demos of the innodb_space command-line tool.

In my last post, Page management in InnoDB space files, I described InnoDB’s extent, file segment, and free space management structures. Now I will provide a few demonstrations of using innodb_space to examine those structures in real tables.

(Note that if you have installed innodb_ruby before, you should upgrade it to version 0.7.6+, as I’ve made a number of changes while writing these posts over the past few days.)

A minimal, empty table

I created an empty table (the schema doesn’t matter) to illustrate the “minimal” state of InnoDB’s page management structures. The space-page-type-regions mode will summarize the type of all contiguous regions of the same page type:

$ innodb_space -f test/e.ibd space-page-type-regions
start       end         count       type                
0           0           1           FSP_HDR             
1           1           1           IBUF_BITMAP         
2           2           1           INODE               
3           3           1           INDEX               
4           5           2           ALLOCATED           

The table has allocated the standard pages for IBD space files: FSP_HDR, IBUF_BITMAP, INODE, and an INDEX page for the root of the (empty) index. There are also two ALLOCATED pages which are unused.

The space-lists mode can be used to summarize the extent descriptor and inode lists in the space:

$ innodb_space -f test/e.ibd space-lists
name                length      f_page      f_offset    l_page      l_offset    
free                0           0           0           0           0           
free_frag           1           0           158         0           158         
full_frag           0           0           0           0           0           
full_inodes         0           0           0           0           0           
free_inodes         1           2           38          2           38          

Only the free_frag extent descriptor list has any entries, and only a single extent is in it. The free_inodes list has the one INODE page seen above in it.

The contents of the free_frag list can be examined with the space-list-iterate mode, which will print a graphic illustrating the usage of pages within all extents in an extent list (“#” means the page is used, “.” means the page is free):

$ innodb_space -f test/e.ibd -L free_frag space-list-iterate
start_page  page_used_bitmap                                                
0           ####............................................................

The file segments in all indexes in the space can be summarized with the space-indexes mode:

$ innodb_space -f test/e.ibd space-indexes
id          root        fseg        used        allocated   fill_factor 
20          3           internal    1           1           100.00%     
20          3           leaf        0           0           0.00%       

Only the internal file segment has any pages allocated, and it only has a single page allocated. The index-fseg-internal-lists mode will summarize the extent lists in the internal file segment:

$ innodb_space -f test/e.ibd -p 3 index-fseg-internal-lists
name                length      f_page      f_offset    l_page      l_offset    
full                0           0           0           0           0           
not_full            0           0           0           0           0           
free                0           0           0           0           0           

All three lists are empty, because this empty table has not allocated any full extents. So where did the 1 page that is used go? It’s a “fragment” page, and those can be listed with the index-fseg-internal-frag-pages mode:

$ innodb_space -f test/e.ibd -p 3 index-fseg-internal-frag-pages
page        index   level   data    free    records 
3           20      0       0       16252   0       

That’s the minimal state of things — mostly empty bookkeeping structures, and a single INDEX page. Let’s take a look at a table with some real data in it.

A table with one million rows

In A quick introduction to innodb_ruby, I created a table with 1 million rows in it. We’ll use the same table in the examples here.

There are a total of 3,072 pages, with the majority of them type INDEX as a typical table would be:

$ innodb_space -f test/t.ibd space-page-type-regions
start       end         count       type                
0           0           1           FSP_HDR             
1           1           1           IBUF_BITMAP         
2           2           1           INODE               
3           37          35          INDEX               
38          63          26          ALLOCATED           
64          2163        2100        INDEX               
2164        2175        12          ALLOCATED           
2176        2176        1           INDEX               
2177        2239        63          ALLOCATED           
2240        2240        1           INDEX               
2241        2303        63          ALLOCATED           
2304        2304        1           INDEX               
2305        2367        63          ALLOCATED           
2368        2368        1           INDEX               
2369        2431        63          ALLOCATED           
2432        2432        1           INDEX               
2433        3071        639         ALLOCATED           

Note that there are a few gaps of ALLOCATED (free) pages in between blocks of INDEX pages. InnoDB does not guarantee that it uses free pages sequentially, and many optimizations around bulk data loading will cause pages to be used out of order. (More on page splitting and these optimizations in a future post.)

Looking at the space’s lists, there are actually a few extents in free, as well as the usual one extent in free_frag:

$ innodb_space -f test/t.ibd space-lists
name                length      f_page      f_offset    l_page      l_offset    
free                3           0           1718        0           1798        
free_frag           1           0           158         0           158         
full_frag           0           0           0           0           0           
full_inodes         0           0           0           0           0           
free_inodes         1           2           38          2           38          

The pages in the free extent descriptor list are all free, as expected:

$ innodb_space -f test/t.ibd -L free space-list-iterate
start_page  page_used_bitmap                                                
2496        ................................................................
2560        ................................................................
2624        ................................................................

The free_frag extent descriptor list shows a number of “fragment” pages are used as well:

$ innodb_space -f test/t.ibd -L free_frag space-list-iterate
start_page  page_used_bitmap                                                
0           ######################################..........................

The index file segments show the bulk of the used pages are allocated to the leaf file segment, also as expected (there are only 3 non-leaf internal pages to manage the 2,137 leaf pages in the B+tree):

$ innodb_space -f test/t.ibd space-indexes
id          root        fseg        used        allocated   fill_factor 
19          3           internal    3           3           100.00%     
19          3           leaf        2137        2464        86.73%      

You can also see that the leaf index file segment has more pages allocated than it’s actually using, showing an 86.73% fill factor. This is due to InnoDB’s “segment fill factor” which is fixed at 87.5% in stock MySQL, but is now configurable in Twitter MySQL thanks to a heads up from Facebook filing MySQL Bug 64673.

Since the internal index file segment has only three pages, as expected the file segment lists are all empty:

$ innodb_space -f test/t.ibd -p 3 index-fseg-internal-lists
name                length      f_page      f_offset    l_page      l_offset    
full                0           0           0           0           0           
not_full            0           0           0           0           0           
free                0           0           0           0           0           

The three used pages are allocated as fragment pages:

$ innodb_space -f test/t.ibd -p 3 index-fseg-internal-frag-pages
page        index   level   data    free    records 
3           19      2       26      16226   2       
36          19      1       13871   2057    1067    
37          19      1       13910   2016    1070    

The leaf index file segment lists are pretty busy, with 32 full extents and 6 not full extents:

$ innodb_space -f test/t.ibd -p 3 index-fseg-leaf-lists
name                length      f_page      f_offset    l_page      l_offset    
full                32          0           198         0           1438        
not_full            6           0           1478        0           1678        
free                0           0           0           0           0           

In addition the leaf index file segment has allocated all 32 fragment pages possible (before any of the full extents above):

$ innodb_space -f test/t.ibd -p 3 index-fseg-leaf-frag-pages
page        index   level   data    free    records 
4           19      0       10362   5720    471     
5           19      0       13134   2916    597     
6           19      0       8558    7562    389     
7           19      0       8118    8002    369     
8           19      0       10494   5588    477     
9           19      0       7920    8212    360     
10          19      0       9900    6198    450     
11          19      0       4730    11448   215     
12          19      0       14630   1386    665     
13          19      0       13552   2484    616     
14          19      0       9636    6462    438     
15          19      0       11000   5082    500     
16          19      0       11902   4162    541     
17          19      0       9108    7006    414     
18          19      0       8228    7894    374     
19          19      0       1980    14242   90      
20          19      0       11418   4656    519     
21          19      0       8668    7452    394     
22          19      0       13134   2906    597     
23          19      0       8426    7694    383     
24          19      0       8734    7378    397     
25          19      0       11374   4698    517     
26          19      0       9614    6486    437     
27          19      0       12078   3986    549     
28          19      0       8976    7132    408     
29          19      0       10714   5370    487     
30          19      0       11396   4676    518     
31          19      0       9526    6576    433     
32          19      0       8338    7778    379     
33          19      0       11770   4294    535     
34          19      0       10494   5594    477     
35          19      0       8800    7312    400     

The full extents, as expected, are all full:

$ innodb_space -f test/t.ibd -p 3 -L full index-fseg-leaf-list-iterate
start_page  page_used_bitmap                                                
64          ################################################################
128         ################################################################
192         ################################################################
256         ################################################################
320         ################################################################
384         ################################################################
448         ################################################################
512         ################################################################
576         ################################################################
640         ################################################################
704         ################################################################
768         ################################################################
832         ################################################################
896         ################################################################
960         ################################################################
1024        ################################################################
1088        ################################################################
1152        ################################################################
1216        ################################################################
1280        ################################################################
1344        ################################################################
1408        ################################################################
1472        ################################################################
1536        ################################################################
1600        ################################################################
1664        ################################################################
1728        ################################################################
1792        ################################################################
1856        ################################################################
1920        ################################################################
1984        ################################################################
2048        ################################################################

The not_full extents are all partially filled, as expected:

$ innodb_space -f test/t.ibd -p 3 -L not_full index-fseg-leaf-list-iterate
start_page  page_used_bitmap                                                
2112        ####################################################............
2176        #...............................................................
2240        #...............................................................
2304        #...............................................................
2368        #...............................................................
2432        #...............................................................

You can see the artifacts of InnoDB’s page split optimizations here: It has taken the first page out of an extent several times (due to page number “hinting” which is dubious at best) in an effort to lay out data in sequential order on disk. A deeper examination of this behavior will come in the future.

What do you think?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s