Saturday, 25 August 2012

Speeding Optimize for InnoDB tables


Speeding up the OPTIMIZE table for InnoDB tables :

Recently we learnt about an incident with one of our client, where we tried to rebuild a huge InnoDB table (180G) to reclaim unused space using command "ALTER TABLE ... ENGINE = INNODB ;", which took forever to rebuild even after turning "fast_index_creation" server variable ON. Let’s dig into details as to why it took such a long time.

Purpose of fast_index_creation :- If a table is altered using ALTER TABLE ... ENGINE=INNODB; then this option will create a temporary table with only clustered index in it, then reload the data from original table and then create secondary indexes on top of it. However there is a serious limitation with this server variable, this works only for secondary indexes created with "ALTER TABLE ADD INDEX ..." and "CREATE INDEX indexname ...".

Percona has overcome this limitation by creating a new server variable known as "expand_fast_index_creation". This option is set to OFF by default.

Please see the below test results for performance improvement on using "expand_fast_index_creation" option over plain "OPTIMIZE TABLE" command. The innodb_buffer_pool_size is set to 256M.

mysql> set expand_fast_index_creation=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql> set profiling=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> optimize table employees;
+---------------------+----------+----------+-------------------------------------------------------------------+
| Table               | Op       | Msg_type | Msg_text                                                          |
+---------------------+----------+----------+-------------------------------------------------------------------+
| employees.employees | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| employees.employees | optimize | status   | OK                                                                |
+---------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (32.42 sec)

mysql> show profile;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
| starting                     |  0.000418 |
| checking permissions         |  0.000023 |
| Opening tables               |  0.000034 |
| System lock                  |  0.000016 |
| Waiting for query cache lock |  0.000041 |
| init                         |  0.000008 |
| Opening tables               |  0.000124 |
| System lock                  |  0.000007 |
| setup                        |  0.000043 |
| creating table               |  0.396054 |
| After create                 |  0.000090 |
| copy to tmp table            | 31.557635 |
| rename result table          |  0.373926 |
| end                          |  0.000009 |
| Waiting for query cache lock |  0.000002 |
| end                          |  0.000017 |
| Opening table                |  0.000121 |
| System lock                  |  0.097834 |
| query end                    |  0.000007 |
| closing tables               |  0.000003 |
| freeing items                |  0.000035 |
| cleaning up                  |  0.000006 |
+------------------------------+-----------+
22 rows in set (0.00 sec)

mysql> set profiling=OFF;
Query OK, 0 rows affected (0.00 sec)

As you can notice from the PROFILE that the major part of time was spent in copying the data to tmp table. This involves inserting individual records into the tmp file and the corresponding indexes, after this the index will be sorted.

mysql> set expand_fast_index_creation=ON;     #### Server variable is turned ON ####
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like '%expand%';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| expand_fast_index_creation | ON    |
+----------------------------+-------+
1 row in set (0.00 sec)

mysql> set profiling=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> optimize table employees;
+---------------------+----------+----------+-------------------------------------------------------------------+
| Table               | Op       | Msg_type | Msg_text                                                          |
+---------------------+----------+----------+-------------------------------------------------------------------+
| employees.employees | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| employees.employees | optimize | status   | OK                                                                |
+---------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (15.31 sec)

mysql> show profile;
+------------------------------+----------+
| Status                       | Duration |
+------------------------------+----------+
| starting                     | 0.000055 |
| checking permissions         | 0.000021 |
| Opening tables               | 0.000031 |
| System lock                  | 0.000014 |
| Waiting for query cache lock | 0.000051 |
| init                         | 0.000019 |
| Opening tables               | 0.000141 |
| System lock                  | 0.000010 |
| setup                        | 0.000042 |
| creating table               | 0.334066 |
| After create                 | 0.000093 |
| copy to tmp table            | 9.129202 |
| restoring secondary keys     | 5.676336 |
| rename result table          | 0.164292 |
| end                          | 0.000011 |
| Waiting for query cache lock | 0.000003 |
| end                          | 0.000021 |
| Opening table                | 0.000146 |
| System lock                  | 0.003445 |
| query end                    | 0.000006 |
| closing tables               | 0.000003 |
| freeing items                | 0.000030 |
| cleaning up                  | 0.000005 |
+------------------------------+----------+
23 rows in set (0.00 sec)

In the above results we can see the records are first copied to a tmp table consisting only of clustered index and then secondary indexes are applied (Restoring secondary keys).

However there are some caveats at using this option, one of which is the FOREIGN KEY constraint. If the table has a foreign key constraint, then percona server will consider "expand_fast_index_creation=OFF", since this constraint cannot be disabled. (This is reason we did not notice any performance improvement while rebuilding the table for one of our client.)

Please see the results below, after a foreign key constraint was added to the table.

mysql> set expand_fast_index_creation=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> set profiling=ON;
Query OK, 0 rows affected (0.00 sec)

mysql> optimize table employees;
+---------------------+----------+----------+-------------------------------------------------------------------+
| Table               | Op       | Msg_type | Msg_text                                                          |
+---------------------+----------+----------+-------------------------------------------------------------------+
| employees.employees | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| employees.employees | optimize | status   | OK                                                                |
+---------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (33.86 sec)

mysql> show profile;
+------------------------------+-----------+
| Status                       | Duration  |
+------------------------------+-----------+
| starting                     |  0.000050 |
| checking permissions         |  0.000022 |
| Opening tables               |  0.000031 |
| System lock                  |  0.000014 |
| Waiting for query cache lock |  0.000071 |
| init                         |  0.000009 |
| Opening tables               |  0.000121 |
| System lock                  |  0.000006 |
| setup                        |  0.000049 |
| creating table               |  0.353180 |
| After create                 |  0.000101 |
| copy to tmp table            | 33.049338 |
| rename result table          |  0.460403 |
| end                          |  0.000010 |
| Waiting for query cache lock |  0.000002 |
| end                          |  0.000019 |
| Opening table                |  0.000162 |
| System lock                  |  0.004451 |
| query end                    |  0.000004 |
| closing tables               |  0.000002 |
| freeing items                |  0.000021 |
| cleaning up                  |  0.000003 |
+------------------------------+-----------+
22 rows in set (0.00 sec)

As seen above there are no performance improvements while the table had foreign key constraint. Disabling through "foreign_key_checks" didnt work either. Same case is seen in terms of "UNIQUE" keys.