MySQL Enterprise Monitor Learns PERFORMANCE_SCHEMA

Leading up to my previous post, I had been doing some work to start the integration of PERFORMANCE_SCHEMA data with MySQL Enterprise Monitor, including some new graphs based on some of the data that I talked about in the above post..

A picture tells a thousand words:

io_graphs

This is only scratching the surface – more to come, watch this space! 🙂

Tracking IO with PERFORMANCE_SCHEMA

Mark Callaghan over at Facebook wrote a note recently about InnoDB disk IO counters in SHOW STATUS, with some extra things that he wanted to track. I posted a quick comment over there, but I thought this deserved it’s own write up.

MySQL 5.5’s PERFORMANCE_SCHEMA has had a fair bit written about it in terms of tracking synchronization point contention (mutexes etc.), but it currently tracks two orders within the wait class – these are /wait/synch and /wait/io.

Actually, allow me to detour first, it’s not clear from the documentation, though it is clear in the worklog. Each event within PERFORMANCE_SCHEMA is a specific instrumentation point, tracked in some way, such as recording the time taken to complete the event (a specific code section). The naming convention for each event follows Linnaean taxonomy/Class/Order/Family/Genus/Species. We currently have one major class – /wait.

The /wait/synch order is already fleshed out fully, including /wait/synch/cond (condition signalling), /wait/synch/mutex (mutex locks) and /wait/synch/rwlock (read/write locks) families. Detour done..

The /wait/io order has one family within it currently – /wait/io/file, which instruments all file IO for the SQL layer (the /wait/io/file/sql genus), tracking binary logs, frm file access, info files, etc., as well as for all table/index file IO done by instrumented storage engines (such as the /wait/io/file/innodb / /wait/io/file/myisam genera).

Mark shows in his blog how you can already get some pretty good statistics on what is taking up IO on the server, so let’s look at how we can fix some of the deficiencies mentioned, such as:

“Innodb_data_writes describes the number of writes done but all writes are not created equal. Writes to the log file are small and use buffered IO. These are much less expensive that writes done to update InnoDB pages in place. But Innodb_data_writes does not distinguish between them.”

“Innodb_pages_written describes the number of pages written but not the number of IO requests used to write them as write requests for adjacent pages can be merged and some of the code that schedules pages to be written attempts to find adjacent dirty pages to be written at the same time.”

This is actually really easy to track with PERFORMANCE_SCHEMA now that InnoDB has been fully instrumented. InnoDB has created 3 individual species within it’s own genus:

/wait/io/file/innodb/innodb_data_file – file IO for central or single tablespace data files
/wait/io/file/innodb/innodb_log_fileredo log file IO
/wait/io/file/innodb/innodb_temp_file – IO against a sort merge file that is used when adding indexes

You can track how much time has been spent on these events with the EVENTS_WAITS_SUMMARY_* tables, which I won’t talk about here, see my earlier post for examples.

You can also track IO use in terms of bytes however, using the FILE_SUMMARY_* tables. There are currently two of these:

FILE_SUMMARY_BY_EVENT_NAME – summarizes file IO events per instrumentation point
FILE_SUMMARY_BY_INSTANCE – summarizes file IO events per file instance

Mark’s problem focuses on the high level, looking at IO as single counters for what is almost specific species of IO, which relates to FILE_SUMMARY_BY_EVENT_NAME. An example of the data that you can get (scroll to the right):

SELECT EVENT_NAME, 
       COUNT_READ, 
       IFNULL(ROUND(SUM_NUMBER_OF_BYTES_READ/1024/1024/1024, 2), 0.00) AS READ_GB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_READ/COUNT_READ)/1024, 2), 0.00) AS AVG_READ_KB,
       COUNT_WRITE, 
       IFNULL(ROUND(SUM_NUMBER_OF_BYTES_WRITE/1024/1024/1024, 2), 0.00) AS WRITE_GB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_WRITE/COUNT_WRITE)/1024, 2), 0.00) AS AVG_WRITE_KB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_READ + SUM_NUMBER_OF_BYTES_WRITE)/1024/1024/1024, 2), 0.00) AS TOTAL_GB, 
       IFNULL(100-((SUM_NUMBER_OF_BYTES_READ/(SUM_NUMBER_OF_BYTES_READ+SUM_NUMBER_OF_BYTES_WRITE))*100), 0.00) AS WRITE_PCT 
  FROM FILE_SUMMARY_BY_EVENT_NAME
 WHERE EVENT_NAME IN ('wait/io/file/innodb/innodb_log_file', 'wait/io/file/innodb/innodb_data_file');

+--------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
| EVENT_NAME                           | COUNT_READ | READ_GB | AVG_READ_KB | COUNT_WRITE | WRITE_GB | AVG_WRITE_KB | TOTAL_GB | WRITE_PCT |
+--------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
| wait/io/file/innodb/innodb_data_file |     359183 |    3.82 |       11.15 |    39642666 |  1438.62 |        38.05 |  1442.44 |   99.7353 |
| wait/io/file/innodb/innodb_log_file  |         12 |    0.00 |       11.33 |    23337913 |    21.43 |         0.96 |    21.43 |   99.9994 |
+--------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
2 rows in set (0.00 sec)

But wait, it doesn’t end there. Mark also wants to be able to break out writes to things like the doublewrite buffer. This is hard to do when tracking data file IO alone as single counters as above, because all writes to the doublewrite buffer also go to the central tablespace’s data file.

Of course, you can SELECT the value for Innodb_dblwr_writes as Mark does and subtract that from the above counters if you want to, but I have a slightly different example using the FILE_SUMMARY_BY_INSTANCE table.

When using innodb_file_per_table (and if you are not, why not? ;)), you also get the added benefit that any data file IO for tables is tracked individually, here’s an example of finding the top 10 consumers:

SELECT SUBSTRING_INDEX(FILE_NAME, '/', -2) AS DATA_FILE, 
       COUNT_READ, 
       IFNULL(ROUND(SUM_NUMBER_OF_BYTES_READ/1024/1024/1024, 2), 0.00) AS READ_GB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_READ/COUNT_READ)/1024, 2), 0.00) AS AVG_READ_KB,
       COUNT_WRITE, 
       IFNULL(ROUND(SUM_NUMBER_OF_BYTES_WRITE/1024/1024/1024, 2), 0.00) AS WRITE_GB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_WRITE/COUNT_WRITE)/1024, 2), 0.00) AS AVG_WRITE_KB,
       IFNULL(ROUND((SUM_NUMBER_OF_BYTES_READ + SUM_NUMBER_OF_BYTES_WRITE)/1024/1024/1024, 2), 0.00) AS TOTAL_GB, 
       IFNULL(100-((SUM_NUMBER_OF_BYTES_READ/(SUM_NUMBER_OF_BYTES_READ+SUM_NUMBER_OF_BYTES_WRITE))*100), 0.00) AS WRITE_PCT 
  FROM FILE_SUMMARY_BY_INSTANCE 
 ORDER BY TOTAL_GB DESC LIMIT 10;

+---------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
| DATA_FILE                             | COUNT_READ | READ_GB | AVG_READ_KB | COUNT_WRITE | WRITE_GB | AVG_WRITE_KB | TOTAL_GB | WRITE_PCT |
+---------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
| data/ibdata1                          |       3201 |    0.05 |       17.26 |    22978247 |  1265.68 |        57.76 |  1265.74 |   99.9958 |
| mem/inventory_instance_attributes.ibd |       6015 |    0.09 |       16.00 |     2104313 |    32.11 |        16.00 |    32.20 |   99.7150 |
| mem/statement_examples.ibd            |      53171 |    0.81 |       16.00 |     1831051 |    29.12 |        16.68 |    29.94 |   97.2897 |
| mem/loghistogram_data.ibd             |      33687 |    0.51 |       16.00 |     1333759 |    20.51 |        16.13 |    21.03 |   97.5556 |
| data/ib_logfile0                      |         12 |    0.00 |       11.33 |    11735008 |    10.75 |         0.96 |    10.75 |   99.9988 |
| data/ib_logfile1                      |          0 |    0.00 |        0.00 |    11645624 |    10.71 |         0.96 |    10.71 |  100.0000 |
| mem/statement_summary_data.ibd        |      26205 |    0.40 |       16.00 |      547254 |     8.47 |        16.22 |     8.87 |   95.4898 |
| mem/dc_p_string#P#p4.ibd              |         61 |    0.00 |        8.13 |      462098 |     3.53 |         8.00 |     3.53 |   99.9866 |
| mem/dc_p_string#P#p5.ibd              |         59 |    0.00 |        8.14 |      457733 |     3.49 |         8.00 |     3.49 |   99.9869 |
| mem/dc_p_string#P#p6.ibd              |         46 |    0.00 |        8.00 |      452797 |     3.45 |         8.00 |     3.45 |   99.9898 |
+---------------------------------------+------------+---------+-------------+-------------+----------+--------------+----------+-----------+
10 rows in set (0.00 sec)

The database instance the above comes from is only using innodb_file_per_table tables, so why all the writes to the central tablespace? Two major reasons.. The doublewrite buffer, and the undo log..

Also note that we can get down to the partition level here as well, so we can find hot partitions too.

Subtracting the doublewrite overhead from the central tablespace datafile IO should give you the added benefit of also drilling in to undo log IO writes as well (perhaps with some extra fuzz factor for other writes to the central tablespace that happen).

5 years and counting

Yesterday marked my 5th anniversary since I signed on the dotted line, and starting working with MySQL AB! I celebrated yesterday with a BBQ with the family, and a few vodka shots for old times sake. 🙂 Interestingly, it was also the 5 year anniversary of the first 5.0 RC being announced.

Since that time I’ve transferred through to Sun, and now find my home to be Oracle (funnily, the user community that I left to join MySQL).

I’ve met a huge number of fantastically talented people, made a lot of new friends, seen a lot of new colleagues come on board as we expanded so quickly, and sadly seen a number of those colleagues leave to go onwards (and pretty much in all cases, upwards) within the IT world. Happily, a large number remain, and continue to amaze me with the work they have been doing, and focus that they have.

I’ve also gotten to know a number of users and customers pretty well, and have to say they are among the most passionate IT community that I’ve ever come across. Your will for MySQL to keep improving is amazing, and remains my inspiration.

It’s also taken me to new places, I’ve gotten to see New York, Las Vegas, San Francisco, Santa Cruz, Washington DC, Santa Clara, Cupertino, Austin, Montreal, Amsterdam, Frankfurt, Heidelberg, Hamburg, and Riga (the less said the better on that one).

For my part, I’m still as jazzed as ever to be working with the MySQL group within Oracle. We’re taking great strides at the moment, with performance, instrumentation, usability, and high availability within MySQL 5.5, and the roadmap ahead has a very bright future for both the Server and the tools that I get to spend my day to day life on (MEM and MySQL Proxy).

And so I raise my coffee cup (hey, I’m at work now), to the next 5 years, and to all the great people that have made my working life so enjoyable!

Tracking mutex locks in a process list, MySQL 5.5’s PERFORMANCE_SCHEMA

Internally MySQL uses various methods to manage (or, block) concurrent access to shared structures within the server – the most common being mutexes (or Mutual Exclusion locks), RW Locks, or Semaphores. Each have slightly different properties on how different threads are allowed to interact when trying to access these synchronization points.

There has been much written on the various hot spots in the server around these locking/blocking constructs over the past couple of years, so I won’t go further in to that here. See the excellent blogs from Mikael or Dimitri from within Oracle, or those from the likes of Mark or Domas over at Facebook, for examples.

Visibility in to what was happening in the server at this level has been very lacking however, with only SHOW MUTEX STATUS available in the standard server distribution, which only tracked InnoDB mutexes and not those at the server level – which could often be the blocking factors for scalability.

I’ve blogged about PERFORMANCE_SCHEMA before, but never really shown what you can now do whilst trying to track down what sessions are doing, and what they are currently waiting for.

I also saw a hack recently that put the fact that a connection was trying to grab a Query Cache mutex in to the thread state (“Waiting on query cache mutex”), to track whether the query cache was causing too much contention. I thought that was entirely the wrong way to track this kind of detail (it misuses the thread state, which should be stages of execution, not each synchronization point, and doesn’t give you any meaningful way to track the entire overhead). Hopefully I can now show why I thought that.

So with that, let’s dive in to an example of how it should really be done, (you will probably have to scroll this to the right):

SELECT NAME, 
       IF(PPS.ID = 0, 'Internal Thread', CONCAT(IPS.USER, '@', IPS.HOST)) USER,
       DB,
       COMMAND,
       STATE,
       TIME,
       EVENT_NAME LAST_WAIT,
       IF(TIMER_WAIT IS NULL , 'Still Waiting', TIMER_WAIT/1000000000) LAST_WAIT_MS
  FROM performance_schema.EVENTS_WAITS_CURRENT 
  JOIN performance_schema.THREADS PPS USING (THREAD_ID) 
  LEFT JOIN INFORMATION_SCHEMA.PROCESSLIST IPS USING (ID);

+----------------------------------------+--------------------+--------------------+---------+--------------+------+--------------------------------------------------+---------------+
| NAME                                   | USER               | DB                 | COMMAND | STATE        | TIME | LAST_WAIT                                        | LAST_WAIT_MS  |
+----------------------------------------+--------------------+--------------------+---------+--------------+------+--------------------------------------------------+---------------+
| thread/sql/main                        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/sql/LOCK_thread_count           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0002        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/io/file/innodb/innodb_log_file              | Still Waiting |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0001        |
| thread/innodb/io_handler_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/buf_pool_mutex           | 0.0004        |
| thread/innodb/srv_error_monitor_thread | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/srv_innodb_monitor_mutex | 0.0002        |
| thread/innodb/srv_master_thread        | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/fil_system_mutex         | 0.0028        |
| thread/innodb/srv_lock_timeout_thread  | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/innodb/kernel_mutex             | 0.0024        |
| thread/sql/one_connection              | msandbox@localhost | performance_schema | Query   | executing    |    0 | wait/io/file/myisam/dfile                        | 0.0053        |
| thread/sql/signal_handler              | Internal Thread    | NULL               | NULL    | NULL         | NULL | wait/synch/mutex/sql/LOCK_thread_count           | 0.0030        |
| thread/sql/one_connection              | msandbox@localhost | test               | Query   | Sending data |  133 | wait/synch/rwlock/innodb/checkpoint_lock         | Still Waiting |
+----------------------------------------+--------------------+--------------------+---------+--------------+------+--------------------------------------------------+---------------+
16 rows in set (0.78 sec)

The first thing to notice is that not only can we see what external connections are doing – we can now see what internal threads are doing as well. All threads are easily identified by the PERFORMANCE_SCHEMA.THREADS.NAME column, with all client connections having the same name of thread/sql/one_connection (the internal function to manage database connections is handle_one_connection()), and each internal thread within the server and storage engines are tagged with their own identifiers, such as thread/sql/main, the main MySQL Server thread, or thread/innodb/srv_master_thread, the main InnoDB storage engine thread.

So we show the name of each thread, and then figure out the user@host if the thread is an external client connection. We then show all of the normal output that you would expect of SHOW PROCESSLIST, with DB, COMMAND, STATE and TIME (I emit the INFO column, but you could add that if you like to see the statement executing as well).

The last two columns come from the new EVENTS_WAITS_CURRENT table. As you can see, we can now show exactly what the current, or last completed, wait for each session was, and for exactly how long they waited (down to picosecond precision).

In my example above, the last thread listed is waiting for checkpointing to complete within InnoDB, whilst an internal thread is waiting on file IO, on the InnoDB transaction logs (i.e, he is the guy working on the checkpoint) – ooops, I was running a huge INSERT … SELECT on an instance with default settings for innodb_log_file_size. 🙂

If you start to see lots of lots of connections waiting on wait/synch/rwlock/innodb/checkpoint_lock like above, then you too know that you should probably be doing something about your log file sizes (or maybe check whether something else is causing too aggressive checkpointing, such as a very low innodb_max_dirty_pages_pct value).

Keep in mind this can only give you an indication that there are contention points that can be dealt with, as these statistics are very transient in nature.

Here’s an example of tracking down exactly what the contention points in your server are:

SELECT EVENT_NAME,
       SUM_TIMER_WAIT/1000000000 WAIT_MS,
       COUNT_STAR
  FROM performance_schema.EVENTS_WAITS_SUMMARY_GLOBAL_BY_EVENT_NAME 
 ORDER BY SUM_TIMER_WAIT DESC, COUNT_STAR DESC LIMIT 30;

+---------------------------------------------------------+-------------+------------+
| EVENT_NAME                                              | WAIT_MS     | COUNT_STAR |
+---------------------------------------------------------+-------------+------------+
| wait/io/file/innodb/innodb_data_file                    | 150562.0345 |     182750 |
| wait/io/file/innodb/innodb_log_file                     |  77795.5425 |       8913 |
| wait/synch/rwlock/innodb/checkpoint_lock                |  11636.7894 |       1284 |
| wait/synch/rwlock/innodb/btr_search_latch               |   7429.2204 |   39677465 |
| wait/io/file/myisam/kfile                               |   7353.2737 |       2011 |
| wait/synch/mutex/innodb/kernel_mutex                    |   4115.0534 |   59198510 |
| wait/synch/mutex/innodb/fil_system_mutex                |   3473.8341 |    1263809 |
| wait/io/file/myisam/dfile                               |   2940.5701 |        641 |
| wait/synch/mutex/innodb/buf_pool_mutex                  |   2650.8666 |   33982979 |
| wait/synch/mutex/innodb/rw_lock_mutex                   |   1261.8025 |   18877546 |
| wait/io/file/sql/FRM                                    |    116.6419 |        863 |
| wait/synch/cond/sql/COND_thread_count                   |      1.4073 |          1 |
| wait/io/file/sql/pid                                    |      1.2654 |          3 |
| wait/synch/mutex/innodb/mutex_list_mutex                |      0.7675 |      16727 |
| wait/synch/mutex/innodb/rw_lock_list_mutex              |      0.4023 |       8331 |
| wait/io/file/sql/dbopt                                  |      0.2745 |         12 |
| wait/io/file/sql/casetest                               |      0.2041 |          5 |
| wait/synch/mutex/innodb/thr_local_mutex                 |      0.2009 |       2050 |
| wait/synch/mutex/mysys/THR_LOCK_open                    |      0.1993 |        989 |
| wait/synch/rwlock/innodb/trx_purge_latch                |      0.1436 |        255 |
| wait/io/file/sql/ERRMSG                                 |      0.1432 |          5 |
| wait/synch/rwlock/sql/LOCK_grant                        |      0.1375 |        188 |
| wait/synch/rwlock/sql/MDL_lock::rwlock                  |      0.1013 |        481 |
| wait/synch/mutex/sql/Query_cache::structure_guard_mutex |      0.0923 |        628 |
| wait/synch/mutex/mysys/THR_LOCK_myisam                  |      0.0781 |        283 |
| wait/synch/rwlock/sql/Query_cache_query::lock           |      0.0676 |        198 |
| wait/io/file/mysys/charset                              |      0.0561 |          3 |
| wait/synch/mutex/sql/LOCK_open                          |      0.0529 |        543 |
| wait/synch/mutex/mysys/THR_LOCK_lock                    |      0.0504 |        244 |
| wait/synch/mutex/innodb/srv_innodb_monitor_mutex        |      0.0424 |         75 |
+---------------------------------------------------------+-------------+------------+
30 rows in set (0.00 sec)

All of the above is aggregated server wide, since server start. Of course IO is my largest issue – I was doing write heavy work.

But then you can see exactly what I was waiting on after that – the first culprit was as above, way too much checkpointing. Next is waiting for wait/synch/rwlock/innodb/btr_search_latch, the synchronization point for the InnoDB Adaptive Hash Index. I might also do well to disable it with innodb_adaptive_hash_index = 0 in my case.

I can tell immediately that the query cache is not a contention point as well for example – just look for high values for wait/synch/mutex/sql/Query_cache::structure_guard_mutex for that.

5.5 is coming people, get downloading and trying! We want your feedback! 🙂

UPDATE: Note that in 5.5.6 and above, you should actually use performance_schema.THREADS instead of PROCESSLIST, and performance_schema.EVENTS_WAITS_SUMMARY_GLOBAL_BY_EVENT_NAME instead of EVENTS_WAITS_SUMMARY_BY_EVENT_NAME, these were renamed within Bug#55416.

Improving InnoDB Transaction Reporting

Everybody knows that parsing the output of SHOW ENGINE INNODB STATUS is hard, especially when you want to track the information historically, or want to aggregate any of the more dynamic sections such as the TRANSACTIONS one.

Within the InnoDB plugin the INFORMATION_SCHEMA.INNODB_TRX table was added, which allowed you to at least get some of the information on each transaction, but not the full breadth of information that SHOW ENGINE INNODB STATUS provided.

“This is nice..” I thought “..but why not go the whole hog..?”.. And so I set about doing that, and opened up Bug#53336. In a very short time, I was in a review process with the InnoDB team, the patch was cleaned up, and (after a little hiccup) everything has been pushed for the next milestone.

Here’s the docs notes that I wrote for it today (with a little more annotation):

This patch adds the following columns:

mysql> desc innodb_trx;
+----------------------------+---------------------+------+-----+---------------------+-------+
| Field                      | Type                | Null | Key | Default             | Extra |
+----------------------------+---------------------+------+-----+---------------------+-------+
...
| trx_operation_state        | varchar(64)         | YES  |     | NULL                |       |
| trx_tables_in_use          | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_tables_locked          | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_lock_structs           | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_lock_memory_bytes      | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_rows_locked            | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_rows_modified          | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_concurrency_tickets    | bigint(21) unsigned | NO   |     | 0                   |       |
| trx_isolation_level        | varchar(16)         | NO   |     |                     |       |
| trx_unique_checks          | int(1)              | NO   |     | 0                   |       |
| trx_foreign_key_checks     | int(1)              | NO   |     | 0                   |       |
| trx_last_foreign_key_error | varchar(256)        | YES  |     | NULL                |       |
| trx_adaptive_hash_latched  | int(1)              | NO   |     | 0                   |       |
| trx_adaptive_hash_timeout  | bigint(21) unsigned | NO   |     | 0                   |       |
+----------------------------+---------------------+------+-----+---------------------+-------+
22 rows in set (0.05 sec)

Most of these are self explanatory, a lot of them duplicate the information within SHOW ENGINE INNODB STATUS (so you now no longer need to parse the output of that to get complete transaction information).

Below are some of my notes. For comparison, here’s an example of a transaction from the SHOW ENGINE INNODB STATUS statement:

---TRANSACTION 517, ACTIVE 1 sec, OS thread id 2958520320 inserting
mysql tables in use 2, locked 2
189 lock struct(s), heap size 27968, 54389 row lock(s), undo log entries 2406
MySQL thread id 2, query id 36 localhost root Sending data
insert ignore into t1 (select (i * rand())*100, 1 from t1)

trx_operation_state – corresponds to “inserting”, InnoDB’s internal transaction state
trx_tables_in_use – corresponds to “mysql tables in use 2”, the number of table locks requested by MySQL via external_lock()
trx_tables_locked – corresponds to “locked 2”, the number of actual table locks taken via external_lock()
trx_lock_structs – corresponds to “189 lock struct(s)”, the size of the lock struct list
trx_lock_memory_bytes – corresponds to “heap size 27968”, the number of bytes allocated to locks structs
trx_rows_locked – corresponds to “54389 row lock(s)”, an estimation of the number of rows locked (delete marked rows may make it imprecise)
trx_rows_modified – corresponds to “undo log entries 2406”, the number of rows modified in the transaction (inserted, updated, deleted)

Not seen in the transaction output above:

trx_concurrency_tickets – corresponds to “thread declared inside InnoDB 89” for a transaction in SHOW ENGINE INNODB STATUS, the number of concurrency tickets remaining for the transaction when innodb_thread_concurrency != 0
trx_isolation_level – the transactions isolation level
trx_unique_checks – whether the transaction has “SET UNIQUE_CHECKS = 0”
trx_foreign_key_checks – whether the transaction has “SET FOREIGN_KEY_CHECKS = 0”
trx_last_foreign_key_error – if the last statement in the transaction resulted in an FK error, the error text is printed here
trx_adaptive_hash_latched – corresponds to “holds adaptive hash latch” being printed for a transaction in SHOW ENGINE INNODB STATUS
trx_adaptive_hash_timeout – when innodb_adapative_hash_index is enabled (default), statements that try to get the adapative hash latch spin 10000 (BTR_SEA_TIMEOUT) times, re-trying getting the adaptive hash latch, before giving up. lower numbers here for a lot of transactions may indicate contention on the adaptive hash latch

In summary – no more having to parse SHOW ENGINE INNODB STATUS output for transaction information, now you can just do it with SQL (with all it’s aggregation goodness as well if you want to)!

Starting a new job!

I’ve had a wild ride over the past ~4.5 years, starting with MySQL AB as a “Support Engineer”, and working through to “Senior Support Engineer”, and then “Regional Support Manager, Americas” with the MySQL Support Team – truly one of the best product support teams I’ve ever known in the IT industry, even if I am biased.

I’ve always had a passion for helping people, which is why I think I did “OK” in Support. However I’ve always also had a second passion which has been bubbling away for me too – building solutions for diagnosing database issues. I started in the database world in the Oracle market, working on monitoring and management tools. MySQL “poached” me from there whilst I was building a MySQL monitoring module for the cross database monitoring tool that we had, as well as working in a supporting/consulting role for our customers.

Given my background when I joined, I was an obvious person to be put in to an internal coordinator role to manage interaction between the team that develops MySQL Enterprise Monitor and Support back then (before MEM reached it’s first GA), so I also had the pleasure of working with that team pretty closely for my time in Support.

It satisfied my second passion for a long time. Now it’s time to take the leap back to that full time!

At the start of this week I started in my new role, as a “Software Development Manager” for the Enterprise Tools team, still within MySQL/Oracle, working full time on MySQL Enterprise Monitor and MySQL Proxy.

I’m pretty excited about the new role – I want to help make monitoring and management of MySQL easy, both by getting more of the “right” data from the server, and by representing it all in coherent ways for DBAs in our tools, so that they too can be more productive in making the right choices when managing their systems. I feel pretty well placed to do that now. 🙂

If you’re a customer that has used MEM and wants to give any feedback, or even just an interested user in the community – I’m all ears, either leave your comments here, or ping me by email (first_name.last_name@oracle.com) – I’d love to have some discussions with you about your needs, likes, or dislikes.

I thank every person in Support for my time with them, however I owe all of that time to one fairly anonymous guy in the “external” MySQL world, but a man mountain “internally” – Dean Ellis, the man who convinced me to join MySQL AB all those years ago, and has been a fantastic mentor and manager over the years to me. Mr Ellis – I salute you. Thanks for bringing me on the ride, I still don’t regret it, no matter how much you tried to make me! 😉

Enterprise Tools team – Look Out, you have me to whine at you full time now!

A morning hack – Com_change_user

So after I published my patch last night, another of my colleagues – the esteemed Shane Bester – pointed out that there is a related bug – Bug#28405 – which requests that Com_change_user is also split out from Com_admin_commands.

So I extended my patch this morning, to kill two birds with one stone:

=== modified file 'sql/mysqld.cc'
--- sql/mysqld.cc       revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/mysqld.cc       2010-03-03 09:57:40 +0000
@@ -3131,6 +3131,7 @@
   {"call_procedure",       (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_CALL]), SHOW_LONG_STATUS},
   {"change_db",            (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_CHANGE_DB]), SHOW_LONG_STATUS},
   {"change_master",        (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_CHANGE_MASTER]), SHOW_LONG_STATUS},
+  {"change_user",          (char*) offsetof(STATUS_VAR, com_change_user), SHOW_LONG_STATUS},
   {"check",                (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_CHECK]), SHOW_LONG_STATUS},
   {"checksum",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_CHECKSUM]), SHOW_LONG_STATUS},
   {"commit",               (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_COMMIT]), SHOW_LONG_STATUS},
@@ -3174,6 +3175,7 @@
   {"load",                 (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_LOAD]), SHOW_LONG_STATUS},
   {"lock_tables",          (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_LOCK_TABLES]), SHOW_LONG_STATUS},
   {"optimize",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_OPTIMIZE]), SHOW_LONG_STATUS},
+  {"ping",                 (char*) offsetof(STATUS_VAR, com_ping), SHOW_LONG_STATUS},
   {"preload_keys",         (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PRELOAD_KEYS]), SHOW_LONG_STATUS},
   {"prepare_sql",          (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PREPARE]), SHOW_LONG_STATUS},
   {"purge",                (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PURGE]), SHOW_LONG_STATUS},
@@ -3350,11 +3352,13 @@
     We have few debug-only commands in com_status_vars, only visible in debug
     builds. for simplicity we enable the assert only in debug builds
 
-    There are 8 Com_ variables which don't have corresponding SQLCOM_ values:
+    There are 10 Com_ variables which don't have corresponding SQLCOM_ values:
     (TODO strictly speaking they shouldn't be here, should not have Com_ prefix
     that is. Perhaps Stmt_ ? Comstmt_ ? Prepstmt_ ?)
 
       Com_admin_commands       => com_other
+      Com_change_user          => com_change_user
+      Com_ping                 => com_ping
       Com_stmt_close           => com_stmt_close
       Com_stmt_execute         => com_stmt_execute
       Com_stmt_fetch           => com_stmt_fetch
@@ -3368,7 +3372,7 @@
     of SQLCOM_ constants.
   */
   compile_time_assert(sizeof(com_status_vars)/sizeof(com_status_vars[0]) - 1 ==
-                     SQLCOM_END + 8);
+                     SQLCOM_END + 10);
 #endif
 
   if (get_options(&remaining_argc, &remaining_argv))

=== modified file 'sql/sql_class.h'
--- sql/sql_class.h     revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/sql_class.h     2010-03-03 09:56:18 +0000
@@ -443,6 +443,8 @@
   ulong ha_discover_count;
   ulong ha_savepoint_count;
   ulong ha_savepoint_rollback_count;
+  ulong com_ping;
+  ulong com_change_user;
 
   /* KEY_CACHE parts. These are copies of the original */
   ulong key_blocks_changed;

=== modified file 'sql/sql_parse.cc'
--- sql/sql_parse.cc    revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/sql_parse.cc    2010-03-03 09:56:19 +0000
@@ -979,7 +979,7 @@
 #endif
   case COM_CHANGE_USER:
   {
-    status_var_increment(thd->status_var.com_other);
+    status_var_increment(thd->status_var.com_change_user);
     char *user= (char*) packet, *packet_end= packet + packet_length;
     /* Safe because there is always a trailing \0 at the end of the packet */
     char *passwd= strend(user)+1;
@@ -1409,7 +1409,7 @@
     break;
   }
   case COM_PING:
-    status_var_increment(thd->status_var.com_other);
+    status_var_increment(thd->status_var.com_ping);
     my_ok(thd);                                // Tell client we are alive
     break;
   case COM_PROCESS_INFO:

Again, a quick test script:

< ?php

$conn = mysqli_connect('127.0.0.1','root','msandbox', 'test', 5550);

if (!mysqli_change_user($conn, 'root', 'msandbox', 'mysql')) {
   echo "Change user failed!\n";
} else {
   echo "User changed!\n";
}

?>

And test:

Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P5550 -e "show global status like 'com_change_user'"
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_change_user | 0     |
+-----------------+-------+
Cerberus:msb_5_5_5 mark$ php ~/Dev/tests/com_change_user.php
User changed!
Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P5550 -e "show global status like 'com_change_user'"
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_change_user | 1     |
+-----------------+-------+
Cerberus:msb_5_5_5 mark$ php ~/Dev/tests/com_change_user.php
User changed!
Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P5550 -e "show global status like 'com_change_user'"
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Com_change_user | 2     |
+-----------------+-------+

Yet Another Disclaimer: Not sure if/when/where this might get in! It is a change in behavior with regards to Com_admin_commands

An evening hack – Com_ping

My boss vented about the lack of a Com_ping SHOW GLOBAL STATUS variable earlier, and I figured it would be dead easy to hack in.

A few minutes later:

Cerberus:mysql-next-mr mark$ bzr diff ./sql
=== modified file 'sql/mysqld.cc'
--- sql/mysqld.cc       revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/mysqld.cc       2010-03-02 22:58:45 +0000
@@ -3174,6 +3174,7 @@
   {"load",                 (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_LOAD]), SHOW_LONG_STATUS},
   {"lock_tables",          (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_LOCK_TABLES]), SHOW_LONG_STATUS},
   {"optimize",             (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_OPTIMIZE]), SHOW_LONG_STATUS},
+  {"ping",                 (char*) offsetof(STATUS_VAR, com_ping), SHOW_LONG_STATUS},
   {"preload_keys",         (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PRELOAD_KEYS]), SHOW_LONG_STATUS},
   {"prepare_sql",          (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PREPARE]), SHOW_LONG_STATUS},
   {"purge",                (char*) offsetof(STATUS_VAR, com_stat[(uint) SQLCOM_PURGE]), SHOW_LONG_STATUS},
@@ -3350,11 +3351,12 @@
     We have few debug-only commands in com_status_vars, only visible in debug
     builds. for simplicity we enable the assert only in debug builds
 
-    There are 8 Com_ variables which don't have corresponding SQLCOM_ values:
+    There are 9 Com_ variables which don't have corresponding SQLCOM_ values:
     (TODO strictly speaking they shouldn't be here, should not have Com_ prefix
     that is. Perhaps Stmt_ ? Comstmt_ ? Prepstmt_ ?)
 
       Com_admin_commands       => com_other
+      Com_ping                 => com_ping
       Com_stmt_close           => com_stmt_close
       Com_stmt_execute         => com_stmt_execute
       Com_stmt_fetch           => com_stmt_fetch
@@ -3368,7 +3370,7 @@
     of SQLCOM_ constants.
   */
   compile_time_assert(sizeof(com_status_vars)/sizeof(com_status_vars[0]) - 1 ==
-                     SQLCOM_END + 8);
+                     SQLCOM_END + 9);
 #endif
 
   if (get_options(&remaining_argc, &remaining_argv))

=== modified file 'sql/sql_class.h'
--- sql/sql_class.h     revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/sql_class.h     2010-03-02 22:56:05 +0000
@@ -443,6 +443,7 @@
   ulong ha_discover_count;
   ulong ha_savepoint_count;
   ulong ha_savepoint_rollback_count;
+  ulong com_ping;
 
   /* KEY_CACHE parts. These are copies of the original */
   ulong key_blocks_changed;

=== modified file 'sql/sql_parse.cc'
--- sql/sql_parse.cc    revid:alik@sun.com-20100114090008-3rsdmlp1w2mqgrhg
+++ sql/sql_parse.cc    2010-03-02 23:23:41 +0000
@@ -1409,7 +1409,7 @@
     break;
   }
   case COM_PING:
-    status_var_increment(thd->status_var.com_other);
+    status_var_increment(thd->status_var.com_ping);
     my_ok(thd);                                // Tell client we are alive
     break;
   case COM_PROCESS_INFO:

COM_PING has always historically been recorded under Com_admin_commands (which is tracked with com_other), however this also includes a number of other commands:

COM_TABLE_DUMP
COM_CHANGE_USER
COM_BINLOG_DUMP
COM_SHUTDOWN
COM_DEBUG

COM_CHANGE_USER can also be used frequently with connection pools, as can COM_PING. It would be nice to differentiate them when tracking down issues within such environments.

Here’s the bug that was opened:

http://bugs.mysql.com/bug.php?id=51667

And a quick test:

Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P 5550 -e "show global status like 'com_ping'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_ping      | 0     |
+---------------+-------+
Cerberus:msb_5_5_5 mark$ php ~/Dev/tests/com_ping.php
Connected!
Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P 5550 -e "show global status like 'com_ping'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_ping      | 1     |
+---------------+-------+
Cerberus:msb_5_5_5 mark$ php ~/Dev/tests/com_ping.php
Connected!
Cerberus:msb_5_5_5 mark$ mysql -u root -pmsandbox -h 127.0.0.1 -P 5550 -e "show global status like 'com_ping'"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Com_ping      | 2     |
+---------------+-------+

The test script:

< ?php

$conn = mysql_connect('127.0.0.1:5550','root','msandbox');
mysql_select_db('test',$conn);

if (!mysql_ping ($conn)) {
   echo "Connection Dead!\n";
} else {
   echo "Connected!\n";
}

?>

Disclaimer: Not sure if/when/where this might get in! It is a change in behavior with regards to Com_admin_commands

PERFORMANCE_SCHEMA hits Prime Time!

I’ve talked about PERFORMANCE_SCHEMA in the past – almost a year ago now.

Back then the feature was just coming together in to something useable, and was on the cusp of moving towards code review. It entered code review, and went around, and around for 9 months, whilst various refinements were made.

Never the less, Marc Alff persevered (much respect!), and yesterday pushed his final merge in to the mysql-next-mr bzr tree. PERFORMANCE_SCHEMA is now awaiting the next milestone release, and will be a part of the next GA release of MySQL!

We have the initial documentation ready:

http://dev.mysql.com/doc/performance-schema/en/index.html

This first round adds the infrastructure to take monitoring of the MySQL Server to the next level, initially adding in instrumentation for sync points (mutexes, rw locks, etc.) and file IO, in the SQL layer, and most of the default storage engines (all those controlled by MySQL/Sun).

Here’s a shout out to the other storage engine developers – we’d love you to start looking at instrumenting your own engines as well. Ask away on the internals@ list – I’m sure Marc will be more than willing to help.

And now that we have the above in – what’s next? It’s a good question! Here’s a list of the major outstanding worklogs:

WL#4674 PERFORMANCE_SCHEMA Setup For Actors
WL#4895 PERFORMANCE_SCHEMA Instrumenting Table IO
WL#4896 PERFORMANCE_SCHEMA Instrumenting Net IO
WL#4878 PERFORMANCE_SCHEMA Trace

Personally, I’m voting for WL#4895 next (well, I’d like to see InnoDB instrumentation too!), but you can vote for what ever you are most interested in via the votes in Worklog as well.

Thanks for all the hard work Marc!

innodb_stats_on_metadata / innodb_adaptive_hash_index

It occurred to me at the MySQL UC that I hadnâ??t blogged about either innodb_status_on_metadata or innodb_adaptive_hash_index -Morgan Tocker convinced me to do a lightning talk at the end of the Percona conference about them – so I thought Iâ??d do a quick write up on them here as well.

First – whatâ??s the point in them? Being the guy that wrote the originating patches for both of them, I have a little background on both

It occurred to me at the MySQL UC that I hadn’t blogged about either innodb_stats_on_metadata or innodb_adaptive_hash_index – Morgan Tocker convinced me to do a lightning talk at the end of the Percona conference about them – so I thought I’d do a quick write up on them here as well.

First – what’s the point in them? Being the guy that wrote the originating patches for both of them, I have a little background:

innodb_stats_on_metadata

InnoDB is fairly light weight when it gathers table and index statistics. In the default compiled in InnoDB for most users it will do 8 random(ish) dives in to the index and samples the page to try to ascertain the cardinality (selectivity) of the index for use by the optimizer. This being the case, not only is this done when you run ANALYZE TABLE on a table, it is also done during various other points:

The first is easy to digest.

The second and third both occur for each row change within InnoDB. row_update_statistics_if_needed() is called from row_insert_for_mysql()row_update_cascade_for_mysql(), and row_update_for_mysql().

The last in the list is interesting. In order to get an accurate(ish) count of the rows in the table, or the cardinality results etc. for the metadata commands, InnoDB runs the same procedure as an ANALYZE TABLE for all of the tables, because it doesn’t maintain accurate statistics in a table or index header somewhere like MyISAM (being MVCC getting an accurate row count is far more difficult for example).

This is great, if you run these commands periodically to check on their output yourself. Once you start trying to read these tables very often in a monitoring solution of any kind though, you will likely hit two distinct problems:

  1. God Damn that INFORMATION_SCHEMA access is SLLLOOOWWWWW
  2. If you are monitoring your queries, and their EXPLAIN plans, you may start to see “plan instability” – the optimizer randomly choosing different access paths for the queries

Behind door #1 there’s a couple of issues – we traverse the directories and table files in the data directory for a lot of the information used in the output, as well as the ANALYZE that is done for InnoDB. We’ve done work in Worklog #3732 to optimize certain accesses to these INFORMATION_SCHEMA tables, but this didn’t solve the statistics issue.

The second issue is a little more subtle. You may have benchmarked your application statements in Dev and QA environments – but data sets change when they are in production. If table and index statistics are constantly being recalculated, there is a high probability that the optimizer will in fact choose different access paths for certain queries and tables where the rows that need to be scanned could be satisfied by more than one index.

Unless you are closely monitoring individual statements for their run time, access paths (EXPLAIN) etc. this is far more difficult to catch (often you will be seeing the “good” EXPLAIN plan a lot, and the “bad” one only surfaces every now and then too – so interactive debugging is made harder as well).

So, at the request of a pretty large SaaS customer, who was having both problems, innodb_stats_on_metadata was born – originally in Bug #26598. If you see any of the situations above, give it a try, and see how it works for you, as well.

Originally it was created as a non-dynamic variable, you could only set it on the command line or in an option file. It was altered by the InnoDB guys to be dynamic, and settable by SUPER privileged users. It’s documented in the InnoDB plugin documentation under the “Changes for Flexibility, Ease of Use and Reliability” chapter.

I’m on the fence about allowing this dynamically, as it still allows tools to change the setting as well (which often have the SUPER privilege to get at things like SHOW INNODB STATUS), when the DBA may have wanted this explicitly not allowed for all users.

Be good citizens tool vendors, and respect this setting please.

innodb_adaptive_hash_index

InnoDB has a really unique feature in the database world called “Adaptive Hash Indexes“. It monitors which rows are being accessed via index searches, and dynamically decides to cache those that are used the most often, by taking the BTREE index for the table the rows are from and converting it to a HASH index in memory, for quick row lookups. Historically, this has performed really well (and still does today in most workloads).

Prior to the fix for the infamous Bug #15815 (the true start of our multi-core scalability jaunt, imho), it just hummed along with most people not noticing it around. Once we had the fix though, we started to see mutex contention in other places, in certain cases leading to InnoDB getting locked up, and stopping the server due to long semaphore waits – this happened within the code for adaptive hash indexes as well, and was reported as Bug #20358.

This has hit a pretty large number of users – many Support customers as well, and has had many engineer hours in trying to reproduce it reliably (without result, the bug is still “Open”, not “Verified”). As this started to hit a number of customers, I wrote a pretty simple patch for innodb_adaptive_hash_index to allow you to switch off the use of adaptive hash indexes altogether within InnoDB.

Now, for some this comes as a performance trade off – if you hit the above bug, it does allow you to work around it, but your application may benefit from Adaptive Hash Indexes too. If you favor reliability over performance, then use it – it fixes the problem (in a somewhat hackish way).

Yasufumi Kinoshita did a lot of great work, in Bug #29560 and Bug #26442, and there have been other issues found and fixed by the InnoDB team – Bug #39483 (closed at the time of writing) – and Mark over at Google – Bug #32149 (still verified at the time of writing). We hardly ever see Bug#20358 any more though, unless it’s on older versions.

This variable also gets a good mention in the InnoDB plugin documentation, under the “Changes for Flexibility, Ease of Use and Reliability” again, and in it’s own section “Controlling Adaptive Hash Indexing“, because the variable has a nice side effect as well – it can also help to scale, specifically because of the kind of mutex contention that also lead to Bug #20358. 

The InnoDB manual mentions that you can monitor the contention via the SEMAPHORES section of SHOW INNODB STATUS by looking for “many threads waiting on an RW-latch created in btr0sea.c” – quite correctly.

You can also check out how much memory the hash index is using, and how well it’s being utilized, in the INSERT BUFFER AND ADAPTIVE HASH INDEX section:

-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 5, seg size 7,
16104 inserts, 16104 merged recs, 15379 merges
Hash table size 2212699, used cells 846306, node heap has 1574 buffer(s)
0.03 hash searches/s, 0.63 non-hash searches

The last two rows apply to the Adaptive Hash index. “Hash Table size” and “used cells” are the number of cells (array slots) assigned to the hash table (always a prime number), and the number of those which are used, respectively. “used cells” was removed as part of Bug #36941 because of it having perform issues by scanning the hash table to find the used cells – in 5.1.28 – so you may not see this on your server.

“node heap has 1574 buffer(s)” is where you can find out how much memory the hash index is using – the number here is the number of pages that are allocated to the hash index (and a page is 16K in InnoDB by default).

The last row gives the number of searches both in the adaptive hash index, as well as those not satisfied by it – which can give you a hit ratio for hash read efficiency.

I’ve made a few graphs for MEM to track these, so they can be monitored over time to get a better picture of adaptive hash index usage, to give you a better tool to judge for yourself what the impact may be for disabling it (see below). I’ll push these up today, and after some testing on a number of other servers, you should hopefully see them in a new advisor update soon.

InnoDB Adaptive Hash Index graphs for MEM
InnoDB Adaptive Hash Index graphs for MEM