MySQL sys version 1.0.1 released

I’ve just released the 1.0.1 version of the MySQL sys schema. It contains a few bug fixes (including a contribution from Joe Grasse, thanks Joe!), and a number of new helper procedures around viewing and altering configuration for consumers and instruments, contributed by the MySQL QA team, as they’ve started to incorporate more Performance Schema trace data in to their automated testing.

Next up is the 1.1.0 release, that also includes a bunch of new host summary views that were contributed by Arnaud Adant (of the MySQL Support team). I have a number of new things in development to add as well before then though.

Let me know if there are things you’d like to see as well, maybe I can find time to work on those too.


MySQL Tech Day, Paris, Slides

I spoke at MySQL Tech Day in Paris yesterday, it was a fun day – apart from the early and late flights in and out on the same day, my boss told me I was crazy, he was probably right.

First Dimitri presented some recent benchmark findings:

Then I got my turn to speak for a few hours around Performance Schema and the MySQL sys schema:

Visible there on the left is also Marc Alff, the principle developer of Performance Schema, who was also able to interject some great points for people as well.

My presentation was a super-set of a few of my previous ones, but updated and with a little added content here and there (particularly around the MySQL sys updates from ps_helper), so it’s pretty long (I had 3 hours to fill), but probably the most complete that I’ve done.

Here are the slides:

MySQL 5.7: Performance Schema Improvements, Percona Live

I had a great time last week at Percona Live, meeting up with lots of old friends, and getting to know lots of new ones.

It was great to meet many of the people that hang around on DBHangOps face to face. Geoff even got a community award (well done)! Unfortunately I had to miss the lunch.

It was also good to see Oracle getting a community award. Our engineers are extremely hard working, and all want to help community and customers alike be successful with their MySQL environments. There was lots of great positive attitude towards the work we’ve been doing, it was pleasing to hear that we are on the right track.

I haven’t been to a conference at that venue since the “old school” MySQL UC. Well done Percona for putting on such a great event.

And as a bonus, I even got to sit in front of Shlomi Noach on my flight from SFO to Newark:

Here’s the slides for my talk, MySQL 5.7: Performance Schema Improvements:

MySQL 5.7: Performance Schema Improvements

I hope to be seeing many of you out at MySQL Connect later this year, not long now to submit your talk!

ps_helper and its migration to sys

MySQL’s Performance Schema is an incredibly rich and versatile instrumentation engine, but apparently, with great power, comes increased complexity for the user to understand up front.

I think this is pretty natural:

  • The more flexibility you provide to the user on what to monitor (and this is one of the major goals of Performance Schema) – the more there is to twiddle from a configuration perspective.
  • The more data we provide in as generic a ways as possible – the more data there is.

Performance Schema is also only going to grow over time. There are many many missing bits of instrumentation within MySQL that users want us to add, and for the most part, most of those new things are now going in to Performance Schema. Things such as memory instrumentation, stored programs, transactions, prepared statements, better replication monitoring … the list goes on, and the things I mention here are just the things that have gone in to MySQL 5.7 so far.

When I started ps_helper it was nothing more than a little playground for me to research both the breadth of the data that Performance Schema provides, as well as to try and nail down exactly the kinds of views that users want and need, based around various different use cases.

Judging by the feedback that I’ve gotten on ps_helper so far, it turns out that I was actually pretty successful with the latter goal. I’ve heard about it being used in all sorts of situations – from large websites, to small installations, everybody seems to have found something useful with it. MySQL Support also regularly directs customers to install it, and return the output from many of the views, to help customers tune and debug what their MySQL instances are doing.

One of the biggest pieces of feedback that I (and others) have heard is that users really want something like ps_helper to ship with the server itself – there is even a bug to have it included opened by my old colleague Valerii Kravchuk.

Getting something like ps_helper in to the server is not a small task. We need to know that what we are providing makes sense to users, commit to maintaining it “forever”, make sure that it is properly tested, is as bug free as possible, that it follows the right procedures for licenses and that it is properly structured for any future plans that we may have (of which there are many).

Recently, the MySQL Workbench team announced that they would be shipping a new schema that was something like ps_helper:

Additionally Workbench 6.1 is leveraging work from various teammates in MySQL Engineering in introducing a schema called “SYS” that provides simplified views on Performance Schema, Information Schema, and other areas.

This is the first step towards moving the kinds of views and procedures that ps_helper provides in to a shipping “MySQL product”. It started with the Workbench team taking ps_helper, performing a bunch of post processing on it (to rename schema etc.), and then integrating the resulting set of objects. In discussion with them after their first try at this, we decided instead to start a new branch, that didn’t need post-processing, and that starts following some of the other strict guidelines for getting a product shipped within MySQL/Oracle.

And so we get to the point of this blog finally, the mysql-sys repository was born on Github.

Why the name “sys“?

Firstly, and more importantly for some people – it is shorter and easier to type than “ps_helper”.

When you look at other database systems though, and at other similar things that people have done within the MySQL Community, it’s clear that there is room for a schema that can more generically provide a set of objects to administrators that help them with their day to day work.

SQLServer for example has a sys catalog, which comes with a number of dynamic management views, functions and procedures for administrators to use for various tasks.

DB2 has SYSIBM, which is a mixture of some DD and some runtime info. They also have syscat and sysstat that can be used too.

Oracle has a SYS catalog, owned by the SYS user. It’s a little different to our goals here (it’s more the DD owner). When comparing our goals with sys to Oracle, you could compare it to the V$ tables in many ways. However in Oracle SYS also includes a number of packages (routines) that can be used for administration tasks – the SYS.DBMS_* packages.

Looking a little closer to home, one only has to look at the excellent common_schema maintained by Shlomi Noach, which provides a whole host of views, procedures and functions to help manage and interact with MySQL, around objects, runtime metadata, and more (QueryScript for example is incredibly powerful).

With all of these in mind, we also wanted something a little more generic than ps_helper, which was created to make Performance Schema itself a little easier to understand. Something that could contain routines and views that are not necessarily Performance Schema specific (although that is what the bulk of the objects are for at the moment).

So what are the changes that have been made when migrating ps_helper to sys?

A quick summary would be:

  • mysql-sys now uses the GPLv2 license.
    • This is an important change, it aligns it with other MySQL products, and will follow the same procedures as other Oracle products. If you want to submit patches, you should follow the OCA. Unfortunately, this means that merges from mysql-sys to ps_helper can not happen. More on this shortly
  • All objects now fully specify all characteristic clauses
    • For instance, they all have a “root@localhost” DEFINER with SQL SECURITY INVOKER, and fully specify other things such as whether a routine is DETERMINISTIC, how it interacts (i.e. READS SQL DATA), or the ALGORITHM a view should use.
  • All [name]_raw views have been renamed to x$[name]
    • Within ps_helper, the “foo” and “foo_raw” tables all sort together, which makes it a little harder to scan the list of user friendly views from the output of SHOW TABLES. By renaming them all to x$foo, all of the views with some form of raw output sort at the end, and give the nice user friendly list individually first. Oracle also has something like this with their X$ vs V$ views, although X$ is “hidden” in that case.
  • A lot of the objects have had their names changed, as well as some columns
    • Many of the routines are now prefixed with ps_*, many of them have had their names changed for a little more clarity etc.
    • All “count” or “count_star” columns now use “total”, which was also used a lot. There was a lot of inconsistency there.
  • All stored routines (functions and procedures) now have a little documentation in their COMMENT clauses.
  • A number of new views were added, particularly around summarizing user activity, and making sure that each formatted view had an x$ counterpart.
  • A WHOLE BUNCH of bugs were fixed along the way.

Unfortunately, I can not really merge those bug fixes back to ps_helper – they happened as I was going through each of the objects for the clean up, but their commits were kind of mixed in with all these other changes (including the license header changes). If I try to merge this back to ps_helper, I have to re-license ps_helper as GPLv2. I’d still like to keep ps_helper as my little playground though, to try out views and routines that could be useful one day to the sys schema.

I’m not sure what to do with the dbahelper (ps_helper) repository at the moment. I am tempted to remove all of the objects that are within mysql-sys itself, and do a similar rename of the remaining objects to “sys” (so they would build on each other). I do still want to use it as my “hey this could be an interesting object to add” playground. Your thoughts on this from the community would be appreciated.

I’ve now updated the ps_helper page to remove the outdated examples, and to simply point at the Github pages directly instead. The README for the mysql-sys repository now contains the most up to date examples and documentation, and should be used instead.

I hope to blog some more about the changes, and the new things that came along in the near future, but this blog is long enough for now.

Finally, I’d like to take the opportunity to thank all in the community for their great feedback on ps_helper. It helped to turn it in to a much more mature project.

I’m looking forward to its new future as sys!

MySQL Connect and Oracle Open World Presentations Online

After a slight delay (travel and catching up with “real work”), I’ve now uploaded the talks that I gave at MySQL Connect and Oracle Open World.

They are available on my Presentations Page, and inline below for convenience. The “Introduction to MySQL Enterprise Monitor” talk was actually a full demo, but there are some screenshots of MEM 3.0 in there if you’re interested in seeing a high level picture of what it looks like now.

Thanks to all that attended my talks, I got a lot of good questions and feedback!

Oracle *Has Not* Stopped Publishing MySQL To Launchpad

Stewart over at Percona noticed that our trees on Launchpad were out of date, and this has been picked up and taken as gospel by others, notably Henrik today.

Unfortunately, Henrik doesn’t seem to have checked Launchpad before publishing his blog – otherwise he would have noticed that all of the trees are currently up to date with the current releases.

He would also have noticed that there is now a new MySQL 5.6 branch, so that we can continue with trunk as the future MySQL version after MySQL 5.6.

So let me repeat this again, Oracle has not stopped updating the MySQL branches on Launchpad.

Sorry about the blips in updating, however.

ALTER USER … PASSWORD EXPIRE – bitten by a bug

MySQL 5.6.6 was released yesterday – the list of changes is impressive.

One of the new things added within 5.6.6 was the ALTER USER … PASSWORD EXPIRE statement, which allows an administrator to expire the passwords for a user, so that they must update their password on next login.

Unfortunately, this was released within 5.6.6 with a serious problem – already noted by Kolbe Kegel – as it updated the password column to an empty string, as well as setting the password_expired flag. This obviously has serious implications, that really means you should not use this feature, if you are planning on using 5.6.6 in production (for whatever reason, given that it is not a GA release yet).

The good news is that this was caught and fixed around a month ago, but didn’t quite make the 5.6.6 cut off date (where we build, and put the release through QA testing) – here’s the same kind of test on my self-built 5.6.7 release:

$ mysql -uroot -pmysql -P3307
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 152259
Server version: 5.6.7 MySQL Community Server (GPL)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant all on *.* to mark@localhost identified by 'mark';
Query OK, 0 rows affected (0.00 sec)

mysql> select user, host, password, password_expired from mysql.user;
| user            | host      | password                                  | password_expired |
| root            | localhost | *E74858DB86EBA20BC33D0AECAE8A8108C56B17FA | N                |
| mark            | localhost | *E6ACCEDB2495496B191ED488F598F04239C85E73 | N                |
2 rows in set (0.00 sec)

mysql> alter user mark@localhost password expire;
Query OK, 0 rows affected (0.00 sec)

mysql> select user, host, password, password_expired from mysql.user;
| user            | host      | password                                  | password_expired |
| root            | localhost | *E74858DB86EBA20BC33D0AECAE8A8108C56B17FA | N                |
| mark            | localhost | *E6ACCEDB2495496B191ED488F598F04239C85E73 | Y                |
2 rows in set (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

mysql> select user, host, password, password_expired from mysql.user;
| user            | host      | password                                  | password_expired |
| root            | localhost | *E74858DB86EBA20BC33D0AECAE8A8108C56B17FA | N                |
| mark            | localhost | *E6ACCEDB2495496B191ED488F598F04239C85E73 | Y                |
2 rows in set (0.00 sec)

mysql> exit
$ mysql -umark -P3307
ERROR 1045 (28000): Access denied for user 'mark'@'localhost' (using password: NO)
$ mysql -umark -pmark -P3307
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 163413
Server version: 5.6.7

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select * from mysql.user;
ERROR 1820 (HY000): You must SET PASSWORD before executing this statement
mysql> set password = password('newmark');
Query OK, 0 rows affected (0.00 sec)

mysql> show grants;
| Grants for mark@localhost                                                                                            |
| GRANT ALL PRIVILEGES ON *.* TO 'mark'@'localhost' IDENTIFIED BY PASSWORD '*00B6543E480F70E68EB0FE311882F1B32E7EEF43' |
1 row in set (0.00 sec)

mysql> select password('newmark');
| password('newmark')                       |
| *00B6543E480F70E68EB0FE311882F1B32E7EEF43 |
1 row in set (0.00 sec)

So, please don’t use this feature until 5.6.7 is released! We’re sorry that this problem slipped in to the milestone release, but thankfully it was caught before our GA release, both with our own internal testing, and external community interest (thanks Kolbe!).

The documentation will be updated to reflect this in the near future as well.

A MySQL Replication Load Average with Performance Schema

Monitoring MySQL’s replication has always been a bit hit and miss, especially when trying to detect whether a slave is becoming overloaded or not. There’s been a few ways to do this in the past:

  • Monitor the Seconds_behind_master variable from SHOW SLAVE STATUS, which is the difference between the timestamp passed down from the master that the event was executed, to the time that the SQL thread started executing the event.
  • Use something like mk-heartbeat, which operates in the same way as Seconds_Behind_Master (trying to show you the actual time difference between the master and slave), and is a little more robust in complex replication chains, and other situations where Seconds_Behind_Master falls down (such as when the IO thread is lagging).

These are fine for knowing how far out of date your slave is getting, but they don’t really tell you that your slave is becoming overloaded. In other words, this only lets you detect that the slave is lagging, not that it is going to lag.

You can also attempt to monitor replication lag according to the differences in where the two threads are reading / writing to, based on the binary log and relay log positions compared against the master. This allows you to dive a little more in to where the delays may be happening in the chain, rather than relying on a single time behind metric.

Here’s a (somewhat simplified and not entirely accurate – the IO thread reads via the master) representation of how an event moves through replication, and the points that can be monitored with current instrumentation (including time delayed replication within 5.6):

Now, I say “Delay” above, that’s not entirely accurate, as most of those metrics actually record the file, and position within that file, that each stage is at – they allow you to measure the number of bytes difference at each stage, not the latency of each of the stages.

This allows you to generate these metrics:

  • IO Thread Bytes Behind Master:
    • Master (File / Position) – Slave (Master_Log_File / Relay_Master_Log_Pos)
  • SQL Thread Bytes Behind IO Thread:
    • Slave (Master_Log_File / Relay_Master_Log_Pos) – Slave (Relay_Log_file / Relay_Log_Pos)
  • SQL Thread Bytes Behind Master:
    • Master (File / Position) – Slave (Relay_Log_file / Relay_Log_Pos)
  • Slave Bytes Behind Master:
    • Master (File / Position) – Slave (Relay_Master_Log_file / Exec_Master_Log_Pos)
  • SQL Thread Current Event Size:
    • Slave (Relay_Log_file / Relay_Log_Pos) – Slave (Relay_Master_Log_file / Exec_Master_Log_Pos)
  • Seconds Behind Overall (maybe):
    • Slave (Seconds_Behind_Master)

Actually generating this data consistently is problematic however. As it requires two statements to be executed on two entirely different systems, you will need to ensure that you execute these as close as possible to each other in time, so ntp will need to be installed on each system, and they will need to be kicked off on a precise clock based schedule on each server. Even then, they really can only be used as an indicator, not a source of truth, simply because of this.

The expressions above are slightly simplified as well. Logs roll over, so you also need to know the size that logs can get to using either max_binlog_size or max_relay_log_size (if non-zero), as appropriate, and calculate the difference depending on the number of logs between each position – not entirely simple.

But even with these metrics in hand, they don’t really tell us much – the size of an event within the replication stream can differ depending on whether statement based or row based binary logging is enabled. A “small” event could be a statement that takes a very long time to run, or a single row based event that executes in no time at all. YMMV.

So we can use these to profile the system, and graphing over time probably wouldn’t hurt, but relying on them as an indicator that you should start looking at a slave because of performance issues gets tough, unless you rely on something that tells you that you are already behind, in a reactive rather than proactive manner.

What we really need is an indicator that tells us that we should be looking in to performance problems before the slave even starts to lag at all.

And so we come back to the title of this blog. The idea is simple, we need a “Replication Load Average”, that can tell us how busy the slave really is – if it’s not executing 100% of the time, then the chances are it’s not really falling behind. If it’s starting to be busy 90-95% of the time, then the chances are, it could start to lag pretty soon. It’s not a new idea either, I’ve seen a couple of prominent people wishing for such a thing:

Peter Zaitsev: Idea: MySQL Should add “Replication Load Average” metric which would tell what time replication thread was busy processing events compared to being idle. This would help a lot to see when you’re close to the limit.

Jeremy Cole: I agree however that getting saturation information is important and is something we will potentially implement in Twitter MySQL. It would be useful to get essentially a %busy/%idle stat for the SQL thread.

I’m happy to say gentlemen, this is already within 5.5, and has been since it was released.

When the SQL thread has executed all events from the relay log, it enters MYSQL_BIN_LOG::wait_for_update_relay_log(), which then waits for a condition called update_cond to be raised. This is exactly where the slave goes “idle”. And luckily within Performance Schema in 5.5, we can track conditions. Conditions get timed within Performance Schema for the time they were waiting for the condition to be raised by another thread – so in the above case, the total time that the SQL thread was idle.

The condition waited for within wait_for_update_relay_log() can be mapped to the wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond event within Performance Schema – here’s a snapshot from a 5.5 server that is in the middle of a replication tree (it’s both a slave and a master, so has both relay logs and binary logs enabled – the wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond is a similar condition that is used by the IO thread logged in to a master, to signal that new events have been written to the binary log.):

mysql> select * from events_waits_summary_global_by_event_name order by sum_timer_wait desc limit 5;
| EVENT_NAME                                                 | COUNT_STAR | SUM_TIMER_WAIT     | MIN_TIMER_WAIT | AVG_TIMER_WAIT | MAX_TIMER_WAIT |
| wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond           |     595360 | 677596417470900396 |       15021936 |  1138128892553 | 20142735786288 |
| wait/synch/cond/sql/MYSQL_BIN_LOG::update_cond             |      60466 | 167695645597814052 |      187123608 |  2773387450762 | 84468739641228 |
| wait/io/file/myisam/kfile                                  |    3237163 |    844640545485132 |         696600 |      260919992 |  1122678434952 |
| wait/io/file/myisam/dfile                                  |    4156623 |    377275756973940 |         206712 |       90764968 |  2911791051108 |
| wait/synch/cond/sql/COND_queue_state                       |         11 |    321025543065288 |   166075717824 | 29184140278662 | 59224373303796 |

OK, so we can get to the raw data.. Great.. But how do we turn that in to a load average? The key is to read the values on a set interval, compute the deltas of the SUM_TIMER_WAIT statistic, and compare the delta – which is the idle time within the interval – to the overall interval.

As a quick proof of concept, I’ll use an event to do this, to call a procedure to gather the data, compute the deltas, and generate some extra statistics. Before boring you with the details of the implementation, let me first give a snapshot of the end result that we are striving for:

mysql> select tstamp, busy_pct, one_min_avg, five_min_avg, fifteen_min_avg
    ->   from slave_sql_load_average
    ->  order by tstamp desc limit 10;
| tstamp              | busy_pct | one_min_avg | five_min_avg | fifteen_min_avg |
| 2012-07-24 14:00:29 |    79.94 |       67.10 |        66.92 |           66.25 |
| 2012-07-24 14:00:24 |    39.97 |       67.10 |        66.93 |           66.15 |
| 2012-07-24 14:00:19 |    79.92 |       67.11 |        66.92 |           66.34 |
| 2012-07-24 14:00:14 |    84.31 |       67.09 |        66.93 |           66.24 |
| 2012-07-24 14:00:09 |    38.62 |       66.98 |        66.93 |           66.11 |
| 2012-07-24 14:00:04 |    83.26 |       67.03 |        66.95 |           66.31 |
| 2012-07-24 13:59:59 |    79.30 |       66.73 |        66.90 |           66.19 |
| 2012-07-24 13:59:54 |    39.97 |       66.81 |        66.91 |           66.09 |
| 2012-07-24 13:59:49 |    79.94 |       66.81 |        66.91 |           66.28 |
| 2012-07-24 13:59:44 |    79.95 |       66.80 |        66.91 |           66.18 |

This gives what both Jeremy and Peter were asking for – a single “busy” counter, with some load average like statistics over 3 different intervals, with a snapshot every 5 seconds in this example. This allows us to see that there are peaks and troughs in the load of the slave on the shorter interval, but flattens it’s overall load with the larger averaged intervals to give you a better baseline of how busy the slave really is over time.

So first, let’s start with the procedure that generates this data:

DROP PROCEDURE IF EXISTS compute_slave_load_average;


CREATE PROCEDURE compute_slave_load_average()
    DECLARE v_ps_enabled VARCHAR(3);
    DECLARE v_update_cond_enabled VARCHAR(3);
    DECLARE v_update_cond_timed VARCHAR(3);

    DECLARE v_wait_count BIGINT DEFAULT 0;
    DECLARE v_last_wait_count BIGINT DEFAULT 0;
    DECLARE v_wait_count_delta BIGINT DEFAULT 0;

    DECLARE v_wait_sum BIGINT DEFAULT 0;
    DECLARE v_last_wait_sum BIGINT DEFAULT 0;
    DECLARE v_wait_delta BIGINT DEFAULT 0;

    DECLARE v_last_tstamp DATETIME;
    DECLARE v_wait_sum_tstamp DATETIME;
    DECLARE v_time_diff BIGINT;

    DECLARE v_current_wait VARCHAR(128);
    DECLARE v_current_timer_end BIGINT;

    DECLARE v_busy_pct DECIMAL(5,2);
    DECLARE v_one_min_avg DECIMAL(5,2);
    DECLARE v_five_min_avg DECIMAL(5,2);
    DECLARE v_fifteen_min_avg DECIMAL(5,2);

    DECLARE v_insert_id BIGINT;
    /* Disable binary logging */
    SET sql_log_bin = 0;

    /* Check Performance Schema is enabled properly */
     WHERE VARIABLE_NAME = 'performance_schema';

    SELECT enabled, timed
      INTO v_update_cond_enabled, v_update_cond_timed
      FROM performance_schema.setup_instruments
     WHERE name = 'wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond';

    IF v_ps_enabled = 'OFF' THEN
        INSERT INTO ps_helper_logs (module, message)
        VALUES ('compute_slave_load_average', 'performance_schema is disabled');
    ELSEIF v_update_cond_enabled = 'NO' OR v_update_cond_timed = 'NO' THEN
        INSERT INTO ps_helper_logs (module, message)
        VALUES ('compute_slave_load_average', 
                CONCAT('performance_schema is not configured properly, 
                        the wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond event is currently
                        set to enabled: ', v_update_cond_enabled, ', timed: ', v_update_cond_timed, 
                        'within the setup_instruments table'));
        /* Get the latest MYSQL_RELAY_LOG::update_cond wait info for the slave SQL thread */
        SELECT his.sum_timer_wait, his.count_star, cur.event_name, cur.timer_end, SYSDATE() 
          INTO v_wait_sum, v_wait_count, v_current_wait, v_current_timer_end, v_wait_sum_tstamp 
          FROM performance_schema.events_waits_summary_by_thread_by_event_name his
          JOIN performance_schema.threads thr USING (thread_id)
          JOIN performance_schema.events_waits_current cur USING (thread_id)
         WHERE his.event_name = 'wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond'
           AND name = 'thread/sql/slave_sql';

        /* Get the last summary for comparison */
        SELECT idle_sum, events_sum, tstamp 
          INTO v_last_wait_sum, v_last_wait_count, v_last_tstamp
          FROM slave_sql_load_average
         ORDER BY id DESC LIMIT 1;

        /* Compute the interval and event count deltas */
        SET v_time_diff = (UNIX_TIMESTAMP(v_wait_sum_tstamp) - UNIX_TIMESTAMP(v_last_tstamp)) * 1000000000000;
        SET v_wait_count_delta = v_wait_count - v_last_wait_count;

        /* Compute the delta busy percentages */
        IF (v_wait_sum != v_last_wait_sum AND v_wait_count_delta > 0) THEN
            /* There have been waits during the period, calculate the stats */
            SET v_wait_delta = v_wait_sum - v_last_wait_sum;

            IF (v_wait_delta > v_time_diff) THEN
                /* The last wait was longer than our current period, estimate waits in period */
                SET v_wait_delta = v_wait_delta % (v_time_diff * FLOOR(v_wait_delta/v_time_diff));
                SET v_busy_pct = 100 - ((v_wait_delta / v_time_diff) * 100);
                /* In a normal period, calculate using raw wait delta */
                SET v_busy_pct = 100 - ((v_wait_delta / v_time_diff) * 100);
            END IF;
        ELSEIF (v_current_wait = 'wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond' 
                AND v_current_timer_end IS NULL) THEN
            /* Waiting 100% on a single event for the entire period, i.e 100% idle*/
            SET v_wait_delta = v_time_diff;
            SET v_busy_pct = 0.00;
            /* Waiting 100% on a single event for the entire period that is not update_cond */
            SET v_wait_delta = v_time_diff;
            SET v_busy_pct = 100.00;
        END IF;

        /* Log the initial stats */
        INSERT INTO slave_sql_load_average 
               (idle_sum, idle_delta, idle_avg, idle_delta_formatted, 
                events_sum, events_delta, busy_pct, tstamp, current_wait)
        VALUES (v_wait_sum, v_wait_delta, format_time(v_wait_delta / v_wait_count_delta),
                format_time(v_wait_delta), v_wait_count, v_wait_count_delta, 
                v_busy_pct, v_wait_sum_tstamp, v_current_wait);

        SELECT LAST_INSERT_ID() INTO v_insert_id;

        /* Compute the averages taking the last interval in to account */
        SELECT SUM(busy_pct)/COUNT(*) INTO v_one_min_avg
          FROM slave_sql_load_average
         WHERE busy_pct IS NOT NULL 
           AND tstamp > SYSDATE() - INTERVAL 1 MINUTE;

        SELECT SUM(busy_pct)/COUNT(*) INTO v_five_min_avg
          FROM slave_sql_load_average
         WHERE busy_pct IS NOT NULL 
           AND tstamp > SYSDATE() - INTERVAL 5 MINUTE;

        SELECT SUM(busy_pct)/COUNT(*) INTO v_fifteen_min_avg
          FROM slave_sql_load_average
         WHERE busy_pct IS NOT NULL 
           AND tstamp > SYSDATE() - INTERVAL 15 MINUTE;

        UPDATE slave_sql_load_average SET 
               one_min_avg = v_one_min_avg,
               five_min_avg = v_five_min_avg,
               fifteen_min_avg = v_fifteen_min_avg
         WHERE id = v_insert_id;

        /* Purge anything older than 2 hours */            
        DELETE FROM slave_sql_load_average 
         WHERE tstamp < NOW() - INTERVAL 2 HOUR;
    END IF;

    /* Re-enable binary logging */
    SET sql_log_bin = 1;


At a high level, this gathers the wait data for the wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond event from the events_waits_summary_by_thread_by_event_name table for only the slave SQL thread, which is exposed as thread/sql/slave_sql, then grabs the last snapshot of data for comparison. It then looks at the current and previous wait info to determine how to to calculate the actual busy percentage based on how the event data has changed over the period, computes the deltas and busy percent for that last interval appropriately, and inserts that in to a table.

Finally it computes the moving averages for the 3 sets of defined intervals and updates the just inserted row with that data (so the moving average also takes in to account the current snapshot), and finally purges anything older than two hours, to store a rolling two hour window of 5 second snapshots.

It also disables binary logging (because these are local statistics that should not be propagated to downstream systems), and does some checking whether performance schema is enabled, and logs appropriately within a logging table if not.

The table structures needed for the procedure are as follows:

DROP TABLE IF EXISTS slave_sql_load_average;

CREATE TABLE slave_sql_load_average (
  tstamp TIMESTAMP,
  idle_avg VARCHAR(12),
  idle_delta_formatted VARCHAR(12),
  busy_pct DECIMAL(5,2),
  one_min_avg DECIMAL(5,2),
  five_min_avg DECIMAL(5,2),
  fifteen_min_avg DECIMAL(5,2),
  idle_sum BIGINT,
  idle_delta BIGINT,
  events_sum INT,
  events_delta INT,
  current_wait VARCHAR(128),
  KEY (tstamp)
) ENGINE = InnoDB;

DROP TABLE IF EXISTS ps_helper_logs;

CREATE TABLE ps_helper_logs (
  tstamp TIMESTAMP,
  module VARCHAR(64),
  message TEXT
) ENGINE = InnoDB;

The full output from the table looks like this:

mysql> select * from slave_sql_load_average order by tstamp desc limit 5;
| id  | tstamp              | idle_avg  | idle_delta_formatted | busy_pct | one_min_avg | five_min_avg | fifteen_min_avg | idle_sum           | idle_delta    | events_sum | events_delta | current_wait                                     |
| 106 | 2012-07-24 13:20:34 | 491.11 ms | 982.22 ms            |    80.36 |       67.07 |        65.29 |           65.98 | 689661649765993320 |  982222907152 |     605561 |            2 | wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond |
| 105 | 2012-07-24 13:20:29 | 500.80 ms | 1.00 s               |    79.97 |       67.07 |        65.28 |           65.84 | 689655667543086168 | 1001609993416 |     605559 |            2 | wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond |
| 104 | 2012-07-24 13:20:24 | 1.50 s    | 3.01 s               |    39.90 |       67.05 |        65.28 |           65.70 | 689649665933092752 | 3005124404328 |     605557 |            2 | wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond |
| 103 | 2012-07-24 13:20:19 | 331.17 ms | 993.52 ms            |    80.13 |       67.07 |        65.28 |           65.96 | 689646660808688424 |  993523754020 |     605555 |            3 | wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond |
| 102 | 2012-07-24 13:20:14 | 192.27 ms | 769.07 ms            |    84.62 |       67.04 |        65.29 |           65.82 | 689640667284934404 |  769065072628 |     605552 |            4 | wait/synch/cond/sql/MYSQL_RELAY_LOG::update_cond |

And the event that kicks all of this off:

DROP EVENT IF EXISTS monitor_slave_load_average;


CREATE EVENT IF NOT EXISTS monitor_slave_load_average 
    CALL compute_slave_load_average();


You’ll also need the format_time() function from ps_helper.

Here’s a graph of the data over time, including a period where the slave IO thread was turned off, where you can see the moving average windows decay more slowly than the last intervals statistics:

Finally, some notes on this method:

  • Importantly, this doesn’t require any knowledge of timestamps across systems, or require any synchronization for collection across systems
  • Yes this is not exactly like the Linux load average – which uses an Exponential Moving Average, which weights snapshots differently the older they get, but that would also be possible
  • The time delta recorded within 5.5 is only possible at a 1 second granularity, which makes us lose quite a bit of precision when compared to picosecond precision wait information, within 5.6 we can use microsecond precision with SYSDATE(6), and get even more accurate reporting – and note, SYSDATE() is used instead of NOW(), as within procedures NOW() returns the timestamp for when the procedure started, not when the function is called
  • The first 15 minutes worth of aggregate data should be taken with a pinch of salt, wait for the data set to “warm up”
  • This doesn’t necessarily negate the need for the other statistics described above – which can all be used to trace where latency might be, and in the case of Seconds_Behind_Master or a heartbeating implementation can still give information on how far behind you are getting, once your slave does really start to lag, as well as using something like Jeremy’s velocity metric (linked via is name above), which can give an idea of the general direction of of a slave once it’s started lagging.

With all that said though, I think this is the *key* metric that anybody should be tracking first and foremost. It is far better to fix problems or spread load before you start lagging.

The joys of digging through the different events in Performance Schema to see what they actually record! There’s gold in there still to be mined, but I think this one is a genuine 14 oz nugget.

Monitoring Processes with Performance Schema in MySQL 5.6

I’ve written before about how you can monitor the last wait per connection with Performance Schema in MySQL 5.5.

That solution joined the performance_schema.events_waits_current with the performance_schema.threads and INFORMATION_SCHEMA.PROCESSLIST tables.

Joining to INFORMATION_SCHEMA.PROCESSLIST is a necessary evil within MySQL 5.5, to be able to get the user and host information etc. – however it is certainly not optimal, because using INFORMATION_SCHEMA.PROCESSLIST (and it’s SHOW [FULL] PROCESSLIST cousin) incurs a bunch of mutexes to be locked – the fill_schema_processlist() function attempts to lock LOCK_thread_count globally (which will block new connections whilst the entire processlist output is generated), as well as tmp->LOCK_thd_data per thread that it iterates over (which could in turn stall each thread if they try to access this) and mysys_var->mutex which protects the mysys_var struct (used around killing connections).

If you are polling the processlist output frequently (I’ve seen some solutions out there poll it once a second, or even lower, sometimes), then the chances are, you are stalling real application connections as well. This is why we don’t use these commands within MySQL Enterprise Monitor for things like Query Analysis.

With that in mind, we have now extended the performance_schema.threads table in MySQL 5.6 to include all of the information that is available within the INFORMATION_SCHEMA.PROCESSLIST table:

mysql> desc performance_schema.threads;
| Field               | Type             | Null | Key | Default | Extra |
| THREAD_ID           | int(11)          | NO   |     | NULL    |       |
| NAME                | varchar(128)     | NO   |     | NULL    |       |
| TYPE                | varchar(10)      | NO   |     | NULL    |       |
| PROCESSLIST_ID      | int(11)          | YES  |     | NULL    |       |
| PROCESSLIST_USER    | varchar(16)      | YES  |     | NULL    |       |
| PROCESSLIST_HOST    | varchar(60)      | YES  |     | NULL    |       |
| PROCESSLIST_DB      | varchar(64)      | YES  |     | NULL    |       |
| PROCESSLIST_COMMAND | varchar(16)      | YES  |     | NULL    |       |
| PROCESSLIST_TIME    | bigint(20)       | YES  |     | NULL    |       |
| PROCESSLIST_STATE   | varchar(64)      | YES  |     | NULL    |       |
| PROCESSLIST_INFO    | longtext         | YES  |     | NULL    |       |
| PARENT_THREAD_ID    | int(11)          | YES  |     | NULL    |       |
| ROLE                | varchar(64)      | YES  |     | NULL    |       |
| INSTRUMENTED        | enum('YES','NO') | NO   |     | NULL    |       |
14 rows in set (0.02 sec)

As with all other instrumentation within Performance Schema, this is entirely non-blocking. When you upgrade to MySQL 5.6, it’s time to start forgetting about SHOW PROCESSLIST, or INFORMATION_SCHEMA.PROCESSLIST. They’ve served us well for a long time, but it’s time to wave goodbye to their contentious ways.

There’s another bonus in the threads table within MySQL 5.6 as well, and that is the newly added INSTRUMENTED column. This allows you turn on and off instrumentation, per thread, dynamically. You do this by simply running an UPDATE:

mysql> select name, type, instrumented from threads;
| name                                   | type       | instrumented |
| thread/sql/main                        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/io_handler_thread        | BACKGROUND | YES          |
| thread/innodb/srv_lock_timeout_thread  | BACKGROUND | YES          |
| thread/innodb/srv_error_monitor_thread | BACKGROUND | YES          |
| thread/innodb/srv_monitor_thread       | BACKGROUND | YES          |
| thread/innodb/srv_master_thread        | BACKGROUND | YES          |
| thread/innodb/srv_purge_thread         | BACKGROUND | YES          |
| thread/innodb/page_cleaner_thread      | BACKGROUND | YES          |
| thread/sql/shutdown                    | BACKGROUND | YES          |
| thread/sql/manager                     | BACKGROUND | YES          |
| thread/sql/con_sockets                 | BACKGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
27 rows in set (0.00 sec)

mysql> update threads set instrumented = 'NO' where name != 'thread/sql/one_connection';
Query OK, 20 rows affected (0.00 sec)
Rows matched: 20  Changed: 20  Warnings: 0

mysql> select name, type, instrumented from threads;
| name                                   | type       | instrumented |
| thread/sql/main                        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/io_handler_thread        | BACKGROUND | NO           |
| thread/innodb/srv_lock_timeout_thread  | BACKGROUND | NO           |
| thread/innodb/srv_error_monitor_thread | BACKGROUND | NO           |
| thread/innodb/srv_monitor_thread       | BACKGROUND | NO           |
| thread/innodb/srv_master_thread        | BACKGROUND | NO           |
| thread/innodb/srv_purge_thread         | BACKGROUND | NO           |
| thread/innodb/page_cleaner_thread      | BACKGROUND | NO           |
| thread/sql/shutdown                    | BACKGROUND | NO           |
| thread/sql/manager                     | BACKGROUND | NO           |
| thread/sql/con_sockets                 | BACKGROUND | NO           |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
| thread/sql/one_connection              | FOREGROUND | YES          |
25 rows in set (0.00 sec)

This then enables/disables all instrumentation within Performance Schema for each thread appropriately.

Now we have all of this data directly within the threads table, we can also start joining it to some of the other tables within Performance Schema as well, to get even more data on what threads are currently doing within the database. Particularly, we can join to the new events_statements_current table as well, to get extra information on statement execution (you will need to ensure that the events_statements_current consumer is enabled within the setup_consumers table).

Here’s an example of another view I’m about to add to ps_helper:

DROP VIEW IF EXISTS processlist_full;

CREATE VIEW processlist_full AS
SELECT pps.thread_id AS thd_id,
       pps.processlist_id AS conn_id,
       IF( = 'thread/sql/one_connection', 
          CONCAT(pps.processlist_user, '@', pps.processlist_host), 
          REPLACE(, 'thread/', '')) user,
       pps.processlist_db AS db,
       pps.processlist_command AS command,
       pps.processlist_state AS state,
       pps.processlist_time AS time,
       format_statement(pps.processlist_info) AS current_statement,
       IF(esc.timer_wait IS NOT NULL,
          NULL) AS last_statement,
       IF(esc.timer_wait IS NOT NULL,
          NULL) as last_statement_latency,
       format_time(esc.lock_time) AS lock_latency,
       esc.created_tmp_tables AS tmp_tables,
       esc.created_tmp_disk_tables as tmp_disk_tables,
       IF(esc.no_good_index_used > 0 OR esc.no_index_used > 0, 
          'YES', 'NO') AS full_scan,
       ewc.event_name AS last_wait,
       IF(ewc.timer_wait IS NULL AND ewc.event_name IS NOT NULL, 
          'Still Waiting', 
          format_time(ewc.timer_wait)) last_wait_latency,
  FROM performance_schema.threads AS pps
  LEFT JOIN performance_schema.events_waits_current AS ewc USING (thread_id)
  LEFT JOIN performance_schema.events_statements_current as esc USING (thread_id)
ORDER BY pps.processlist_time DESC, last_wait_latency DESC;

And some sample output:

*************************** 3. row ***************************
                thd_id: 11617
               conn_id: 11596
                  user: root@localhost
                    db: mysql
               command: Sleep
                 state: cleaning up
                  time: 0
     current_statement: NULL
        last_statement: SELECT CAST(SUM_NUMBER_OF_BYTE ... nnodb/innodb_log_file' limit 1
last_statement_latency: 2.02 ms
          lock_latency: 0 ps
         rows_examined: 39
             rows_sent: 0
         rows_affected: 0
            tmp_tables: 0
       tmp_disk_tables: 0
             full_scan: YES
             last_wait: idle
     last_wait_latency: Still Waiting
*************************** 4. row ***************************
                thd_id: 12474
               conn_id: 12453
                  user: root@localhost
                    db: mem
               command: Sleep
                 state: cleaning up
                  time: 0
     current_statement: NULL
        last_statement: commit
last_statement_latency: 952.49 µs
          lock_latency: 0 ps
         rows_examined: 0
             rows_sent: 0
         rows_affected: 0
            tmp_tables: 0
       tmp_disk_tables: 0
             full_scan: NO
             last_wait: idle
     last_wait_latency: Still Waiting
*************************** 5. row ***************************
                thd_id: 12400
               conn_id: 12379
                  user: root@localhost
                    db: ps_helper
               command: Query
                 state: Copying to tmp table
                  time: 0
     current_statement: SELECT pps.thread_id AS thd_id ... e DESC, last_wait_latency DESC
        last_statement: NULL
last_statement_latency: NULL
          lock_latency: 1.00 ms
         rows_examined: 0
             rows_sent: 0
         rows_affected: 0
            tmp_tables: 1
       tmp_disk_tables: 0
             full_scan: YES
             last_wait: wait/synch/mutex/sql/THD::LOCK_thd_data
     last_wait_latency: 77.11 ns
                source: sql_class.h:3843

Listed here are two idle connections, and one active connection (which was the one gathering the data).

The events_statements_current table stores the currently executing, or last executed (if no currently executing), statement per thread. This means that this new form of processlist can not only tell you what is currently executing, but also what last executed for sleeping threads.

All of the statistics listed, such as lock latency, row stats, temporary table stats are for either the current or last statement too. So with “conn_id: 12453″ we can see that it last executed a COMMIT (in the last_statement column), which took 952.49 µs (in the last_statement_latency column).

I can’t count how many times I wanted to know the last statement that executed for sleeping connections (particularly those holding open a transaction), nor how many times customers and users asked that question whilst I was in the MySQL Support group, so this is a great step forward.

For idle connections, we now have a new “idle” wait type, which records how long sessions sit idle for as well. For sleeping connections this will always be the current wait (unless you disable the idle instrument). However for active connections, we show the last wait that happened within that statement’s execution.

You can see that the last session listed is copying to a temporary table, had lock latency of 1 ms, and last waited on the wait/synch/mutex/sql/THD::LOCK_thd_data event, which basically corresponds to the LOCK_thd_data mutex I was talking about above.

But hang on? I said above that SHOW PROCESSLIST etc. lock this in a blocking way as well right? And this is also grabbing this mutex? What gives? Well, performance_schema is true table storage engine, and the statement executed is still grabbing a read lock against the performance schema table. The statement itself is also creating a temporary table, doing it’s own thing within it’s own thread as well, and whilst doing that, threads tend to grab their own LOCK_thd_data mutex (there is one per thread), to protect access from other threads – for instance, we LOCK_thd_data to set the current query string, which protects SHOW FULL PROCESSLIST from reading a half written query string for it’s INFO. That doesn’t change for Performance Schema access – but it does help to show just the kind of contention that SHOW FULL PROCESSLIST can cause!

The above view only gets a subset of the available data per statement available within the events_statements_current table as well – I’ve omitted all of the sort / scan information for instance. You can add these to the view as you see fit.

I’ll be adding some views like this to ps_helper shortly. If you have any feedback in the meantime feel free to comment!