Google Patents

So I came across Google Patents the other day and thought that it was a pretty cool idea! Searching for whether a patent already exists had been a pretty hard job in the past, so having a readily available resource such as Google is a fantastic step forwards.

Especially in light of the recent Microsoft ‘FUD’.

Here’s a patent I came across whilst browsing around the various database related patents that exist out there:

http://www.google.com/patents?id=zLsYAAAAEBAJ&dq

Electronic shopping and merchandising system

Abstract
The present invention provides a merchant system for online shopping and merchandising. The merchant system architecture provides great flexibility for a merchant to adapt the merchant system to their existing business practices, promotions and databases. The merchant system includes a dynamic page generator, a configurable order processing module and a database module capable of retrieving data from the database without regard to its schema. The present invention enables merchants to create electronic orders which are easily adaptable for different sales situations. The order processing module includes multiple configurable stages to process a merchant’s electronic orders. The merchant system is capable of generating pages dynamically using templates having embedded directives. The database module and the dynamic page generator allow merchants to modify their databases and page displays without having to reengineer the merchant system.

Wow, it looks to me like Microsoft patented the Online shopping cart! There’s a lot of interesting patents that Microsoft seem to have taken out. Google makes searching for these relatively easily.

There’s some great other patents as well out there too:

The OUIJA-BOABD
The FLYING-MACHINE

And every geek’s favourite!

The POCKET PROTECTOR

Aggregating SHOW STATUS Output

When I first joined MySQL one of the things that was evident was the Support Engineers spent quite some time with customers issues that were focused on performance tuning. Performance tuning issues generally start with a engineer requesting a bunch of information from the customer such as:

  • SHOW GLOBAL VARIABLES
  • SHOW GLOBAL STATUS (a number of times, to give us some rate information)
  • SHOW FULL PROCESSLIST
  • SHOW INNODB STATUS (if InnoDB is widely being used)
  • vmstat output for a number of short periods
  • iostat -dx output for a number of short periods

A lot of this output is pretty easy to go through when you know what you are looking for. However where we spend a lot of our time is looking through the SHOW GLOBAL STATUS output – trying to piece together the rates of change etc. so that we can get more insight in to what is hurting the database.

I also had never used Python – and wanted to give it a try – and said some time early last year “One of these days I’m going to create a Python script that aggregates all of this stuff for us“. Well, towards the end of last year (and with some extra stuff added early this year), I actually went ahead and took the dive and created such a script. Now, I know there are some scripts out there that already do some of this from other people (such as the pretty good one put together over on hackmysql.com) – however these are scripts that all connect to a running database, and get stats to aggregate in that way. This really didn’t help us in the Support Group – as we needed customers to send us stats in files generally (although sometimes we do connect and troubleshoot that way). It also didn’t serve the purpose of my wanting to try Python :)

I’ve reached a point with this script now where I’m ready to shove it out to the world to use as they wish, and request feedback on it. The current version I am releasing here is actually a little behind – I’m working on an ‘interactive’ version as well (more like mysqlreport et al), but more on that later.

Here’s a sample of what the report spits out now:

[markleith@medusa:~/Development] $ ./statpack.py statsfiles/mysql_collect_200701071916.txt statsfiles/mysql_collect_200701071921.txt

Number of Snapshots: 2

====================================================================================================
Uptime: 12 hours 29 min Snapshot Period 1: 4 minute interval
====================================================================================================
Variable Delta/Percentage Per Second Total
====================================================================================================

=========================================Database Activity==========================================

Connections: 12,193 43.70 1,496,977
Threads Connected: 9 432
Threads Running: 26 284
Questions: 1,235,947 4,429.92 159,441,275
Bytes Recieved: 771,889,505 2,766,629.05 99,484,958,519
Bytes Sent: 4,531,361,061 16,241,437.49 575,614,342,973
Aborted Clients: 0 0.00 368
Aborted Connects: 0 0.00 1

=========================================Statement Activity=========================================

SELECT: 197,280 707.10 26,224,157
INSERT: 13,061 46.81 1,568,744
UPDATE: 14,932 53.52 1,774,077
DELETE: 12,848 46.05 1,585,598
REPLACE: 0 0.00 0
INSERT ... SELECT: 0 0.00 0
REPLACE ... SELECT: 0 0.00 0
Multi UPDATE: 0 0.00 0
Multi DELETE: 0 0.00 0
COMMIT: 21 0.08 2,459
ROLLBACK: 4 0.01 447

========================================Prepared Statements=========================================

Prepared Statement Count: Status variable not available in your MySQL version
PREPARE: 0 0.00 0
EXECUTE: 0 0.00 0
DEALLOCATE PREPARE: 0 0.00 0
Fetch Roundtrips: 0 0.00 0
Send Long Data: 0 0.00 0

===========================================Admin Commands===========================================

KILL: 0 0.00 0
FLUSH: 0 0.00 0
ANALYZE TABLE: 0 0.00 0
OPTIMIZE TABLE: 0 0.00 0
CHECK TABLE: 0 0.00 0
REPAIR TABLE: 0 0.00 0

==========================================Temporary Space===========================================

tmp_table_size Efficiency: 11.64%
Memory Temp Tables: 6,599 23.65 978,895
Disk Temp Tables: 5,912 21.19 864,973
Temp Files: 0 0.00 5

==========================================MyISAM Key Cache==========================================

Cache Read Efficiency: 99.98%
Cache Write Efficiency: 51.94%
Memory Used: 64Kb 57Mb
Memory Free: -64Kb 780Mb
Key Reads: 76 0.27 60,426
Key Read Requests: 2,241,530 8,034.16 283,675,567
Key Writes: 5,399 19.35 863,132
Key Write Requests: 15,455 55.39 1,795,774
Blocks Not Flushed: 12 0.04 15

============================================Table Cache=============================================

table_cache Efficiency: 99.49%
Open Tables: 0 0.00 1,357
Opened Tables: 0 0.00 1,363

=========================================InnoDB Buffer Pool=========================================

Buffer Pool Read Efficiency: 100.00%
Memory Total: 0Kb 1,024Mb
Memory Free: -1,280Kb 984Mb
Memory Data: 1,264Kb 38Mb
Memory Dirty: -592Kb 0Mb
Data Read: 1,232Kb 37Mb
Data Written: 16,665Kb 1,793Mb
Buffer Pool Reads: 76 0.27 1,178
Buffer Pool Read Requests: 460,617 1,650.96 66,374,804
Buffer Pool Write Requests: 1,584 5.68 183,697

==========================================InnoDB Log Files==========================================

Log Data Written: 114Kb 16Mb
Log Writes: 57 0.20 7,124
Log Write Requests: 182 0.65 32,096
Log Waits: 0 0.00 0

============================================Query Cache=============================================

QCache Hits / SELECT: 81.76%
QCache Hit/Qcache Insert: 83.71%
Qcache Hits/Invalidations: 23.00%
SELECTs: 197,280 707.10 26,224,157
Query Cache Hits: 913,101 3,272.76 117,563,266
Query Cache Inserts: 164,677 590.24 22,869,761
Queries Not Cached: 32,566 116.72 3,356,727
Cache Low Memory Prunes: 151,782 544.02 19,606,315
Total Cache Blocks: -1,126 -4.04 301,180
Queries In Cache: -280 -1.00 123,023
Cache Free Blocks: -539 -1.93 47,065

============================================Index Usage=============================================

Index Efficiency: 90.96%
Full Index Scans: 7,652 27.43 875,018
Full Table Scans: 30 0.11 4,289
Full Join Scans: 0 0.00 0
Handler_read_first: 7,652 27.43 875,018 (0.21%)
Handler_read_key: 841,736 3,016.97 116,933,614 (27.52%)
Handler_read_next: 2,569,564 9,209.91 268,699,389 (63.24%)
Handler_read_prev: 0 0.00 0 (0.00%)
Handler_read_rnd: 97,563 349.69 13,331,811 (3.14%)
Handler_read_rnd_next: 43,357 155.40 25,075,543 (5.90%)

==========================================Lock Contention===========================================

Percent of Locks Waited: 13.08%
Table Locks Waited: 47,482 170.19 4,394,859
Table Locks Immediate: 209,882 752.27 29,201,590

==============================================Sorting===============================================

Rows Sorted: 90,014 322.63 12,448,923
Sort Range: 32,914 117.97 4,372,453
Sort Scan: 6,279 22.51 924,212
Sort Merge Passes: 0 0.00 0
Full Range Joins: 0 0.00 0

Input files can contain many snapshots of SHOW STATUS within the same file, and you can also pass in any number of files in chronological order (or a mixture of the two). StatPack also understands status files with both “batch” and “non-batch” outputs. That is to say it currently can parse files of both of the following formats:

[markleith@medusa:~/Development] $ mysql -u root -e "SHOW STATUS"
+-----------------------------------+----------+
| Variable_name | Value |
+-----------------------------------+----------+
| Aborted_clients | 16 |
| Aborted_connects | 0 |
| Binlog_cache_disk_use | 0 |
...

[markleith@medusa:~/Development] $ mysql -u root -B -e "SHOW STATUS"
Variable_name Value
Aborted_clients 16
Aborted_connects 0
Binlog_cache_disk_use 0
...

Currently it does not support the “\G” output.

There will be one report for each period reported within the “Number of Snapshots” section (less one, as initially we need 2 reports to start aggregation). I.e if it reports 4 snapshot periods, you will get 3 reports based on the differences between each snapshot.

Now, as I’ve learned a little more a Python I’ve realized I’ve done some things that could…. be done better… :) I’m working ‘Version 2′, which is changing a fair bit of the way things are done, and also adds in connecting to a running server to generate a report from as well. Currently this is what it supports:

[markleith@medusa:~/Development] $ ./statpack-interactive.py

Error: No arguments supplied
usage: ./statpack-interactive.py [list of arguments]

Non-interactive mode (aggregate txt files containing SHOW STATUS snapshots):

-f --files List of statistics files to aggregate (--files=stat1.txt[, stat2.txt..])
Argument must be first within list of arguments, on it's own

Interactive mode (connect to running MySQL server for SHOW STATUS snapshots):

-h --host Host for MySQL server to connect to in interactive mode
-P --port Port the MySQL server is running on
-S --socket UNIX domain socket to use (instead of -h and -P)
-u --user User to connect to the MySQL server with
-p --password Password for the user
-d --defaults-file Defaults file to read options from (reads [client] group by default)
-i --interval Length of each collection interval
-c --report-count Count of intervals to collect and aggregate
-r --report-file File to output report data to
-s --status-file File to output raw SHOW STATUS data to

I’m still ironing a few things, but wanted to get this last version out for any feedback that anybody would like to provide before I release the next version! You can download the ‘current’ version from http://markleith.co.uk/dl/statpack-v1.tar.gz.

Please do provide feedback!