Arjen Lentz (“Community Relations Manager” for MySQL AB) posted “The MySQL 5.0 Beta Challenge” on his blog today – so in the spirit of community I decided to give it a crack, by putting together some posts on the new features of MySQL 5.0 up here to my humble blog. Who knows – I may win a mug! Then I can have a list of ALL the people to bug over at MySQL AB!
I’m a DBA at heart, so whilst I’ll take a look at the new features that are available for developers, I’m going to start from the DBA tack, and take a look at INFORMATION_SCHEMA.
So what is the INFORMATION_SCHEMA? Those that have used other RDBMS’ in the past will recognize it as MySQL’s “Data Dictionary”. An RDBMS Data Dictionary is a collection of tables or views (that appear as tables), that utilize the database metadata to give easy access to information about the objects (tables, indexes, procedures, functions, views etc.) within the database, as well as information on things such security/privileges, performance statistics, auditing trails etc.
On installing MySQL 5.0, and issuing “
SHOW DATABASES;” you will immediately see “information_schema” listed. The INFORMATION_SCHEMA is implemented to look like a database in it’s own right. However, it doesn’t conform with the conventional database privilege system. You can “USE” it, but you can’t run any type of DML against it. GRANT does seem to work against it, but attempting to run a DELETE results in an error.
mysql> use information_schema; Database changed mysql> show grants; +------------------------------ | Grants for mark@localhost | +------------------------------ | GRANT USAGE ON *.* TO 'mark'@'localhost' | | GRANT SELECT, DELETE ON `information_schema`.* TO 'mark'@'localhost' | +------------------------------ 2 rows in set (0.00 sec) mysql> delete from information_schema.tables; ERROR 1288 (HY000): The target table tables of the DELETE is not updatable
“What good are they to me?”
Well, one question I’ve seen a lot around IRC is “Can I use the output of a SHOW command in a statement, such as a subquery?”. This mostly comes down to something to do with
SHOW TABLES – getting databases sizes, table sizes etc.
SHOW TABLES still works in 5.0, as do all other SHOW commands, however, INFORMATION_SCHEMA.TABLES now also parses the exact same output. Think of it of as view (more on these later) against the
SHOW TABLES command. The great thing about this is we can now use conventional SQL to use these values – we no longer have to write external scripts to do what SUM() can do, for example. So here’s one for those guys on IRC that want to find out how big their databases are from SQL:
SELECT s.schema_name, CONCAT(IFNULL(ROUND((SUM(t.data_length)+SUM(t.index_length))/1024/1024,2),0.00),'Mb') total_size, CONCAT(IFNULL(ROUND(((SUM(t.data_length)+SUM(t.index_length))-SUM(t.data_free))/1024/1024,2),0.00),'Mb') data_used, CONCAT(IFNULL(ROUND(SUM(data_free)/1024/1024,2),0.00),'Mb') data_free, IFNULL(ROUND((((SUM(t.data_length)+SUM(t.index_length))-SUM(t.data_free))/((SUM(t.data_length)+SUM(t.index_length)))*100),2),0) pct_used, COUNT(table_name) total_tables FROM information_schema.schemata s LEFT JOIN information_schema.tables t ON s.schema_name = t.table_schema WHERE s.schema_name != 'information_schema' GROUP BY s.schema_name ORDER BY pct_used DESCG *****************1. row ************** schema_name: test total_size: 0.06Mb data_used: 0.06Mb data_free: 0.00Mb pct_used: 100.00 total_tables: 2 *****************2. row ************** schema_name: flightstats total_size: 2950.14Mb data_used: 2949.71Mb data_free: 0.43Mb pct_used: 99.99 total_tables: 81 ...
So now we can get the information we want, in the format we want. We can also JOIN the INFORMATION_SCHEMA tables to create better reports – such as above, if there are no actual tables in a database, they won’t be shown within INFORMATION_SCHEMA.TABLES, so to get a full report on all databases, we LEFT JOIN from INFORMATION_SCHEMA.SCHEMATA.
This also leads us in to getting helpful information on any object optimizations easily. Perhaps you have performed an archive of a lot of data, and want to reclaim freespace. You want to find all tables with > 40% of the datafile free, so that you can perform an
OPTIMIZE TABLE against them:
SELECT table_name, ROUND(((t.data_length+t.index_length)-t.data_free)/(t.data_length+t.index_length)*100) pct_used FROM information_schema.tables t WHERE table_schema = 'flightstats' HAVING pct_used < 60 ORDER BY table_name;</blockquote>
Let’s take this one step further – now we can actually use MySQL to generate the scripts that we’ve had to use an external scripting language to achieve. Let’s focus on another question often asked on IRC – “How do I drop all my tables, without dropping my database?” – previous to MySQL 5.0 the answer is “get table_name from the output of SHOW TABLES, and loop through this with your scripting language of choice, running a DROP TABLE command for each one”.
Now let’s use the INFORMATION_SCHEMA, and SQL to do the job for us:
SELECT CONCAT('DROP TABLE ',table_name,';') INTO OUTFILE '/sql/drop_tables.sql' FROM information_schema.tables WHERE table_schema = 'test';
Now all we have to do is “SOURCE /sql/drop_tables.sql;” and all of the tables in the test database will be gone, and we didn’t have to leave the mysql command line interface! Don’t try this on your production schemas kids!
Or to go back to our OPTIMIZE TABLE:
SELECT CONCAT('OPTIMIZE TABLE ',table_name,';') INTO OUTFILE '/sql/optimize_tables.sql' FROM information_schema.tables WHERE table_schema = 'flightstats' AND ROUND(((data_length+index_length)-data_free)/(data_length+index_length)*100) < 60;<
This makes life a lot easier! There are a number of other INFORMATION_SCHEMA tables, but I won’t go in to all of those now. We can use these to get information on grants, build grant scripts, get information about indexes, columns, foreign keys etc. For instance – it’s probably possible now to perform “fuzzy” foreign key matches – searching for columns that are named the same within different tables, that may not necessarily be the InnoDB engine. As I put together more scripts in readiness for 5.0, I’ll post them here, and to the other site that I run – SQLDump.co.uk (which I haven’t really started filling yet!! But it’s wiki! )
It should be noted that all users have access to the INFORMATION_SCHEMA. They will however only be able to see the objects for which they have some privileges on.