MySQL Vala Program Example

September 8, 2011, 4:28 am

≫ Next: Building Galera Replication from Scratch

Taxonomy upgrade extras:

Summary: In this article we have a short look at a simple MySQL example program written in Vala.

Recently a customer pointed me to a programming language called Vala. Vala is a C-style programming language generating C code which afterwards can be compiled and linked with the normal gcc.
This I found pretty useful to not mess around with pointers and all this stuff in C and to be capable anyway to write C programs for some projects I had in mind in my head since long.
Vala is mostly used around the Gnome Project.

Vala

Vala declares itself as:

Vala is a new programming language that allows modern programming techniques to be used to write applications ... Before Vala, the only ways to program for the platform were with the machine native C API, which exposes a lot of often unwanted detail, with a high level language that has an attendant virtual machine, such as Python or the Mono C# language, or alternatively, with C++ through a wrapper library.
Vala is different from all these other techniques, as it outputs C code which can be compiled to run with no extra library support beyond the GNOME platform. This has several consequences, but most importantly:

Programs written in Vala should have broadly similar performance to those written directly in C, whilst being easier and faster to write and maintain.
A Vala application can do nothing that a C equivalent cannot. Whilst Vala introduces a lot of language features that are not available in C, these are all mapped to C constructs, although they are often ones that are difficult or too time consuming to write directly.
As such, whilst Vala is a modern language with all of the features you would expect, it gains its power from an existing platform, and must in some ways comply with the rules set down by it.

[ 1 ]

MySQL Example

After playing around with some simple programs like the hello.vala I wanted to write my own little MySQL program in Vala. But I did not find one single complete runnable example for MySQL. :-(
Some Vala documentation about MySQL is available here.
After searching around for a while and loosing a lot of time I finally got it myself. I hope the following MySQL Program Example in Vala helps you to get a faster start with this nice language and MySQL:

using Mysql;

int main (string[] args)
{

  int rc = 0;

  ClientFlag cflag    = 0;
  string     host     = "127.0.0.1";
  string     user     = "root";
  string     password = "";
  string     database = "test";
  int        port     = 3306;
  string     socket   = null;

  Database mysql = new Mysql.Database ();

  var isConnected = mysql.real_connect(host, user, password, database, port, socket, cflag);

  if ( ! isConnected ) {

    rc = 1;
    stdout.printf("ERROR %u: Connection failed: %s\n", mysql.errno(), mysql.error());
    return rc;
  }

  stdout.printf("Connected to MySQL server version: %s (%lu)\n"
              , mysql.get_server_info()
              , (ulong) mysql.get_server_version());

  string sql = "SELECT * FROM test LIMIT 10";
  rc = mysql.query(sql);
  if ( rc != 0 ) {

    stdout.printf("ERROR %u: Query failed: %s\n", mysql.errno(), mysql.error());
    return rc;
  }

  Result ResultSet = mysql.use_result();

  string[] MyRow;

  while ( (MyRow = ResultSet.fetch_row()) != null ) {

    stdout.printf("id: %s | data: %s | ts: %s\n", MyRow[0], MyRow[1], MyRow[2]);
  }
  // free_result is called automatically

  // mysql_close is called automatically
  return rc;
}

This code will only run with Vala v0.12 and newer.

How to install Vala

The following installation instruction [ 2 ] worked for me:

sudo apt-key adv --recv-keys --keyserver keyserver.ubuntu.com 7DAAC99C
sudo add-apt-repository ppa:vala-team
sudo apt-get update
sudo apt-get install valac vala-utils vala-doc valac-dbg
valac --version

sudo apt-get install libgee-dev
sudo apt-get install gedit-vala-plugin vala-gen-project
sudo apt-get install valide

Compile the MySQL Vala example program

I used the following command to compile my MySQL Vala Program Example:

valac --pkg=mysql --Xcc=-lmysqlclient mysql_ex1.vala --Xcc=-I/home/mysql/src/mysql-5.1.55 \
      --Xcc=-L/home/mysql/product/mysql-5.1.55/lib -v

If you want too see the C-code which is generated you have to add the --ccode option.

The generated binary file in my case is just 13876 bytes long.
And now have fun playing around with Vala!

↧

Building Galera Replication from Scratch

November 13, 2011, 3:29 am

≫ Next: How MySQL behaves with many schemata, tables and partitions

≪ Previous: MySQL Vala Program Example

Taxonomy upgrade extras:

Introduction

MySQL/Galera synchronous Multi-Master Replication consists of 2 parts:

The wsrep patches for MySQL (codership-mysql) and
the Galera Replication Plugin (galera).

If you do not want to download the prepared binaries you can build it on you own.
First you have to download the native MySQL sources, then patch it with the Galera wsrep patches and compile it. In a second step you have to build the Galera Plugin.

This is especially useful because in the standard Galera binary tar balls the garbd (and possibly other tools) is not provided.

The following steps describe how to do it:

Prepare a patched MySQL

Download MySQL Sources

Download the normal MySQL source code:

wget http://mirror.switch.ch/ftp/mirror/mysql/Downloads/MySQL-5.5/mysql-5.5.15.tar.gz

Download wsrep Patch

Download the wsrep Patch for MySQL:

wget http://launchpad.net/codership-mysql/5.5/5.5.15-21.2/+download/mysql-5.5.15-wsrep_21.2.patch

Patch MySQL

Patch MySQL as follows:

cd /tmp
tar xf /download/mysql-5.5.15.tar.gz
cd mysql-5.5.15
patch -p1 < /download/mysql-5.5.15-wsrep_21.2.patch

If you want avoid this step you can also download the already patched codership-mysql directly as follows:

bzr branch lp:codership-mysql

bzr branch lp:codership-mysql/5.5

If you want to do create your own wsrep patch:

bzr branch lp:codership-mysql/5.5
cd 5.5
bzr diff -p1 -v --diff-options " --exclude=.bzrignore " -r tag:mysql-5.5.15..branch:lp:codership-mysql/5.5 \
  > mysql-5.5.15-wsrep_21.2.patch

Compile MySQL with the wsrep patch

To compile the patched MySQL do the following:

chmod a+x BUILD/compile-amd64-wsrep
BUILD/compile-amd64-wsrep --prefix=/home/mysql/product/mysql-5.5.15-wsrep-21.2
make install

Up to here this was the first step to get a prepared MySQL working with wsrep. Now we have to make the Galera Plugin...

Prepeare the Galera Replication Plugin

Download Galera Replication Plugin

The source of the Galera Replication Plugin you can get like this:

wget http://launchpad.net/galera/1.x/21.1.0/+download/galera-21.1.0-src.tar.gz

or if you want to get the most recent source take if directly from launchpad:

bzr branch lp:galera/1.x

Compile the Galera Replication Plugin

cd /tmp/
tar xf /download/galera-21.1.0-src.tar.gz
cd galera-21.1.0-src/
scons

cp garb/garbd /home/mysql/product/mysql-5.5.15-wsrep-21.2/bin/
cp libgalera_smm.so /home/mysql/product/mysql-5.5.15-wsrep-21.2/lib/plugin/

This is the whole magic. The most difficult thing is to put everything in the right order and to get the right packages in place (for Ubuntu/Debian: libboost-dev, libboost-program-options-dev (>= v1.41), scons, libssl-dev, check, g++ (>= 4.4.0), etc.).

↧

How MySQL behaves with many schemata, tables and partitions

November 30, 2011, 6:35 am

≫ Next: Migrating from MySQL Master-Master Replication to Galera Multi-Master Replication

≪ Previous: Building Galera Replication from Scratch

Taxonomy upgrade extras:

table_definition_cache

open_files_limit

open_files

Introduction

Recently a customer claimed that his queries were slow some times and sometimes they were fast.

First idea: Flipping query execution plan caused by InnoDB could be skipped because it affected mainly MyISAM tables.

Second idea: Caching effects by either the file system cache caching MyISAM data or the MyISAM key buffer caching MyISAM indexes were examined: File system cache was huge and MyISAM key buffer was only used up to 25%.

I was a bit puzzled...

Then we checked the table_open_cache and the table_definition_cache which were still set to their default values. Their according status information Opened_tables and Opened_table_definitions were clearly indicating that those caches are much too small...

The typical reaction is: increase open_files_limit (after increasing the operating system user limits) to a higher value. Unfortunately most of the Linux distributions have a default values for open files of 1024 which is far to low for typical database systems.

Too many open files

But customer claimed that he tried this already (set open_files_limit to 50'000) and got error messages in the MySQL error log:

# perror 24
OS error code  24:  Too many open files

So we were even more puzzled.

After some further investigation we found that the customer has up to 600 schemata and in each schema he had 30 to 100 tables and some of those tables have even monthly partitions up to 4 years (roughly 50 partitions).

A partition internally is handled similar to a table. So we have something between 18'000 and 3'000'000 tables/partitions in total. We wrote already about the problems with MySQL databases with many tables in Configuration of MySQL for Shared Hosting but now it is even worse with the partitions. I remembered that having too many partitions with MySQL is not a good idea so we investigated a bit deeper in this area:

We first looked at the amount of file handles MySQL has open:

lsof -p <pid> | wc -l

This value clearly moved towards our new open_files_limit of 150'000 (with table_open_cache and table_definition_cache set to 32k each) and then we got Too many open files errors. So we set table_open_cache and table_definition_cache both back to 2048 and go a stable number of file descriptors between 100'000 and 110'000.

This gives us an idea how to extrapolate those values when we want to have bigger caches. But we do not know how a Linux system or MySQL will behave with much higher values...

Partition table test

But why do we get such high values?

As an example we took the following table with 14 partitions:

CREATE TABLE ptn_test (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT
, data VARCHAR(64)
, ts TIMESTAMP
, PRIMARY KEY (id, ts)
, INDEX (ts)
) ENGINE = MyISAM
PARTITION BY RANGE ( UNIX_TIMESTAMP(ts) ) (
  PARTITION p_2010    VALUES LESS THAN ( UNIX_TIMESTAMP('2011-01-01 00:00:00') )
, PARTITION p_2011_01 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-02-01 00:00:00') )
, PARTITION p_2011_02 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-03-01 00:00:00') )
, PARTITION p_2011_03 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-04-01 00:00:00') )
, PARTITION p_2011_04 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-05-01 00:00:00') )
, PARTITION p_2011_05 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-06-01 00:00:00') )
, PARTITION p_2011_06 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-07-01 00:00:00') )
, PARTITION p_2011_07 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-08-01 00:00:00') )
, PARTITION p_2011_08 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-09-01 00:00:00') )
, PARTITION p_2011_09 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-10-01 00:00:00') )
, PARTITION p_2011_10 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-11-01 00:00:00') )
, PARTITION p_2011_11 VALUES LESS THAN ( UNIX_TIMESTAMP('2011-12-01 00:00:00') )
, PARTITION p_2011_12 VALUES LESS THAN ( UNIX_TIMESTAMP('2012-01-01 00:00:00') )
, PARTITION p_max VALUES LESS THAN (MAXVALUE)
);

and inserted some rows in each partition:

INSERT INTO ptn_test VALUES
  (NULL, 'Bla', '2010-12-01 00:00:42')
, (NULL, 'Bla', '2011-01-01 00:00:42')
, (NULL, 'Bla', '2011-02-01 00:00:42')
, (NULL, 'Bla', '2011-03-01 00:00:42')
, (NULL, 'Bla', '2011-04-01 00:00:42')
, (NULL, 'Bla', '2011-05-01 00:00:42')
, (NULL, 'Bla', '2011-06-01 00:00:42')
, (NULL, 'Bla', '2011-07-01 00:00:42')
, (NULL, 'Bla', '2011-08-01 00:00:42')
, (NULL, 'Bla', '2011-09-01 00:00:42')
, (NULL, 'Bla', '2011-10-01 00:00:42')
, (NULL, 'Bla', '2011-11-01 00:00:42')
, (NULL, 'Bla', '2011-12-01 00:00:42')
, (NULL, 'Bla', '2012-01-01 00:00:42');

Just for running a simple EXPLAIN:

EXPLAIN PARTITIONS SELECT * FROM ptn_test WHERE ts = '2011-11-01 00:00:42';

+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+
| id | select_type | table    | partitions | type   | possible_keys | key  | key_len | ref  | rows | Extra |
+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+
|  1 | SIMPLE      | ptn_test | p_2011_11  | system | ts            | NULL | NULL    | NULL |    1 |       |
+----+-------------+----------+------------+--------+---------------+------+---------+------+------+-------+

MySQL already opens 28 file descriptors and uses one table_open_cache and one table_definition_cache entry. When doing the same query in a second session on a second schema MySQL already has opened 56 file descriptors and 2 table_open_cache and 2 table_definition_cache entries. What the MySQL manual states about you can find here: How MySQL Opens and Closes Tables.

SHOW GLOBAL STATUS LIKE 'open_%s';

+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 56    |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
+--------------------------+-------+

By the way we found that the value of Open_files is pretty close to the result of the following Linux command:

lsof -p <pid> | grep '/var/lib/mysql' | grep -e 'MYD' -e 'MYI' -e 'ib' | wc -l

So Open_files is a good indication of how far we are away of open_files_limit. MySQL needs some more file descriptors for error log file, binary log file etc. But this is typically less than 10.

Now we have already found why we got the Too many open files error message.

But it does not explain yet why our MyISAM key buffer was so badly used.

MyISAM key buffer is wiped out

I have not found it mentioned in the documentation explicitly (please correct me when I am wrong) but it looks like when MySQL has a pressure on the table_open_cache and removes entries from it, it wipes out blocks from the MyISAM key buffer of those tables as well.

A little experiment for this:

Clean up table_open_cache and table_definition_cache and MyISAM key buffer:

FLUSH TABLES;
FLUSH STATUS;
SET GLOBAL key_buffer_size=1024*1024;
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 0     |
| Open_tables              | 0     |
| Opened_files             | 0     |
| Opened_table_definitions | 0     |
| Opened_tables            | 0     |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 837   |
+-------------------+-------+

Run a query from 2 sessions in 2 different schemata:

EXPLAIN
SELECT COUNT(*) FROM (
  SELECT * FROM ptn_test
   WHERE ts >= '2011-11-01 00:00:00' and Ts < '2011-12-01 00:00:00'
  ) AS x;

Show the status information again:

SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 56    |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
| Opened_files             | 62    |
| Opened_table_definitions | 2     |
| Opened_tables            | 2     |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 504   |
+-------------------+-------+

We have 56 file descriptors (2 tables x 14 partitions x (data + index = 2)), 2 table_open_cache and table_definition_cache entries and 333 MyISAM key buffer blocks used.

Connect from a 3^rd connection without the -A option:

SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 30    |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 127   |
| Opened_table_definitions | 21    |
| Opened_tables            | 21    |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 669   |
+-------------------+-------+

26 file descriptors where released. We currently have 16 entries in the table_open_cache (its size was limited to 16) but in total 19 tables were opened. Why 65 files were opened for this operation is unclear.

The current amount of MyISAM key blocks is 168 which is roughly the half of the value before. So we can assume that one of our 2 partitioned table was closed and its key buffer entries were wiped out.

Run the query again

EXPLAIN
SELECT COUNT(*) FROM (
  SELECT * FROM ptn_test
   WHERE ts >= '2011-11-01 00:00:00' and Ts < '2011-12-01 00:00:00'
  ) AS x;
SHOW GLOBAL STATUS LIKE 'open_%s';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 58    |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 156   |
| Opened_table_definitions | 21    |
| Opened_tables            | 22    |
+--------------------------+-------+

SHOW GLOBAL STATUS LIKE 'key_blocks_un%';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Key_blocks_unused | 504   |
+-------------------+-------+

28 file descriptors were used again with 29 in total and one more table was opened. And our key buffer is back to 333 blocks used.

So it looks like when we have really pressure on the table_open_cache this also has an impact on the MyISAM key buffer.

How does it behave with InnoDB?

Nowadays InnoDB is much more often used than MyISAM. So let us have a look how is the impact on InnoDB tables:

Starting point values:

SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 0     |
| Open_tables              | 0     |
| Opened_files             | 0     |
| Opened_table_definitions | 0     |
| Opened_tables            | 0     |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

Run the query in 2 different connections on 2 different tables:

SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 0     |
| Open_table_definitions   | 2     |
| Open_tables              | 2     |
| Opened_files             | 6     |
| Opened_table_definitions | 2     |
| Opened_tables            | 2     |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

Use a new connection:

SHOW GLOBAL STATUS LIKE 'open_%s';
SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool_pages_%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| Open_files               | 2     |
| Open_table_definitions   | 21    |
| Open_tables              | 16    |
| Opened_files             | 71    |
| Opened_table_definitions | 21    |
| Opened_tables            | 21    |
+--------------------------+-------+

+----------------------------------+-------+
| Variable_name                    | Value |
+----------------------------------+-------+
| Innodb_buffer_pool_pages_data    | 1012  |
| Innodb_buffer_pool_pages_free    | 1002  |
| Innodb_buffer_pool_pages_misc    | 34    |
| Innodb_buffer_pool_pages_total   | 2048  |
+----------------------------------+-------+

So the phenomena of wiping out data seems definitely not to happen with InnoDB tables. Further InnoDB seems to use much less file descriptors than MyISAM.

Conclusion

MyISAM uses a huge number of file descriptors. This comes especially true when using partitions.
This needs a significant increase of open_files_limit. Its impact is unknown.
Shortage of entries in the table_open_cache leads to wipe out tables from the table_open_cache.
This leads to wipe out of MyISAM key buffer blocks of the according tables which leads to slower queries when the table is requested next time.
With InnoDB this behavior is much more relaxed and problems should appear much later than with MyISAM.
Be careful using a significant number of (partitioned) tables with MyISAM. It can have a serious impact on the performance of your system.

↧

Migrating from MySQL Master-Master Replication to Galera Multi-Master Replication

December 2, 2011, 5:48 am

≫ Next: Rolling upgrade of Galera 1.0 to 1.1

≪ Previous: How MySQL behaves with many schemata, tables and partitions

Taxonomy upgrade extras:

Introduction

Galera is a synchronous Multi-Master Replication for MySQL. It is therefore in competition with several other MySQL architectures:

Master-Master Replication with MySQL
MySQL Cluster
The non-open source product called Schooner

Very often they can be easily replaced by Galera's synchronous Multi-Master Replication for MySQL.

All those products have some advantages and disadvantages. Very often MySQL Master-Master Replication is used in the field because of its simplicity to set-up. But after a while one faces its disadvantages which is mainly data inconsistency between the 2 Masters. This is not only the fault of MySQL Replication but MySQL Replication makes it easy to get such data inconsistencies.

In the following article we look at how you can replace a MySQL Master-Master Replication by Galera Multi-Master Replication with the possibility to fall back if you do not like the solution or if you run into troubles.

Starting point

Some MySQL users have a typical Master-Master Replication set-up like a) active-passive or b) active-active for High Availability (HA) reasons. Either in the same data center or even in remote data centers:


a) active-passive		b) active-active

Adding Galera synchronous Replication

As a first step you can add a Galera Replication Cluster as a simple Slave:

In this set-up you have to consider, that ALL nodes which are participating in replication (Master 1, Master 2 and Galera 1) have set the following parameters:

#
# my.cnf
#
[mysqld]

default_storage_engine = InnoDB

log_slave_updates      = 1
log_bin                = bin-log
server_id              = <n>
binlog_format          = ROW

It is very important that the server_id is unique on all MySQL nodes BUT they have to be EQUAL for all Galera nodes. Example:

Master 1: server_id = 1
Master 2: server_id = 2
Galera 1: server_id = 3
Galera 2: server_id = 3
Galera 3: server_id = 3

This is to avoid conflicts during replication.

Galera is set-up as described in Installing MySQL/Galera. Please make sure, that you do no have any MyISAM tables anymore. Galera cannot cope with any other Storage Engine than InnoDB at the moment.
The following query helps you to find out if you are using any other Storage Engine than InnoDB:

SELECT table_schema, engine, COUNT(*)
  FROM information_schema.tables
 WHERE table_schema NOT IN ('information_schema', 'mysql', 'performance_schema')
 GROUP BY table_schema, engine;

Then you do a normal dump from the Master 2 as follows:

mysqldump --user=root --password --master-data --single-transaction \
          --databases foodmart test > full_dump.sql

When you dump the database avoid dumping the mysql schema otherwise you will destroy your Galera Node 2 and 3. Then restore the dump on ONE node of the Galera Cluster (preferably on node 1) after setting it to its master:

CHANGE MASTER TO master_host='master1', master_port=3306
     , master_user='replication', master_password='secret';

mysql --user=root --password < full_dump.sql

Then you can attach the Galera node to the Master 2.

Now all data arriving to your MySQL Master(s) will be automatically replicated to the Galera Cluster as well.

Adding the Galera Cluster into the ring

In a second step you can add the Galera Cluster into the Replication ring by pointing Master 1 to the Galera Node 1:

Application Load Balancing for Galera

To have true High Availability (HA) it makes sense to put some Load Balancer in front of your Galera Cluster. This can be done either through:

a Hardware Load Balancer
a Software Load Balancer (Linux Virtual Server (LVS), Galera Load Balancer, Pen
MySQL Proxy, Connector/J or PHP Mysqlnd
or mechanisms implemented in your application.

Now your Galera Replication Cluster is ready to put some load on it:

If you are more familiar with Galera you can move the Virtual IP (VIP) from MySQL Master-Master Replication to the Galera Replication Cluster:

And if you are happy with the synchronous replication and its scaling performance you can finally drop your old MySQL Master-Master set-up and bypass the VIP during next downtime of your application.

Shortcuts

A shortcut in this way would be when you directly replace Master 2 by a Galera node:

Then you need one server less and you can directly use the MySQL Master node as a base for starting with Galera. You just have to replace the MySQL Binaries by the MySQL-Galera Binaries and then add 2 other Galera nodes in the set-up.

Important notes

Currently Galera works only with InnoDB tables. So you have to make sure that you convert all your non-InnoDB tables to InnoDB tables (except the ones in the mysql Schema). Otherwise you will run into problems.

The described set-up works starting with Galera v1.1 and wsrep v22.3.

The memory of the Galera node getting the import has grown by 1.5 Gbyte in one of our tests. So be prepared that the system has enough memory! In our first tests the system heavily started to swap, which caused high I/O load. In this situation Galera behaved erroneous...

If the Master is under very high load Galera Slave can not catch up with the load and starts lagging... This is not a problem if you run the load only on the Galera Cluster!

To avoid a Split-Brain situations all Cluster Solutions need at least 3 nodes. This is the same with Galera. When you move from MySQL Master-Master replication you need one Server more than before to avoid this problem. Theoretically Galera can be run in a 2-node set-up but this is strongly NOT recommended to do.

One way out of this situation is to use the garbd who acts as an arbitrator in such a scenario. This is called a 2 1/2 node set-up.

And now have fun with your synchronous Multi-Master Galera Replication for MySQL...

↧

Rolling upgrade of Galera 1.0 to 1.1

December 14, 2011, 6:38 am

≫ Next: Recover lost .frm files for InnoDB tables

≪ Previous: Migrating from MySQL Master-Master Replication to Galera Multi-Master Replication

Taxonomy upgrade extras:

A few days ago Codership announced their new version Galera v1.1 - synchronous Replication Cluster for MySQL. Before we look at the new feature of Rolling Online Schema Upgrade (OSU) we have a look at how to upgrade to the new Galera release.

A rolling upgrade of your synchronous Galera Replication Cluster from version 1.0 to 1.1 is quite easy when you stay at the same MySQL version (5.5).

To not lose the availability of your database service during the upgrade you should have at least 3 Galera nodes in your Cluster.

For further details please also look at MySQL/Galera cluster upgrade.

Hint: If you can do without rolling upgrade, you better avoid it and take your Galera Cluster down.

Check the version

Check first the version you are currently running on:

SHOW GLOBAL VARIABLES LIKE 'version';
+---------------+-----------------------+
| Variable_name | Value                 |
+---------------+-----------------------+
| version       | 5.5.15-wsrep_21.1-log |
+---------------+-----------------------+

We can see that we are using MySQL 5.5.15 with the wsrep API Version 21 and the wsrep patch 21.1.

SHOW GLOBAL STATUS LIKE 'wsrep_provider_version';
+------------------------+-------------+
| Variable_name          | Value       |
+------------------------+-------------+
| wsrep_provider_version | 21.1.0(r86) |
+------------------------+-------------+

Here we see that we are using the Galera Replicator (plug-in) 1.0 (revision 86) based on the wsrep API Version 21.

Some rules to Galera versioning. We have 4 different version numbers we should care about:

MySQL version (5.5.15)
Wsrep API version (21, 22, ...)
The wsrep API versions will always be single monotonically increasing numbers: 21, 22, ... That indicates API compatibility between MySQL and Galera.
Wsrep Patch version (21.1)
The wsrep patch versions has the form 21.1 where the 21 represents the API version and the 1 represents bug-fix of that API version.
Galera Replicator (= provider, plug-in) (1.0(r86))
Galera versions will be in the form <major>.<minor> with minor meaning: bug-fixes and small features and major: major features, which involve a lot of code change.

Galera 22.1.1 is backward compatible with Galera 21.1.0. I was told that Galera should be at least ONE version backward compatible. So 1.0 should be for 0.8 and 1.1 for 1.0 and 1.2 for 1.1 and 2.0 for 1.2 etc.

Preparation

Download the packages for your preferred installation method from here:

In my case there was only a binary tar ball provided for Codership-MySQL but not for the Galera Plug-in v1.1. So I extracted it from the Debian package as follows:

dpkg-deb -x galera-22.1.1-amd64.deb /tmp/oli/
cp /tmp/oli/usr/bin/garbd /home/mysql/product/mysql-5.5.17-wsrep-22.3/bin
cp /tmp/oli/usr/lib/galera/libgalera_smm.so \
/home/mysql/product/mysql-5.5.17-wsrep-22.3/lib/plugin/

For RPM's it should work in a similar way:

rpm2cpio package.rpm | cpio -idmv

Precautions

Make sure, that during upgrade from 5.1 to 5.5 no DDL's are allowed!

Upgrade

Then upgrade your Galera Cluster as follows:

Shift load away from this node.
Shutdown node (/etc/init.d/mysql stop)
Uninstall or remove the old Galera plug-in.
Uninstall or remove the old Codership-MySQL Binaries
Install the new Codership-MySQL binaries with the wsrep API version 22
Install the new Galera plug-in v1.1
Check if wsrep_provider in my.cnf is pointing to the correct new location.
Start node (/etc/init.d/mysql start)

Check if node came up properly:

SHOW GLOBAL STATUS LIKE 'wsrep%';
+----------------------------+--------------------------------------+
| Variable_name              | Value                                |
+----------------------------+--------------------------------------+
| wsrep_local_state          | 4                                    |
| wsrep_local_state_comment  | Synced (6)                           |
| wsrep_cluster_size         | 3                                    |
| wsrep_cluster_status       | Primary                              |
| wsrep_connected            | ON                                   |
| wsrep_local_index          | 1                                    |
| wsrep_provider_version     | 22.1.1(r95)                          |
| wsrep_ready                | ON                                   |
+----------------------------+--------------------------------------+

If this is the case shift load back to this node.
If you have already troubles up to here we recommend to solve the problems first and NOT to continue with the upgrade procedure. Otherwise you risk the loss of you complete service.
If your reached this step you can upgrade the next node in your Galera Cluster.

When you have upgraded all your nodes in the Galera Cluster you should notice, that the Protocol version will switch automatically from 1 to 2:

111212 17:32:24 [Note] WSREP: Quorum results:
        version    = 1,
        component  = PRIMARY,
        conf_id    = 6,
...
111212 17:34:33 [Note] WSREP: Quorum results:
        version    = 2,
        component  = PRIMARY,
        conf_id    = 7,

Configuration for Rolling restart

In Galera Cluster configurations you see often that the Cluster is still set to its initial start configuration which is inappropriate for a rolling restart operation:

Galera configuration for an initial Cluster start

Galera node 1: wsrep_cluster_address: gcomm://
Galera node 2: wsrep_cluster_address: gcomm://192.168.1.101:4567
Galera node 3: wsrep_cluster_address: gcomm://192.168.1.102:4567

In this case Node 2 and 3 are OK for a rolling restart but Galera Node 1 will fail to restart.

Galera configuration for normal operations and a Cluster rolling restart

This is the way we recommend to have a Galera configuration for normal operations and a rolling restart:

Galera node 1: wsrep_cluster_address: gcomm://192.168.1.103:4567
Galera node 2: wsrep_cluster_address: gcomm://192.168.1.101:4567
Galera node 3: wsrep_cluster_address: gcomm://192.168.1.102:4567

Every Galera node points to its "left" neigbour.

Upgrade from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.1

Upgrading from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.1 has to be done in 2 steps because Codership only provides backwards-compatibility for one minor version jumps (0.8 -> 1.0 -> 1.1 -> 1.2 -> 2.0).

We have 2 possibilities now:

5.1/0.8 -> 5.1/1.0 -> 5.5/1.1

5.1/0.8 -> 5.5/1.0 -> 5.5/1.1

Which one you choose is up to you.

A rolling upgrade on a running system is impossible without a snapshot state transfer (SST) at the moment. So be prepared it takes a while and causes some load on the systems.

In our case we chose the way via 5.5/1.0 (2^nd way).

To upgrade from 5.1/0.8 to 5.5/1.0 proceed as follows:

Shift load away from this node to the other 2 nodes.
Shutdown this node (/etc/init.d/mysql stop)
Set wsrep_provider = none in my.cnf
Uninstall or remove the old Galera plug-in.
Uninstall or remove the old Codership-MySQL Binaries
Install the new Codership-MySQL binaries
Install the new Galera plug-in
Start this node (/etc/init.d/mysql start)

Then you will get some error messages:

111214 11:44:59 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it
111214 11:44:59 [ERROR] Native table 'performance_schema'.'events_waits_current' has the wrong structure
...
111214 11:44:59 [ERROR] Native table 'performance_schema'.'file_instances' has the wrong structure
111214 11:44:59 [Note] Event Scheduler: Loaded 0 events
111214 11:44:59 [Note] WSREP: wsrep_load(): loading provider library 'none'

Run mysql_upgrade (see MySQL upgrade instructions and consider that a MySQL binary upgrade is not officially supported/recommended (this is not a problem, because SST with mysqldump will do a logical restore anyway)).
Set wsrep_provider in my.cnf to the new plug-in location.
Prepare SST upgrade script on (all) the donor(s) node(s).
```
cp wsrep_sst_mysqldump wsrep_sst_mysqldump_upgrade
```

Change the script wsrep_sst_mysqldump_upgrade that it dumps all databases except the mysql database:

diff wsrep_sst_mysqldump wsrep_sst_mysqldump_upgrade
59c59
< --skip-comments --flush-privileges --all-databases"
---
> --skip-comments --flush-privileges --databases test foodmart"

Caution: Be careful with Stored Procedures, Stored Functions, Triggers and Events! This upgrade procedure will NOT work completely if you use some of those MySQL features. This upgrade procedure will further not work completely if you have differences in the mysql schema of your Galera nodes (for what ever reason).
Set wsrep_sst_method = mysqldump_upgrade in my.cnf
Start this node (/etc/init.d/mysql start). Keep in mind that one of the remaining Galera nodes will act as a SST donor and during the synchronization he is not available for queries!
Check if node came properly up: SHOW GLOBAL STATUS LIKE 'wsrep%';
If this is the case shift load back to this node.
If you have already troubles up to here we recommend to solve the problems first and NOT to continue with the upgrade procedure. Otherwise you risk the loss of you complete service.
Set wsrep_sst_method back to its original value (mysqldump).
If your reached this step you can upgrade the next node in your Galera Cluster.

If you finally manged to upgrade from MySQL 5.1/Galera 0.8 to MySQL 5.5/Galera 1.0 you can follow the procedure mentioned above to upgrade to Galera 1.1

Findings

To identify our different Galera Clusters we name them:

wsrep_cluster_name             = "Galera-0.8 wsrep-21"

In this upgrade scenario this naming convention is very non-optimal because the name of our Galera Cluster should change as well. But the value of wsrep_cluster_name should be the same on all Galera nodes otherwise a node is not capable to join the Cluster (this is to make sure that a Galera node is not connecting by accident to a/the wrong Galera Cluster).
To change the wsrep_cluster_name parameter you have to bring down the whole Galera Cluster. This is not possible at the moment during a rolling restart.
Hopefully this constraint is released in a later Galera version.

↧

Recover lost .frm files for InnoDB tables

December 20, 2011, 1:51 pm

≫ Next: I prefer MySQL binary tar balls with Galera...

≪ Previous: Rolling upgrade of Galera 1.0 to 1.1

Taxonomy upgrade extras:

english

Backup Restore Recovery

Recently I found in a forum the following request for help:

My MySQL instance crashed because of free disk space fault. I saw in /var/lib/mysql all the files: ibdata1, ib_logfile* and all the folders containing frm files. Well, when i solved the problem and run successfully the instance, some databases disappeared. One of those is the most important, and i don't know how many tables had and their structures. Is there any way for recover the entire lost database (structure and data) only having the ibdata1 file?

First of all the observation sounds a bit strange because files do not just disappear. So I fear that its not just the .frm files which are lost. But let's think positive and assume just the .frm files have gone...

To recover the tables is a bit tricky because the .frm files contains the information about the table structure for MySQL.

If you have any old backup or only a structure dump it would be very helpful..

In InnoDB there is the table structure stored as well. You can get it out with the InnoDB Table Monitor as follows:

mysql> CREATE SCHEMA recovery;
mysql> use recovery;
mysql> CREATE TABLE innodb_table_monitor (id INT) ENGINE = InnoDB;

MySQL will write the output into its error log:

TABLE: name test/test, id 16, flags 1, columns 4, indexes 1, appr.rows 3
  COLUMNS: id: DATA_INT DATA_BINARY_TYPE len 4;
    DB_ROW_ID: DATA_SYS prtype 256 len 6;
    DB_TRX_ID: DATA_SYS prtype 257 len 6;
  DB_ROLL_PTR: DATA_SYS prtype 258 len 7;
  INDEX: name GEN_CLUST_INDEX, id 18, fields 0/4, uniq 1, type 1
   root page 312, appr.key vals 3, leaf pages 1, size pages 1
   FIELDS:  DB_ROW_ID DB_TRX_ID DB_ROLL_PTR id

With these information and some experience you can guestimate the original table structure:

Schema and table name: test.test

id: DATA_INT DATA_BINARY_TYPE len 4;
  DB_ROW_ID: DATA_SYS prtype 256 len 6;
  DB_TRX_ID: DATA_SYS prtype 257 len 6;
DB_ROLL_PTR: DATA_SYS prtype 258 len 7;

The table has only 1 column called id which is an 4 byte int, the other columns are InnoDB internal stuff (19 byte!).

INDEX: name GEN_CLUST_INDEX, id 18, fields 0/4, uniq 1, type 1

The table has only one generated clustered index (no explicit index!).

So we can guess:

mysql> CREATE TABLE test.test (
  id INT UNSIGNED NOT NULL DEFAULT 0
) ENGINE = InnoDB CHARSET=utf8;

This table has to be created on a second system now. From there we see with the InnoDB table monitor:

TABLE: name test/test, id 0 1269, columns 4, indexes 1, appr.rows 0
  COLUMNS: id: DATA_INT DATA_UNSIGNED DATA_BINARY_TYPE DATA_NOT_NULL len 4;
    DB_ROW_ID: DATA_SYS prtype 256 len 6;
    DB_TRX_ID: DATA_SYS prtype 257 len 6;
  DB_ROLL_PTR: DATA_SYS prtype 258 len 7;
  INDEX: name GEN_CLUST_INDEX, id 0 909, fields 0/4, uniq 1, type 1
   root page 3, appr.key vals 0, leaf pages 1, size pages 1
   FIELDS:  DB_ROW_ID DB_TRX_ID DB_ROLL_PTR id

This is not 100% correct yet.

id seems to be SIGNED and not UNSIGNED and NULL seems to be allowed. So next try:

mysql> CREATE TABLE test.test (
  id INT SIGNED NULL
) ENGINE = InnoDB CHARSET=utf8;

TABLE: name test/test, id 0 1271, columns 4, indexes 1, appr.rows 0
  COLUMNS: id: DATA_INT DATA_BINARY_TYPE len 4;
    DB_ROW_ID: DATA_SYS prtype 256 len 6;
    DB_TRX_ID: DATA_SYS prtype 257 len 6;
  DB_ROLL_PTR: DATA_SYS prtype 258 len 7;
  INDEX: name GEN_CLUST_INDEX, id 0 911, fields 0/4, uniq 1, type 1
   root page 3, appr.key vals 0, leaf pages 1, size pages 1
   FIELDS:  DB_ROW_ID DB_TRX_ID DB_ROLL_PTR id

So this looks pretty much like it should. Do not be confused because of some other details. The original table was created on a MySQL 5.6.4 and the .frm recovery is done on a 5.1.55.

Now copy the .frm file to the original database and look if you can access your data. If it does you can do this table by table for all you zillions of tables...

When you are done. Take a backup and ideally do a proper install of your database!

Just a little detail: I created the original table like this:

mysql> CREATE TABLE test.test (id INT) ENGINE = InnoDB;

mysql> SHOW CREATE TABLE test.test\G
CREATE TABLE `test` (
  `id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1

MySQL seems to figure out itself what is the correct character set...

↧

I prefer MySQL binary tar balls with Galera...

February 8, 2012, 6:13 am

≫ Next: What can MySQL performance monitoring graphs tell you?

≪ Previous: Recover lost .frm files for InnoDB tables

Taxonomy upgrade extras:

In my set-ups I have different MySQL versions (MySQL 5.0, 5.1, 5.5 and 5.6, Percona Server 13.1 and 24.0, MariaDB 5.2.10, 5.3.3, Galera 1.0, 1.1 and 2.0) running in parallel at the same time.

Up to now I have not found a practical way yet to do this with RPM or DEB packages. If anybody knows how to do it I am happy to hear about it.

So I love and need only binary tar balls. Installation and removal is done within seconds and no remainings are left over after a removal. To operate the whole I use myenv.

Some software providers unfortunately do not provide binary tar balls at all or not in the form I want and need them. Thus I was thinking about how to get those by extracting them from packages. Up to now I have not had the time to write this down. But today was the right time...

RPM

rpm2cpio galera-22.1.1-1.rhel5.x86_64.rpm | cpio -vidm
tar czf galera-22.1.1-1.rhel5.x86_64.tar.gz usr
rm -rf usr

Extract with:

tar xf galera-22.1.1-1.rhel5.x86_64.tar.gz

DEB

ar vx galera-22.1.1-amd64.deb
mv data.tar.gz galera-22.1.1-amd64.deb.tar.gz
rm debian-binary control.tar.gz

Extract with:

tar -mxf galera-22.1.1-amd64.deb.tar.gz

The packages look quite the same in size:

-rw-r--r-- 1 oli oli 6725416 2012-02-08 13:49 galera-22.1.1-1.rhel5.x86_64.rpm
-rw-r--r-- 1 oli oli 6769606 2012-02-08 14:18 galera-22.1.1-1.rhel5.x86_64.tar.gz

-rw-r--r-- 1 oli oli 1386762 2011-12-12 17:12 galera-22.1.1-amd64.deb
-rw-r--r-- 1 oli oli 1385994 2012-02-08 14:18 galera-22.1.1-amd64.deb.tar.gz

so I assume that there is nothing lost.

The differences in size between DEB and RPM seems to come from the packaging itself:

usr_deb/lib/galera/libgalera_smm.so:   ELF 64-bit (SYSV), dynamically linked, stripped
usr_rpm/lib64/galera/libgalera_smm.so: ELF 64-bit (SYSV), dynamically linked, not stripped

So nothing to worry. The programs itself worked without any problems after the first tests. So I am optimistic that this is a good workaround until I can convince the software vendor to make good binary tar balls...

↧

What can MySQL performance monitoring graphs tell you?

February 16, 2012, 9:32 am

≫ Next: Does InnoDB data compression help with short disk space?

≪ Previous: I prefer MySQL binary tar balls with Galera...

Taxonomy upgrade extras:

performance monitoring

Many of you may monitor their databases for different purposes. Beside alerting it is often good to also make some graphs from MySQL performance counters to see what is actually happening on your database.

The following graphs where made with our FromDual Performance Monitor for MySQL as a Service (MaaS) set-up. If you do not have the time to install a performance monitoring yourself please feel free to contact us for our MaaS solution.

Overview

First of all it is a good idea to have an overview of all the settings in you different databases and if they are compliant to your standards.

Here it looks like two of our databases are still running with Statement Bases Replication (SBR). Further the MySQL General Query Log is enabled which is non optimal for write performance and the default Storage Engine is still set to MyISAM which is not wanted in our case.

InnoDB

This server is mostly an InnoDB Server. We can see that we have some write traffic because the InnoDB Buffer Pool has constantly 15 - 20% of dirty pages. Further we see that we have a very constant level of dirty pages:

Here I guess, the database was restarted on Wednesday:

We can clearly see the positive impact of MySQL partitioning:

Read and write are here more or less equally high with some strong spikes:

If we have a closer look we can see that one typical spike is always at 06:00 in the morning. It is read and write so it is possibly NOT a backup but more likely a reporting or maintenance job:

This system does mostly write with a heavy read phase during midnight:

The read starts at 23:00 and ends at 03:30. It could be some nightly reporting?

Extreme read and write spikes. Not good for a system that likes to have close to real time behavior:

The consequences we can see immediately, Locking:

When we look at the InnoDB Locking we can see that this job at 06:00 in the morning causes some lockings. Up to 2.5 seconds. If this causes troubles we really have to investigate what is running at that time:

If we look at the last 14 days we can see a huge read spike some time ago, what has happened there? This has possibly influenced the whole system as well:

Here we have a special InnoDB read pattern. Can you see it? Every 3 hours at xx:30 it happens. We should try to find out what it is:

From time to time we can see some big transactions:

MyISAM

Currently and over the last few days our MyISAM key buffer was mostly empty. But the high water mark Key_blocks_used indicates that they were used in the past. So we should try to find out if this key buffer is still used and if not if we can free this memory (about 2.4 Gbyte):

From time to time we see some MyISAM key reads from memory. This could be caused by MyISAM table indexes or by temporary tables gone to disk. So not really much MyISAM traffic at all and it is mostly happening during week days:

This is a MyISAM mostly system but never the less we do not have MyISAM table locking issues:

Connections

We are not far from hitting max_connections. So either we increase this value or we think about why we have so many connections open concurrently. On February 5^th we had many concurrent running threads. That was surely not good for the whole system performance:

On February 12^th there was possibly something wrong with the application: We see many aborted clients what never had before:

It looks like thread cache was always big enough:

On this server we definitly had a problem and we hit the roof with the amount of connections. I am wondering if it would not be more useful to even lower max_connections here?

Network traffic

Network traffic was growing a bit last week but now is stable again. Our network should not have a problem with this load:

This is a read mostly database. The patter comes from sorts and handler_read_rnd (see below):

Handler Interface

Here we see that we do many handler_read_next and handler_read_rnd_next operations. This is equivalent to index range scans and full table scans. Our spike at February 11^th seems to come from such an index range scan:

This server does mostly read by a full table scan! There is a huge potential for optimization so the server could cover more load if the queries were done more optimal:

It seems there were some UPDATEs involved on 8^th:

We see many INSERTs but few to no DELETEs. This means the database size is steadly growing. We should think about archiving old data?

OK. This seems to be a maintenance job. Luckily it was set off-peak hours (08:00):

Somebody does evil things hear: handler_read_rnd is bad for performance. And it pretty much correlates with the network traffic from above:

Sort

Sort behavior seems to have changed significantly on Monday. We should find out if application has released a new version:

On this server we see some sort_merge_passes which is a sign of a too small sort buffer or just huge sorts:

Queries

The majority of queries sent against this database are SELECT's:

Temporary Tables

The use of temporary tables has changed. This is a sign again that something in the application was modified:

If we look a bit closer we can see that the use of temporary disk tables has increased. We should keep an eye on this:

Here we most probably have a problem with temporary tables on disk. It happens quite periodical and predictable so we can investigate who causes those:

If you look closer at it we can see that it is a similar pattern as with the data reads from above:

MySQL Process information

The number of page faults has changed dramatically over the weekend. So what was changed there?

Query Cache

There are no query cache hits but Query Cache is enabled! There are several reasons for this. We should investigate or disable Query Cache at all:

We have some low memory prunes. Shall we increase Query Cache size?

Hmmm. It is already quite big. Better to defragment it from time to time to get rid of the free memory:

Table Definition Cache

The Table Definition Cache is much to big here. A value of 256 (default) would be far more than enough. Better to release the resources:

This is the opposite case. Here the Table Definition Cache is much too small at least when the job at 23:00 is running:

Table (Open) Cache

Same situation here with the Table Open Cache. A value of 3 - 4k would be enough. Better to release the resources. There are some cases known where a too big table cache causes performance issues:

Same here. Too small at midnight jobs:

Binary logging (Master)

The Binary Log has a size of about 100 Mbyte (Debian/Ubuntu?) and is filled up every hour (ca. 25 kybte/s). During the night we have some less traffic, during day some more traffic:

From time to time we have some spikes in binary log traffic. But binary log cache seems to be always big enough:

Replication Slave

The Slave was not lagging very often and only for short time:

If you want such performance graphs as well from your system, just let us know!

↧

Does InnoDB data compression help with short disk space?

March 23, 2012, 10:35 am

≫ Next: Troubles with MySQL 5.5 on FreeBSD 9

≪ Previous: What can MySQL performance monitoring graphs tell you?

Taxonomy upgrade extras:

innodb

compress

Because we are a bit short off disk space on one of our servers I had the idea to try out the MySQL feature Data Compression for InnoDB. This feature is useful if you have tables with VARCHAR, BLOB or TEXT attributes.

To not make it not too simple our table is partitioned as well. Our table looks like this:

CREATE TABLE `history_str` (
  `itemid` mediumint(8) unsigned NOT NULL DEFAULT '0',
  `clock` int(11) unsigned NOT NULL DEFAULT '0',
  `value` varchar(255) NOT NULL DEFAULT '',
  PRIMARY KEY (`itemid`,`clock`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY RANGE (clock)
(PARTITION p2012_kw05 VALUES LESS THAN (1328482800) ENGINE = InnoDB,
 PARTITION p2012_kw06 VALUES LESS THAN (1329087600) ENGINE = InnoDB,
 PARTITION p2012_kw07 VALUES LESS THAN (1329692400) ENGINE = InnoDB,
 PARTITION p2012_kw08 VALUES LESS THAN (1330297200) ENGINE = InnoDB,
 PARTITION p2012_kw09 VALUES LESS THAN (1330902000) ENGINE = InnoDB,
 PARTITION p2012_kw10 VALUES LESS THAN (1331506800) ENGINE = InnoDB,
 PARTITION p2012_kw11 VALUES LESS THAN (1332111600) ENGINE = InnoDB,
 PARTITION p2012_kw12 VALUES LESS THAN MAXVALUE ENGINE = InnoDB);

And the partitions use the following space on disk (not really much, I know!):

-rw-rw---- 1 mysql mysql 184549376 Mar  7 00:43 history_str#P#p2012_kw05.ibd
-rw-rw---- 1 mysql mysql 209715200 Mar 14 00:11 history_str#P#p2012_kw06.ibd
-rw-rw---- 1 mysql mysql 234881024 Mar 21 00:47 history_str#P#p2012_kw07.ibd
-rw-rw---- 1 mysql mysql 226492416 Mar 23 16:39 history_str#P#p2012_kw08.ibd
-rw-rw---- 1 mysql mysql 234881024 Mar 19 18:22 history_str#P#p2012_kw09.ibd
-rw-rw---- 1 mysql mysql 289406976 Mar 19 18:22 history_str#P#p2012_kw10.ibd
-rw-rw---- 1 mysql mysql 281018368 Mar 23 16:39 history_str#P#p2012_kw11.ibd
-rw-rw---- 1 mysql mysql 213909504 Mar 23 17:23 history_str#P#p2012_kw12.ibd

After the table was compressed with the following values:

ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4

The space on disk was used as follows:

-rw-rw---- 1 mysql mysql   7340032 Mar 23 17:33 history_str#P#p2012_kw05.ibd
-rw-rw---- 1 mysql mysql   7340032 Mar 23 17:34 history_str#P#p2012_kw06.ibd
-rw-rw---- 1 mysql mysql   8388608 Mar 23 17:36 history_str#P#p2012_kw07.ibd
-rw-rw---- 1 mysql mysql  75497472 Mar 23 17:49 history_str#P#p2012_kw08.ibd
-rw-rw---- 1 mysql mysql 104857600 Mar 23 17:44 history_str#P#p2012_kw09.ibd
-rw-rw---- 1 mysql mysql 125829120 Mar 23 17:51 history_str#P#p2012_kw10.ibd
-rw-rw---- 1 mysql mysql 125829120 Mar 23 17:57 history_str#P#p2012_kw11.ibd
-rw-rw---- 1 mysql mysql 134217728 Mar 23 18:11 history_str#P#p2012_kw12.ibd

So we got a reduction of used disk space by 40 - 60%! Not too bad.

But we also want to see what impact it has on memory:

SHOW GLOBAL STATUS LIKE 'innodb_buffer%';
+---------------------------------------+----------------------+
| Variable_name                         | Value                |
+---------------------------------------+----------------------+
| Innodb_buffer_pool_pages_data         | 10769                |
| Innodb_buffer_pool_pages_dirty        | 6613                 |
| Innodb_buffer_pool_pages_free         | 644                  |
| Innodb_buffer_pool_pages_misc         | 18446744073709549802 |
| Innodb_buffer_pool_pages_total        | 9599                 |
+---------------------------------------+----------------------+

Those numbers do not really make sense. We also hit a known MySQL Bug #59550: Innodb_buffer_pool_pages_misc goes wrong.

Some fancy graphs

InnoDB Buffer Pool activity

Because our InnoDB Buffer pool was too big we have reduced it a bit. For enabling the Barracuda file format we restarted the database afterwards. An then the numbers went amok...

InnoDB compression time

The first time we can see InnoDB compression time in our Monitor... \o/

InnoDB Row operations

And here you can find out how we solved it technically... :-)

↧

Troubles with MySQL 5.5 on FreeBSD 9

March 29, 2012, 2:55 am

≫ Next: FromDual Performance Monitor for MySQL (MPM) v0.9 released

≪ Previous: Does InnoDB data compression help with short disk space?

Taxonomy upgrade extras:

FreeBSD 9 seems to have some troubles with MySQL 5.5.20. A customer has moved from MySQL 5.0 on Linux to MySQL 5.5 on FreeBSD 9. He experienced a lot of periodic slow downs on the new, much stronger, system which he has not seen on the old Linux box.

This slow downs were also shown in high CPU system time but we could not see any I/O going on.

When we looked into MySQL we have seen many threads in Opening tables state in the MySQL processlist.

The first idea was to increase table_open_cache to 2048 and later to 4096. This made the Opening tables disappear but then we got a significant amount of threads hanging in Copying to tmp table state in the processlist.

So we suspected that those table are going to disk. But we did not see any I/O (with iostat) and Created_tmp_disk_tables did not change significantly but just Created_tmp_tables.

So I suspect some troubles with Memory allocation on FreeBSD 9 with MySQL.

Then the customer has set table_open_cache = 4!!! And suddenly the problems disappeared. I am a bit confused because this is exactly the opposite of what I expect.

We played a bit around with table_open_cache but as bigger as we make the value as worse the situation became. So be warned if you see similar symptoms...

Anybody any clue what is going on? I have not found any bug in the MySQL bugs database which sounds similar to this...

↧

FromDual Performance Monitor for MySQL (MPM) v0.9 released

April 3, 2012, 5:03 am

≫ Next: MySQL and Galera Load Balancer (GLB)

≪ Previous: Troubles with MySQL 5.5 on FreeBSD 9

Taxonomy upgrade extras:

performance

enterprise monitor

monitoring

performance monitoring

On April 2^nd 2012 FromDual released the new version v0.9 of its Performance Monitor for MySQL (mpm). The new version can be downloaded from here.

The Performance Monitor for MySQL (mpm) is an agent which is hooked into Zabbix. Zabbix is an integrated Enterprise Monitoring solution which can produce performance graphs and alerting.

The changes in the new release are:

New functionality

A new server module gathers MySQL database specific server informations. This is especially interesting for the Monitoring as a Service customers.
You can monitor Galera Cluster for MySQL now. All important items of Galera Cluster for MySQL up to version 2.0 are gathered. The important Triggers and Graphs are available. FromDual Performance Monitor for MySQL becomes your indispensable tool for monitoring Galera Cluster!
Trigger was added on low open_files_limit

Changed functionality

Item history was reduced from 90 to 30 days to safe space on disk.
InnoDB items were added and Graphs improved and cleaned-up.
MyISAM items were added and Graphs improved.
Query Cache items were added.
Some triggers were too verbose or complained when they should not. Should be fixed now.
MPM v0.9 was tested with Zabbix 1.8.11 and works without any problems.

Fixes

Some items were not reported correctly. Fixed them.
Many little bugs in different modules were fixed.

For more detailed informations see the CHANGELOG.

Installation and upgrade documentation can be found here.

If you want to stay tuned about the progess of the next release of mpm follow us on Twitter...

If you find any bug please report them to our bugs database. If you have some questions or if you want to exchange know-how related to the mpm please go to our Forum.

↧

MySQL and Galera Load Balancer (GLB)

April 7, 2012, 1:10 am

≫ Next: How to make the MySQL Performance Monitor work on Windows?

≪ Previous: FromDual Performance Monitor for MySQL (MPM) v0.9 released

When you install a Galera Cluster for MySQL for High Availability (HA) it is not enough to install the Database Cluster to achieve this goal. You also have to make the application aware of this HA functionality. This is typically done with some kind of load balancing mechanism between the database and the application.

We have several possibilities how to make such a load balancing possible:

We build such a load balancing mechanism directly into the application.
When we use Java or PHP we can use the fail-over functionality of the connectors (Connector/J, mysqlnd-ms).
If we cannot touch the application we can put a load balancing mechanism between the application and the database. This can be done with:
- a Hardware Load Balancer,
- a Software Load Balancer (Pen, GLB, LVS, HAProxy and others...),
- or with MySQL-Proxy.

Building the Galera Load Balancer

As an example we look at the Galera Load Balancer (GLB). The documentation about it you can find in the README file.

It can be built as follows:

wget http://www.codership.com/files/glb/glb-0.7.4.tar.gz
tar xf glb-0.7.4.tar.gz
cd glb-0.7.4
./configure
make
make install

Starting the Galera Load Balancer

The Galera Load Balancer will be started as follows:

./glbd --daemon --threads 6 --control 127.0.0.1:4444 127.0.0.1:3306 \
192.168.56.101:3306:1 192.168.56.102:3306:1 192.168.56.103:3306:1
Incoming address:       127.0.0.1:3306 , control FIFO: /tmp/glbd.fifo
Control  address:        127.0.0.1:4444
Number of threads: 6, source tracking: OFF, verbose: OFF, daemon: YES
Destinations: 3
   0:  192.168.56.101:3306 , w: 1.000
   1:  192.168.56.102:3306 , w: 1.000
   2:  192.168.56.103:3306 , w: 1.000

Querying the Galera Load Balancer

It can be queried as follows:

echo getinfo | nc -q 1 127.0.0.1 4444
Router:
----------------------------------------------------
        Address       :   weight   usage   conns
 192.168.56.101:3306  :    1.000   0.667     2
 192.168.56.102:3306  :    1.000   0.500     1
 192.168.56.103:3306  :    1.000   0.500     1
----------------------------------------------------
Destinations: 3, total connections: 4

and

echo getstats | nc -q 1 127.0.0.1 4444
in: 37349 out: 52598 recv: 89947 / 1989 send: 89947 / 1768 conns: 225 / 4
poll: 1989 / 0 / 1989 elapsed: 76.59987

Draining nodes with Galera Load Balancer

Let's assume, we want to take out node 192.168.56.101 from the Load Balancer for maintenance purposes, this can be done as follows:

echo 192.168.56.101:3306:0 | nc -q 1 127.0.0.1 4444
echo getinfo | nc -q 1 127.0.0.1 4444
Router:
----------------------------------------------------
        Address       :   weight   usage   conns
 192.168.56.101:3306  :    0.000   1.000     0
 192.168.56.102:3306  :    1.000   0.667     2
 192.168.56.103:3306  :    1.000   0.667     2
----------------------------------------------------
Destinations: 3, total connections: 4

Removing and adding nodes from Galera Load Balancer

If you want to shrink or grow your database cluster, removing and adding nodes works as follows:

echo 192.168.56.103:3306:-1 | nc -q 1 127.0.0.1 4444

echo 192.168.56.103:3306:2 | nc -q 1 127.0.0.1 4444

And now have fun playing around with your Galera Load Balancer...

↧

How to make the MySQL Performance Monitor work on Windows?

April 20, 2012, 9:22 am

≫ Next: MySQL @ FrOSCon 7 in St. Augustin (Germany)

≪ Previous: MySQL and Galera Load Balancer (GLB)

Taxonomy upgrade extras:

performance

performance monitoring

A customer recently was asking why our MySQL Performance Monitor (MPM) is not working on Windows...? The answer is short: It was developed on Linux and never tested on Windows...

But I was wondering how much effort it would take to make it work on Windows as well.

I was quite surprised how fast it was to make the basic functionality working on Windows. It took me less than one hour to install, configure and patch MPM.

Patch MPM

The file FromDualMySQLagent.pm has to be patched at 2 locations. The lock file name must be something understandable by Windows (for example C:\Temp\FromDualMySQLagent.lock. We will fix that in the next MPM release.

 40   # Should NOT be hard coded, tofix later!!!
 41   # Does not work on Windows!
 42   my $lAgentLockFile = '/tmp/FromDualMySQLagent.lock';
 43   # Check if lock file already exists and complain if yes
...
533   # Does not work on Windows!
534   my $lAgentLockFile = '/tmp/FromDualMySQLagent.lock';
535   if ( ! unlink($lAgentLockFile) ) {

There are at least 2 other parts in the code which make troubles. But they can be circumvented by disabling the modules (server and process) respectively configuring MPM accordingly.

A basic MPM configuration file on Windows

We have used the following basic configuration file:

[default]

LogFile       = C:\Users\oli\logs\mpm.log
Debug         = 2

CacheFileBase = C:\Users\oli\cache

MaaS          = on
Hash          = <your hash>
Methtode      = http
Url           = http://support.fromdual.com/maas/receiver.php


[FromDual.Win_laptop]

Modules       = mpm

[FromDual.Win_laptop.win_db]

In your case there is possibly some more configuration needed. For details please look here.

Now we are quite confident that the next MPM release will work more or less with Windows out of the box. If you cannot wait try it out with this hack. More details about installing the MPM on Windows you can find here. If you run into problems please report them in the MPM installation on Windows forum. All paying customers can naturally use our support platform.

↧

MySQL @ FrOSCon 7 in St. Augustin (Germany)

June 1, 2012, 7:21 am

≫ Next: Change MyISAM tables to InnoDB and handle SELECT COUNT(*) situation

≪ Previous: How to make the MySQL Performance Monitor work on Windows?

Taxonomy upgrade extras:

Also this year we will have a special track for MySQL, Galera, Percona und MariaDB at the FrOSCon in St. Augustin in Germany. The conference is scheduled for August 25 and 26 2012.

Together with the PostgreSQL people we are organizing a sub-conference for Open Source RDBMS there. Now we are looking for interesting talks about MySQL and related techniques like Galera, Percona, MariaDB. The only restriction for the talks is: They must be about an Open Source topic.

We encourage you to send your proposals.

After registering you can Submit a new event. Choose the Databases track. It makes easier to assign the proposal.

Regarding the talks: Please do NOT add talks about NON Open Source solutions. It can be about some new technical things or about some user experience with MySQL technology.

Keep in mind the audience is going to be technical driven. Think about the audience as colleges and not as decision makers.

Please help spreading the word for the Conference by blogging and twittering about it (#froscon)!

And now let us go...

Oli

↧

Change MyISAM tables to InnoDB and handle SELECT COUNT(*) situation

June 12, 2012, 11:48 am

≫ Next: Deadlocks, indexing and Primary Key's

≪ Previous: MySQL @ FrOSCon 7 in St. Augustin (Germany)

Taxonomy upgrade extras:

Its a known problem that changing the Storage Engine from MyISAM to InnoDB can cause some problems [ 1 ] if you have queries of this type:

SELECT COUNT(*) from table;

Luckily this query happens rarely and if, the query can be easily omitted or worked around by guesstimating the amount of rows in the table. For example with:

SHOW TABLE STATUS LIKE 'test';

But in some rare cases customer really needs these values for some reasons. To not exhaust the resources of the server with this query which can be fired quite often in some cases we make use of the materialized views/shadow table technique [ 2 ].

The following example illustrates how to do this.

Our original situation

We have an offer table which is feed by a host system:

CREATE TABLE offer (
  id   int unsigned NOT NULL AUTO_INCREMENT
, `type` CHAR(3) NOT NULL DEFAULT 'AAA'
, data varchar(64) DEFAULT NULL
, PRIMARY KEY (`id`)
, INDEX (type)
) ENGINE=InnoDB;

INSERT INTO offer VALUES (NULL, 'AAA', 'Blablabla');
INSERT INTO offer VALUES (NULL, 'ABC', 'Blablabla');
INSERT INTO offer VALUES (NULL, 'ZZZ', 'Blablabla');

The query we want to perform looks like this:

SELECT COUNT(*) FROM offer;

This query becomes expensive when you have zillions of rows in your table.v

The work around

To work around the problem we create a counter table where we count the rows which are inserted, updated or deleted on the offer table.

CREATE TABLE counter (
  `type` char(3) NOT NULL DEFAULT 'AAA'
, `count` MEDIUMINT UNSIGNED NOT NULL DEFAULT 0
, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
, PRIMARY KEY (type)
) ENGINE=InnoDB;

To fill this counter table we need an initial snapshot:

INSERT INTO counter
SELECT type, COUNT(*), NULL
  FROM offer
 GROUP BY type;

SELECT * FROM counter;
SELECT COUNT(*) FROM counter;

Update the counter table

To keep the counter table up-to-date we need the following 3 triggers:

DROP TRIGGER IF EXISTS insert_offer_trigger;

delimiter //

CREATE TRIGGER insert_offer_trigger
AFTER INSERT ON offer FOR EACH ROW
BEGIN
  INSERT INTO counter
  VALUES (NEW.type, 1, NULL)
  ON DUPLICATE KEY
  UPDATE count = count + 1, ts = CURRENT_TIMESTAMP();
END;
//

delimiter ;


DROP TRIGGER IF EXISTS update_offer_trigger;

delimiter //

CREATE TRIGGER update_offer_trigger
AFTER UPDATE ON offer FOR EACH ROW
BEGIN
  IF NEW.type = OLD.type THEN
    UPDATE counter SET ts = CURRENT_TIMESTAMP() WHERE type = NEW.type;
  ELSE
    UPDATE counter
       SET count = count - 1, ts = CURRENT_TIMESTAMP()
     WHERE type = OLD.type;
    INSERT INTO counter
    VALUES (NEW.type, 1, NULL)
    ON DUPLICATE KEY
    UPDATE count = count + 1, ts = CURRENT_TIMESTAMP();
  END IF;
END;
//

delimiter ;


DROP TRIGGER IF EXISTS delete_offer_trigger;

delimiter //

CREATE TRIGGER delete_offer_trigger
AFTER DELETE ON offer FOR EACH ROW
BEGIN
  UPDATE counter SET count = count - 1 WHERE type = OLD.type;
END;
//

delimiter ;

Now we can test some cases and compare the results of both tables:

INSERT INTO offer VALUES (NULL, 'AAA', 'Blablabla');
INSERT INTO offer VALUES (NULL, 'AAA', 'Blablabla');

-- Single offer change
UPDATE offer SET data = 'Single offer change' WHERE id = 2;

-- Multi offer change
UPDATE offer SET data = 'Multi offer change' WHERE type = 'AAA';

-- Single offer delete
DELETE FROM offer WHERE id = 1;

-- REPLACE (= DELETE / INSERT)
REPLACE INTO offer VALUES (3, 'ZZZ', 'Single row replace');

-- New type
INSERT INTO offer VALUES (NULL, 'DDD', 'Blablabla');

-- Change of type
UPDATE offer SET type = 'ZZZ' where id = 2;

-- Change of type to new type
UPDATE offer SET type = 'YYY' where id = 3;

-- INSERT on DUPLICATE KEY UPDATE
INSERT INTO offer VALUES (7, 'DDD', 'ON DUPLICATE KEY UPDATE')
ON DUPLICATE KEY UPDATE type = 'DDD', data = 'INSERT ON DUPLICATE KEY';
INSERT INTO offer VALUES (7, 'DDD', 'ON DUPLICATE KEY UPDATE')
ON DUPLICATE KEY UPDATE type = 'DDD', data = 'UPDATE ON DUPLICATE KEY UPDATE';

SELECT * FROM offer;
SELECT COUNT(*) FROM offer;
SELECT * FROM counter;
SELECT SUM(count) FROM counter;

This solution has the advantage that we get also a very fast response on the number of rows for a specific order type. Which would be also expensive for MyISAM tables...

↧

Deadlocks, indexing and Primary Key's

August 16, 2012, 1:51 am

≫ Next: Galera Cluster discussions at FrOSCon 2012

≪ Previous: Change MyISAM tables to InnoDB and handle SELECT COUNT(*) situation

Taxonomy upgrade extras:

Recently a customer has shown up with some deadlocks occurring frequently. They were of the following type (I have shortened the output a bit):

*** (1) TRANSACTION:

TRANSACTION 22723019234, fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1216, 14 row lock(s)
update location set expires='2012-08-10 04:50:29' where username='12345678901' AND contact='sip:12345678901@192.168.0.191:5060' AND callid='945a20d4-8945dcb5-15511d3e@192.168.0.191'

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 2203904 n bits 136 index `GEN_CLUST_INDEX` of table `location` trx id 22723019234 lock_mode X locks rec but not gap waiting


*** (2) TRANSACTION:

TRANSACTION 22723019222, fetching rows, thread declared inside InnoDB 225
mysql tables in use 1, locked 1
192 lock struct(s), heap size 30704, 9483 row lock(s)
delete from location where expires<'2012-08-10 04:49:30' AND expires!='1969-12-31 19:00:00'

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 2203904 n bits 136 index `GEN_CLUST_INDEX` of table `location` trx id 22723019222 lock_mode X

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 2203951 n bits 136 index `GEN_CLUST_INDEX` of table `location` trx id 22723019222 lock_mode X waiting

*** WE ROLL BACK TRANSACTION (1)

They want us to have this fixed. And they did not like the answer that the application has to cope with deadlocks [ 1 ].

But one thing looks suspicious here: The GEN_CLUSTER_INDEX! This basically means there was NO explicit Primary Key on the InnoDB table and InnoDB was creating its own internal Primary Key.

After some discussion we started to examine the whole situation. For this we transformed the UPDATE and the DELETE statement into SELECT's:

`UPDATE`

EXPLAIN
SELECT *
  FROM location
 WHERE username='12345678901'
   AND contact='sip:12345678901@192.168.0.191:5060'
   AND callid='945a20d4-8945dcb5-15511d3e@192.168.0.191'
;

+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+
| id | select_type | table    | type | possible_keys | key      | key_len | ref   | rows | Extra                              |
+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+
|  1 | SIMPLE      | location | ref  | username      | username | 66      | const |    9 | Using index condition; Using where |
+----+-------------+----------+------+---------------+----------+---------+-------+------+------------------------------------+

The first strange thing is, that the MySQL optimizer expects 9 rows (on a long key of 66 bytes) which theoretically should be a Primary Key access! This sound non optimal. And as shorter and faster transactions are as less probable are deadlocks.

So we tried to look at the transaction with SHOW ENGINE INNODB STATUS:

START TRANSACTION;
SELECT *
  FROM location
 WHERE username='12345678901'
   AND contact='sip:12345678901@192.168.0.191:5060'
   AND callid='945a20d4-8945dcb5-15511d3e@192.168.0.191'
   FOR UPDATE
;

---TRANSACTION 14033
4 lock struct(s), heap size 1248, 11 row lock(s)
MySQL thread id 5, OS thread handle 0x7f3647b9e700, query id 526 localhost root cleaning up

We can see that the same query uses 4 lock structs and locks in total 11 rows. This is similar to what we have seen in the deadlock.

`DELETE`

EXPLAIN
SELECT *
  FROM location
 WHERE expires < '2012-08-10 04:49:30'
   AND expires != '1969-12-31 19:00:00'
;

+----+-------------+----------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table    | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+----------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | location | ALL  | NULL          | NULL | NULL    | NULL | 10754 | Using where |
+----+-------------+----------+------+---------------+------+---------+------+-------+-------------+

Uiii! The DELETE does not use an index at all but does a full table scan. Which is not so optimal performance wise...

START TRANSACTION;
SELECT *
  FROM location
 WHERE expires < '2012-08-10 04:49:30'
   AND expires != '1969-12-31 19:00:00'
   FOR UPDATE
;

---TRANSACTION 14034
168 lock struct(s), heap size 31160, 11007 row lock(s)
MySQL thread id 6, OS thread handle 0x7f3647b9e700, query id 663 localhost root cleaning up

And we can see a huge amount of locked rows... The table contains 10840 rows in total. Those numbers differ a bit from the deadlock but it is OK because they do not represent the same point in time.

So we started looking at the table structure. The table which was provided by the customer looks as follows:

CREATE TABLE `location` (
  `username` varchar(64) NOT NULL DEFAULT '',
  `domain` varchar(128) NOT NULL DEFAULT '',
  `contact` varchar(255) NOT NULL DEFAULT '',
  `received` varchar(255) DEFAULT NULL,
  `path` varchar(255) DEFAULT NULL,
  `expires` datetime NOT NULL DEFAULT '2020-01-01 00:00:00',
  `q` float(10,2) NOT NULL DEFAULT '1.00',
  `callid` varchar(255) NOT NULL DEFAULT 'Default-Call-ID',
  `cseq` int(11) NOT NULL DEFAULT '42',
  `last_modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `replicate` int(10) unsigned NOT NULL DEFAULT '0',
  `state` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `flags` int(11) NOT NULL DEFAULT '0',
  `cflags` int(11) NOT NULL DEFAULT '0',
  `user_agent` varchar(100) NOT NULL DEFAULT '',
  `socket` varchar(128) DEFAULT NULL,
  `methods` int(11) DEFAULT NULL,
  `id` int(10) NOT NULL DEFAULT '0',
  KEY `username` (`username`,`domain`,`contact`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
;

First I want to fix the problem of the full table scan:

ALTER TABLE location ADD INDEX (expires);

Then the DELETE looks much better:

+----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref  | rows | Extra                 |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------+
|  1 | SIMPLE      | location | range | expires       | expires | 5       | NULL |    2 | Using index condition |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-----------------------+

---TRANSACTION 14074
2 lock struct(s), heap size 1248, 1 row lock(s)
MySQL thread id 6, OS thread handle 0x7f3647b9e700, query id 671 localhost root cleaning up

But I do not know how realistic my actual data are. This can change with an other data-set!

Now I want to see if there is any difference with the KEY declared as a Primary Key:

ALTER TABLE location DROP INDEX username, ADD PRIMARY KEY (username, domain, contact);

Long indexes are bad for InnoDB. See blog post which will hopefully appear soon!

+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table    | type | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | location | ref  | PRIMARY       | PRIMARY | 66      | const |    9 | Using where |
+----+-------------+----------+------+---------------+---------+---------+-------+------+-------------+

---TRANSACTION 14145
3 lock struct(s), heap size 1248, 10 row lock(s)
MySQL thread id 6, OS thread handle 0x7f3647b9e700, query id 684 localhost root cleaning up

Execution plan looks the same. OK. 1 row less is locked. This means 10% less probability of deadlocks?

The effect was not as big as expected. So rolling back last change (making at least a unique key out of it).

As already mentioned, short Primary Keys are good for InnoDB. And as we will show in a blog post soon VARCHAR are bad performance wise. So we try to use the non used? field id:

ALTER TABLE location DROP PRIMARY KEY, ADD UNIQUE KEY (username, domain, contact);

ALTER TABLE location MODIFY COLUMN id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY;

EXPLAIN
SELECT *
  FROM location
 WHERE id = 2984
;

+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+
| id | select_type | table    | type  | possible_keys | key     | key_len | ref   | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+
|  1 | SIMPLE      | location | const | PRIMARY       | PRIMARY | 4       | const |    1 | NULL  |
+----+-------------+----------+-------+---------------+---------+---------+-------+------+-------+

START TRANSACTION;
SELECT *
  FROM location
 WHERE id = 2984
   FOR UPDATE
;

---TRANSACTION 14336
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 6, OS thread handle 0x7f3647b9e700, query id 701 localhost root cleaning up

And see there: The Query Execution Plan looks much better and locks are much smaller.

As a result

Use proper indexing (expires).
Follow the rules: Creating Primary Keys on relational database tables. See also article: Disadvantages of explicitly NOT using InnoDB Primary Keys?.
Follow the rules: Create short synthetic Primary Keys and avoid long natural Primary Keys (especially with InnoDB): Clustered and Secondary Indexes and Optimizing InnoDB Queries.
Make your application aware of deadlocks (and other kinds of aborted transactions) and reissue transaction if it fails with a deadlock. [ 1 ]

I hope we can add here the results on the impact of deadlock occurrence soon.

↧

Galera Cluster discussions at FrOSCon 2012

August 27, 2012, 8:18 am

≫ Next: Galera Cluster Nagios Plugin

≪ Previous: Deadlocks, indexing and Primary Key's

Taxonomy upgrade extras:

During and after Henriks great talk about Galera Cluster at the FrOSCon 2012 in St. Augustin we found 2 important things related to Galera Cluster for MySQL:

The InnoDB double write buffer (innodb_doublewrite) should not be disabled anymore for Galera when using v2.0 and higher!!! The reason for this is: When MySQL crashes InnoDB pages might get corrupted during the crash. They would be fixed by the blocks from the double write buffer during auto-recovery. But if the double write buffer is disabled they are not available. With Galera v1.x that was not a problem because after a crash a SST would have happened and the corrupted InnoDB block are corrected. But now with IST in Galera v2.0 MySQL will start without noticing the corruption (as usual) and only an IST is performed. This leads to a running MySQL database with possibly corrupted InnoDB blocks. And this might cause you later troubles for example if this node is used as a donor. Then the corrupted page is inherited to other nodes (using rsync or Xtrabackup?). And in some bad cased then the whole Cluster could crash at once when hitting the corrupted page. Thanks to Monty W. for bringing this up!
Recommendation is: Do NOT disable InnoDB double write buffer (innodb_doublewrite) with Galera Cluster >= v2.0 if your care about your data!
The second discussion was about the event sequence in the binary-log (for those who where present: the A-B vs B-A discussion). Codership confirmed that the binary-log sequence on 2 different Galera nodes of the same Galera Cluster should be the same (everything else is considered to be a bug). As a result this leads to 2 different consequences:
a) The binary-log of node B can be used for a PiTR of node A in case we need it. Finding the right position is a bit tricky and it needs some manual work on this (finding XID with binlog-pos of node B, then finding binlog-pos of node A with XID). But Codership told me they are planning a tool for automatizing this.
b) The binary-log of node B can be used for a Channel fail-over in case we have 2 different Galera Clusters in 2 different data centers connect to each other through MySQL asynchronous replication... For more on this topic see also MySQL Cluster and channel failover...

↧

Galera Cluster Nagios Plugin

September 14, 2012, 1:49 am

≫ Next: MySQL tmpdir on RAM-disk

≪ Previous: Galera Cluster discussions at FrOSCon 2012

Taxonomy upgrade extras:

Based on customer feedback we have decided to add a plugin Galera Cluster for MySQL to our MySQL Nagios/Icinga Plugins.

The module checks, if the node is in status Primary and if the expected amount of Galera Cluster nodes is available. If not, a warning or an alarm is returned.

The script is written in Perl and is Nagios Plugin API v3.0 compatible.

You can download it from our download page.

If you have suggestions for improvements, please contact us. Bugs can be reported at our bugs database.

The following modules are contained in the package:

check_db_mysql.pl
check_errorlog_mysql.pl
check_galera_nodes.pl
check_repl_mysql_cnt_slave_hosts.pl
check_repl_mysql_hearbeat.pl
check_repl_mysql_io_thread.pl
check_repl_mysql_read_exec_pos.pl
check_repl_mysql_readonly.pl
check_repl_mysql_seconds_behind_master.pl
check_repl_mysql_sql_thread.pl
perf_mysql.pl

↧

MySQL tmpdir on RAM-disk

November 15, 2012, 10:15 am

≫ Next: Resize XFS file system for MySQL

≪ Previous: Galera Cluster Nagios Plugin

Taxonomy upgrade extras:

temporary

memory table

myisam

MySQL temporary tables are created either in memory (as MEMORY tables) or on disk (as MyISAM tables). How many tables went to disk and how many tables went to memory you can find with:

mysql> SHOW GLOBAL STATUS LIKE 'Created_tmp%tables';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Created_tmp_disk_tables | 49094    |
| Created_tmp_tables      | 37842181 |
+-------------------------+----------+

Tables created in memory are typically faster than tables created on disk. Thus we want as many as possible tables to be created in memory.

To achieve this we can configure the variables accordingly:

mysql> SHOW GLOBAL VARIABLES LIKE '%table_size';
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 25165824 |
| tmp_table_size      | 25165824 |
+---------------------+----------+

All result sets which are smaller than these values can be handled as MEMORY tables. All result sets bigger than these values are handled as MyISAM tables an go to disk.

But there is still an other reason for tables going to disk: MEMORY tables cannot handle TEXT or BLOB attributes as it often occurs in CMS like Typo3. In these cases MySQL has to do directly MyISAM tables on disk and they are counted as Created_tmp_disk_tables.

If these temporary disk tables are causing serious I/O performance problems one could consider to use a RAM-disk instead of normal physical disks instead.

On Linux we have 2 possibilities to create a RAM-disk: ramfs and tmpfs [ 1 ].

We recommend to use tmpfs.

A RAM-disk can be created as follows:

shell> mkdir -p /mnt/ramdisk
shell> chown mysql:mysql /mnt/ramdisk
shell> mount -t tmpfs -o size=512M tmpfs /mnt/ramdisk

To make this persistent we have to add it to the fstab:

#
# /etc/fstab
#

tmpfs           /mnt/ramdisk     tmpfs   rw,mode=1777,size=512M    0       0

MySQL still writes to the default location which is found as follows:

mysql> SHOW GLOBAL VARIABLES LIKE 'tmpdir';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| tmpdir        | /tmp  |
+---------------+-------+

To changes this value you have to configure your my.cnf accordingly and restart the database...

↧

Resize XFS file system for MySQL

November 17, 2012, 5:30 am

≫ Next: MySQL backup to file, gzip and load in one step

≪ Previous: MySQL tmpdir on RAM-disk

Important: Before you start any operation mentioned below do a proper file system backup of your XFS file system you want to resize. If MySQL is running on this mount point do this with a stopped mysqld. Alternatively you can also use mysqldump to do the MySQL backup but test the restore time before continuing to not experience ugly surprises...

All these operations have to be performed as the root user. First we want to see what mount points are available:

shell> df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             485M   77M  383M  17% /
/dev/sdb1             496M  314M  157M  67% /var/lib/mysql

Our MySQL data are located on /dev/sdb1.

After the file system backup unmount /dev/sdb1 and resize the disk, partition or volume (works for VMware, NetApp filer and similar equipment, for LVM use lvextend):

shell> tar cvf /backup/mysql.tar /var/lib/mysql
shell> umount /var/lib/mysql
shell> fdisk /dev/sdb

Change the units in fdisk to have a better overview over your begin and end of your partition:

fdisk> u
Changing display/entry units to sectors

fdisk> p

Disk /dev/sdb: 1073 MB, 1073741824 bytes
139 heads, 8 sectors/track, 1885 cylinders, total 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xea17dfd0

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               8     1047503      523748   83  Linux

fdisk> d
Selected partition 1

fdisk> n
Command action
   e   extended
   p   primary partition (1-4)
fdisk> p
Partition number (1-4):
fdisk> 1
First sector (8-2097151, default 8):
Using default value 8
Last sector, +sectors or +size{K,M,G} (8-2097151, default 2097151):
Using default value 2097151
fdisk> w

shell> fdisk /dev/sdb
fdisk> p

Disk /dev/sdb: 1073 MB, 1073741824 bytes
139 heads, 8 sectors/track, 1885 cylinders, total 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xea17dfd0

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               8     2097151     1048572   83  Linux

Now the partition has the new size. The next step is to resize the XFS file system. Install the XFS tools if they are not already there:

apt-get install xfsprogs

yum install xfsprogs

And then extend the XFS file system on-line and mount it again:

shell> xfs_growfs /var/lib/mysql

shell> df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             485M   77M  383M  17% /
/dev/sdb1             992M  314M  627M  34% /var/lib/mysql

↧