Quantcast
Channel: Shinguz's blog
Viewing all articles
Browse latest Browse all 318

Impact of column types on MySQL JOIN performance

$
0
0
Taxonomy upgrade extras: 

In our MySQL trainings and consulting engagements we tell our customers always to use the smallest possible data type to get better query performance. Especially for the JOIN columns. This advice is supported as well by the MySQL documentation in the chapter Optimizing Data Types:

Use the most efficient (smallest) data types possible. MySQL has many specialized types that save disk space and memory. For example, use the smaller integer types if possible to get smaller tables. MEDIUMINT is often a better choice than INT because a MEDIUMINT column uses 25% less space.

I remember somewhere the JOIN columns where explicitly mentioned but I cannot find it any more.

Test set-up

To get numbers we have created a little test set-up:

CREATE TABLE `a` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT
, `data` varchar(64) DEFAULT NULL
, `ts` timestamp NOT NULL
, PRIMARY KEY (`id`)
) ENGINE=InnoDB CHARSET=latin1
 

CREATE TABLE `b` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT
, `data` varchar(64) DEFAULT NULL
, `ts` timestamp NOT NULL
, `a_id` int(10) unsigned DEFAULT NULL
, PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

1048576 rows16777216 rows

The following query was used for the test:

EXPLAIN SELECT * FROM a JOIN b ON b.a_id = a.id WHERE a.id BETWEEN 10000 AND 15000;
+----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref         | rows     | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+
|  1 | SIMPLE      | b     | ALL    | NULL          | NULL    | NULL    | NULL        | 16322446 | Using where |
|  1 | SIMPLE      | a     | eq_ref | PRIMARY       | PRIMARY | 4       | test.b.a_id |        1 | NULL        |
+----+-------------+-------+--------+---------------+---------+---------+-------------+----------+-------------+

And yes: I know this query could be more optimal by setting an index on b.a_id.

Results

The whole workload was executed completely in memory and thus CPU bound (we did not want to measure the speed of our I/O system).

SEJOIN columnbytesquery timeGainSpaceCharacter set
InnoDBMEDIUMINT35.28 s96%4% faster75%
InnoDBINT45.48 s100%100%100%
InnoDBBIGINT85.65 s107%7% slower200%
InnoDBNUMERIC(7, 2)~46.77 s124%24% slower~100%
InnoDBVARCHAR(7)7-86.44 s118%18% slower~200%latin1
InnoDBVARCHAR(16)7-86.44 s118%18% slower~200%latin1
InnoDBVARCHAR(32)7-86.42 s118%18% slower~200%latin1
InnoDBVARCHAR(128)7-86.46 s118%18% slower~200%latin1
InnoDBVARCHAR(256)8-96.17 s114%14% slower~225%latin1
InnoDBVARCHAR(16)7-86.96 s127%27% slower~200%utf8
InnoDBVARCHAR(128)7-86.82 s124%24% slower~200%utf8
InnoDBCHAR(16)166.85 s125%25% slower400%latin1
InnoDBCHAR(128)1289.68 s177%77% slower3200%latin1
InnoDBTEXT8-910.7 s195%95% slower~225%latin1
MyISAMINT43.16 s58%42% faster
TokuDBINT44.52 s82%18% faster

Some comments to the tests:

  • MySQL 5.6.13 was used for most of the tests.
  • TokuDB v7.1.0 was tested with MySQL 5.5.30.
  • As results the optimistic cases were taken. In reality the results can be slightly worse.
  • We did not take into consideration that bigger data types will eventually cause more I/O which is very slow!

Commands

ALTER TABLE a CONVERT TO CHARACTER SET latin1;
ALTER TABLE b CONVERT TO CHARACTER SET latin1;

ALTER TABLE a MODIFY COLUMN id INT UNSIGNED NOT NULL;
ALTER TABLE b MODIFY COLUMN a_id INT UNSIGNED NOT NULL;


Viewing all articles
Browse latest Browse all 318

Latest Images

Trending Articles



Latest Images