Optimizing MySQL performance by using EXPLAIN — Query Execution Plan
This is the 2nd post in a series on MySQL performance.
The previous post discussed the types of indexes and how they help query performance.
The MySQL optimizer uses several techniques to optimize SQL lookups based on your tables, columns, and indexes as well as your WHERE clause. It is possible to perform a query on a large table without reading all its rows; a join involving several tables can be performed without comparing every combination of rows.
EXPLAIN — Query Execution Plan
The set of operations that the optimizer chooses to execute the most efficient query is known as the “query execution plan”, also known as the EXPLAIN plan. Your goal here is to identify the aspects of the EXPLAIN plan that indicate a query is optimized well and to learn the SQL syntax and indexing techniques needed to improve the plan if you see some inefficient operations.
Depending on the query, EXPLAIN output can be vast and varied.
EXPLAIN — Output
The following table describes the columns produced by EXPLAIN. Each row of output provides information about one table.
+---------------+------------------------------------------------+
| column | Meaning |
+---------------+------------------------------------------------+
| id | The SELECT identifier |
| select_type | The SELECT type |
| table | The table for the output row |
| partitions | The matching partitions |
| type | The join type |
| possible_keys | The possible indexes to choose |
| key | The index is actually chosen |
| key_len | The length of the chosen key |
| ref | The columns compared to the index |
| rows | Estimate of rows to be examined |
| filtered | Percentage of rows filtered by table condition |
| Extra | Additional information |
+---------------+------------------------------------------------+
- id: This is the sequential number of the SELECT within the query.
- select_type: The type of SELECT, which can be any of those shown below.
+----------------------+-------------------------------------------+
| select_type | Meaning |
+----------------------+-------------------------------------------+
SIMPLE | Simple SELECT
PRIMARY | Outermost SELECT
UNION | Second or later SELECT statement in a
UNION
DEPENDENT UNION | Second or later SELECT statement in a
UNION, dependent on outer query
UNION RESULT | Result of a UNION
SUBQUERY | First SELECT in subquery
DEPENDENT SUBQUERY | First SELECT in subquery, dependent on
outer query
DERIVED | Derived table
DEPENDENT DERIVED | Derived table dependent on another table
MATERIALIZED | Materialized subquery
UNCACHEABLE SUBQUERY | A subquery for which the result cannot be
cached and must be re-evaluated for each
row of the outer query
UNCACHEABLE UNION | The second or later select in a UNION that
belongs to an uncacheable subquery
+----------------------+-------------------------------------------+
- table: The name of the table to which the row of output refers.
- partitions: The partitions from which records would be matched by the query.
- type: The join type, describes how tables are joined.
+-----------------+------------------------------------------------+ | type | Meaning | +-----------------+------------------------------------------------+ system | The table has only one row const | The table has at most one matching row eq_ref | One row is read from this table for each combination of rows from the previous tables. ref | All rows with matching index values are read from this table for each combination of rows from the previous tables fulltext | The join is performed using a FULLTEXT index. ref_or_null | This join type is like ref, but with the addition that MySQL does an extra search for rows that contain NULL values. index_merge | This join type indicates that the Index Merge optimization is used. unique_subquery | This type replaces eq_ref for some IN subqueries index_subquery | This join type is similar to unique_subquery range | Only rows that are in a given range are retrieved, using an index to select the rows index | The index join type is the same as ALL, except that the index tree is scanned ALL | A full table scan is done for each combination of rows from the previous tables.
- possible_keys: Indicates the indexes from which MySQL can choose to find the rows in this table.
- key: Indicates the key (index) that MySQL actually decided to use.
- key_len: Indicates the length of the key that MySQL decided to use.
- ref: Shows which columns or constants are compared to the index named in the key column to select rows from the table.
- rows: Indicates the number of rows MySQL believes it must examine to execute the query.
- filtered: Indicates an estimated percentage of table rows that are filtered by the table condition.
- Extra: Contains additional information about how MySQL resolves the query.
Queries where MySQL can use an index:
- Find matching rows: WHERE
- Groups rows: GROUP BY
- Sort rows: ORDER BY
- Avoid reading rows: Covering Indexes
- Join table: Joins
To demonstrate the concept, we will use the following schema with 10 random records populated.
CREATE TABLE records (
id int unsigned NOT NULL,
a char(2) NOT NULL,
b char(2) NOT NULL,
c char(2) NOT NULL,
PRIMARY KEY (id),
KEY `idx_a_b` (a,b)
)
Find matching rows: WHERE
In a WHERE clause, MySQL can use an index to find rows that match table conditions. The use of indexes depends on several factors, including table conditions, indexes, and the leftmost prefix requirement.
A query must use the index’s left-most prefix in order to use it. The index definition specifies at least one index column, beginning with the left-most index column. The left-most prefix is necessary because the underlying index structure is ordered by the index column order, and it can only be traversed in that order.
Let’s take a simple example of WHERE.
EXPLAIN SELECT * FROM records WHERE id=1;
Key: PRIMARY confirms that the primary key is used by MySQL. It is not ALL (table scan) or index (index scan), as would be expected for a simple primary key lookup. There is no secondary index listed in the possible_keys field since MySQL cannot use it for this query, since column id is not the leftmost prefix of the secondary index on columns ‘a’ and ‘b’.
Type as const occurs only when all index columns of the primary key or unique secondary index contain constant conditions (ref: const).
Let’s take a more realistic example of WHERE.
EXPLAIN SELECT * from records where id > 1 and id < 5 and c = 'Bh';
While the query still uses the key: PRIMARY, the access type is a range, which uses an index to read rows between a range of values. The ref field is NULL since the conditions on column id are not constant. It estimates that it will examine three rows in the range (rows: 3). When “Using Where” is set in the Extra field, MySQL will find matching rows using the WHERE conditions, for each row read, a row matches if all WHERE conditions are true.
Let’s use both leftmost prefixes of the secondary index.
We will use only column ‘a’ in the WHERE clause.
EXPLAIN SELECT * FROM records where a = 'Ab';
For the WHERE clause, let’s use both columns ‘a’ and ‘b’.
EXPLAIN SELECT * FROM records WHERE a = 'Ab' AND b = 'Cd';
The key idx_a_b confirms that MySQL uses the secondary index, as it meets the leftmost prefix requirement. In the first WHERE clause, only the first index part, column ‘a’, is used. In the second WHERE clause, both index parts, columns ‘a’ and ‘b’, are used. Column ‘b’ alone would not satisfy the leftmost prefix requirement.
Access type ref is an equality lookup on the leftmost prefix of the index. In the extra field “using index condition”, MySQL can locate matching rows using an index.
Groups rows: GROUP BY
Because values are implicitly grouped by index order, MySQL can use an index to optimize GROUP BY.
EXPLAIN SELECT a, COUNT(*) FROM records GROUP BY a;
The key idx_a_b confirms that MySQL uses the index to optimize the GROUP BY. Since the index is ordered, MySQL is assured that each new value for column ‘a’ is a new group. “Using index” in the Extra field indicates that MySQL reads values only from the index for column ‘a’. The query uses an index, but not for index lookup, type: index indicates that the index is being scanned.
Sort rows: ORDER BY
By using an ordered index, MySQL can optimize ORDER BY. This optimization avoids sorting rows, which takes a little more time, by accessing rows in order. In the absence of this optimization, MySQL reads all matching rows, sorts them, then returns the sorted result set. In the Extra field of the EXPLAIN plan, MySQL prints “Using filesort” when sorting rows. Filesort means sorting rows.
There are three ways to use an index to optimize ORDER BY.
- The first and simplest way is to use the leftmost index prefix for the ORDER BY clause. For table records, this means
* ORDER BY id
* ORDER by a
* ORDER by a, b
EXPLAIN SELECT a,b from records ORDER BY id;
- A second method is to hold the leftmost part of the index constant and order it by the next index column.
EXPLAIN SELECT a, b FROM records WHERE a = 'Hm' ORDER BY b;
- The third way is a special case of the second way. The below query does not cause a filesort because the WHERE conditions are a leftmost prefix, but shouldn’t ORDER BY ID cause a filesort? A hidden primary key is added to the secondary index.
EXPLAIN SELECT * FROM records WHERE a = 'Hm' AND b = 'Bh' ORDER BY id;
- In order to demonstrate that the hidden primary key allows the ORDER BY to avoid filesort, let us remove the column ‘b’ condition to invalidate optimization.
EXPLAIN SELECT * FROM records WHERE a = 'Hm' ORDER BY id;
By removing the condition on column ‘b’, MySQL can no longer use the hidden primary key to optimize ORDER BY by using the leftmost prefix on the secondary index.
Avoid reading rows: Covering Indexes
All columns referenced in a query are included in a covering index.
EXPLAIN SELECT a,b FROM records WHERE a = 'Hm' AND b = 'Bh';
As usual, the WHERE conditions on columns ‘a’ and ‘b’ point to corresponding index columns, but these index columns also point to the corresponding columns in the SELECT clause to indicate that the values for these columns are read from the index.
Join table: Joins
A MySQL index is used to join tables, and this is fundamentally the same as using an index for anything else. The main difference is the source of values used in join conditions between the two tables.
Let’s create another table and populate it.
CREATE TABLE records_info (
id char(2) NOT NULL,
info varchar(16) DEFAULT NULL,
PRIMARY KEY (id)
)
So that a join can be performed on it, the value of ‘id’ is kept the same as in column ‘a’ of the records table.
EXPLAIN SELECT info
FROM records JOIN records_info ON (records.a = records_info.id)
WHERE a IN ('Ab', 'B', 'Hk');