MySQL :: MySQL 8.4 Reference Manual :: 10.8.2 EXPLAIN Output Format (original) (raw)

10.8.2 EXPLAIN Output Format

The EXPLAIN statement provides information about how MySQL executes statements.EXPLAIN works withSELECT,DELETE,INSERT,REPLACE, andUPDATE statements.

EXPLAIN returns a row of information for each table used in theSELECT statement. It lists the tables in the output in the order that MySQL would read them while processing the statement. This means that MySQL reads a row from the first table, then finds a matching row in the second table, and then in the third table, and so on. When all tables are processed, MySQL outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table.

EXPLAIN Output Columns

This section describes the output columns produced byEXPLAIN. Later sections provide additional information about thetype andExtra columns.

Each output row from EXPLAIN provides information about one table. Each row contains the values summarized inTable 10.1, “EXPLAIN Output Columns”, and described in more detail following the table. Column names are shown in the table's first column; the second column provides the equivalent property name shown in the output whenFORMAT=JSON is used.

Table 10.1 EXPLAIN Output Columns

Column JSON Name Meaning
id select_id The SELECT identifier
select_type None The SELECT type
table table_name The table for the output row
partitions partitions The matching partitions
type access_type The join type
possible_keys possible_keys The possible indexes to choose
key key The index actually chosen
key_len key_length The length of the chosen key
ref ref The columns compared to the index
rows rows Estimate of rows to be examined
filtered filtered Percentage of rows filtered by table condition
Extra None Additional information

Note

JSON properties which are NULL are not displayed in JSON-formatted EXPLAIN output.

EXPLAIN Join Types

The type column ofEXPLAIN output describes how tables are joined. In JSON-formatted output, these are found as values of the access_type property. The following list describes the join types, ordered from the best type to the worst:

SELECT * FROM tbl_name WHERE primary_key=1;  
SELECT * FROM tbl_name  
  WHERE primary_key_part1=1 AND primary_key_part2=2;  
SELECT * FROM ref_table,other_table  
  WHERE ref_table.key_column=other_table.column;  
SELECT * FROM ref_table,other_table  
  WHERE ref_table.key_column_part1=other_table.column  
  AND ref_table.key_column_part2=1;  
SELECT * FROM ref_table WHERE key_column=expr;  
SELECT * FROM ref_table,other_table  
  WHERE ref_table.key_column=other_table.column;  
SELECT * FROM ref_table,other_table  
  WHERE ref_table.key_column_part1=other_table.column  
  AND ref_table.key_column_part2=1;  
SELECT * FROM ref_table  
  WHERE key_column=expr OR key_column IS NULL;  

See Section 10.2.1.15, “IS NULL Optimization”.

value IN (SELECT primary_key FROM single_table WHERE some_expr)  

unique_subquery is just an index lookup function that replaces the subquery completely for better efficiency.

value IN (SELECT key_column FROM single_table WHERE some_expr)  
SELECT * FROM tbl_name  
  WHERE key_column = 10;  
SELECT * FROM tbl_name  
  WHERE key_column BETWEEN 10 and 20;  
SELECT * FROM tbl_name  
  WHERE key_column IN (10,20,30);  
SELECT * FROM tbl_name  
  WHERE key_part1 = 10 AND key_part2 IN (10,20,30);  

EXPLAIN Extra Information

The Extra column ofEXPLAIN output contains additional information about how MySQL resolves the query. The following list explains the values that can appear in this column. Each item also indicates for JSON-formatted output which property displays the Extra value. For some of these, there is a specific property. The others display as the text of the message property.

If you want to make your queries as fast as possible, look out for Extra column values of Using filesort and Using temporary, or, in JSON-formatted EXPLAIN output, forusing_filesort andusing_temporary_table properties equal totrue.

SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id  
  WHERE t2.id IS NULL;  

Assume that t2.id is defined asNOT NULL. In this case, MySQL scanst1 and looks up the rows int2 using the values oft1.id. If MySQL finds a matching row int2, it knows thatt2.id can never beNULL, and does not scan through the rest of the rows in t2 that have the same id value. In other words, for each row in t1, MySQL needs to do only a single lookup in t2, regardless of how many rows actually match in t2.
This can also indicate that a WHERE condition of the form NOT IN (_`subquery`_) orNOT EXISTS (_`subquery`_) has been transformed internally into an antijoin. This removes the subquery and brings its tables into the plan for the topmost query, providing improved cost planning. By merging semijoins and antijoins, the optimizer can reorder tables in the execution plan more freely, in some cases resulting in a faster plan.
You can see when an antijoin transformation is performed for a given query by checking theMessage column from SHOW WARNINGS following execution ofEXPLAIN, or in the output ofEXPLAIN FORMAT=TREE.
Note
An antijoin is the complement of a semijoin_`tablea`_ JOIN_`tableb`_ ON_`condition`_. The antijoin returns all rows from_tablea_ for which there is_no_ row in_tableb_ which matches_condition_.

SELECT  
  ...  
FROM  
  t,  
  LATERAL (derived table that refers to t) AS dt  
...  

The content of the derived table is rematerialized to bring it up to date each time a new row oft is processed by the top query.

SELECT MIN(c1), MIN(c2) FROM t1;  

Suppose that MIN(c1) can be retrieved by reading one index row and MIN(c2) can be retrieved by reading one row from a different index. That is, for each column c1 andc2, there exists an index where the column is the first column of the index. In this case, one row is returned, produced by reading two deterministic rows.
This Extra value does not occur if the rows to read are not deterministic. Consider this query:

SELECT MIN(c2) FROM t1 WHERE c1 <= 10;  

Suppose that (c1, c2) is a covering index. Using this index, all rows with c1 <= 10 must be scanned to find the minimumc2 value. By contrast, consider this query:

SELECT MIN(c2) FROM t1 WHERE c1 = 10;  

In this case, the first index row with c1 = 10 contains the minimum c2 value. Only one row must be read to produce the returned row.
For storage engines that maintain an exact row count per table (such as MyISAM, but notInnoDB), this Extra value can occur for COUNT(*) queries for which the WHERE clause is missing or always true and there is no GROUP BY clause. (This is an instance of an implicitly grouped query where the storage engine influences whether a deterministic number of rows can be read.)

EXPLAIN Output Interpretation

You can get a good indication of how good a join is by taking the product of the values in the rows column of the EXPLAIN output. This should tell you roughly how many rows MySQL must examine to execute the query. If you restrict queries with themax_join_size system variable, this row product also is used to determine which multiple-table SELECT statements to execute and which to abort. SeeSection 7.1.1, “Configuring the Server”.

The following example shows how a multiple-table join can be optimized progressively based on the information provided byEXPLAIN.

Suppose that you have theSELECT statement shown here and that you plan to examine it usingEXPLAIN:

EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
               tt.ProjectReference, tt.EstimatedShipDate,
               tt.ActualShipDate, tt.ClientID,
               tt.ServiceCodes, tt.RepetitiveID,
               tt.CurrentProcess, tt.CurrentDPPerson,
               tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
               et_1.COUNTRY, do.CUSTNAME
        FROM tt, et, et AS et_1, do
        WHERE tt.SubmitTime IS NULL
          AND tt.ActualPC = et.EMPLOYID
          AND tt.AssignedPC = et_1.EMPLOYID
          AND tt.ClientID = do.CUSTNMBR;

For this example, make the following assumptions:

Initially, before any optimizations have been performed, theEXPLAIN statement produces the following information:

table type possible_keys key  key_len ref  rows  Extra
et    ALL  PRIMARY       NULL NULL    NULL 74
do    ALL  PRIMARY       NULL NULL    NULL 2135
et_1  ALL  PRIMARY       NULL NULL    NULL 74
tt    ALL  AssignedPC,   NULL NULL    NULL 3872
           ClientID,
           ActualPC
      Range checked for each record (index map: 0x23)

Because type isALL for each table, this output indicates that MySQL is generating a Cartesian product of all the tables; that is, every combination of rows. This takes quite a long time, because the product of the number of rows in each table must be examined. For the case at hand, this product is 74 × 2135 × 74 × 3872 = 45,268,558,720 rows. If the tables were bigger, you can only imagine how long it would take.

One problem here is that MySQL can use indexes on columns more efficiently if they are declared as the same type and size. In this context, VARCHAR andCHAR are considered the same if they are declared as the same size.tt.ActualPC is declared asCHAR(10) and et.EMPLOYID is CHAR(15), so there is a length mismatch.

To fix this disparity between column lengths, useALTER TABLE to lengthenActualPC from 10 characters to 15 characters:

mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);

Now tt.ActualPC andet.EMPLOYID are bothVARCHAR(15). Executing theEXPLAIN statement again produces this result:

table type   possible_keys key     key_len ref         rows    Extra
tt    ALL    AssignedPC,   NULL    NULL    NULL        3872    Using
             ClientID,                                         where
             ActualPC
do    ALL    PRIMARY       NULL    NULL    NULL        2135
      Range checked for each record (index map: 0x1)
et_1  ALL    PRIMARY       NULL    NULL    NULL        74
      Range checked for each record (index map: 0x1)
et    eq_ref PRIMARY       PRIMARY 15      tt.ActualPC 1

This is not perfect, but is much better: The product of therows values is less by a factor of 74. This version executes in a couple of seconds.

A second alteration can be made to eliminate the column length mismatches for the tt.AssignedPC = et_1.EMPLOYID and tt.ClientID = do.CUSTNMBR comparisons:

mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
                      MODIFY ClientID   VARCHAR(15);

After that modification,EXPLAIN produces the output shown here:

table type   possible_keys key      key_len ref           rows Extra
et    ALL    PRIMARY       NULL     NULL    NULL          74
tt    ref    AssignedPC,   ActualPC 15      et.EMPLOYID   52   Using
             ClientID,                                         where
             ActualPC
et_1  eq_ref PRIMARY       PRIMARY  15      tt.AssignedPC 1
do    eq_ref PRIMARY       PRIMARY  15      tt.ClientID   1

At this point, the query is optimized almost as well as possible. The remaining problem is that, by default, MySQL assumes that values in the tt.ActualPC column are evenly distributed, and that is not the case for the tt table. Fortunately, it is easy to tell MySQL to analyze the key distribution:

mysql> ANALYZE TABLE tt;

With the additional index information, the join is perfect andEXPLAIN produces this result:

table type   possible_keys key     key_len ref           rows Extra
tt    ALL    AssignedPC    NULL    NULL    NULL          3872 Using
             ClientID,                                        where
             ActualPC
et    eq_ref PRIMARY       PRIMARY 15      tt.ActualPC   1
et_1  eq_ref PRIMARY       PRIMARY 15      tt.AssignedPC 1
do    eq_ref PRIMARY       PRIMARY 15      tt.ClientID   1

The rows column in the output fromEXPLAIN is an educated guess from the MySQL join optimizer. Check whether the numbers are even close to the truth by comparing therows product with the actual number of rows that the query returns. If the numbers are quite different, you might get better performance by usingSTRAIGHT_JOIN in yourSELECT statement and trying to list the tables in a different order in theFROM clause. (However,STRAIGHT_JOIN may prevent indexes from being used because it disables semijoin transformations. SeeOptimizing IN and EXISTS Subquery Predicates with Semijoin Transformations.)

It is possible in some cases to execute statements that modify data when EXPLAIN SELECT is used with a subquery; for more information, see Section 15.2.15.8, “Derived Tables”.