8.3.7 InnoDB and MyISAM Index Statistics Collection (original) (raw)

8.3.7 InnoDB and MyISAM Index Statistics Collection

Storage engines collect statistics about tables for use by the optimizer. Table statistics are based on value groups, where a value group is a set of rows with the same key prefix value. For optimizer purposes, an important statistic is the average value group size.

MySQL uses the average value group size in the following ways:

(...) JOIN tbl_name ON tbl_name.key = expr  

As the average value group size for an index increases, the index is less useful for those two purposes because the average number of rows per lookup increases: For the index to be good for optimization purposes, it is best that each index value target a small number of rows in the table. When a given index value yields a large number of rows, the index is less useful and MySQL is less likely to use it.

The average value group size is related to table cardinality, which is the number of value groups. TheSHOW INDEX statement displays a cardinality value based on N/S, where_N_ is the number of rows in the table and S is the average value group size. That ratio yields an approximate number of value groups in the table.

For a join based on the <=> comparison operator, NULL is not treated differently from any other value: NULL <=> NULL, just as _`N`_ <=>_`N`_ for any other_N_.

However, for a join based on the = operator,NULL is different from non-NULL values:_`expr1`_ =_`expr2`_ is not true when_expr1_ or_expr2_ (or both) areNULL. This affectsref accesses for comparisons of the form _`tblname.key`_ =_`expr`_: MySQL does not access the table if the current value of_expr_ is NULL, because the comparison cannot be true.

For = comparisons, it does not matter how many NULL values are in the table. For optimization purposes, the relevant value is the average size of the non-NULL value groups. However, MySQL does not currently enable that average size to be collected or used.

For InnoDB and MyISAM tables, you have some control over collection of table statistics by means of theinnodb_stats_method andmyisam_stats_method system variables, respectively. These variables have three possible values, which differ as follows:

If you tend to use many joins that use<=> rather than =,NULL values are not special in comparisons and one NULL is equal to another. In this case, nulls_equal is the appropriate statistics method.

The innodb_stats_method system variable has a global value; themyisam_stats_method system variable has both global and session values. Setting the global value affects statistics collection for tables from the corresponding storage engine. Setting the session value affects statistics collection only for the current client connection. This means that you can force a table's statistics to be regenerated with a given method without affecting other clients by setting the session value ofmyisam_stats_method.

To regenerate MyISAM table statistics, you can use any of the following methods:

Some caveats regarding the use ofinnodb_stats_method andmyisam_stats_method: