17.8.10.3 Estimating ANALYZE TABLE Complexity for InnoDB Tables (original) (raw)

17.8.10.3 Estimating ANALYZE TABLE Complexity for InnoDB Tables

ANALYZE TABLE complexity forInnoDB tables is dependent on:

Using these parameters, an approximate formula for estimatingANALYZE TABLE complexity would be:

The value ofinnodb_stats_persistent_sample_pages * number of indexed columns in a table * the number of partitions

Typically, the greater the resulting value, the greater the execution time for ANALYZE TABLE.

For a more in-depth approach to estimating ANALYZE TABLE complexity, consider the following example.

In Big O notation, ANALYZE TABLE complexity is described as:

 O(n_sample
  * (n_cols_in_uniq_i
     + n_cols_in_non_uniq_i
     + n_cols_in_pk * (1 + n_non_uniq_i))
  * n_part)

where:

Now, consider the following table (table t), which has a primary key (2 columns), a unique index (2 columns), and two nonunique indexes (two columns each):

CREATE TABLE t (
  a INT,
  b INT,
  c INT,
  d INT,
  e INT,
  f INT,
  g INT,
  h INT,
  PRIMARY KEY (a, b),
  UNIQUE KEY i1uniq (c, d),
  KEY i2nonuniq (e, f),
  KEY i3nonuniq (g, h)
);

For the column and index data required by the algorithm described above, query themysql.innodb_index_stats persistent index statistics table for table t. Then_diff_pfx% statistics show the columns that are counted for each index. For example, columnsa and b are counted for the primary key index. For the nonunique indexes, the primary key columns (a,b) are counted in addition to the user defined columns.

mysql> SELECT index_name, stat_name, stat_description
       FROM mysql.innodb_index_stats WHERE
       database_name='test' AND
       table_name='t' AND
       stat_name like 'n_diff_pfx%';
  +------------+--------------+------------------+
  | index_name | stat_name    | stat_description |
  +------------+--------------+------------------+
  | PRIMARY    | n_diff_pfx01 | a                |
  | PRIMARY    | n_diff_pfx02 | a,b              |
  | i1uniq     | n_diff_pfx01 | c                |
  | i1uniq     | n_diff_pfx02 | c,d              |
  | i2nonuniq  | n_diff_pfx01 | e                |
  | i2nonuniq  | n_diff_pfx02 | e,f              |
  | i2nonuniq  | n_diff_pfx03 | e,f,a            |
  | i2nonuniq  | n_diff_pfx04 | e,f,a,b          |
  | i3nonuniq  | n_diff_pfx01 | g                |
  | i3nonuniq  | n_diff_pfx02 | g,h              |
  | i3nonuniq  | n_diff_pfx03 | g,h,a            |
  | i3nonuniq  | n_diff_pfx04 | g,h,a,b          |
  +------------+--------------+------------------+

Based on the index statistics data shown above and the table definition, the following values can be determined:

You can now calculateinnodb_stats_persistent_sample_pages * (2 + 4 + 2 * (1 + 2)) * 1 to determine the number of leaf pages that are scanned. Withinnodb_stats_persistent_sample_pages set to the default value of 20, and with a default page size of 16 KiB (innodb_page_size=16384), you can then estimate that 20 * 12 * 16384 bytes are read for table t, or about 4MiB.

Note

All 4 MiB may not be read from disk, as some leaf pages may already be cached in the buffer pool.