MySQL :: MySQL 8.0 Reference Manual :: 14.20.1 Window Function Descriptions (original) (raw)

14.20.1 Window Function Descriptions

This section describes nonaggregate window functions that, for each row from a query, perform a calculation using rows related to that row. Most aggregate functions also can be used as window functions; see Section 14.19.1, “Aggregate Function Descriptions”.

For window function usage information and examples, and definitions of terms such as the OVER clause, window, partition, frame, and peer, seeSection 14.20.2, “Window Function Concepts and Syntax”.

Table 14.30 Window Functions

Name Description
CUME_DIST() Cumulative distribution value
DENSE_RANK() Rank of current row within its partition, without gaps
FIRST_VALUE() Value of argument from first row of window frame
LAG() Value of argument from row lagging current row within partition
LAST_VALUE() Value of argument from last row of window frame
LEAD() Value of argument from row leading current row within partition
NTH_VALUE() Value of argument from N-th row of window frame
NTILE() Bucket number of current row within its partition.
PERCENT_RANK() Percentage rank value
RANK() Rank of current row within its partition, with gaps
ROW_NUMBER() Number of current row within its partition

In the following function descriptions,overclause represents theOVER clause, described inSection 14.20.2, “Window Function Concepts and Syntax”. Some window functions permit a nulltreatment clause that specifies how to handle NULL values when calculating results. This clause is optional. It is part of the SQL standard, but the MySQL implementation permits onlyRESPECT NULLS (which is also the default). This means that NULL values are considered when calculating results. IGNORE NULLS is parsed, but produces an error.

mysql> SELECT  
         val,  
         ROW_NUMBER()   OVER w AS 'row_number',  
         CUME_DIST()    OVER w AS 'cume_dist',  
         PERCENT_RANK() OVER w AS 'percent_rank'  
       FROM numbers  
       WINDOW w AS (ORDER BY val);  
+------+------------+--------------------+--------------+  
| val  | row_number | cume_dist          | percent_rank |  
+------+------------+--------------------+--------------+  
|    1 |          1 | 0.2222222222222222 |            0 |  
|    1 |          2 | 0.2222222222222222 |            0 |  
|    2 |          3 | 0.3333333333333333 |         0.25 |  
|    3 |          4 | 0.6666666666666666 |        0.375 |  
|    3 |          5 | 0.6666666666666666 |        0.375 |  
|    3 |          6 | 0.6666666666666666 |        0.375 |  
|    4 |          7 | 0.8888888888888888 |         0.75 |  
|    4 |          8 | 0.8888888888888888 |         0.75 |  
|    5 |          9 |                  1 |            1 |  
+------+------------+--------------------+--------------+  
mysql> SELECT  
         time, subject, val,  
         FIRST_VALUE(val)  OVER w AS 'first',  
         LAST_VALUE(val)   OVER w AS 'last',  
         NTH_VALUE(val, 2) OVER w AS 'second',  
         NTH_VALUE(val, 4) OVER w AS 'fourth'  
       FROM observations  
       WINDOW w AS (PARTITION BY subject ORDER BY time  
                    ROWS UNBOUNDED PRECEDING);  
+----------+---------+------+-------+------+--------+--------+  
| time     | subject | val  | first | last | second | fourth |  
+----------+---------+------+-------+------+--------+--------+  
| 07:00:00 | st113   |   10 |    10 |   10 |   NULL |   NULL |  
| 07:15:00 | st113   |    9 |    10 |    9 |      9 |   NULL |  
| 07:30:00 | st113   |   25 |    10 |   25 |      9 |   NULL |  
| 07:45:00 | st113   |   20 |    10 |   20 |      9 |     20 |  
| 07:00:00 | xh458   |    0 |     0 |    0 |   NULL |   NULL |  
| 07:15:00 | xh458   |   10 |     0 |   10 |     10 |   NULL |  
| 07:30:00 | xh458   |    5 |     0 |    5 |     10 |   NULL |  
| 07:45:00 | xh458   |   30 |     0 |   30 |     10 |     30 |  
| 08:00:00 | xh458   |   25 |     0 |   25 |     10 |     30 |  
+----------+---------+------+-------+------+--------+--------+  

Each function uses the rows in the current frame, which, per the window definition shown, extends from the first partition row to the current row. For theNTH_VALUE() calls, the current frame does not always include the requested row; in such cases, the return value is NULL.

mysql> SELECT  
         t, val,  
         LAG(val)        OVER w AS 'lag',  
         LEAD(val)       OVER w AS 'lead',  
         val - LAG(val)  OVER w AS 'lag diff',  
         val - LEAD(val) OVER w AS 'lead diff'  
       FROM series  
       WINDOW w AS (ORDER BY t);  
+----------+------+------+------+----------+-----------+  
| t        | val  | lag  | lead | lag diff | lead diff |  
+----------+------+------+------+----------+-----------+  
| 12:00:00 |  100 | NULL |  125 |     NULL |       -25 |  
| 13:00:00 |  125 |  100 |  132 |       25 |        -7 |  
| 14:00:00 |  132 |  125 |  145 |        7 |       -13 |  
| 15:00:00 |  145 |  132 |  140 |       13 |         5 |  
| 16:00:00 |  140 |  145 |  150 |       -5 |       -10 |  
| 17:00:00 |  150 |  140 |  200 |       10 |       -50 |  
| 18:00:00 |  200 |  150 | NULL |       50 |      NULL |  
+----------+------+------+------+----------+-----------+  

In the example, the LAG() andLEAD() calls use the default_N_ and_default_ values of 1 andNULL, respectively.
The first row shows what happens when there is no previous row for LAG(): The function returns the default value (in this case, NULL). The last row shows the same thing when there is no next row forLEAD().
LAG() andLEAD() also serve to compute sums rather than differences. Consider this data set, which contains the first few numbers of the Fibonacci series:

mysql> SELECT n FROM fib ORDER BY n;  
+------+  
| n    |  
+------+  
|    1 |  
|    1 |  
|    2 |  
|    3 |  
|    5 |  
|    8 |  
+------+  

The following query shows theLAG() andLEAD() values for the rows adjacent to the current row. It also uses those functions to add to the current row value the values from the preceding and following rows. The effect is to generate the next number in the Fibonacci series, and the next number after that:

mysql> SELECT  
         n,  
         LAG(n, 1, 0)      OVER w AS 'lag',  
         LEAD(n, 1, 0)     OVER w AS 'lead',  
         n + LAG(n, 1, 0)  OVER w AS 'next_n',  
         n + LEAD(n, 1, 0) OVER w AS 'next_next_n'  
       FROM fib  
       WINDOW w AS (ORDER BY n);  
+------+------+------+--------+-------------+  
| n    | lag  | lead | next_n | next_next_n |  
+------+------+------+--------+-------------+  
|    1 |    0 |    1 |      1 |           2 |  
|    1 |    1 |    2 |      2 |           3 |  
|    2 |    1 |    3 |      3 |           5 |  
|    3 |    2 |    5 |      5 |           8 |  
|    5 |    3 |    8 |      8 |          13 |  
|    8 |    5 |    0 |     13 |           8 |  
+------+------+------+--------+-------------+  

One way to generate the initial set of Fibonacci numbers is to use a recursive common table expression. For an example, seeFibonacci Series Generation.
Beginning with MySQL 8.0.22, you cannot use a negative value for the rows argument of this function.

mysql> SELECT  
         val,  
         ROW_NUMBER() OVER w AS 'row_number',  
         NTILE(2)     OVER w AS 'ntile2',  
         NTILE(4)     OVER w AS 'ntile4'  
       FROM numbers  
       WINDOW w AS (ORDER BY val);  
+------+------------+--------+--------+  
| val  | row_number | ntile2 | ntile4 |  
+------+------------+--------+--------+  
|    1 |          1 |      1 |      1 |  
|    1 |          2 |      1 |      1 |  
|    2 |          3 |      1 |      1 |  
|    3 |          4 |      1 |      2 |  
|    3 |          5 |      1 |      2 |  
|    3 |          6 |      2 |      3 |  
|    4 |          7 |      2 |      3 |  
|    4 |          8 |      2 |      4 |  
|    5 |          9 |      2 |      4 |  
+------+------------+--------+--------+  

Beginning with MySQL 8.0.22, the constructNTILE(NULL) is no longer permitted.

(rank - 1) / (rows - 1)  

This function should be used with ORDER BY to sort partition rows into the desired order. Without ORDER BY, all rows are peers.
overclause is as described inSection 14.20.2, “Window Function Concepts and Syntax”.
For an example, see theCUME_DIST() function description.

mysql> SELECT  
         val,  
         ROW_NUMBER() OVER w AS 'row_number',  
         RANK()       OVER w AS 'rank',  
         DENSE_RANK() OVER w AS 'dense_rank'  
       FROM numbers  
       WINDOW w AS (ORDER BY val);  
+------+------------+------+------------+  
| val  | row_number | rank | dense_rank |  
+------+------------+------+------------+  
|    1 |          1 |    1 |          1 |  
|    1 |          2 |    1 |          1 |  
|    2 |          3 |    3 |          2 |  
|    3 |          4 |    4 |          3 |  
|    3 |          5 |    4 |          3 |  
|    3 |          6 |    4 |          3 |  
|    4 |          7 |    7 |          4 |  
|    4 |          8 |    7 |          4 |  
|    5 |          9 |    9 |          5 |  
+------+------------+------+------------+