MySQL :: MySQL 8.4 Reference Manual :: 15.2.13.2 JOIN Clause (original) (raw)

15.2.13.2 JOIN Clause

MySQL supports the following JOIN syntax for the tablereferences part ofSELECT statements and multiple-table DELETE andUPDATE statements:

table_references:
    escaped_table_reference [, escaped_table_reference] ...

escaped_table_reference: {
    table_reference
  | { OJ table_reference }
}

table_reference: {
    table_factor
  | joined_table
}

table_factor: {
    tbl_name [PARTITION (partition_names)]
        [[AS] alias] [index_hint_list]
  | [LATERAL] table_subquery [AS] alias [(col_list)]
  | ( table_references )
}

joined_table: {
    table_reference {[INNER | CROSS] JOIN | STRAIGHT_JOIN} table_factor [join_specification]
  | table_reference {LEFT|RIGHT} [OUTER] JOIN table_reference join_specification
  | table_reference NATURAL [INNER | {LEFT|RIGHT} [OUTER]] JOIN table_factor
}

join_specification: {
    ON search_condition
  | USING (join_column_list)
}

join_column_list:
    column_name[, column_name] ...

index_hint_list:
    index_hint[ index_hint] ...

index_hint: {
    USE {INDEX|KEY}
      [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
  | {IGNORE|FORCE} {INDEX|KEY}
      [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
}

index_list:
    index_name [, index_name] ...

A table reference is also known as a join expression.

A table reference (when it refers to a partitioned table) may contain a PARTITION clause, including a list of comma-separated partitions, subpartitions, or both. This option follows the name of the table and precedes any alias declaration. The effect of this option is that rows are selected only from the listed partitions or subpartitions. Any partitions or subpartitions not named in the list are ignored. For more information and examples, seeSection 26.5, “Partition Selection”.

The syntax of tablefactor is extended in MySQL in comparison with standard SQL. The standard accepts only tablereference, not a list of them inside a pair of parentheses.

This is a conservative extension if each comma in a list of_tablereference_ items is considered as equivalent to an inner join. For example:

SELECT * FROM t1 LEFT JOIN (t2, t3, t4)
                 ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)

is equivalent to:

SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4)
                 ON (t2.a = t1.a AND t3.b = t1.b AND t4.c = t1.c)

In MySQL, JOIN, CROSS JOIN, and INNER JOIN are syntactic equivalents (they can replace each other). In standard SQL, they are not equivalent. INNER JOIN is used with an ON clause, CROSS JOIN is used otherwise.

In general, parentheses can be ignored in join expressions containing only inner join operations. MySQL also supports nested joins. See Section 10.2.1.8, “Nested Join Optimization”.

Index hints can be specified to affect how the MySQL optimizer makes use of indexes. For more information, seeSection 10.9.4, “Index Hints”. Optimizer hints and theoptimizer_switch system variable are other ways to influence optimizer use of indexes. SeeSection 10.9.3, “Optimizer Hints”, andSection 10.9.2, “Switchable Optimizations”.

The following list describes general factors to take into account when writing joins:

SELECT t1.name, t2.salary  
  FROM employee AS t1 INNER JOIN info AS t2 ON t1.name = t2.name;  
SELECT t1.name, t2.salary  
  FROM employee t1 INNER JOIN info t2 ON t1.name = t2.name;  
SELECT * FROM (SELECT 1, 2, 3) AS t1;  
SELECT left_tbl.*  
  FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id  
  WHERE right_tbl.id IS NULL;  

This example finds all rows in left_tbl with an id value that is not present inright_tbl (that is, all rows inleft_tbl with no corresponding row inright_tbl). SeeSection 10.2.1.9, “Outer Join Optimization”.

a LEFT JOIN b USING (c1, c2, c3)  
SELECT left_tbl.*  
    FROM { OJ left_tbl LEFT OUTER JOIN right_tbl  
           ON left_tbl.id = right_tbl.id }  
    WHERE right_tbl.id IS NULL;  

You can use other types of joins within { OJ ... }, such as INNER JOIN orRIGHT OUTER JOIN. This helps with compatibility with some third-party applications, but is not official ODBC syntax.

Some join examples:

SELECT * FROM table1, table2;

SELECT * FROM table1 INNER JOIN table2 ON table1.id = table2.id;

SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id;

SELECT * FROM table1 LEFT JOIN table2 USING (id);

SELECT * FROM table1 LEFT JOIN table2 ON table1.id = table2.id
  LEFT JOIN table3 ON table2.id = table3.id;

Natural joins and joins with USING, including outer join variants, are processed according to the SQL:2003 standard:

CREATE TABLE t1 (i INT, j INT);  
CREATE TABLE t2 (k INT, j INT);  
INSERT INTO t1 VALUES(1, 1);  
INSERT INTO t2 VALUES(1, 1);  
SELECT * FROM t1 NATURAL JOIN t2;  
SELECT * FROM t1 JOIN t2 USING (j);  

In the first SELECT statement, column j appears in both tables and thus becomes a join column, so, according to standard SQL, it should appear only once in the output, not twice. Similarly, in the second SELECT statement, columnj is named in theUSING clause and should appear only once in the output, not twice.
Thus, the statements produce this output:

+------+------+------+  
| j    | i    | k    |  
+------+------+------+  
|    1 |    1 |    1 |  
+------+------+------+  
+------+------+------+  
| j    | i    | k    |  
+------+------+------+  
|    1 |    1 |    1 |  
+------+------+------+  

Redundant column elimination and column ordering occurs according to standard SQL, producing this display order:

COALESCE(x, y) = (CASE WHEN x IS NOT NULL THEN x ELSE y END)  

If the join operation is any other join, the result columns of the join consist of the concatenation of all columns of the joined tables.
A consequence of the definition of coalesced columns is that, for outer joins, the coalesced column contains the value of the non-NULL column if one of the two columns is always NULL. If neither or both columns are NULL, both common columns have the same value, so it doesn't matter which one is chosen as the value of the coalesced column. A simple way to interpret this is to consider that a coalesced column of an outer join is represented by the common column of the inner table of a JOIN. Suppose that the tables t1(a, b) andt2(a, c) have the following contents:

t1    t2  
----  ----  
1 x   2 z  
2 y   3 w  

Then, for this join, column a contains the values of t1.a:

mysql> SELECT * FROM t1 NATURAL LEFT JOIN t2;  
+------+------+------+  
| a    | b    | c    |  
+------+------+------+  
|    1 | x    | NULL |  
|    2 | y    | z    |  
+------+------+------+  

By contrast, for this join, column a contains the values of t2.a.

mysql> SELECT * FROM t1 NATURAL RIGHT JOIN t2;  
+------+------+------+  
| a    | c    | b    |  
+------+------+------+  
|    2 | z    | y    |  
|    3 | w    | NULL |  
+------+------+------+  

Compare those results to the otherwise equivalent queries with JOIN ... ON:

mysql> SELECT * FROM t1 LEFT JOIN t2 ON (t1.a = t2.a);  
+------+------+------+------+  
| a    | b    | a    | c    |  
+------+------+------+------+  
|    1 | x    | NULL | NULL |  
|    2 | y    |    2 | z    |  
+------+------+------+------+  
mysql> SELECT * FROM t1 RIGHT JOIN t2 ON (t1.a = t2.a);  
+------+------+------+------+  
| a    | b    | a    | c    |  
+------+------+------+------+  
|    2 | y    |    2 | z    |  
| NULL | NULL |    3 | w    |  
+------+------+------+------+  
a LEFT JOIN b USING (c1, c2, c3)  
a LEFT JOIN b ON a.c1 = b.c1 AND a.c2 = b.c2 AND a.c3 = b.c3  

With respect to determining which rows satisfy the join condition, both joins are semantically identical.
With respect to determining which columns to display forSELECT * expansion, the two joins are not semantically identical. The USING join selects the coalesced value of corresponding columns, whereas the ON join selects all columns from all tables. For the USING join,SELECT * selects these values:

COALESCE(a.c1, b.c1), COALESCE(a.c2, b.c2), COALESCE(a.c3, b.c3)  

For the ON join, SELECT * selects these values:

a.c1, a.c2, a.c3, b.c1, b.c2, b.c3  

With an inner join, COALESCE(a.c1, b.c1) is the same as eithera.c1 or b.c1 because both columns have the same value. With an outer join (such as LEFT JOIN), one of the two columns can be NULL. That column is omitted from the result.

CREATE TABLE t1 (i1 INT);  
CREATE TABLE t2 (i2 INT);  
CREATE TABLE t3 (i3 INT);  
SELECT * FROM t1 JOIN t2 ON (i1 = i3) JOIN t3;  

The statement fails with an Unknown column 'i3' in 'on clause' error because i3 is a column in t3, which is not an operand of the ON clause. To enable the join to be processed, rewrite the statement as follows:

SELECT * FROM t1 JOIN t2 JOIN t3 ON (i1 = i3);  
CREATE TABLE t1 (i1 INT, j1 INT);  
CREATE TABLE t2 (i2 INT, j2 INT);  
CREATE TABLE t3 (i3 INT, j3 INT);  
INSERT INTO t1 VALUES(1, 1);  
INSERT INTO t2 VALUES(1, 1);  
INSERT INTO t3 VALUES(1, 1);  
SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3);  

The JOIN takes precedence over the comma operator, so the operands for the ON clause are t2 and t3. Because t1.i1 is not a column in either of the operands, the result is an Unknown column 't1.i1' in 'on clause' error.
To enable the join to be processed, use either of these strategies:

SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);  
SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);  

The same precedence interpretation also applies to statements that mix the comma operator with INNER JOIN, CROSS JOIN, LEFT JOIN, and RIGHT JOIN, all of which have higher precedence than the comma operator.