innerjoin - Inner join between two tables or timetables - MATLAB (original) (raw)
Inner join between two tables or timetables
Syntax
Description
[T](#btx2hll-1-C) = innerjoin([Tleft](#mw%5F24bc4175-651b-40e7-a3ad-f7eae45a1fc7),[Tright](#mw%5Fbca4c01f-f2f8-4570-9d03-7d49c1b0e7da))
creates the table or timetable, T
, as the inner join of Tleft
and Tright
usingkey variables. An inner join combines rows where the key variables have matching values. For example, ifTleft
has variables named Key1
andVar1
, and Tright
has variablesKey1
and Var2
, thenT=innerjoin(Tleft,Tright)
uses Key1
as a key variable.
By default, the key variables are:
- Variables that have the same names in
Tleft
andTright
, if both inputs are tables, or ifTleft
is a timetable andTright
is a table. - Vectors of row times, if both
Tleft
andTright
are timetables.
The matching values of the key variables in the left and right tables do not have to be in the same order. Inner joins can perform one-to-many and many-to-one matches between the key variables of the two tables. That is, a value that occurs once in a key variable of the left table can have multiple matches in the right table. Similarly, a value that occurs once in a key variable of the right table can have multiple matches in the left table.
You can perform inner joins only on certain combinations of tables and timetables.
- If
Tleft
is a table, thenTright
must be a table.innerjoin
returnsT
as a table. - If
Tleft
is a timetable, thenTright
can be either a table or a timetable.innerjoin
returnsT
as a timetable for either combination of inputs.
[T](#btx2hll-1-C) = innerjoin([Tleft](#mw%5F24bc4175-651b-40e7-a3ad-f7eae45a1fc7),[Tright](#mw%5Fbca4c01f-f2f8-4570-9d03-7d49c1b0e7da),[Name,Value](#namevaluepairarguments))
performs the inner-join operation with additional options specified by one or moreName,Value
pair arguments.
For example, you can specify the variables to use as key variables.
[[T](#btx2hll-1-C),[ileft](#btx2hll-1-ia),[iright](#btx2hll-1-ib)] = innerjoin(___)
also returns index vectors, ileft
andiright
indicating the correspondence between rows inT
and rows in Tleft
andTright
respectively. You can use this syntax with any of the input arguments in the previous syntaxes.
Examples
Create a table, Tleft
.
Tleft = table([5;12;23;2;6],... {'cereal';'pizza';'salmon';'cookies';'pizza'},... 'VariableNames',{'Age','FavoriteFood'})
Tleft=5×2 table Age FavoriteFood ___ ____________
5 {'cereal' }
12 {'pizza' }
23 {'salmon' }
2 {'cookies'}
6 {'pizza' }
Create a table, Tright
, with one variable in common with Tleft
.
Tright = table({'cereal';'cookies';'pizza';'salmon';'cake'},... [110;160;140;367;243],... {'A-';'D';'B';'B';'C-'},... 'VariableNames',{'FavoriteFood','Calories','NutritionGrade'})
Tright=5×3 table FavoriteFood Calories NutritionGrade ____________ ________ ______________
{'cereal' } 110 {'A-'}
{'cookies'} 160 {'D' }
{'pizza' } 140 {'B' }
{'salmon' } 367 {'B' }
{'cake' } 243 {'C-'}
Use the innerjoin
function to create a new table, T
, with data from tables Tleft
and Tright
.
T = innerjoin(Tleft,Tright)
T=5×4 table Age FavoriteFood Calories NutritionGrade ___ ____________ ________ ______________
5 {'cereal' } 110 {'A-'}
2 {'cookies'} 160 {'D' }
12 {'pizza' } 140 {'B' }
6 {'pizza' } 140 {'B' }
23 {'salmon' } 367 {'B' }
Table T
is sorted by the key variable, FavoriteFood
.
Create a table, Tleft
.
Tleft = table({'a' 'b' 'c' 'e' 'h'}',[1 2 3 11 17]',... 'VariableNames',{'Key1' 'Var1'})
Tleft=5×2 table Key1 Var1 _____ ____
{'a'} 1
{'b'} 2
{'c'} 3
{'e'} 11
{'h'} 17
Create a table, Tright
, with common values in the variable Key1
between tables Tleft
and Tright
, but also containing rows with values of Key1
not present in Tleft
.
Tright = table({'a' 'b' 'd' 'e'}',[4 5 6 7]',... 'VariableNames',{'Key1' 'Var2'})
Tright=4×2 table Key1 Var2 _____ ____
{'a'} 4
{'b'} 5
{'d'} 6
{'e'} 7
Use the innerjoin
function to create a new table, T
, with data from tables Tleft
and Tright
. Retain only rows whose values in the variable Key1
match.
Also, return index vectors, ileft
and iright
indicating the correspondence between rows in T
and rows in Tleft
and Tright
respectively.
[T,ileft,iright] = innerjoin(Tleft,Tright)
T=3×3 table Key1 Var1 Var2 _____ ____ ____
{'a'} 1 4
{'b'} 2 5
{'e'} 11 7
Table T
is sorted by the values in the key variable, Key1
, and contains the horizontal concatenation of Tleft(ileft,:)
and Tright(iright,'Var2')
.
Create a table, Tleft
.
Tleft = table([10;4;2;3;7],[5;4;9;6;1],[10;3;8;8;4])
Tleft=5×3 table Var1 Var2 Var3 ____ ____ ____
10 5 10
4 4 3
2 9 8
3 6 8
7 1 4
Create a table, Tright
, with common values in the second variable as the first variable of table Tleft
.
Tright = table([6;1;1;6;8],[2;3;4;5;6])
Tright=5×2 table Var1 Var2 ____ ____
6 2
1 3
1 4
6 5
8 6
Use the innerjoin
function to create a new table, T
, with data from tables Tleft
and Tright
. Use the first variable of Tleft
and the second variable of Tright
as key variables.
[T,ileft,iright] = innerjoin(Tleft,Tright,'LeftKeys',1,'RightKeys',2)
T=3×4 table Var1_Tleft Var2 Var3 Var1_Tright __________ ____ ____ ___________
2 9 8 6
3 6 8 1
4 4 3 1
Table T
retains only the rows that match between Tleft
and Tright
with respect to the key variables.
Table T
contains the horizontal concatenation of Tleft(ileft,:)
and Tright(iright,'Var1')
.
Create two timetables, Tleft
and Tright
. They have some row times in common, but each also includes row times that are not in the other timetable.
Tleft = timetable(seconds([1;2;4;6]),[1 2 3 11]')
Tleft=4×1 timetable Time Var1 _____ ____
1 sec 1
2 sec 2
4 sec 3
6 sec 11
Tright = timetable(seconds([2;4;6;7]),[4 5 6 7]')
Tright=4×1 timetable Time Var1 _____ ____
2 sec 4
4 sec 5
6 sec 6
7 sec 7
Combine Tleft
and Tright
with an inner join. T
matches up the rows with common row times. T
does not contain any other rows from either timetable.
T = innerjoin(Tleft,Tright)
T=3×2 timetable Time Var1_Tleft Var1_Tright _____ __________ ___________
2 sec 2 4
4 sec 3 5
6 sec 11 6
Input Arguments
Left table, specified as a table or a timetable.
Right table, specified as a table or a timetable.
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: 'Keys',2
uses the second variable inTleft
and the second variable in Tright
as key variables.
Variables to use as keys, specified as the comma-separated pair consisting of 'Keys'
and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.
You cannot use the 'Keys'
name-value pair argument with the 'LeftKeys'
and'RightKeys'
name-value pair arguments.
A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.
Example: 'Keys',[1 3]
uses the first and third variables in Tleft
and Tright
as a key variables.
Example: 'Keys',{'X','Y'}
uses the variables namedX
and Y
inTleft
and Tright
as key variables.
Example: 'Keys','Row'
uses the vectors of row names of Tleft
and Tright
as key variables, if both Tleft
andTright
are tables with row names.
Variables to use as keys in Tleft
, specified as the comma-separated pair consisting of 'LeftKeys'
and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.
You must use the 'LeftKeys'
name-value pair argument in conjunction with the 'RightKeys'
name-value pair argument. 'LeftKeys'
and'RightKeys'
both must specify the same number of key variables. innerjoin
pairs key values based on their order.
A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.
Example: 'LeftKeys',1
uses only the first variable in Tleft
as a key variable.
Variables to use as keys in Tright
, specified as the comma-separated pair consisting of 'RightKeys'
and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.
You must use the 'RightKeys'
name-value pair argument in conjunction with the 'LeftKeys'
name-value pair argument. 'LeftKeys'
and'RightKeys'
both must specify the same number of key variables. innerjoin
pairs key values based on their order.
A vector of row labels can be a key, alone or in combination with other key variables. For more information, see the Algorithms section.
Example: 'RightKeys',3
uses only the third variable in Tright
as a key variable.
Variables from Tleft
to include inT
, specified as the comma-separated pair consisting of 'LeftVariables'
and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.
You can use 'LeftVariables'
to include or exclude key variables, as well as nonkey variables from the output,T
.
By default, innerjoin
includes all variables fromTleft
.
Variables from Tright
to include inT
, specified as the comma-separated pair consisting of 'RightVariables'
and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.
You can use 'RightVariables'
to include or exclude key variables, as well as nonkey variables from the output,T
.
By default, innerjoin
includes all the variables from Tright
except the key variables.
Output Arguments
Inner join from Tleft
and Tright
, returned as a table or a timetable. The output table or timetable,T
, contains one row for each pair of rows inTleft
and Tright
that share the same combination of values in the key variables. If Tleft
and Tright
contain variables with the same name,innerjoin
adds a unique suffix to the corresponding variable names in T
.
In general, if there are m
rows inTleft
and n
rows inTright
that all contain the same combination of values in the key variables, then T
containsm*n
rows for that combination.
T
is sorted by the values in the key variables and contains the horizontal concatenation ofTleft(ileft,LeftVars)
andTright(iright,RightVars)
. By default,LeftVars
consists of all the variables ofTleft
, and RightVars
consists of all the nonkey variables from Tright
. Otherwise,LefttVars
consists of the variables specified by the'LeftVariables'
name-value pair argument, andRightVars
is the variables specified by the'RightVariables'
name-value pair argument.
You can store additional metadata such as descriptions, variable units, variable names, and row names in T
. For more information, see the Properties sections of table or timetable.
Index to Tleft
, returned as a column vector. Each element of ileft
identifies the row inTleft
that corresponds to that row in the output table or timetable, T
.
Index to Tright
, returned as a column vector. Each element of iright
identifies the row inTright
that corresponds to that row in the output table or timetable, T
.
More About
Variable used to match and combine data between the input tables,Tleft
and Tright
.
Algorithms
- The vector of row labels from an input table or timetable can be used as a key variable, alone or in combination with other key variables. Row labels are the row names of a table or the row times of a timetable. To use this vector as a key, specify it as
'Row'
(for the row names of a table), as the name of a timetable vector of row times, or as the value of_`T`_.Properties.DimensionNames{1}
, where_`T`_
is the table or timetable.
In general,innerjoin
copies row labels from the input tableTleft
to the output tableT
.- If
Tleft
has no row labels, thenT
has no row labels. - If
Tleft
has row labels, theninnerjoin
copies row labels fromTleft
to create row labels inT
.
* However, if bothTleft
andTright
are tables, but you do not specify either input’s row names as a key, theninnerjoin
does not create row names inT
.
* If bothTleft
andTright
are timetables, but you do not specify either input’s row times as a key, theninnerjoin
copies row times fromTleft
toT
.
- If
You cannot perform an inner join using the row labels ofTleft
as the left key and a variable ofTright
as the right key. To perform the inner join, convert the row labels of Tleft
to a table variable and use the new table variable as a key.
Extended Capabilities
Theinnerjoin
function fully supports tall arrays. For more information, see Tall Arrays.
Usage notes and limitations:
- In general, the input tables cannot have any nonkey variables with the same names. However, you can join subsets of the input tables if you specify the
'LeftVariables'
and'RightVariables'
name-value arguments. Specify these arguments so that no variable name appears in both'LeftVariables'
and'RightVariables'
. - The values of these name-value arguments must be constant:
'Keys'
'LeftKeys'
'RightKeys'
'LeftVariables'
'RightVariables'
- The values of these name-value arguments do not support pattern expressions:
'Keys'
'LeftKeys'
'RightKeys'
'LeftVariables'
'RightVariables'
- Nested tables and timetables are not supported.
Version History
Introduced in R2013b