Join Tables - Combine two tables using key variables in the Live Editor - MATLAB (original) (raw)

Combine two tables using key variables in the Live Editor

Description

The Join Tables task lets you interactively combine two tables by performing joins or by concatenating the tables horizontally or vertically. The task automatically generates MATLAB® code for your live script.

Using this task, you can:

Join Tables task in the Live Editor

Open the Task

To add the Join Tables task to a live script in the MATLAB Editor:

Examples

expand all

Use the Join Tables Live Editor task to perform an inner join and an outer join on two tables.

First, load the orders table, which has order IDs, customer names, and order dates for a number of shipments.

orders=3×3 table OrderID CustomerID OrderDate _______ __________ ___________

 5120      "Sanchez"     23-Apr-2019
 1037      "Li"          18-Apr-2019
 8937      "Johnson"     16-Apr-2019

Then load the items table, which contains products that customers ordered, along with the quantity, price, and status of the shipment for that item. Each row of this table has an order ID, just like orders. Because a customer can order multiple items, several rows of items can refer to one order from orders.

items=5×5 table OrderID Product Quantity Price Status
_______ ________________ ________ _____ _________

 6005      "Dozen Roses"           1       39.99    Shipped  
 1037      "Petunia Basket"        1       23.99    Delivered
 5120      "Tulips"               12        0.99    Pending  
 1037      "Gardenias"             1       17.99    Shipped  
 1037      "Gerber Daisies"        6        1.99    Delivered

Open the Join Tables task. To open the task, type the keyword join in a code block and select Join Tables when it appears in the menu.

Use the task to perform an inner join of orders and items. When the task opens:

  1. Select orders and items as the left and right tables, respectively.
  2. Select OrderID as the merging variable for both tables.
  3. Click the Inner join button.
  4. To see the code that this task generates, expand the task display by clicking Show code at the bottom of the task parameter area.

Live Task

joinedData=4×7 table OrderID CustomerID OrderDate Product Quantity Price Status
_______ __________ ___________ ________________ ________ _____ _________

 1037      "Li"          18-Apr-2019    "Petunia Basket"        1       23.99    Delivered
 1037      "Li"          18-Apr-2019    "Gardenias"             1       17.99    Shipped  
 1037      "Li"          18-Apr-2019    "Gerber Daisies"        6        1.99    Delivered
 5120      "Sanchez"     23-Apr-2019    "Tulips"               12        0.99    Pending  

When you perform an inner join, the output table includes only those key values that appear in both the left and right tables.

Next, use the task to perform a left outer join. Outer joins can include key values that appear in only one input table. For example, a left outer join includes all key values from the left table, even when the right table has no corresponding matches. If the right table has key values that do not have matches in the left table, then those key values are not included.

  1. Click the Left outer join button.
  2. Select the Combine merging variables check box. By default, outer joins copy the key variables from the left and right tables into separate variables in the output table. Merge the key variables so one key variable is in the output.
  3. To see the code that this task generates, expand the task display by clicking Show code at the bottom of the task parameter area.

Live Task

joinedData2=5×7 table OrderID CustomerID OrderDate Product Quantity Price Status
_______ __________ ___________ ________________ ________ _____ ___________

 1037      "Li"          18-Apr-2019    "Petunia Basket"        1       23.99    Delivered  
 1037      "Li"          18-Apr-2019    "Gardenias"             1       17.99    Shipped    
 1037      "Li"          18-Apr-2019    "Gerber Daisies"        6        1.99    Delivered  
 5120      "Sanchez"     23-Apr-2019    "Tulips"               12        0.99    Pending    
 8937      "Johnson"     16-Apr-2019    <missing>             NaN         NaN    <undefined>

The output table now includes data for order 8937. However, because the items table for order 8937 had no items, the rest of the row is filled in with empty values (such as <missing>, NaN, or <undefined>). Outer joins fill table elements with empty values when the left or right tables do not have data associated with a key value.

Parameters

expand all

Specify the name from a list of all the nonempty tables and timetables that are in the workspace.

Specify the name from a list of all the nonempty tables and timetables that are in the workspace.

Specify the name of a variable from a list of variables in the left or right table.

When you specify a merging variable, or key variable, its values determine which rows are merged from the left and right tables. To specify multiple sets of merging variables, use the + button.

Combine corresponding merging variables when performing outer joins. By default, outer joins copy key variables from the left and right tables to separate variables in the output table. To combine corresponding key variables in the left and right tables into one variable in the output, select this check box.

Version History

Introduced in R2019b

expand all

When the left input of the Join Tables task in the Live Editor is a timetable, you can sort the output timetable by row times even when you do not specify row times as key values. To sort by row times in this case, select theSort result by row times check box.

This sorting option is available only when all three of these conditions are true:

If you do specify that row times are key values, then the output timetable is automatically sorted by row times.

If the Join Tables Live Editor task fails to automatically select the first pair of merging variables based on row labels or variable names, then it tries to select them based on a scoring algorithm described in Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks. TheJoin Tables task selects and tests candidate pairs of merging variables using these steps:

  1. Select row names (in a table) or row times (in a timetable) as the first pair of merging variables.
  2. If step 1 fails, then select variables with names that exactly match as the first pair.
  3. If steps 1 and 2 fail, then score pairs of variables using the scoring algorithm. Select the pair of variables with the highest score as the first pair of merging variables.
  4. If all previous steps fail, then select the first items in the Merging variable drop-down lists as the first pair of merging variables.

In previous releases, step 3 was to select the pair of variables whose names gave the best partial match as the first pair of merging variables.

This Live Editor task does not run automatically if the inputs have more than 1 million elements. In previous releases, the task always ran automatically for inputs of any size. If the inputs have a large number of elements, then the code generated by this task can take a noticeable amount of time to run (more than a few seconds).

When a task does not run automatically, the Autorun indicator is disabled. You can either run the task manually when needed or choose to enable the task to run automatically.