setdiff - Difference of two sets of data - MATLAB (original) (raw)

Difference of two sets of data

Syntax

Description

[C](#btcnv2b-1-C) = setdiff([A,B](#btcnv2b-1%5Fsep%5Fshared-AB)) returns the data in A that is not in B, with no repetitions.C is in sorted order.

example

[C](#btcnv2b-1-C) = setdiff([A,B](#btcnv2b-1%5Fsep%5Fshared-AB),[setOrder](#btcnv2b-1-setOrder)) returns C in a specific order. setOrder can be'sorted' or 'stable'.

example

[C](#btcnv2b-1-C) = setdiff([A,B](#btcnv2b-1%5Fsep%5Fshared-AB),___,'rows') and`C` = setdiff(`A,B`,'rows',___) treat each row of A and each row of B as single entities and return the rows from A that are not in B, with no repetitions. You must specify A and B and optionally can specify setOrder.

The 'rows' option does not support cell arrays, unless one of the inputs is either a categorical array or a datetime array.

[[C](#btcnv2b-1-C),[ia](#btcnv2b-1-ia)] = setdiff(___) also returns the index vector ia using any of the previous syntaxes.

example

[[C](#btcnv2b-1-C),[ia](#btcnv2b-1-ia)] = setdiff([A,B](#btcnv2b-1%5Fsep%5Fshared-AB),'legacy') and[`C`,`ia`] = setdiff(`A,B`,'rows','legacy') preserve the behavior of the setdiff function from R2012b and prior releases.

The 'legacy' option does not support categorical arrays, datetime arrays, duration arrays, tables, or timetables.

example

Examples

collapse all

Define two vectors with values in common.

A = [3 6 2 1 5 1 1]; B = [2 4 6];

Find the values in A that are not in B.

Define two tables with rows in common.

A = table([1:5]',['A';'B';'C';'D';'E'],logical([0;1;0;1;0]))

A=5×3 table Var1 Var2 Var3 ____ ____ _____

 1       A      false
 2       B      true 
 3       C      false
 4       D      true 
 5       E      false

B = table([1:2:10]',['A';'C';'E';'G';'I'],logical(zeros(5,1)))

B=5×3 table Var1 Var2 Var3 ____ ____ _____

 1       A      false
 3       C      false
 5       E      false
 7       G      false
 9       I      false

Find the rows in A that are not in B.

C=2×3 table Var1 Var2 Var3 ____ ____ _____

 2       B      true 
 4       D      true 

Define two vectors with values in common.

A = [3 6 2 1 5 1 1]; B = [2 4 6];

Find the values in A that are not in B as well as the index vector ia, such that C = A(ia).

Define a table, A, of gender, age, and height for five people.

A = table(['M';'M';'F';'M';'F'],[27;52;31;46;35],[74;68;64;61;64],... 'VariableNames',{'Gender' 'Age' 'Height'},... 'RowNames',{'Ted' 'Fred' 'Betty' 'Bob' 'Judy'})

A=5×3 table Gender Age Height ______ ___ ______

Ted        M       27       74  
Fred       M       52       68  
Betty      F       31       64  
Bob        M       46       61  
Judy       F       35       64  

Define a table, B, with the same variables as A.

B = table(['F';'M';'F';'F'],[64;68;62;58],[31;47;35;23],... 'VariableNames',{'Gender' 'Height' 'Age'},... 'RowNames',{'Meg' 'Joe' 'Beth' 'Amy'})

B=4×3 table Gender Height Age ______ ______ ___

Meg       F         64      31 
Joe       M         68      47 
Beth      F         62      35 
Amy       F         58      23 

Find the rows in A that are not in B, as well as the index vector ia, such that C = A(ia,:).

C=4×3 table Gender Age Height ______ ___ ______

Judy      F       35       64  
Ted       M       27       74  
Bob       M       46       61  
Fred      M       52       68  

The rows of C are in sorted order first by Gender and next by Age.

Define two matrices with rows in common.

A = [7 9 7; 0 0 0; 7 9 7; 5 5 5; 1 4 5]; B = [0 0 0; 5 5 5];

Find the rows from A that are not in B as well as the index vector ia, such that C = A(ia,:).

[C,ia] = setdiff(A,B,'rows')

Use the setOrder argument to specify the ordering of the values in C.

Specify 'stable' or 'sorted' when the order of the values in C are important.

A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C,ia] = setdiff(A,B,'stable')

Alternatively, you can specify 'sorted' order.

[C,ia] = setdiff(A,B,'sorted')

Define two vectors containing NaN.

A = [5 NaN NaN]; B = [5 NaN];

Find the set difference of A and B.

setdiff treats NaN values as distinct.

Create a cell array of character vectors, A.

A = {'dog','cat','fish','horse'};

Create a cell array of character vectors, B, where some of the vectors have trailing white space.

B = {'dog ','cat','fish ','horse'};

Find the character vectors in A that are not in B.

C = 1×2 cell {'dog'} {'fish'}

setdiff treats trailing white space in cell arrays of character vectors as distinct characters.

Create a character vector, A.

A = ['cat';'dog';'fox';'pig']; class(A)

Create a cell array of character vectors, B.

B={'dog','cat','fish','horse'}; class(B)

Find the character vectors in A that are not in B.

C = 2×1 cell {'fox'} {'pig'}

The result, C, is a cell array of character vectors.

Use the 'legacy' flag to preserve the behavior of setdiff from R2012b and prior releases in your code.

Find the difference of A and B with the current behavior.

A = [3 6 2 1 5 1 1]; B = [2 4 6]; [C1,ia1] = setdiff(A,B)

Find the difference of A and B, and preserve the legacy behavior.

[C2,ia2] = setdiff(A,B,'legacy')

Input Arguments

collapse all

Order flag, specified as 'sorted' or 'stable', indicates the order of the values (or rows) in C.

Flag Description
'sorted' The values (or rows) in C return in sorted order as returned by sort.ExampleC = setdiff([4 1 3 2 5],[2 1],'sorted')C = 3 4 5
'stable' The values (or rows) in C return in the same order as inA.ExampleC = setdiff([4 1 3 2 5],[2 1],'stable')C = 4 3 5

Data Types: char | string

Output Arguments

collapse all

Difference of A and B, returned as a vector, matrix, table, or timetable. If the inputs A and B are tables or timetables, then the order of the variables in C is the same as the order of the variables in A.

The following describes the shape of C when the inputs are vectors or matrices and when the 'legacy' flag is not specified:

The class of C is the same as the class of A, unless:

Index to A, returned as a column vector when the'legacy' flag is not specified. ia identifies the values (or rows) in A that are not in B. If there is a repeated value (or row) appearing exclusively in A, thenia contains the index to the first occurrence of the value (or row).

Tips

Extended Capabilities

expand all

Thesetdiff function supports tall arrays with the following usage notes and limitations:

For more information, see Tall Arrays.

Usage notes and limitations:

The setdiff function supports GPU array input with these usage notes and limitations:

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Usage notes and limitations:

For more information, see Run MATLAB Functions with Distributed Arrays (Parallel Computing Toolbox).

Version History

Introduced before R2006a