Creates a join data.table — J (original) (raw)

Creates a data.table for use in i in a [.data.table join.

Usage

# DT[J(...)]                          # J() only for use inside DT[...]
# DT[.(...)]                          # .() only for use inside DT[...]
# DT[list(...)]                       # same; .(), list() and J() are identical
SJ(...)                             # DT[SJ(...)]
CJ(..., sorted=TRUE, unique=FALSE)  # DT[CJ(...)]

Arguments

...

Each argument is a vector. Generally each vector is the same length, but if they are not then the usual silent recycling is applied.

sorted

logical. Should [setkey()](setkey.html) be called on all the columns in the order they were passed to CJ?

unique

logical. When TRUE, only unique values of each vectors are used (automatically).

Details

SJ and CJ are convenience functions to create a data.table to be used in i when performing a data.table 'query' on x.

x[data.table(id)] is the same as x[J(id)] but the latter is more readable. Identical alternatives are x[list(id)] and x[.(id)].

When using a join table in i, x must either be keyed or the on argument be used to indicate the columns in x and i which should be joined. See [[.data.table](data.table.html).

Value

See also

Examples

DT = data.table(A=5:1, B=letters[5:1])
setkey(DT, B)   # reorders table and marks it sorted
DT[J("b")]      # returns the 2nd row
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b
DT[list("b")]   # same
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b
DT[.("b")]      # same using the dot alias for list
#> Key: <B>
#>        A      B
#>    <int> <char>
#> 1:     2      b

# CJ usage examples
CJ(c(5, NA, 1), c(1, 3, 2))                 # sorted and keyed data.table
#> Key: <V1, V2>
#>       V1    V2
#>    <num> <num>
#> 1:    NA     1
#> 2:    NA     2
#> 3:    NA     3
#> 4:     1     1
#> 5:     1     2
#> 6:     1     3
#> 7:     5     1
#> 8:     5     2
#> 9:     5     3
do.call(CJ, list(c(5, NA, 1), c(1, 3, 2)))  # same as above
#> Key: <V1, V2>
#>       V1    V2
#>    <num> <num>
#> 1:    NA     1
#> 2:    NA     2
#> 3:    NA     3
#> 4:     1     1
#> 5:     1     2
#> 6:     1     3
#> 7:     5     1
#> 8:     5     2
#> 9:     5     3
CJ(c(5, NA, 1), c(1, 3, 2), sorted=FALSE)   # same order as input, unkeyed
#>       V1    V2
#>    <num> <num>
#> 1:     5     1
#> 2:     5     3
#> 3:     5     2
#> 4:    NA     1
#> 5:    NA     3
#> 6:    NA     2
#> 7:     1     1
#> 8:     1     3
#> 9:     1     2
# use for 'unique=' argument
x = c(1, 1, 2)
y = c(4, 6, 4)
CJ(x, y)              # output columns are automatically named 'x' and 'y'
#> Key: <x, y>
#>        x     y
#>    <num> <num>
#> 1:     1     4
#> 2:     1     4
#> 3:     1     4
#> 4:     1     4
#> 5:     1     6
#> 6:     1     6
#> 7:     2     4
#> 8:     2     4
#> 9:     2     6
CJ(x, y, unique=TRUE) # unique(x) and unique(y) are computed automatically
#> Key: <x, y>
#>        x     y
#>    <num> <num>
#> 1:     1     4
#> 2:     1     6
#> 3:     2     4
#> 4:     2     6
CJ(x, y, sorted = FALSE) # retain input order for y
#>        x     y
#>    <num> <num>
#> 1:     1     4
#> 2:     1     6
#> 3:     1     4
#> 4:     1     4
#> 5:     1     6
#> 6:     1     4
#> 7:     2     4
#> 8:     2     6
#> 9:     2     4