torch_frame.data.MultiNestedTensor — pytorch-frame documentation (original) (raw)
class MultiNestedTensor(num_rows: int, num_cols: int, values: Tensor, offset: Tensor)[source]
Bases: _MultiTensor
A read-only PyTorch tensor-based data structure that stores[num_rows, num_cols, *]
, where the size of last dimension can be different for different row/column. Internally, we store the object in an efficient flattened format: (values, offset)
, where the PyTorch Tensor at (i, j)
is accessed byvalues[offset[i*num_cols+j]:offset[i*num_cols+j+1]]
. It supports various advanced indexing, including slicing and list indexing along both row and column.
Parameters:
- num_rows (int) – Number of rows.
- num_cols (int) – Number of columns.
- values (torch.Tensor) – The values torch.Tensor of size
[numel,]
. - offset (torch.Tensor) – The offset torch.Tensor of size
[num_rows*num_cols+1,]
.
Example
import torch from torch_frame.data import MultiNestedTensor tensor_mat = [ ... [torch.tensor([1, 2]), torch.tensor([3])], ... [torch.tensor([4]), torch.tensor([5, 6, 7])], ... [torch.tensor([8, 9]), torch.tensor([10])], ... ] mnt = MultiNestedTensor.from_tensor_mat(tensor_mat) mnt MultiNestedTensor(num_rows=3, num_cols=2, device='cpu') mnt.values tensor([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) mnt.offset tensor([ 0, 2, 3, 4, 7, 9, 10]) mnt[0, 0] torch.tensor([1, 2]) mnt[1, 1] tensor([5, 6, 7]) mnt[0] # Row integer indexing MultiNestedTensor(num_rows=1, num_cols=2, device='cpu') mnt[:, 0] # Column integer indexing MultiNestedTensor(num_rows=3, num_cols=1, device='cpu') mnt[:2] # Row integer slicing MultiNestedTensor(num_rows=2, num_cols=2, device='cpu') mnt[[2, 1, 2, 0]] # Row list indexing MultiNestedTensor(num_rows=4, num_cols=2, device='cpu') mnt.to_dense(fill_value=-1) # Map to a dense matrix with padding tensor([[[ 1, 2, -1], [ 3, -1, -1]], [[ 4, -1, -1], [ 5, 6, 7]], [[ 8, 9, -1], [10, -1, -1]]])
classmethod from_tensor_mat(tensor_mat: list[list[torch.Tensor]]) → MultiNestedTensor[source]
Construct MultiNestedTensor object fromtensor_mat
.
Parameters:
tensor_mat (List [ List [ Tensor ] ]) – A matrix oftorch.Tensor objects. tensor_mat[i][j]
contains 1-dim torch.Tensor of i
-th row and j
-th column, varying in size.
Returns:
A MultiNestedTensor instance.
Return type:
fillna_col(col_index: int, fill_value: int | float | Tensor) → None[source]
Fill the index
-th column in MultiTensor
with fill_value in-place.
Parameters:
- col_index (int) – A column index of the tensor to select.
- fill_value (Union [_int,_ float, Tensor ]) – Scalar values to replace NaNs.
to_dense(fill_value: int | float) → Tensor[source]
Map MultiNestedTensor into dense Tensor representation with padding.
Parameters:
fill_value (Union [_int,_ float]) – Fill values.
Returns:
Padded PyTorch Tensor object with shape
(num_rows, num_cols, max_length)
Return type:
Tensor
static cat(xs: Sequence[MultiNestedTensor], dim: int = 0) → MultiNestedTensor[source]
Concatenates a sequence of MultiNestedTensor along the specified dimension.
Parameters:
- xs (Sequence _[_MultiNestedTensor]) – A sequence ofMultiNestedTensor to be concatenated.
- dim (int) – The dimension to concatenate along.
Returns:
Concatenated multi nested tensor.
Return type: