EmbeddingBag — PyTorch 2.7 documentation (original) (raw)

class torch.nn.EmbeddingBag(num_embeddings, embedding_dim, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, _weight=None, include_last_offset=False, padding_idx=None, device=None, dtype=None)[source][source]

Compute sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.

For bags of constant length, no per_sample_weights, no indices equal to padding_idx, and with 2D inputs, this class

However, EmbeddingBag is much more time and memory efficient than using a chain of these operations.

EmbeddingBag also supports per-sample weights as an argument to the forward pass. This scales the output of the Embedding before performing a weighted reduction as specified by mode. If per_sample_weights is passed, the only supported mode is "sum", which computes a weighted sum according toper_sample_weights.

Parameters

Variables

weight (Tensor) – the learnable weights of the module of shape (num_embeddings, embedding_dim)initialized from N(0,1)\mathcal{N}(0, 1).

Examples:

an EmbeddingBag module containing 10 tensors of size 3

embedding_sum = nn.EmbeddingBag(10, 3, mode='sum')

a batch of 2 samples of 4 indices each

input = torch.tensor([1, 2, 4, 5, 4, 3, 2, 9], dtype=torch.long) offsets = torch.tensor([0, 4], dtype=torch.long) embedding_sum(input, offsets) tensor([[-0.8861, -5.4350, -0.0523], [ 1.1306, -2.5798, -1.0044]])

Example with padding_idx

embedding_sum = nn.EmbeddingBag(10, 3, mode='sum', padding_idx=2) input = torch.tensor([2, 2, 2, 2, 4, 3, 2, 9], dtype=torch.long) offsets = torch.tensor([0, 4], dtype=torch.long) embedding_sum(input, offsets) tensor([[ 0.0000, 0.0000, 0.0000], [-0.7082, 3.2145, -2.6251]])

An EmbeddingBag can be loaded from an Embedding like so

embedding = nn.Embedding(10, 3, padding_idx=2) embedding_sum = nn.EmbeddingBag.from_pretrained( embedding.weight, padding_idx=embedding.padding_idx, mode='sum')

forward(input, offsets=None, per_sample_weights=None)[source][source]

Forward pass of EmbeddingBag.

Parameters

Returns

Tensor output shape of (B, embedding_dim).

Return type

Tensor

Note

A few notes about input and offsets:

classmethod from_pretrained(embeddings, freeze=True, max_norm=None, norm_type=2.0, scale_grad_by_freq=False, mode='mean', sparse=False, include_last_offset=False, padding_idx=None)[source][source]

Create EmbeddingBag instance from given 2-dimensional FloatTensor.

Parameters

Return type

EmbeddingBag

Examples:

FloatTensor containing pretrained weights

weight = torch.FloatTensor([[1, 2.3, 3], [4, 5.1, 6.3]]) embeddingbag = nn.EmbeddingBag.from_pretrained(weight)

Get embeddings for index 1

input = torch.LongTensor([[1, 0]]) embeddingbag(input) tensor([[ 2.5000, 3.7000, 4.6500]])