tf.nn.ctc_loss  |  TensorFlow v2.16.1 (original) (raw)

tf.nn.ctc_loss

Stay organized with collections Save and categorize content based on your preferences.

Computes CTC (Connectionist Temporal Classification) loss.

tf.nn.ctc_loss(
    labels,
    logits,
    label_length,
    logit_length,
    logits_time_major=True,
    unique=None,
    blank_index=None,
    name=None
)

This op implements the CTC loss as presented inGraves et al., 2006

Connectionist temporal classification (CTC) is a type of neural network output and associated scoring function, for training recurrent neural networks (RNNs) such as LSTM networks to tackle sequence problems where the timing is variable. It can be used for tasks like on-line handwriting recognition or recognizing phones in speech audio. CTC refers to the outputs and scoring, and is independent of the underlying neural network structure.

Notes:

tf.random.set_seed(50) batch_size = 8 num_labels = 6 max_label_length = 5 num_frames = 12 labels = tf.random.uniform([batch_size, max_label_length], minval=1, maxval=num_labels, dtype=tf.int64) logits = tf.random.uniform([num_frames, batch_size, num_labels]) label_length = tf.random.uniform([batch_size], minval=2, maxval=max_label_length, dtype=tf.int64) label_mask = tf.sequence_mask(label_length, maxlen=max_label_length, dtype=label_length.dtype) labels *= label_mask logit_length = [num_frames] * batch_size with tf.GradientTape() as t: t.watch(logits) ref_loss = tf.nn.ctc_loss( labels=labels, logits=logits, label_length=label_length, logit_length=logit_length, blank_index=0) ref_grad = t.gradient(ref_loss, logits)

Args
labels Tensor of shape [batch_size, max_label_seq_length] orSparseTensor.
logits Tensor of shape [frames, batch_size, num_labels]. Iflogits_time_major == False, shape is [batch_size, frames, num_labels].
label_length Tensor of shape [batch_size]. None, if labels is aSparseTensor. Length of reference label sequence in labels.
logit_length Tensor of shape [batch_size]. Length of input sequence inlogits.
logits_time_major (optional) If True (default), logits is shaped [frames, batch_size, num_labels]. If False, shape is[batch_size, frames, num_labels].
unique (optional) Unique label indices as computed byctc_unique_labels(labels). If supplied, enable a faster, memory efficient implementation on TPU.
blank_index (optional) Set the class index to use for the blank label. Negative values will start from num_labels, ie, -1 will reproduce the ctc_loss behavior of using num_labels - 1 for the blank symbol. There is some memory/performance overhead to switching from the default of 0 as an additional shifted copy of logits may be created.
name A name for this Op. Defaults to "ctc_loss_dense".
Returns
loss A 1-D float Tensor of shape [batch_size], containing negative log probabilities.
Raises
ValueError Argument blank_index must be provided when labels is aSparseTensor.
References
Connectionist Temporal Classification - Labeling Unsegmented Sequence Data with Recurrent Neural Networks:Graves et al., 2006 (pdf)https://en.wikipedia.org/wiki/Connectionist_temporal_classification