tft.bag_of_words | TFX | TensorFlow (original) (raw)
tft.bag_of_words
Stay organized with collections Save and categorize content based on your preferences.
Computes a bag of "words" based on the specified ngram configuration.
tft.bag_of_words(
tokens: tf.SparseTensor,
ngram_range: Tuple[int, int],
separator: str,
name: Optional[str] = None
) -> tf.SparseTensor
A light wrapper around tft.ngrams. First computes ngrams, then transforms the ngram representation (list semantics) into a Bag of Words (set semantics) per row. Each row reflects the set of unique ngrams present in an input record.
See tft.ngrams for more information.
Args | |
---|---|
tokens | a two-dimensional SparseTensor of dtype tf.string containing tokens that will be used to construct a bag of words. |
ngram_range | A pair with the range (inclusive) of ngram sizes to compute. |
separator | a string that will be inserted between tokens when ngrams are constructed. |
name | (Optional) A name for this operation. |
Returns |
---|
A SparseTensor containing the unique set of ngrams from each row of the input. Note: the original order of the ngrams may not be preserved. |
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-11-01 UTC.