tf.raw_ops.StringToHashBucketStrong  |  TensorFlow v2.16.1 (original) (raw)

tf.raw_ops.StringToHashBucketStrong

Stay organized with collections Save and categorize content based on your preferences.

Converts each string in the input Tensor to its hash mod by a number of buckets.

View aliases

Compat aliases for migration

SeeMigration guide for more details.

tf.compat.v1.raw_ops.StringToHashBucketStrong

tf.raw_ops.StringToHashBucketStrong(
    input, num_buckets, key, name=None
)

The hash function is deterministic on the content of the string within the process. The hash function is a keyed hash function, where attribute keydefines the key of the hash function. key is an array of 2 elements.

A strong hash is important when inputs may be malicious, e.g. URLs with additional components. Adversaries could try to make their inputs hash to the same bucket for a denial-of-service attack or to skew the results. A strong hash can be used to make it difficult to find inputs with a skewed hash value distribution over buckets. This requires that the hash function is seeded by a high-entropy (random) "key" unknown to the adversary.

The additional robustness comes at a cost of roughly 4x higher compute time than tf.string_to_hash_bucket_fast.

Examples:

tf.strings.to_hash_bucket_strong(["Hello", "TF"], 3, [1, 2]).numpy() array([2, 0])

Args
input A Tensor of type string. The strings to assign a hash bucket.
num_buckets An int that is >= 1. The number of buckets.
key A list of ints. The key used to seed the hash function, passed as a list of two uint64 elements.
name A name for the operation (optional).
Returns
A Tensor of type int64.