tft.get_num_buckets_for_transformed_feature | TFX | TensorFlow (original) (raw)
tft.get_num_buckets_for_transformed_feature
Stay organized with collections Save and categorize content based on your preferences.
Provides the number of buckets for a transformed feature if annotated.
tft.get_num_buckets_for_transformed_feature(
transformed_feature: common_types.TensorType
) -> tf.Tensor
This for example can be used for the direct output of tft.bucketize,tft.apply_buckets, tft.compute_and_apply_vocabulary,tft.apply_vocabulary. These methods annotate the transformed feature with additional information. If the given transformed_feature
isn't annotated, this method will fail.
Example:
def preprocessing_fn(inputs):
bucketized = tft.bucketize(inputs['x'], num_buckets=3)
integerized = tft.compute_and_apply_vocabulary(inputs['x'])
zeros = tf.zeros_like(inputs['x'], tf.int64)
return {
'bucketized': bucketized,
'bucketized_num_buckets': (
zeros + tft.get_num_buckets_for_transformed_feature(bucketized)),
'integerized': integerized,
'integerized_num_buckets': (
zeros + tft.get_num_buckets_for_transformed_feature(integerized)),
}
raw_data = [dict(x=3),dict(x=23)]
feature_spec = dict(x=tf.io.FixedLenFeature([], tf.int64))
raw_data_metadata = tft.DatasetMetadata.from_feature_spec(feature_spec)
with tft_beam.Context(temp_dir=tempfile.mkdtemp()):
transformed_dataset, transform_fn = (
(raw_data, raw_data_metadata)
| tft_beam.AnalyzeAndTransformDataset(preprocessing_fn))
transformed_data, transformed_metadata = transformed_dataset
transformed_data
[{'bucketized': 1, 'bucketized_num_buckets': 3,
'integerized': 0, 'integerized_num_buckets': 2},
{'bucketized': 2, 'bucketized_num_buckets': 3,
'integerized': 1, 'integerized_num_buckets': 2}]
Args | |
---|---|
transformed_feature | A Tensor or SparseTensor which is the direct output of tft.bucketize, tft.apply_buckets,tft.compute_and_apply_vocabulary or tft.apply_vocabulary. |
Raises | |
---|---|
ValueError | If the given tensor has not been annotated a the number of buckets. |
Returns |
---|
A Tensor with the number of buckets for the given transformed_feature. |