tfr.data.parse_from_example_in_example | TensorFlow Ranking (original) (raw)
Parses an ExampleInExample batch to a feature map.
tfr.data.parse_from_example_in_example(
serialized,
list_size=None,
context_feature_spec=None,
example_feature_spec=None,
size_feature_name=None,
mask_feature_name=None,
shuffle_examples=False,
seed=None
)
An ExampleInExample is a tf.train.Example that has two fields:
serialized_context
is a scalar of bytes. The value is a serialized tf.train.Example that contains context features.serialized_examples
is a repeated field of bytes. The value is a list of serialized tf.train.Example with each representing an example that contains example features.
For example:
serialized_context_string = Serialize({
features {
feature {
key: "query_length"
value { int64_list { value: 3 } }
}
}
})
serialized_examples_string = [
Serialize({
features {
feature {
key: "unigrams"
value { bytes_list { value: "tensorflow" } }
}
feature {
key: "utility"
value { float_list { value: 0.0 } }
}
}
}),
Serialize({
features {
feature {
key: "unigrams"
value { bytes_list { value: ["learning" "to" "rank" } }
}
feature {
key: "utility"
value { float_list { value: 1.0 } }
}
}
})
]
serialized_context_string_2 = Serialize({
features {
feature {
key: "query_length"
value { int64_list { value: 2 } }
}
}
})
serialized_examples_string_2 = [
Serialize({
features {
feature {
key: "unigrams"
value { bytes_list { value: "gbdt" } }
}
feature {
key: "utility"
value { float_list { value: 0.0 } }
}
}
}),
Serialize({
features {
feature {
key: "unigrams"
value { bytes_list { value: ["neural" "network" } }
}
feature {
key: "utility"
value { float_list { value: 1.0 } }
}
}
})
]
serialized = [
{
serialized_context: serialized_context_string,
serialized_examples: serialized_examples_string,
},
{
serialized_context: serialized_context_string_2,
serialized_examples: serialized_examples_string_2,
},
]
We can use arguments:
context_feature_spec: {
"query_length": tf.io.FixedLenFeature([1], dtypes.int64),
}
example_feature_spec: {
"unigrams": tf.io.VarLenFeature(dtypes.string),
"utility": tf.io.FixedLenFeature([1], dtypes.float32),
}
And the expected output is:
{
"unigrams": SparseTensor(
indices=array([[0, 0, 0], [0, 1, 0], [0, 1, 1], [0, 1, 2], [1, 0, 0],
[1, 1, 0], [1, 1, 1]]),
values=["tensorflow", "learning", "to", "rank", "gbdt", "neural" ,
"network"],
dense_shape=array([2, 2, 3])),
"utility": [[[ 0.], [ 1.]], [[ 0.], [ 1.]]],
"query_length": [[3], [2]],
}
Args | |
---|---|
serialized | (Tensor) 1-D Tensor and each entry is a serializedExampleListWithContext proto that contains context and example list. |
list_size | (int) The number of examples for each list. If specified, truncation or padding is applied to make 2nd dim of output Tensors aligned to list_size. Otherwise, the 2nd dim of the output Tensors is dynamic. |
context_feature_spec | (dict) A mapping from feature keys toFixedLenFeature or VarLenFeature values for context inExampleListWithContext proto. |
example_feature_spec | (dict) A mapping from feature keys toFixedLenFeature or VarLenFeature values for examples inExampleListWithContext proto. |
size_feature_name | (str) Name of feature for example list sizes. Populates the feature dictionary with a tf.int32 Tensor of shape [batch_size] for this feature name. If None, which is default, this feature is not generated. |
mask_feature_name | (str) Name of feature for example list masks. Populates the feature dictionary with a tf.bool Tensor of shape [batch_size, list_size] for this feature name. If None, which is default, this feature is not generated. |
shuffle_examples | (bool) A boolean to indicate whether examples within a list are shuffled before the list is trimmed down to list_size elements (when list has more than list_size elements). |
seed | (int) A seed passed onto random_ops.uniform() to shuffle examples. |
Returns |
---|
A mapping from feature keys to Tensor or SparseTensor. |