tf.tpu.experimental.embedding.AdagradMomentum

Stay organized with collections Save and categorize content based on your preferences.

Optimization parameters for Adagrad + Momentum with TPU embeddings.

View aliases

Compat aliases for migration

tf.compat.v1.tpu.experimental.embedding.AdagradMomentum

tf.tpu.experimental.embedding.AdagradMomentum(
    learning_rate: Union[float, Callable[[], float]] = 0.001,
    momentum: float = 0.0,
    use_nesterov: bool = False,
    exponent: float = 2,
    beta2: float = 1,
    epsilon: float = 1e-10,
    use_gradient_accumulation: bool = True,
    clip_weight_min: Optional[float] = None,
    clip_weight_max: Optional[float] = None,
    weight_decay_factor: Optional[float] = None,
    multiply_weight_decay_factor_by_learning_rate: Optional[bool] = None,
    slot_variable_creation_fn: Optional[SlotVarCreationFnType] = None,
    clipvalue: Optional[ClipValueType] = None,
    low_dimensional_packing_status: bool = False
)

Pass this to tf.tpu.experimental.embedding.TPUEmbedding via the optimizerargument to set the global optimizer and its parameters:

embedding = tf.tpu.experimental.embedding.TPUEmbedding(
    ...
    optimizer=tf.tpu.experimental.embedding.AdagradMomentum(0.1))

This can also be used in a tf.tpu.experimental.embedding.TableConfig as the optimizer parameter to set a table specific optimizer. This will override the optimizer and parameters for global embedding optimizer defined above:

table_one = tf.tpu.experimental.embedding.TableConfig(
    vocabulary_size=...,
    dim=...,
    optimizer=tf.tpu.experimental.embedding.AdagradMomentum(0.2))
table_two = tf.tpu.experimental.embedding.TableConfig(
    vocabulary_size=...,
    dim=...)

feature_config = (
    tf.tpu.experimental.embedding.FeatureConfig(
        table=table_one),
    tf.tpu.experimental.embedding.FeatureConfig(
        table=table_two))

embedding = tf.tpu.experimental.embedding.TPUEmbedding(
    feature_config=feature_config,
    batch_size=...
    optimizer=tf.tpu.experimental.embedding.AdagradMomentum(0.1))

In the above example, the first feature will be looked up in a table that has a learning rate of 0.2 while the second feature will be looked up in a table that has a learning rate of 0.1.

See 'tensorflow/core/protobuf/tpu/optimization_parameters.proto' for a complete description of these parameters and their impacts on the optimizer algorithm.

Args
learning_rate	The learning rate. It should be a floating point value or a callable taking no arguments for a dynamic learning rate.
momentum	Moving average parameter for the momentum accumulator.
use_nesterov	Whether to use the Nesterov variant of momentum. See Sutskever et al., 2013.
exponent	Exponent for the Adagrad accumulator.
beta2	Moving average parameter for the Adagrad accumulator.
epsilon	initial accumulator for Adagrad accumulator.
use_gradient_accumulation	setting this to False makes embedding gradients calculation less accurate but faster.
clip_weight_min	the minimum value to clip by; None means -infinity.
clip_weight_max	the maximum value to clip by; None means +infinity.
weight_decay_factor	amount of weight decay to apply; None means that the weights are not decayed.
multiply_weight_decay_factor_by_learning_rate	if true,weight_decay_factor is multiplied by the current learning rate.
slot_variable_creation_fn	If you wish do directly control the creation of the slot variables, set this to a callable taking three parameters: a table variable, a list of slot names to create for it, and a list of initializers. This function should return a dict with the slot names as keys and the created variables as values with types matching the table variable. When set to None (the default), uses the built-in variable creation.
clipvalue	Controls clipping of the gradient. Set to either a single positive scalar value to get clipping or a tuple of scalar values (min, max) to set a separate maximum or minimum. If one of the two entries is None, then there will be no clipping that direction.
low_dimensional_packing_status	Status of the low-dimensional embedding packing optimization controls whether to optimize the packing of 1-dimensional, 2-dimensional, and 4-dimensional embedding tables in memory.

Methods

`eq`

View source

__eq__(
    other: Any
) -> Union[Any, bool]

Return self==value.

tf.tpu.experimental.embedding.AdagradMomentum | TensorFlow v2.16.1 (original) (raw)

View aliases

Methods

__eq__

`eq`