tf.distribute.HierarchicalCopyAllReduce | TensorFlow v2.16.1 (original) (raw)
tf.distribute.HierarchicalCopyAllReduce
Stay organized with collections Save and categorize content based on your preferences.
Hierarchical copy all-reduce implementation of CrossDeviceOps.
Inherits From: CrossDeviceOps
View aliases
Compat aliases for migration
SeeMigration guide for more details.
tf.compat.v1.distribute.HierarchicalCopyAllReduce
tf.distribute.HierarchicalCopyAllReduce(
num_packs=1
)
Used in the notebooks
Used in the guide |
---|
Distributed training with TensorFlow |
It reduces to one GPU along edges in some hierarchy and broadcasts back to each GPU along the same path. For the batch API, tensors will be repacked or aggregated for more efficient cross-device transportation.
This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like that on DGX-1 machine. If you have different GPU inter-connections, it is likely that it would be slower than tf.distribute.ReductionToOneDevice.
For reduces that are not all-reduce, it falls back totf.distribute.ReductionToOneDevice.
Here is how you can use HierarchicalCopyAllReduce
intf.distribute.MirroredStrategy:
strategy = tf.distribute.MirroredStrategy(
cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
Args | |
---|---|
num_packs | a non-negative integer. The number of packs to split values into. If zero, no packing will be done. |
Raises |
---|
ValueError if num_packs is negative. |
Methods
batch_reduce
batch_reduce(
reduce_op, value_destination_pairs, options=None
)
Reduce values to destinations in batches.
See tf.distribute.StrategyExtended.batch_reduce_to. This can only be called in the cross-replica context.
Args | |
---|---|
reduce_op | a tf.distribute.ReduceOp specifying how values should be combined. |
value_destination_pairs | a sequence of (value, destinations) pairs. Seetf.distribute.CrossDeviceOps.reduce for descriptions. |
options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptions for details. |
Returns |
---|
A list of tf.Tensor or tf.distribute.DistributedValues, one per pair in value_destination_pairs. |
Raises | |
---|---|
ValueError | if value_destination_pairs is not an iterable of tuples of tf.distribute.DistributedValues and destinations. |
broadcast
broadcast(
tensor, destinations
)
Broadcast tensor
to destinations
.
This can only be called in the cross-replica context.
Args | |
---|---|
tensor | a tf.Tensor like object. The value to broadcast. |
destinations | a tf.distribute.DistributedValues, a tf.Variable, atf.Tensor alike object, or a device string. It specifies the devices to broadcast to. Note that if it's a tf.Variable, the value is broadcasted to the devices of that variable, this method doesn't update the variable. |
Returns |
---|
A tf.Tensor or tf.distribute.DistributedValues. |
reduce
reduce(
reduce_op, per_replica_value, destinations, options=None
)
Reduce per_replica_value
to destinations
.
See tf.distribute.StrategyExtended.reduce_to. This can only be called in the cross-replica context.
Args | |
---|---|
reduce_op | a tf.distribute.ReduceOp specifying how values should be combined. |
per_replica_value | a tf.distribute.DistributedValues, or a tf.Tensorlike object. |
destinations | a tf.distribute.DistributedValues, a tf.Variable, atf.Tensor alike object, or a device string. It specifies the devices to reduce to. To perform an all-reduce, pass the same to value anddestinations. Note that if it's a tf.Variable, the value is reduced to the devices of that variable, and this method doesn't update the variable. |
options | a tf.distribute.experimental.CommunicationOptions. Seetf.distribute.experimental.CommunicationOptions for details. |
Returns |
---|
A tf.Tensor or tf.distribute.DistributedValues. |
Raises | |
---|---|
ValueError | if per_replica_value can't be converted to atf.distribute.DistributedValues or if destinations is not a string,tf.Variable or tf.distribute.DistributedValues. |