tf.experimental.dtensor.initialize_accelerator_system  |  TensorFlow v2.16.1 (original) (raw)

Initializes accelerators and communication fabrics for DTensor.

View aliases

Main aliases

tf.experimental.dtensor.initialize_multi_client, tf.experimental.dtensor.initialize_tpu_system

tf.experimental.dtensor.initialize_accelerator_system(
    device_type: Optional[str] = None,
    enable_coordination_service: Optional[bool] = True,
    num_logical_cpu_devices: Optional[int] = None,
    experimental_reset_context: Optional[bool] = False,
    experimental_enable_megcore: Optional[bool] = False
) -> str

DTensor configures TensorFlow to run in the local mode or multi-client mode.

If DTENSOR_JOBS is non-empty, DTensor configures TensorFlow to run in the multi-client mode using the distributed runtime. In multi-client mode devices on different clients can communicate with each other.

The following environment variables controls the behavior of this function.

Args
device_type Type of accelerator to use, can be CPU, GPU, or TPU. If None, uses tf.experimental.dtensor.preferred_device_type().
enable_coordination_service If true, enable distributed coordination service to make sure that workers know the devices on each other, when there is more than 1 client.
num_logical_cpu_devices the number of logical CPU devices per DTensor client. Default to the current number of logical CPU (dtensor.num_local_devices("CPU")),when device_type is CPU, otherwise set automatially to match the number of local GPU/TPU devices.
experimental_reset_context Reset the tensorflow context. Behaviors of existing TensorFlow objects (e.g. Tensors) are undefined. Set this to True as an escape hatch, if there is no clear way to refactor your code to call initialize_accelerator_system() before calling TensorFlow APIs that initialize the context.
experimental_enable_megcore Optionally enable megcore in backend.
Returns
device_type the type of accelerator that was initialized.