tf.distribute.Server | TensorFlow v2.16.1 (original) (raw)
tf.distribute.Server
Stay organized with collections Save and categorize content based on your preferences.
An in-process TensorFlow server, for use in distributed training.
View aliases
Compat aliases for migration
SeeMigration guide for more details.
tf.compat.v1.distribute.Server, tf.compat.v1.train.Server
tf.distribute.Server(
server_or_cluster_def,
job_name=None,
task_index=None,
protocol=None,
config=None,
start=True
)
Used in the notebooks
Used in the guide | Used in the tutorials |
---|---|
Migrate multi-worker CPU/GPU training | Parameter server training with ParameterServerStrategy |
A tf.distribute.Server instance encapsulates a set of devices and atf.compat.v1.Session target that can participate in distributed training. A server belongs to a cluster (specified by a tf.train.ClusterSpec), and corresponds to a particular task in a named job. The server can communicate with any other server in the same cluster.
Args | |
---|---|
server_or_cluster_def | A tf.train.ServerDef or tf.train.ClusterDefprotocol buffer, or a tf.train.ClusterSpec object, describing the server to be created and/or the cluster of which it is a member. |
job_name | (Optional.) Specifies the name of the job of which the server is a member. Defaults to the value in server_or_cluster_def, if specified. |
task_index | (Optional.) Specifies the task index of the server in its job. Defaults to the value in server_or_cluster_def, if specified. Otherwise defaults to 0 if the server's job has only one task. |
protocol | (Optional.) Specifies the protocol to be used by the server. Acceptable values include "grpc", "grpc+verbs". Defaults to the value in server_or_cluster_def, if specified. Otherwise defaults to"grpc". |
config | (Options.) A tf.compat.v1.ConfigProto that specifies default configuration options for all sessions that run on this server. |
start | (Optional.) Boolean, indicating whether to start the server after creating it. Defaults to True. |
Raises | |
---|---|
tf.errors.OpError | Or one of its subclasses if an error occurs while creating the TensorFlow server. |
Attributes | |
---|---|
server_def | Returns the tf.train.ServerDef for this server. |
target | Returns the target for a tf.compat.v1.Session to connect to this server.To create atf.compat.v1.Session that connects to this server, use the following snippet: server = tf.distribute.Server(...) with tf.compat.v1.Session(server.target): # ... |
Methods
create_local_server
@staticmethod
create_local_server( config=None, start=True )
Creates a new single-process cluster running on the local host.
This method is a convenience wrapper for creating atf.distribute.Server with a tf.train.ServerDef that specifies a single-process cluster containing a single task in a job called"local"
.
Args | |
---|---|
config | (Options.) A tf.compat.v1.ConfigProto that specifies default configuration options for all sessions that run on this server. |
start | (Optional.) Boolean, indicating whether to start the server after creating it. Defaults to True. |
Returns |
---|
A local tf.distribute.Server. |
join
join()
Blocks until the server has shut down.
This method currently blocks forever.
Raises | |
---|---|
tf.errors.OpError | Or one of its subclasses if an error occurs while joining the TensorFlow server. |
start
start()
Starts this server.
Raises | |
---|---|
tf.errors.OpError | Or one of its subclasses if an error occurs while starting the TensorFlow server. |