[RFC] Future of gpus/ipus/tpu_cores with respect to devices · Issue #10410 · Lightning-AI/pytorch-lightning (original) (raw)

Proposed refactoring or deprecation

Currently we have two methods to specifying devices. Let's take GPUs for example:

  1. The standard case that we've all grown used to and are mostly aware of.

trainer = Trainer(gpus=2)

  1. Introduced in 1.5, tries to make the number of devices agnostic. This means if you specify accelerator='tpu' we automatically know to use 2 TPU cores.

trainer = Trainer(devices=2, accelerator='gpu')

Recently, it has come up in #10404 (comment) that we may want to deprecate and prevent further device specific names from appearing in the Trainer (such as hpus).

Related conversation #9053 (comment)

I see two options:

🚀 We keep both device specific arguments (gpus tpu_cores ipus for the Trainer) and devices
👀 We drop gpus tpu_cores ipus in the future and fully rely on devices. (Potentially this would likely be done in Lightning 2.0, instead of after 2 minor releases)

cc @kaushikb11 @justusschock @ananthsub @awaelchli