[RFC] Future of gpus/ipus/tpu_cores
with respect to devices
· Issue #10410 · Lightning-AI/pytorch-lightning (original) (raw)
Proposed refactoring or deprecation
Currently we have two methods to specifying devices. Let's take GPUs for example:
- The standard case that we've all grown used to and are mostly aware of.
trainer = Trainer(gpus=2)
- Introduced in 1.5, tries to make the number of devices agnostic. This means if you specify
accelerator='tpu'
we automatically know to use 2 TPU cores.
trainer = Trainer(devices=2, accelerator='gpu')
Recently, it has come up in #10404 (comment) that we may want to deprecate and prevent further device specific names from appearing in the Trainer (such as hpus
).
Related conversation #9053 (comment)
I see two options:
🚀 We keep both device specific arguments (gpus
tpu_cores
ipus
for the Trainer) and devices
👀 We drop gpus
tpu_cores
ipus
in the future and fully rely on devices
. (Potentially this would likely be done in Lightning 2.0, instead of after 2 minor releases)