DefaultSampler — mmengine 0.10.7 documentation (original) (raw)
class mmengine.dataset.DefaultSampler(dataset, shuffle=True, seed=None, round_up=True)[source]¶
The default data sampler for both distributed and non-distributed environment.
It has several differences from the PyTorch DistributedSampler
as below:
- This sampler supports non-distributed environment.
- The round up behaviors are a little different.
- If
round_up=True
, this sampler will add extra samples to make the number of samples is evenly divisible by the world size. And this behavior is the same as theDistributedSampler
withdrop_last=False
. - If
round_up=False
, this sampler won’t remove or add any samples while theDistributedSampler
withdrop_last=True
will remove tail samples.
- If
Parameters:
- dataset (Sized) – The dataset.
- shuffle (bool) – Whether shuffle the dataset or not. Defaults to True.
- seed (int, optional) – Random seed used to shuffle the sampler if
shuffle=True
. This number should be identical across all processes in the distributed group. Defaults to None. - round_up (bool) – Whether to add extra samples to make the number of samples evenly divisible by the world size. Defaults to True.
Sets the epoch for this sampler.
When shuffle=True
, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.
Parameters:
epoch (int) – Epoch number.
Return type:
None