Flickr8k — Torchvision 0.22 documentation (original) (raw)
class torchvision.datasets.Flickr8k(root: ~typing.Union[str, ~pathlib.Path], ann_file: str, transform: ~typing.Optional[~typing.Callable] = None, target_transform: ~typing.Optional[~typing.Callable] = None, loader: ~typing.Callable[[str], ~typing.Any] = <function default_loader>)[source]¶
Flickr8k Entities Dataset.
Parameters:
- root (str or
pathlib.Path
) – Root directory where images are downloaded to. - ann_file (string) – Path to annotation file.
- transform (callable , optional) – A function/transform that takes in a PIL image or torch.Tensor, depends on the given loader, and returns a transformed version. E.g,
transforms.RandomCrop
- target_transform (callable , optional) – A function/transform that takes in the target and transforms it.
- loader (callable , optional) – A function to load an image given its path. By default, it uses PIL as its image loader, but users could also pass in
torchvision.io.decode_image
for decoding image data into tensors directly.
Special-members:
__getitem__(index: int) → Tuple[Any, Any][source]¶
Parameters:
index (int) – Index
Returns:
Tuple (image, target). target is a list of captions for the image.
Return type: