detectron2.data.transforms — detectron2 0.6 documentation (original) (raw)
Related tutorial: Data Augmentation.
class detectron2.data.transforms.
Transform
¶
Bases: object
Base class for implementations of deterministic transformations for image and other data structures. “Deterministic” requires that the output of all methods of this class are deterministic w.r.t their input arguments. Note that this is different from (random) data augmentations. To perform data augmentations in training, there should be a higher-level policy that generates these transform ops.
Each transform op may handle several data types, e.g.: image, coordinates, segmentation, bounding boxes, with its apply_*
methods. Some of them have a default implementation, but can be overwritten if the default isn’t appropriate. See documentation of each pre-defined apply_*
methods for details. Note that The implementation of these method may choose to modify its input data in-place for efficient transformation.
The class can be extended to support arbitrary new data types with itsregister_type() method.
__repr__
()¶
Produce something like: “MyTransform(field1={self.field1}, field2={self.field2})”
apply_box
(box: numpy.ndarray) → numpy.ndarray¶
Apply the transform on an axis-aligned box. By default will transform the corner points and use their minimum/maximum to create a new axis-aligned box. Note that this default may change the size of your box, e.g. after rotations.
Parameters
box (ndarray) – Nx4 floating point array of XYXY format in absolute coordinates.
Returns
ndarray – box after apply the transformation.
Note
The coordinates are not pixel indices. Coordinates inside an image of shape (H, W) are in range [0, W] or [0, H].
This function does not clip boxes to force them inside the image. It is up to the application that uses the boxes to decide.
abstract apply_coords
(coords: numpy.ndarray)¶
Apply the transform on coordinates.
Parameters
coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns
ndarray – coordinates after apply the transformation.
Note
The coordinates are not pixel indices. Coordinates inside an image of shape (H, W) are in range [0, W] or [0, H]. This function should correctly transform coordinates outside the image as well.
abstract apply_image
(img: numpy.ndarray)¶
Apply the transform on an image.
Parameters
img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns
ndarray – image after apply the transformation.
apply_polygons
(polygons: list) → list¶
Apply the transform on a list of polygons, each represented by a Nx2 array. By default will just transform all the points.
Parameters
polygon (list[ ndarray ]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns
list[ndarray] – polygon after apply the transformation.
Note
The coordinates are not pixel indices. Coordinates on an image of shape (H, W) are in range [0, W] or [0, H].
apply_segmentation
(segmentation: numpy.ndarray) → numpy.ndarray¶
Apply the transform on a full-image segmentation. By default will just perform “apply_image”.
Parameters
- segmentation (ndarray) – of shape HxW. The array should have integer
- bool dtype. (or) –
Returns
ndarray – segmentation after apply the transformation.
inverse
() → detectron2.data.transforms.Transform¶
Create a transform that inverts the geometric changes (i.e. change of coordinates) of this transform.
Note that the inverse is meant for geometric changes only. The inverse of photometric transforms that do not change coordinates is defined to be a no-op, even if they may be invertible.
Returns
Transform
classmethod register_type
(data_type: str, func: Optional[Callable] = None)[source]¶
Register the given function as a handler that this transform will use for a specific data type.
Parameters
- data_type (str) – the name of the data type (e.g., box)
- func (callable) – takes a transform and a data, returns the transformed data.
Examples:
call it directly
def func(flip_transform, voxel_data): return transformed_voxel_data HFlipTransform.register_type("voxel", func)
or, use it as a decorator
@HFlipTransform.register_type("voxel") def func(flip_transform, voxel_data): return transformed_voxel_data
...
transform = HFlipTransform(...) transform.apply_voxel(voxel_data) # func will be called
class detectron2.data.transforms.
TransformList
(transforms: List[detectron2.data.transforms.Transform])¶
Bases: detectron2.data.transforms.Transform
Maintain a list of transform operations which will be applied in sequence. .. attribute:: transforms
__add__
(other: detectron2.data.transforms.TransformList) → detectron2.data.transforms.TransformList¶
Parameters
other (TransformList) – transformation to add.
Returns
TransformList – list of transforms.
__iadd__
(other: detectron2.data.transforms.TransformList) → detectron2.data.transforms.TransformList¶
Parameters
other (TransformList) – transformation to add.
Returns
TransformList – list of transforms.
__init__
(transforms: List[detectron2.data.transforms.Transform])¶
Parameters
transforms (list_[_Transform]) – list of transforms to perform.
Returns
Number of transforms contained in the TransformList.
__radd__
(other: detectron2.data.transforms.TransformList) → detectron2.data.transforms.TransformList¶
Parameters
other (TransformList) – transformation to add.
Returns
TransformList – list of transforms.
apply_coords
(x)¶
apply_image
(x)¶
inverse
() → detectron2.data.transforms.TransformList¶
Invert each transform in reversed order.
class detectron2.data.transforms.
BlendTransform
(src_image: numpy.ndarray, src_weight: float, dst_weight: float)¶
Bases: detectron2.data.transforms.Transform
Transforms pixel colors with PIL enhance functions.
__init__
(src_image: numpy.ndarray, src_weight: float, dst_weight: float)¶
Blends the input image (dst_image) with the src_image using formula:src_weight * src_image + dst_weight * dst_image
Parameters
- src_image (ndarray) – Input image is blended with this image. The two images must have the same shape, range, channel order and dtype.
- src_weight (float) – Blend weighting of src_image
- dst_weight (float) – Blend weighting of dst_image
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
Apply no transform on the coordinates.
apply_image
(img: numpy.ndarray, interp: str = None) → numpy.ndarray¶
Apply blend transform on the image(s).
Parameters
- img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
- interp (str) – keep this option for consistency, perform blend would not require interpolation.
Returns
ndarray – blended image(s).
apply_segmentation
(segmentation: numpy.ndarray) → numpy.ndarray¶
Apply no transform on the full-image segmentation.
inverse
() → detectron2.data.transforms.Transform¶
The inverse is a no-op.
class detectron2.data.transforms.
CropTransform
(x0: int, y0: int, w: int, h: int, orig_w: Optional[int] = None, orig_h: Optional[int] = None)¶
Bases: detectron2.data.transforms.Transform
__init__
(x0: int, y0: int, w: int, h: int, orig_w: Optional[int] = None, orig_h: Optional[int] = None)¶
Parameters
- x0 (int) – crop the image(s) by img[y0:y0+h, x0:x0+w].
- y0 (int) – crop the image(s) by img[y0:y0+h, x0:x0+w].
- w (int) – crop the image(s) by img[y0:y0+h, x0:x0+w].
- h (int) – crop the image(s) by img[y0:y0+h, x0:x0+w].
- orig_w (int) – optional, the original width and height before cropping. Needed to make this transform invertible.
- orig_h (int) – optional, the original width and height before cropping. Needed to make this transform invertible.
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
Apply crop transform on coordinates.
Parameters
coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns
ndarray – cropped coordinates.
apply_image
(img: numpy.ndarray) → numpy.ndarray¶
Crop the image(s).
Parameters
img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns
ndarray – cropped image(s).
apply_polygons
(polygons: list) → list¶
Apply crop transform on a list of polygons, each represented by a Nx2 array. It will crop the polygon with the box, therefore the number of points in the polygon might change.
Parameters
polygon (list[ ndarray ]) – each is a Nx2 floating point array of (x, y) format in absolute coordinates.
Returns
ndarray – cropped polygons.
inverse
() → detectron2.data.transforms.Transform¶
class detectron2.data.transforms.
PadTransform
(x0: int, y0: int, x1: int, y1: int, orig_w: Optional[int] = None, orig_h: Optional[int] = None, pad_value: float = 0, seg_pad_value: int = 0)¶
Bases: detectron2.data.transforms.Transform
__init__
(x0: int, y0: int, x1: int, y1: int, orig_w: Optional[int] = None, orig_h: Optional[int] = None, pad_value: float = 0, seg_pad_value: int = 0)¶
Parameters
- x0 – number of padded pixels on the left and top
- y0 – number of padded pixels on the left and top
- x1 – number of padded pixels on the right and bottom
- y1 – number of padded pixels on the right and bottom
- orig_w – optional, original width and height. Needed to make this transform invertible.
- orig_h – optional, original width and height. Needed to make this transform invertible.
- pad_value – the padding value to the image
- seg_pad_value – the padding value to the segmentation mask
apply_coords
(coords)¶
apply_image
(img)¶
apply_segmentation
(img)¶
inverse
() → detectron2.data.transforms.Transform¶
class detectron2.data.transforms.
GridSampleTransform
(grid: numpy.ndarray, interp: str)¶
Bases: detectron2.data.transforms.Transform
__init__
(grid: numpy.ndarray, interp: str)¶
Parameters
- grid (ndarray) – grid has x and y input pixel locations which are used to compute output. Grid has values in the range of [-1, 1], which is normalized by the input height and width. The dimension is N x H x W x 2.
- interp (str) – interpolation methods. Options include nearest andbilinear.
apply_coords
(coords: numpy.ndarray)¶
Not supported.
apply_image
(img: numpy.ndarray, interp: str = None) → numpy.ndarray¶
Apply grid sampling on the image(s).
Parameters
- img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
- interp (str) – interpolation methods. Options include nearest andbilinear.
Returns
ndarray – grid sampled image(s).
apply_segmentation
(segmentation: numpy.ndarray) → numpy.ndarray¶
Apply grid sampling on the full-image segmentation.
Parameters
segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns
ndarray – grid sampled segmentation.
class detectron2.data.transforms.
HFlipTransform
(width: int)¶
Bases: detectron2.data.transforms.Transform
Perform horizontal flip.
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
Flip the coordinates.
Parameters
coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns
ndarray – the flipped coordinates.
Note
The inputs are floating point coordinates, not pixel indices. Therefore they are flipped by (W - x, H - y), not(W - 1 - x, H - 1 - y).
apply_image
(img: numpy.ndarray) → numpy.ndarray¶
Flip the image(s).
Parameters
img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns
ndarray – the flipped image(s).
apply_rotated_box
(rotated_boxes)¶
Apply the horizontal flip transform on rotated boxes.
Parameters
rotated_boxes (ndarray) – Nx5 floating point array of (x_center, y_center, width, height, angle_degrees) format in absolute coordinates.
inverse
() → detectron2.data.transforms.Transform¶
The inverse is to flip again
class detectron2.data.transforms.
VFlipTransform
(height: int)¶
Bases: detectron2.data.transforms.Transform
Perform vertical flip.
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
Flip the coordinates.
Parameters
coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns
ndarray – the flipped coordinates.
Note
The inputs are floating point coordinates, not pixel indices. Therefore they are flipped by (W - x, H - y), not(W - 1 - x, H - 1 - y).
apply_image
(img: numpy.ndarray) → numpy.ndarray¶
Flip the image(s).
Parameters
img (ndarray) – of shape HxW, HxWxC, or NxHxWxC. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
Returns
ndarray – the flipped image(s).
inverse
() → detectron2.data.transforms.Transform¶
The inverse is to flip again
class detectron2.data.transforms.
NoOpTransform
¶
Bases: detectron2.data.transforms.Transform
A transform that does nothing.
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
apply_image
(img: numpy.ndarray) → numpy.ndarray¶
apply_rotated_box
(x)¶
inverse
() → detectron2.data.transforms.Transform¶
class detectron2.data.transforms.
ScaleTransform
(h: int, w: int, new_h: int, new_w: int, interp: str = None)¶
Bases: detectron2.data.transforms.Transform
Resize the image to a target size.
__init__
(h: int, w: int, new_h: int, new_w: int, interp: str = None)¶
Parameters
- h (int) – original image size.
- w (int) – original image size.
- new_h (int) – new image size.
- new_w (int) – new image size.
- interp (str) – interpolation methods. Options includes nearest, linear(3D-only), bilinear, bicubic (4D-only), and area. Details can be found in:https://pytorch.org/docs/stable/nn.functional.html
apply_coords
(coords: numpy.ndarray) → numpy.ndarray¶
Compute the coordinates after resize.
Parameters
coords (ndarray) – floating point array of shape Nx2. Each row is (x, y).
Returns
ndarray – resized coordinates.
apply_image
(img: numpy.ndarray, interp: str = None) → numpy.ndarray¶
Resize the image(s).
Parameters
- img (ndarray) – of shape NxHxWxC, or HxWxC or HxW. The array can be of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
- interp (str) – interpolation methods. Options includes nearest, linear(3D-only), bilinear, bicubic (4D-only), and area. Details can be found in:https://pytorch.org/docs/stable/nn.functional.html
Returns
ndarray – resized image(s).
apply_segmentation
(segmentation: numpy.ndarray) → numpy.ndarray¶
Apply resize on the full-image segmentation.
Parameters
segmentation (ndarray) – of shape HxW. The array should have integer or bool dtype.
Returns
ndarray – resized segmentation.
inverse
() → detectron2.data.transforms.Transform¶
The inverse is to resize it back.
class detectron2.data.transforms.
ExtentTransform
(src_rect, output_size, interp=2, fill=0)¶
Bases: detectron2.data.transforms.Transform
Extracts a subregion from the source image and scales it to the output size.
The fill color is used to map pixels from the source rect that fall outside the source image.
See: https://pillow.readthedocs.io/en/latest/PIL.html#PIL.ImageTransform.ExtentTransform
__init__
(src_rect, output_size, interp=2, fill=0)¶
Parameters
- src_rect (x0 , y0 , x1 , y1) – src coordinates
- output_size (h , w) – dst image size
- interp – PIL interpolation methods
- fill – Fill color used when src_rect extends outside image
apply_coords
(coords)¶
apply_image
(img, interp=None)¶
apply_segmentation
(segmentation)¶
class detectron2.data.transforms.
ResizeTransform
(h, w, new_h, new_w, interp=None)¶
Bases: detectron2.data.transforms.Transform
Resize the image to a target size.
__init__
(h, w, new_h, new_w, interp=None)¶
Parameters
- h (int) – original image size
- w (int) – original image size
- new_h (int) – new image size
- new_w (int) – new image size
- interp – PIL interpolation methods, defaults to bilinear.
apply_coords
(coords)¶
apply_image
(img, interp=None)¶
apply_rotated_box
(rotated_boxes)¶
Apply the resizing transform on rotated boxes. For details of how these (approximation) formulas are derived, please refer to RotatedBoxes.scale()
.
Parameters
rotated_boxes (ndarray) – Nx5 floating point array of (x_center, y_center, width, height, angle_degrees) format in absolute coordinates.
apply_segmentation
(segmentation)¶
inverse
()¶
class detectron2.data.transforms.
RotationTransform
(h, w, angle, expand=True, center=None, interp=None)¶
Bases: detectron2.data.transforms.Transform
This method returns a copy of this image, rotated the given number of degrees counter clockwise around its center.
__init__
(h, w, angle, expand=True, center=None, interp=None)¶
Parameters
- h (int) – original image size
- w (int) – original image size
- angle (float) – degrees for rotation
- expand (bool) – choose if the image should be resized to fit the whole rotated image (default), or simply cropped
- center (tuple ( width , height )) – coordinates of the rotation center if left to None, the center will be fit to the center of each image center has no effect if expand=True because it only affects shifting
- interp – cv2 interpolation method, default cv2.INTER_LINEAR
apply_coords
(coords)¶
coords should be a N * 2 array-like, containing N couples of (x, y) points
apply_image
(img, interp=None)¶
img should be a numpy array, formatted as Height * Width * Nchannels
apply_segmentation
(segmentation)¶
create_rotation_matrix
(offset=0)¶
inverse
()¶
The inverse is to rotate it back with expand, and crop to get the original shape.
class detectron2.data.transforms.
ColorTransform
(op)¶
Bases: detectron2.data.transforms.Transform
Generic wrapper for any photometric transforms. These transformations should only affect the color space and
not the coordinate space of the image (e.g. annotation coordinates such as bounding boxes should not be changed)
__init__
(op)¶
Parameters
op (Callable) – operation to be applied to the image, which takes in an ndarray and returns an ndarray.
apply_coords
(coords)¶
apply_image
(img)¶
apply_segmentation
(segmentation)¶
inverse
()¶
class detectron2.data.transforms.
PILColorTransform
(op)¶
Bases: detectron2.data.transforms.ColorTransform
Generic wrapper for PIL Photometric image transforms,
which affect the color space and not the coordinate space of the image
__init__
(op)¶
Parameters
op (Callable) – operation to be applied to the image, which takes in a PIL Image and returns a transformed PIL Image. For reference on possible operations see: - https://pillow.readthedocs.io/en/stable/
apply_image
(img)¶
class detectron2.data.transforms.
Augmentation
¶
Bases: object
Augmentation defines (often random) policies/strategies to generate Transformfrom data. It is often used for pre-processing of input data.
A “policy” that generates a Transform may, in the most general case, need arbitrary information from input data in order to determine what transforms to apply. Therefore, each Augmentation instance defines the arguments needed by its get_transform() method. When called with the positional arguments, the get_transform() method executes the policy.
Note that Augmentation defines the policies to create a Transform, but not how to execute the actual transform operations to those data. Its __call__() method will use AugInput.transform() to execute the transform.
The returned Transform object is meant to describe deterministic transformation, which means it can be re-applied on associated data, e.g. the geometry of an image and its segmentation masks need to be transformed together. (If such re-application is not needed, then determinism is not a crucial requirement.)
__call__
(aug_input) → detectron2.data.transforms.Transform¶
Augment the given aug_input in-place, and return the transform that’s used.
This method will be called to apply the augmentation. In most augmentation, it is enough to use the default implementation, which calls get_transform()using the inputs. But a subclass can overwrite it to have more complicated logic.
Parameters
aug_input (AugInput) – an object that has attributes needed by this augmentation (defined by self.get_transform
). Its transform
method will be called to in-place transform it.
Returns
Transform – the transform that is applied on the input.
__repr__
()¶
Produce something like: “MyAugmentation(field1={self.field1}, field2={self.field2})”
__str__
()¶
Produce something like: “MyAugmentation(field1={self.field1}, field2={self.field2})”
get_transform
(*args) → detectron2.data.transforms.Transform¶
Execute the policy based on input data, and decide what transform to apply to inputs.
Parameters
args – Any fixed-length positional arguments. By default, the name of the arguments should exist in the AugInput to be used.
Returns
Transform – Returns the deterministic transform to apply to the input.
Examples:
class MyAug: # if a policy needs to know both image and semantic segmentation def get_transform(image, sem_seg) -> T.Transform: pass tfm: Transform = MyAug().get_transform(image, sem_seg) new_image = tfm.apply_image(image)
Notes
Users can freely use arbitrary new argument names in customget_transform() method, as long as they are available in the input data. In detectron2 we use the following convention:
- image: (H,W) or (H,W,C) ndarray of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255].
- boxes: (N,4) ndarray of float32. It represents the instance bounding boxes of N instances. Each is in XYXY format in unit of absolute coordinates.
- sem_seg: (H,W) ndarray of type uint8. Each element is an integer label of pixel.
We do not specify convention for other types and do not include builtinAugmentation that uses other types in detectron2.
input_args
: Optional[Tuple[str]] = None¶
class detectron2.data.transforms.
AugmentationList
(augs)¶
Bases: detectron2.data.transforms.Augmentation
Apply a sequence of augmentations.
It has __call__
method to apply the augmentations.
Note that get_transform()
method is impossible (will throw error if called) for AugmentationList, because in order to apply a sequence of augmentations, the kth augmentation must be applied first, to provide inputs needed by the (k+1)th augmentation.
__init__
(augs)¶
Parameters
augs (list_[_Augmentation or Transform]) –
class detectron2.data.transforms.
AugInput
(image: numpy.ndarray, *, boxes: Optional[numpy.ndarray] = None, sem_seg: Optional[numpy.ndarray] = None)¶
Bases: object
Input that can be used with Augmentation.__call__(). This is a standard implementation for the majority of use cases. This class provides the standard attributes “image”, “boxes”, “sem_seg”defined in __init__() and they may be needed by different augmentations. Most augmentation policies do not need attributes beyond these three.
After applying augmentations to these attributes (using AugInput.transform()), the returned transforms can then be used to transform other data structures that users have.
Examples:
input = AugInput(image, boxes=boxes) tfms = augmentation(input) transformed_image = input.image transformed_boxes = input.boxes transformed_other_data = tfms.apply_other(other_data)
An extended project that works with new data types may implement augmentation policies that need other inputs. An algorithm may need to transform inputs in a way different from the standard approach defined in this class. In those rare situations, users can implement a class similar to this class, that satify the following condition:
- The input must provide access to these data in the form of attribute access (
getattr
). For example, if an Augmentation to be applied needs “image” and “sem_seg” arguments, its input must have the attribute “image” and “sem_seg”. - The input must have a
transform(tfm: Transform) -> None
method which in-place transforms all its attributes.
__init__
(image: numpy.ndarray, *, boxes: Optional[numpy.ndarray] = None, sem_seg: Optional[numpy.ndarray] = None)¶
Parameters
- image (ndarray) – (H,W) or (H,W,C) ndarray of type uint8 in range [0, 255], or floating point in range [0, 1] or [0, 255]. The meaning of C is up to users.
- boxes (ndarray or None) – Nx4 float32 boxes in XYXY_ABS mode
- sem_seg (ndarray or None) – HxW uint8 semantic segmentation mask. Each element is an integer label of pixel.
transform
(tfm: detectron2.data.transforms.Transform) → None¶
In-place transform all attributes of this class.
By “in-place”, it means after calling this method, accessing an attribute such as self.image
will return transformed data.
class detectron2.data.transforms.
FixedSizeCrop
(crop_size: Tuple[int], pad: bool = True, pad_value: float = 128.0, seg_pad_value: int = 255)¶
Bases: detectron2.data.transforms.Augmentation
If crop_size is smaller than the input image size, then it uses a random crop of the crop size. If crop_size is larger than the input image size, then it pads the right and the bottom of the image to the crop size if pad is True, otherwise it returns the smaller image.
__init__
(crop_size: Tuple[int], pad: bool = True, pad_value: float = 128.0, seg_pad_value: int = 255)¶
Parameters
- crop_size – target image (height, width).
- pad – if True, will pad images smaller than crop_size up to crop_size
- pad_value – the padding value to the image.
- seg_pad_value – the padding value to the segmentation mask.
get_transform
(image: numpy.ndarray) → detectron2.data.transforms.TransformList¶
class detectron2.data.transforms.
RandomApply
(tfm_or_aug, prob=0.5)¶
Bases: detectron2.data.transforms.Augmentation
Randomly apply an augmentation with a given probability.
__init__
(tfm_or_aug, prob=0.5)¶
Parameters
- tfm_or_aug (Transform, Augmentation) – the transform or augmentation to be applied. It can either be a Transform or Augmentationinstance.
- prob (float) – probability between 0.0 and 1.0 that the wrapper transformation is applied
get_transform
(*args)¶
class detectron2.data.transforms.
RandomBrightness
(intensity_min, intensity_max)¶
Bases: detectron2.data.transforms.Augmentation
Randomly transforms image brightness.
Brightness intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce brightness - intensity = 1 will preserve the input image - intensity > 1 will increase brightness
See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html
__init__
(intensity_min, intensity_max)¶
Parameters
get_transform
(image)¶
class detectron2.data.transforms.
RandomContrast
(intensity_min, intensity_max)¶
Bases: detectron2.data.transforms.Augmentation
Randomly transforms image contrast.
Contrast intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce contrast - intensity = 1 will preserve the input image - intensity > 1 will increase contrast
See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html
__init__
(intensity_min, intensity_max)¶
Parameters
get_transform
(image)¶
class detectron2.data.transforms.
RandomCrop
(crop_type: str, crop_size)¶
Bases: detectron2.data.transforms.Augmentation
Randomly crop a rectangle region out of an image.
__init__
(crop_type: str, crop_size)¶
Parameters
crop_type (str) – one of “relative_range”, “relative”, “absolute”, “absolute_range”.
crop_size (tuple[_float,_ float]) – two floats, explained below.
“relative”: crop a (H * crop_size[0], W * crop_size[1]) region from an input image of size (H, W). crop size should be in (0, 1]
“relative_range”: uniformly sample two values from [crop_size[0], 1] and [crop_size[1]], 1], and use them as in “relative” crop type.
“absolute” crop a (crop_size[0], crop_size[1]) region from input image. crop_size must be smaller than the input image size.
“absolute_range”, for an input of size (H, W), uniformly sample H_crop in [crop_size[0], min(H, crop_size[1])] and W_crop in [crop_size[0], min(W, crop_size[1])]. Then crop a region (H_crop, W_crop).
get_crop_size
(image_size)¶
Parameters
image_size (tuple) – height, width
Returns
crop_size (tuple) – height, width in absolute pixels
get_transform
(image)¶
class detectron2.data.transforms.
RandomExtent
(scale_range, shift_range)¶
Bases: detectron2.data.transforms.Augmentation
Outputs an image by cropping a random “subrect” of the source image.
The subrect can be parameterized to include pixels outside the source image, in which case they will be set to zeros (i.e. black). The size of the output image will vary with the size of the random subrect.
__init__
(scale_range, shift_range)¶
Parameters
- output_size (h , w) – Dimensions of output image
- scale_range (l , h) – Range of input-to-output size scaling factor
- shift_range (x , y) – Range of shifts of the cropped subrect. The rect is shifted by [w / 2 * Uniform(-x, x), h / 2 * Uniform(-y, y)], where (w, h) is the (width, height) of the input image. Set each component to zero to crop at the image’s center.
get_transform
(image)¶
class detectron2.data.transforms.
RandomFlip
(prob=0.5, *, horizontal=True, vertical=False)¶
Bases: detectron2.data.transforms.Augmentation
Flip the image horizontally or vertically with the given probability.
__init__
(prob=0.5, *, horizontal=True, vertical=False)¶
Parameters
- prob (float) – probability of flip.
- horizontal (boolean) – whether to apply horizontal flipping
- vertical (boolean) – whether to apply vertical flipping
get_transform
(image)¶
class detectron2.data.transforms.
RandomSaturation
(intensity_min, intensity_max)¶
Bases: detectron2.data.transforms.Augmentation
Randomly transforms saturation of an RGB image. Input images are assumed to have ‘RGB’ channel order.
Saturation intensity is uniformly sampled in (intensity_min, intensity_max). - intensity < 1 will reduce saturation (make the image more grayscale) - intensity = 1 will preserve the input image - intensity > 1 will increase saturation
See: https://pillow.readthedocs.io/en/3.0.x/reference/ImageEnhance.html
__init__
(intensity_min, intensity_max)¶
Parameters
- intensity_min (float) – Minimum augmentation (1 preserves input).
- intensity_max (float) – Maximum augmentation (1 preserves input).
get_transform
(image)¶
class detectron2.data.transforms.
RandomLighting
(scale)¶
Bases: detectron2.data.transforms.Augmentation
The “lighting” augmentation described in AlexNet, using fixed PCA over ImageNet. Input images are assumed to have ‘RGB’ channel order.
The degree of color jittering is randomly sampled via a normal distribution, with standard deviation given by the scale parameter.
__init__
(scale)¶
Parameters
scale (float) – Standard deviation of principal component weighting.
get_transform
(image)¶
class detectron2.data.transforms.
RandomRotation
(angle, expand=True, center=None, sample_style='range', interp=None)¶
Bases: detectron2.data.transforms.Augmentation
This method returns a copy of this image, rotated the given number of degrees counter clockwise around the given center.
__init__
(angle, expand=True, center=None, sample_style='range', interp=None)¶
Parameters
- angle (list_[_float]) – If
sample_style=="range"
, a [min, max] interval from which to sample the angle (in degrees). Ifsample_style=="choice"
, a list of angles to sample from - expand (bool) – choose if the image should be resized to fit the whole rotated image (default), or simply cropped
- center (list[_ _[_float,_ float] ]) – If
sample_style=="range"
, a [[minx, miny], [maxx, maxy]] relative interval from which to sample the center, [0, 0] being the top left of the image and [1, 1] the bottom right. Ifsample_style=="choice"
, a list of centers to sample from Default: None, which means that the center of rotation is the center of the image center has no effect if expand=True because it only affects shifting
get_transform
(image)¶
class detectron2.data.transforms.
Resize
(shape, interp=2)¶
Bases: detectron2.data.transforms.Augmentation
Resize image to a fixed target size
__init__
(shape, interp=2)¶
Parameters
- shape – (h, w) tuple or a int
- interp – PIL interpolation method
get_transform
(image)¶
class detectron2.data.transforms.
ResizeScale
(min_scale: float, max_scale: float, target_height: int, target_width: int, interp: int = 2)¶
Bases: detectron2.data.transforms.Augmentation
Takes target size as input and randomly scales the given target size between min_scaleand max_scale. It then scales the input image such that it fits inside the scaled target box, keeping the aspect ratio constant. This implements the resize part of the Google’s ‘resize_and_crop’ data augmentation:https://github.com/tensorflow/tpu/blob/master/models/official/detection/utils/input_utils.py#L127
__init__
(min_scale: float, max_scale: float, target_height: int, target_width: int, interp: int = 2)¶
Parameters
- min_scale – minimum image scale range.
- max_scale – maximum image scale range.
- target_height – target image height.
- target_width – target image width.
- interp – image interpolation method.
get_transform
(image: numpy.ndarray) → detectron2.data.transforms.Transform¶
class detectron2.data.transforms.
ResizeShortestEdge
(short_edge_length, max_size=9223372036854775807, sample_style='range', interp=2)¶
Bases: detectron2.data.transforms.Augmentation
Resize the image while keeping the aspect ratio unchanged. It attempts to scale the shorter edge to the given short_edge_length, as long as the longer edge does not exceed max_size. If max_size is reached, then downscale so that the longer edge does not exceed max_size.
__init__
(short_edge_length, max_size=9223372036854775807, sample_style='range', interp=2)¶
Parameters
- short_edge_length (list_[_int]) – If
sample_style=="range"
, a [min, max] interval from which to sample the shortest edge length. Ifsample_style=="choice"
, a list of shortest edge lengths to sample from. - max_size (int) – maximum allowed longest edge length.
- sample_style (str) – either “range” or “choice”.
static get_output_shape
(oldh: int, oldw: int, short_edge_length: int, max_size: int) → Tuple[int, int][source]¶
Compute the output size given input size and target short edge length.
get_transform
(image)¶
class detectron2.data.transforms.
RandomCrop_CategoryAreaConstraint
(crop_type: str, crop_size, single_category_max_area: float = 1.0, ignored_category: int = None)¶
Bases: detectron2.data.transforms.Augmentation
Similar to RandomCrop, but find a cropping window such that no single category occupies a ratio of more than single_category_max_area in semantic segmentation ground truth, which can cause unstability in training. The function attempts to find such a valid cropping window for at most 10 times.
__init__
(crop_type: str, crop_size, single_category_max_area: float = 1.0, ignored_category: int = None)¶
Parameters
- crop_type – same as in RandomCrop
- crop_size – same as in RandomCrop
- single_category_max_area – the maximum allowed area ratio of a category. Set to 1.0 to disable
- ignored_category – allow this category in the semantic segmentation ground truth to exceed the area ratio. Usually set to the category that’s ignored in training.
get_transform
(image, sem_seg)¶
class detectron2.data.transforms.
RandomResize
(shape_list, interp=2)¶
Bases: detectron2.data.transforms.Augmentation
Randomly resize image to a target size in shape_list
__init__
(shape_list, interp=2)¶
Parameters
- shape_list – a list of shapes in (h, w)
- interp – PIL interpolation method
get_transform
(image)¶
class detectron2.data.transforms.
MinIoURandomCrop
(min_ious=0.1, 0.3, 0.5, 0.7, 0.9, min_crop_size=0.3, mode_trials=1000, crop_trials=50)¶
Bases: detectron2.data.transforms.Augmentation
Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.
Parameters
- min_ious (tuple) – minimum IoU threshold for all intersections with
- boxes (bounding) –
- min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,
- a >= min_crop_size ) (where) –
- mode_trials – number of trials for sampling min_ious threshold
- crop_trials – number of trials for sampling crop_size after cropping
get_transform
(image, boxes)¶
Call function to crop images and bounding boxes with minimum IoU constraint.
Parameters
boxes – ground truth boxes in (x1, y1, x2, y2) format