Skip to content

Minor updates in autoaugment, augment docstring v2 #7317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions torchvision/transforms/v2/_augment.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,22 @@


class RandomErasing(_RandomApplyTransform):
"""[BETA] Randomly selects a rectangle region in the input image or video and erases its pixels.
"""[BETA] Randomly select a rectangle region in the input image or video and erase its pixels.

.. betastatus:: RandomErasing transform

This transform does not support PIL Image.
'Random Erasing Data Augmentation' by Zhong et al. See https://arxiv.org/abs/1708.04896

Args:
p: probability that the random erasing operation will be performed.
scale: range of proportion of erased area against input image.
ratio: range of aspect ratio of erased area.
value: erasing value. Default is 0. If a single int, it is used to
p (float, optional): probability that the random erasing operation will be performed.
scale (tuple of float, optional): range of proportion of erased area against input image.
ratio (tuple of float, optional): range of aspect ratio of erased area.
value (number or tuple of numbers): erasing value. Default is 0. If a single int, it is used to
erase all pixels. If a tuple of length 3, it is used to erase
R, G, B channels respectively.
If a str of 'random', erasing each pixel with random values.
inplace: boolean to make this transform inplace. Default set to False.
inplace (bool, optional): boolean to make this transform inplace. Default set to False.

Returns:
Erased input.
Expand Down
44 changes: 26 additions & 18 deletions torchvision/transforms/v2/_auto_augment.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,14 +167,16 @@ class AutoAugment(_AutoAugmentBase):

.. betastatus:: AutoAugment transform

If the image is torch Tensor, it should be of type torch.uint8, and it is expected
This transformation works on images and videos only.

If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
If img is PIL Image, it is expected to be in mode "L" or "RGB".

Args:
policy (AutoAugmentPolicy): Desired policy enum defined by
policy (AutoAugmentPolicy, optional): Desired policy enum defined by
:class:`torchvision.transforms.autoaugment.AutoAugmentPolicy`. Default is ``AutoAugmentPolicy.IMAGENET``.
interpolation (InterpolationMode): Desired interpolation enum defined by
interpolation (InterpolationMode, optional): Desired interpolation enum defined by
:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
fill (sequence or number, optional): Pixel fill value for the area outside the transformed
Expand Down Expand Up @@ -342,15 +344,17 @@ class RandAugment(_AutoAugmentBase):

.. betastatus:: RandAugment transform

If the image is torch Tensor, it should be of type torch.uint8, and it is expected
This transformation works on images and videos only.

If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
If img is PIL Image, it is expected to be in mode "L" or "RGB".

Args:
num_ops (int): Number of augmentation transformations to apply sequentially.
magnitude (int): Magnitude for all the transformations.
num_magnitude_bins (int): The number of different magnitude values.
interpolation (InterpolationMode): Desired interpolation enum defined by
num_ops (int, optional): Number of augmentation transformations to apply sequentially.
magnitude (int, optional): Magnitude for all the transformations.
num_magnitude_bins (int, optional): The number of different magnitude values.
interpolation (InterpolationMode, optional): Desired interpolation enum defined by
:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
fill (sequence or number, optional): Pixel fill value for the area outside the transformed
Expand Down Expand Up @@ -423,13 +427,15 @@ class TrivialAugmentWide(_AutoAugmentBase):

.. betastatus:: TrivialAugmentWide transform

If the image is torch Tensor, it should be of type torch.uint8, and it is expected
This transformation works on images and videos only.

If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
If img is PIL Image, it is expected to be in mode "L" or "RGB".

Args:
num_magnitude_bins (int): The number of different magnitude values.
interpolation (InterpolationMode): Desired interpolation enum defined by
num_magnitude_bins (int, optional): The number of different magnitude values.
interpolation (InterpolationMode, optional): Desired interpolation enum defined by
:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
fill (sequence or number, optional): Pixel fill value for the area outside the transformed
Expand Down Expand Up @@ -492,18 +498,20 @@ class AugMix(_AutoAugmentBase):

.. betastatus:: AugMix transform

If the image is torch Tensor, it should be of type torch.uint8, and it is expected
This transformation works on images and videos only.

If the input is :class:`torch.Tensor`, it should be of type ``torch.uint8``, and it is expected
to have [..., 1 or 3, H, W] shape, where ... means an arbitrary number of leading dimensions.
If img is PIL Image, it is expected to be in mode "L" or "RGB".

Args:
severity (int): The severity of base augmentation operators. Default is ``3``.
mixture_width (int): The number of augmentation chains. Default is ``3``.
chain_depth (int): The depth of augmentation chains. A negative value denotes stochastic depth sampled from the interval [1, 3].
severity (int, optional): The severity of base augmentation operators. Default is ``3``.
mixture_width (int, optional): The number of augmentation chains. Default is ``3``.
chain_depth (int, optional): The depth of augmentation chains. A negative value denotes stochastic depth sampled from the interval [1, 3].
Default is ``-1``.
alpha (float): The hyperparameter for the probability distributions. Default is ``1.0``.
all_ops (bool): Use all operations (including brightness, contrast, color and sharpness). Default is ``True``.
interpolation (InterpolationMode): Desired interpolation enum defined by
alpha (float, optional): The hyperparameter for the probability distributions. Default is ``1.0``.
all_ops (bool, optional): Use all operations (including brightness, contrast, color and sharpness). Default is ``True``.
interpolation (InterpolationMode, optional): Desired interpolation enum defined by
:class:`torchvision.transforms.InterpolationMode`. Default is ``InterpolationMode.NEAREST``.
If input is Tensor, only ``InterpolationMode.NEAREST``, ``InterpolationMode.BILINEAR`` are supported.
fill (sequence or number, optional): Pixel fill value for the area outside the transformed
Expand Down