Skip to content

Current limitation on transforms #3224

Open
@voldemortX

Description

@voldemortX

The torchvision transforms now have 2 backends (PIL and Tensor), here are some functional mismatch between them and some may-be-useful features that neither of them support. Details are listed in transforms.py and functional.py.

Supported by PIL but not Tensor:

  1. Fill value for pad and random crop.
  2. Tensor images do not support many modes due to lack of metadata (maybe not possible to address). For instance, the adjust_* functions and autoaugment related functions.
  3. Tensors only support 3 interpolation modes (bilinear, linear, nearest).
  4. Tensors only support transformations on RGB images

Crop with crop size larger than the original image. #3297 Solved by #3333

Supported by Tensor but not PIL:

  1. Normalize (probably of no use for PIL images).
  2. Erase.

Supported by neither:

  1. adjust_gamma() and adjust_hue() do not support images with transparency.
  2. Subpixel translations. Affine Transform: why is translate a list[int] when the code suggests it could be floating point? #3293

Not supported by torchscript (mostly not possible given the current jit support):

  1. single value inputs in Pad(fill), RandomCrop(padding), Resize(size), RandomResizedCrop(size).
  2. PIL and Tensor conversions.
  3. Compose, RandomOrder, RandomChoice.
  4. Lambda.

It is just a draft, let me know if I forget anything. cc @vfdev-5 @datumbox

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions