Skip to content

add automatic feature type dispatch to functional transforms #5323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pmeier
Copy link
Collaborator

@pmeier pmeier commented Jan 31, 2022

This PR adds support for dispatching different Feature's based on their type to the functional API. This effectively separates the API into three parts:

  • low level: this section includes the kernels that only take vanilla tensors and the user is responsible to pass input of the correct type. This was explained and added in readd functional transforms API to prototype #5295.
  • mid level: These functions take a single Feature and dispatch to the respective low level kernel. They also handle converting the meta into the required format as well as passing additional meta to the kernel. This part of the API is not user facing and should not be called manually.
  • high level: These functions take all Feature's and dispatch them to the mid level API.

I'll illustrate this through an example with inline comments.


Implementing the dispatch logic, i.e. the mid and high level API explained above, involves a lot of boilerplate code. Thus, I've opted to autogenerate it from the kernels and a minimal accompanying configuration file. I'm well aware that autogenerated code is harder on the user so I tried to strike a balance. I wanted to avoid having only a configuration file for the user to see, because that makes it a lot harder to implement transformations outside of torchvision. At the same time I wanted to avoid manually written, error-prone code and documentation that we need to maintain.

My solution to this, is to track the generated file in version control. This way, the full implementation is out in the open for everyone to see, but we don't have to manually write it. Let me know what you think.

@facebook-github-bot
Copy link

facebook-github-bot commented Jan 31, 2022

💊 CI failures summary and remediations

As of commit c7785b0 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 1, 2022

Dispatcher approach looks interesting and rather complicated to me.
Due to a meta-language introduced in yml + a script generating final python script, I wonder how quickly we can solve future issues. Basically, the question is how developer-friendly scripts/regenerate_transform_dispatch.py?

I also wonder why we could not use class inheritence ? Looks like majority of the ops are

class GenericTransform:

def _op_image(self, input: features.Image, **kwargs) -> features.Image:
    output = self.op_image(input, **kwargs)

    return features.Image.new_like(input, output)

def _op_bounding_box(self, input: features.BoundingBox, **kwargs) -> features.BoundingBox:
    intermediate_format = BoundingBoxFormat.XYXY
    converted_input = F.convert_bounding_box_format(input, old_format=input.format, new_format=intermediate_format)

    output = self.op_bbox(converted_input, image_size=input.image_size, **kwargs)
    output = F.convert_bounding_box_format(output, old_format=intermediate_format, new_format=input.format)

    return features.BoundingBox.new_like(input, output)

def _op_segm_mask(self, input, **kwargs):
    output = self.op_segm_mask(input, **kwargs)

    return features.SegmentationMask.new_like(input, output)    

Using that we may reduce the size of the codebase. For specific ops, we can introduce specific classes. There could be more transparency, IMO.
What do you think ?

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 1, 2022

@vfdev

Due to a meta-language introduced in yml + a script generating final python script, I wonder how quickly we can solve future issues. Basically, the question is how developer-friendly scripts/regenerate_transform_dispatch.py?

Let's split "developers" into three roles: downstream library maintainers, contributors to torchvision and torchvision maintainers.

  • downstream library maintainers: since the generated file is tracked in git, they don't need to know about the auto generation. If they want to add a new transform of their own, they can simply use the contents of _dispatch.py as templates and alter them to fit their needs.
  • contributors to torchvision: contributing a new kernel is almost the same as before. The only extra step that one has to do now, is to add an entry to dispatch.yaml file. I feel like this is on par if not easier than to adapt the dispatcher manually.
  • torchvision maintainers: IMO this is the only "controversial" bit. The code is dense and is currently lacking comments. Even when the documentation is sufficient, the implementation is still complex and it will take time to digest.

For me this is still an advantage for two main reasons:

  1. Given that we probably will add quite a few kernels in the near future to support bounding boxes, segmentation masks, and so on, auto generating the dispatch will pay off fast in terms of time invested.
  2. The documentation of the dispatchers currently is just a dummy, but in the future it should link all kernels and all extra processing that happens before the kernel is called. If we autogenerate the dispatch, this information is already available and we could also autogenerate the docstrings. This would keep the documentation consistent across the board.

I also wonder why we could not use class inheritence ?

The idea was to have the dispatch functionality available at the functional level. For example, F.horizontal_flip(input) should work regardless if input is an Image or a BoundingBox. With your approach we would lose that.

There could be more transparency, IMO.

Transparency with respect to what? How the dispatch works?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Feb 3, 2022

The idea was to have the dispatch functionality available at the functional level. For example, F.horizontal_flip(input) should work regardless if input is an Image or a BoundingBox. With your approach we would lose that.

In this code F.horizontal_flip is not a function object but F.utils.Dispatcher (if you do type(F.horizontal_flip)). Thus basically you can do the same thing with inheritence

Example with inheritance
from torchvision.prototype.transforms.functional import *
from torchvision.prototype import features
import torchvision.prototype.transforms.functional as F


class Dispatcher:

    def __init__(self, op_image, op_bbox, op_segm_mask):
        self.op_image = op_image
        self.op_bbox = op_bbox
        self.op_segm_mask = op_segm_mask

    def __call__(self, input, **kwargs):
        if isinstance(input, features.Image):
            return self._op_image(input, **kwargs)
        if isinstance(input, features.BoundingBox):
            return self._op_bbox(input, **kwargs)
        if isinstance(input, features.SegmentationMask):
            return self._op_segm_mask(input, **kwargs)

    def _op_image(self, input, **kwargs):
        print("Call Dispatcher._op_image")
        output = self.op_image(input, **kwargs)
        return output

    def _op_bbox(self, input, **kwargs):
        print("Call Dispatcher._op_bbox")
        output = self.op_bbox(input, **kwargs)
        return output

    def _op_segm_mask(self, input, **kwargs):
        print("Call Dispatcher._op_bbox")
        output = self.op_segm_mask(input, **kwargs)
        return output

#### functional ops module

def resize_image(*args, **kwargs):
    print("Call F.resize_image")
    return None


def resize_bbox(*args, **kwargs):
    print("Call F.resize_bbox")
    return None


def resize_segm_mask(*args, **kwargs):
    print("Call F.resize_segm_mask")
    return None

####

class ResizeFunctionalOp(Dispatcher):

    def __init__(self):
        super().__init__(resize_image, resize_bbox, resize_segm_mask)


resize = ResizeFunctionalOp()

print("---")

import torch

image_item = features.Image(torch.rand(3, 32, 32))
bbox_item = features.BoundingBox(torch.randint(0, 32, size=(10, 4)), image_size=(32, 32), format="xyxy")
segm_mask_item = features.SegmentationMask(torch.rand(3, 32, 32))

resize(image_item)
print("---")

resize(bbox_item)
print("---")

resize(segm_mask_item)
print("---")

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 3, 2022

@vfdev-5 Thanks for the clarification, I misunderstood what you were referring to. That is indeed a possibility that would achieve feature parity with the upside of eliminating the __torch_function__ complexity. Let me implement a version of that in this PR.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid autogenerating code because it adds a lot of extra complexity. Inheritance sounds an idea worth investigating.

Also perhaps if we can make all kernels to follow the same pattern:

@MY_METHOD.implements(features.Image, pil_kernel=_F.MY_METHOD)
def _MY_METHOD_image(input: features.Image, *, ...) -> features.Image:
    output = F.MY_METHOD(input, ...)

    return features.Image.new_like(input, output)

Then we could eliminate also their method bodies:

@MY_METHOD.implements(features.Image, pil_kernel=_F.MY_METHOD, tensor_kernel= F.MY_METHOD)
def _MY_METHOD_image(input: features.Image, *, ...) -> features.Image: pass

If that's true, then perhaps we don't even need those intermediate per type kernels and all we need is to define these directly as arguments to the @Dispatcher annotation. I don't know if that's possible, let me know your thoughts.

Copy link
Collaborator Author

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completely removed the auto-generation and moved the dispatchers to the respective kernels. Please have another look.

@pmeier pmeier requested review from vfdev-5 and datumbox February 9, 2022 22:01
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmeier thanks for the updates, looks good to me.
Left few nits and comments.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the new proposal. Prior merging we need to close 3 things:

  1. Clarify what happened to the default values in all high-level kernels and why the low-level kernels retain them? I would expect the opposite.
  2. I propose to remove inplace from all kernels to align with the general policy of not modifying input. This is not related to your API and admittedly is something I wan't convinced about previously (see Adding Mixup and Cutmix #4379 (comment); @fmassa I hope you feel vindicated).
  3. Resolve the conflict with main branch.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments, since you pushed changes right before I submit my previous.

@pmeier
Copy link
Collaborator Author

pmeier commented Feb 10, 2022

After some offline discussion with @datumbox we are not sure how to handle the documentation of the dispatchers. In the case regular case it is fine to just link the low-level kernels. But there are dispatchers that do not call the low-level kernels directly but rather an intermediate layer.

This intermediate layer is very thin and only performs some meta data access and maybe parameter mapping. For example, the low level kernel horizontal_flip_bounding_box needs format and image_size since they are not available on vanilla tensors. The dispatcher however has access to a BoundingBox feature which stores them. Thus, we have this intermediate function:

def _horizontal_flip_bounding_box(input: features.BoundingBox) -> torch.Tensor:
    return kernels.horizontal_flip_bounding_box(input, format=input.format, image_size=input.image_size)

This is fine from a functionality standpoint, but we need to document this somehow. Otherwise, if we just link kernels.horizontal_flip_bounding_box, the user might (rightfully) assume that they need to pass format and image_size explicitly.

We came up with two solutions:

  1. Make the intermediate layer public, document its behavior and link it from the dispatcher.
  2. Put the documentation of the intermediate layer into the documentation of the dispatcher and keep it private.

Thoughts?

@datumbox
Copy link
Contributor

datumbox commented Feb 10, 2022

I'm in favour of option 1 because it provides a clean policy and reuses standard documentation tools in python. The policy is simple:

  • The dispatcher documentation always links to the subscribed kernels for info on arguments.
  • When the intermediate kernel IS the subscribed kernel then we add the documentation of its parameters there. We also describe it uses the low-level kernel and provide a link.
  • When the low-level kernel is subscribed, then we juts link to it as above.
  • No special cases, no need for extra handling.

Option 2 has its pros (it doesn't make public yet another kernel to the users) but I feel that making it private you hide information from the users and they have to figure out what is going on and why the low-level kernel isn't used directly. To answer that you need to bring more documentation to the high-level kernel, describe parameters etc. This can bring disconnection between the documentation on the high-level and the params of the intermediate.

Another option might be to remove the dispatcher all together for cases where we can just seamlessly use it and instead do the handling inside the body of the high-level kernel. This way we use the dispatcher for 90% of easy dispatches and for the difficult 10% we do it manually. This avoid making the dispatcher too complex for corner-cases while at the same time eliminating unnecessary boilerplate code for the majority of the cases.

Edit: Also I think that the _resize_bounding_box kernel is a much better example to look at because is clearly indicated there is more massaging needed sometimes before calling the low-level kernel.

bounding_box = bounding_box.view(-1, 4)
bounding_box[:, [0, 2]] = image_size[1] - bounding_box[:, [0, 2]]
return bounding_box.view(shape)
def _horizontal_flip_bounding_box(input: features.BoundingBox) -> torch.Tensor:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmeier why this one returns torch.Tensor and in _resize_bounding_box output is features.BoundingBox and wrapping is explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might be a bug actually. IT should return a BoundingBox not a Tensor.

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me, only 2 nits and we could consider it as final candidate for solution. I would recommend merging it to the feature branch (avoid making this PR longer) and follow up with alternatives on top of this on new PRs.

@pmeier pmeier merged commit ce289d9 into pytorch:revamp-prototype-features-transforms Feb 10, 2022
@pmeier pmeier deleted the transforms/dispatch branch February 10, 2022 18:02
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmeier I noticed a few more improvements we should bring on a new PR. Please check below and let me know what you think.

return cls.new_like(args[0], output, dtype=output.dtype, device=output.device)
else:
return output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the right place to put the comment but Github won't let me comment on the right spot. I think Feature is exposed publicly on the __init__.py file of the area. Given it's an internal class (unlike Image, BoundingBox etc), I think it's worth keeping private.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this does not need be supported in the first version, Feature should not be an internal class. We want users to be able to create their own custom features if it is useful for their use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be private for now.

datumbox pushed a commit that referenced this pull request Feb 11, 2022
* revamp prototype features (#5283)

* remove decoding from prototype datasets (#5287)

* remove decoder from prototype datasets

* remove unused imports

* cleanup

* fix readme

* use OneHotLabel in SEMEION

* improve voc implementation

* revert unrelated changes

* fix semeion mock data

* fix pcam

* readd functional transforms API to prototype (#5295)

* readd functional transforms

* cleanup

* add missing imports

* remove __torch_function__ dispatch

* readd repr

* readd empty line

* add test for scriptability

* remove function copy

* change import from functional tensor transforms to just functional

* fix import

* fix test

* fix prototype features and functional transforms after review (#5377)

* fix prototype functional transforms after review

* address features review

* make mypy more strict on prototype features

* make mypy more strict for prototype transforms

* fix annotation

* fix kernel tests

* add automatic feature type dispatch to functional transforms (#5323)

* add auto dispatch

* fix missing arguments error message

* remove pil kernel for erase

* automate feature specific parameter detection

* fix typos

* cleanup dispatcher call

* remove __torch_function__ from transform dispatch

* remove auto-generation

* revert unrelated changes

* remove implements decorator

* change register parameter order

* change order of transforms for readability

* add documentation for __torch_function__

* fix mypy

* inline check for support

* refactor kernel registering process

* refactor dispatch to be a regular decorator

* split kernels and dispatchers

* remove sentinels

* replace pass with ...

* appease mypy

* make single kernel dispatchers more concise

* make dispatcher signatures more generic

* make kernel checking more strict

* revert doc changes

* address Franciscos comments

* remove inplace

* rename kernel test module

* fix inplace

* remove special casing for pil and vanilla tensors

* address comments

* update docs

* cleanup features / transforms feature branch (#5406)

* mark candidates for removal

* align signature of resize_bounding_box with corresponding image kernel

* fix documentation of Feature

* remove interpolation mode and antialias option from resize_segmentation_mask

* remove or privatize functionality in features / datasets / transforms
facebook-github-bot pushed a commit that referenced this pull request Feb 16, 2022
Summary:
* revamp prototype features (#5283)

* remove decoding from prototype datasets (#5287)

* remove decoder from prototype datasets

* remove unused imports

* cleanup

* fix readme

* use OneHotLabel in SEMEION

* improve voc implementation

* revert unrelated changes

* fix semeion mock data

* fix pcam

* readd functional transforms API to prototype (#5295)

* readd functional transforms

* cleanup

* add missing imports

* remove __torch_function__ dispatch

* readd repr

* readd empty line

* add test for scriptability

* remove function copy

* change import from functional tensor transforms to just functional

* fix import

* fix test

* fix prototype features and functional transforms after review (#5377)

* fix prototype functional transforms after review

* address features review

* make mypy more strict on prototype features

* make mypy more strict for prototype transforms

* fix annotation

* fix kernel tests

* add automatic feature type dispatch to functional transforms (#5323)

* add auto dispatch

* fix missing arguments error message

* remove pil kernel for erase

* automate feature specific parameter detection

* fix typos

* cleanup dispatcher call

* remove __torch_function__ from transform dispatch

* remove auto-generation

* revert unrelated changes

* remove implements decorator

* change register parameter order

* change order of transforms for readability

* add documentation for __torch_function__

* fix mypy

* inline check for support

* refactor kernel registering process

* refactor dispatch to be a regular decorator

* split kernels and dispatchers

* remove sentinels

* replace pass with ...

* appease mypy

* make single kernel dispatchers more concise

* make dispatcher signatures more generic

* make kernel checking more strict

* revert doc changes

* address Franciscos comments

* remove inplace

* rename kernel test module

* fix inplace

* remove special casing for pil and vanilla tensors

* address comments

* update docs

* cleanup features / transforms feature branch (#5406)

* mark candidates for removal

* align signature of resize_bounding_box with corresponding image kernel

* fix documentation of Feature

* remove interpolation mode and antialias option from resize_segmentation_mask

* remove or privatize functionality in features / datasets / transforms

Reviewed By: sallysyw

Differential Revision: D34265747

fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants