-
Notifications
You must be signed in to change notification settings - Fork 7.1k
add automatic feature type dispatch to functional transforms #5323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add automatic feature type dispatch to functional transforms #5323
Conversation
💊 CI failures summary and remediationsAs of commit c7785b0 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
Dispatcher approach looks interesting and rather complicated to me. I also wonder why we could not use class inheritence ? Looks like majority of the ops are class GenericTransform:
def _op_image(self, input: features.Image, **kwargs) -> features.Image:
output = self.op_image(input, **kwargs)
return features.Image.new_like(input, output)
def _op_bounding_box(self, input: features.BoundingBox, **kwargs) -> features.BoundingBox:
intermediate_format = BoundingBoxFormat.XYXY
converted_input = F.convert_bounding_box_format(input, old_format=input.format, new_format=intermediate_format)
output = self.op_bbox(converted_input, image_size=input.image_size, **kwargs)
output = F.convert_bounding_box_format(output, old_format=intermediate_format, new_format=input.format)
return features.BoundingBox.new_like(input, output)
def _op_segm_mask(self, input, **kwargs):
output = self.op_segm_mask(input, **kwargs)
return features.SegmentationMask.new_like(input, output) Using that we may reduce the size of the codebase. For specific ops, we can introduce specific classes. There could be more transparency, IMO. |
Let's split "developers" into three roles: downstream library maintainers, contributors to torchvision and torchvision maintainers.
For me this is still an advantage for two main reasons:
The idea was to have the dispatch functionality available at the functional level. For example,
Transparency with respect to what? How the dispatch works? |
In this code Example with inheritancefrom torchvision.prototype.transforms.functional import *
from torchvision.prototype import features
import torchvision.prototype.transforms.functional as F
class Dispatcher:
def __init__(self, op_image, op_bbox, op_segm_mask):
self.op_image = op_image
self.op_bbox = op_bbox
self.op_segm_mask = op_segm_mask
def __call__(self, input, **kwargs):
if isinstance(input, features.Image):
return self._op_image(input, **kwargs)
if isinstance(input, features.BoundingBox):
return self._op_bbox(input, **kwargs)
if isinstance(input, features.SegmentationMask):
return self._op_segm_mask(input, **kwargs)
def _op_image(self, input, **kwargs):
print("Call Dispatcher._op_image")
output = self.op_image(input, **kwargs)
return output
def _op_bbox(self, input, **kwargs):
print("Call Dispatcher._op_bbox")
output = self.op_bbox(input, **kwargs)
return output
def _op_segm_mask(self, input, **kwargs):
print("Call Dispatcher._op_bbox")
output = self.op_segm_mask(input, **kwargs)
return output
#### functional ops module
def resize_image(*args, **kwargs):
print("Call F.resize_image")
return None
def resize_bbox(*args, **kwargs):
print("Call F.resize_bbox")
return None
def resize_segm_mask(*args, **kwargs):
print("Call F.resize_segm_mask")
return None
####
class ResizeFunctionalOp(Dispatcher):
def __init__(self):
super().__init__(resize_image, resize_bbox, resize_segm_mask)
resize = ResizeFunctionalOp()
print("---")
import torch
image_item = features.Image(torch.rand(3, 32, 32))
bbox_item = features.BoundingBox(torch.randint(0, 32, size=(10, 4)), image_size=(32, 32), format="xyxy")
segm_mask_item = features.SegmentationMask(torch.rand(3, 32, 32))
resize(image_item)
print("---")
resize(bbox_item)
print("---")
resize(segm_mask_item)
print("---") |
@vfdev-5 Thanks for the clarification, I misunderstood what you were referring to. That is indeed a possibility that would achieve feature parity with the upside of eliminating the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid autogenerating code because it adds a lot of extra complexity. Inheritance sounds an idea worth investigating.
Also perhaps if we can make all kernels to follow the same pattern:
@MY_METHOD.implements(features.Image, pil_kernel=_F.MY_METHOD)
def _MY_METHOD_image(input: features.Image, *, ...) -> features.Image:
output = F.MY_METHOD(input, ...)
return features.Image.new_like(input, output)
Then we could eliminate also their method bodies:
@MY_METHOD.implements(features.Image, pil_kernel=_F.MY_METHOD, tensor_kernel= F.MY_METHOD)
def _MY_METHOD_image(input: features.Image, *, ...) -> features.Image: pass
If that's true, then perhaps we don't even need those intermediate per type kernels and all we need is to define these directly as arguments to the @Dispatcher
annotation. I don't know if that's possible, let me know your thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completely removed the auto-generation and moved the dispatchers to the respective kernels. Please have another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmeier thanks for the updates, looks good to me.
Left few nits and comments.
Conflicts: torchvision/prototype/transforms/functional/_geometry.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love the new proposal. Prior merging we need to close 3 things:
- Clarify what happened to the default values in all high-level kernels and why the low-level kernels retain them? I would expect the opposite.
- I propose to remove
inplace
from all kernels to align with the general policy of not modifying input. This is not related to your API and admittedly is something I wan't convinced about previously (see Adding Mixup and Cutmix #4379 (comment); @fmassa I hope you feel vindicated). - Resolve the conflict with main branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More comments, since you pushed changes right before I submit my previous.
After some offline discussion with @datumbox we are not sure how to handle the documentation of the dispatchers. In the case regular case it is fine to just link the low-level kernels. But there are dispatchers that do not call the low-level kernels directly but rather an intermediate layer. This intermediate layer is very thin and only performs some meta data access and maybe parameter mapping. For example, the low level kernel def _horizontal_flip_bounding_box(input: features.BoundingBox) -> torch.Tensor:
return kernels.horizontal_flip_bounding_box(input, format=input.format, image_size=input.image_size) This is fine from a functionality standpoint, but we need to document this somehow. Otherwise, if we just link We came up with two solutions:
Thoughts? |
I'm in favour of option 1 because it provides a clean policy and reuses standard documentation tools in python. The policy is simple:
Option 2 has its pros (it doesn't make public yet another kernel to the users) but I feel that making it private you hide information from the users and they have to figure out what is going on and why the low-level kernel isn't used directly. To answer that you need to bring more documentation to the high-level kernel, describe parameters etc. This can bring disconnection between the documentation on the high-level and the params of the intermediate. Another option might be to remove the dispatcher all together for cases where we can just seamlessly use it and instead do the handling inside the body of the high-level kernel. This way we use the dispatcher for 90% of easy dispatches and for the difficult 10% we do it manually. This avoid making the dispatcher too complex for corner-cases while at the same time eliminating unnecessary boilerplate code for the majority of the cases. Edit: Also I think that the |
bounding_box = bounding_box.view(-1, 4) | ||
bounding_box[:, [0, 2]] = image_size[1] - bounding_box[:, [0, 2]] | ||
return bounding_box.view(shape) | ||
def _horizontal_flip_bounding_box(input: features.BoundingBox) -> torch.Tensor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmeier why this one returns torch.Tensor and in _resize_bounding_box
output is features.BoundingBox and wrapping is explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that might be a bug actually. IT should return a BoundingBox not a Tensor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me, only 2 nits and we could consider it as final candidate for solution. I would recommend merging it to the feature branch (avoid making this PR longer) and follow up with alternatives on top of this on new PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pmeier I noticed a few more improvements we should bring on a new PR. Please check below and let me know what you think.
return cls.new_like(args[0], output, dtype=output.dtype, device=output.device) | ||
else: | ||
return output | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the right place to put the comment but Github won't let me comment on the right spot. I think Feature
is exposed publicly on the __init__.py
file of the area. Given it's an internal class (unlike Image
, BoundingBox
etc), I think it's worth keeping private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this does not need be supported in the first version, Feature
should not be an internal class. We want users to be able to create their own custom features if it is useful for their use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be private for now.
* revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms
Summary: * revamp prototype features (#5283) * remove decoding from prototype datasets (#5287) * remove decoder from prototype datasets * remove unused imports * cleanup * fix readme * use OneHotLabel in SEMEION * improve voc implementation * revert unrelated changes * fix semeion mock data * fix pcam * readd functional transforms API to prototype (#5295) * readd functional transforms * cleanup * add missing imports * remove __torch_function__ dispatch * readd repr * readd empty line * add test for scriptability * remove function copy * change import from functional tensor transforms to just functional * fix import * fix test * fix prototype features and functional transforms after review (#5377) * fix prototype functional transforms after review * address features review * make mypy more strict on prototype features * make mypy more strict for prototype transforms * fix annotation * fix kernel tests * add automatic feature type dispatch to functional transforms (#5323) * add auto dispatch * fix missing arguments error message * remove pil kernel for erase * automate feature specific parameter detection * fix typos * cleanup dispatcher call * remove __torch_function__ from transform dispatch * remove auto-generation * revert unrelated changes * remove implements decorator * change register parameter order * change order of transforms for readability * add documentation for __torch_function__ * fix mypy * inline check for support * refactor kernel registering process * refactor dispatch to be a regular decorator * split kernels and dispatchers * remove sentinels * replace pass with ... * appease mypy * make single kernel dispatchers more concise * make dispatcher signatures more generic * make kernel checking more strict * revert doc changes * address Franciscos comments * remove inplace * rename kernel test module * fix inplace * remove special casing for pil and vanilla tensors * address comments * update docs * cleanup features / transforms feature branch (#5406) * mark candidates for removal * align signature of resize_bounding_box with corresponding image kernel * fix documentation of Feature * remove interpolation mode and antialias option from resize_segmentation_mask * remove or privatize functionality in features / datasets / transforms Reviewed By: sallysyw Differential Revision: D34265747 fbshipit-source-id: 569ed9f74ac0c026391767c3b422ca0147f55ead
This PR adds support for dispatching different
Feature
's based on their type to the functional API. This effectively separates the API into three parts:Feature
and dispatch to the respective low level kernel. They also handle converting the meta into the required format as well as passing additional meta to the kernel. This part of the API is not user facing and should not be called manually.Feature
's and dispatch them to the mid level API.I'll illustrate this through an example with inline comments.
Implementing the dispatch logic, i.e. the mid and high level API explained above, involves a lot of boilerplate code. Thus, I've opted to autogenerate it from the kernels and a minimal accompanying configuration file. I'm well aware that autogenerated code is harder on the user so I tried to strike a balance. I wanted to avoid having only a configuration file for the user to see, because that makes it a lot harder to implement transformations outside of torchvision. At the same time I wanted to avoid manually written, error-prone code and documentation that we need to maintain.
My solution to this, is to track the generated file in version control. This way, the full implementation is out in the open for everyone to see, but we don't have to manually write it. Let me know what you think.