Skip to content

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Sep 22, 2022

This PR:

  • Moves all types definitions on a single file, mirroring what we do for funtional
  • Fixes incorrect type definitions on existing transforms
  • Aligns the definitions of fill types across Transforms (excluding the actual default value which would break BC)

Copy link
Contributor Author

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some clarifications:

Comment on lines +111 to +114
if has_any(inputs, PIL.Image.Image, features.BoundingBox, features.Mask, features.Label):
raise TypeError(
f"{type(self).__name__}() does not support PIL images, bounding boxes, masks and plain labels."
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we previously missed PIL images which are not supported by this class.

@@ -223,7 +223,7 @@ def _copy_paste(
# This is something different to TF implementation we introduced here as
# originally the algorithm works on equal-sized data
# (for example, coming from LSJ data augmentations)
size1 = image.shape[-2:]
size1 = cast(List[int], image.shape[-2:])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy started to randomly complain once the image type was defined, so I choose the cast it. Happy to ignore for performance reasons.

@@ -52,7 +52,7 @@ def __init__(self, num_output_channels: Literal[1, 3] = 1) -> None:
super().__init__()
self.num_output_channels = num_output_channels

def _transform(self, inpt: DType, params: Dict[str, Any]) -> DType:
def _transform(self, inpt: ImageType, params: Dict[str, Any]) -> ImageType:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously defined incorrect type.

@@ -68,7 +68,7 @@ def forward(self, *inputs: Any) -> Any:

return super().forward(*inputs)

def _transform(self, inpt: Union[torch.Tensor, features._Feature], params: Dict[str, Any]) -> torch.Tensor:
def _transform(self, inpt: TensorImageType, params: Dict[str, Any]) -> torch.Tensor:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrectly defined features._Feature previously. It should have been features.Image

@datumbox datumbox requested review from vfdev-5 and pmeier September 22, 2022 11:08
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the approach. I think we can update fill type handling and avoid mutiple fill = self.fill[type(inpt)] if self.fill is not None else None -> fill = self.fill[type(inpt)]

@datumbox
Copy link
Contributor Author

I'm ok with the approach. I think we can update fill type handling and avoid mutiple fill = self.fill[type(inpt)] if self.fill is not None else None -> fill = self.fill[type(inpt)]

To do this, we need to force self.fill be a dictionary. It's possible but then the public fill and the self.fill will have different types and this will lead to different signatures between internal and public methods. Is this the direction you favour?

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Sep 23, 2022

It's possible but then the public fill and the self.fill

They will already different if you pass a scalar right ?

@datumbox
Copy link
Contributor Author

Sounds good, let me try your proposal to force None into the dictionary

_check_fill_arg(fill)

if isinstance(fill, dict):
return fill

return defaultdict(lambda: fill) # type: ignore[return-value]
return defaultdict(lambda: fill) # type: ignore[return-value, arg-type]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't do the arg-type ignore we get:

torchvision/prototype/transforms/_utils.py:42: error: Argument 1 to
"defaultdict" has incompatible type
"Callable[[], Union[Union[int, float, Sequence[int], Sequence[float], None], Dict[Type[Any], Union[int, float, Sequence[int], Sequence[float], None]]]]";
expected "Optional[Callable[[], Union[float, Sequence[float], None]]]" 
[arg-type]
        return defaultdict(lambda: fill)  # type: ignore[return-value]
                           ^
torchvision/prototype/transforms/_utils.py:42: note: Error code "arg-type" not covered by "type: ignore" comment
Found 1 error in 1 file (checked 222 source files)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth investigating. At some point mypy thinks that fill can also be an int. Although that is part of the correct type, it shouldn't be for the actual type. I guess we forgot to update something. @datumbox I can look into this if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I've also felt that this was worth noting. I have no idea if this is a failure or mypy or it's seeing something I dont. If you have any input, I would love to get an advice.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false alarm by mypy right before the line mypy thinks that fill has type

Union[builtins.float, typing.Sequence[builtins.float], None]

Note that for mypy float == Union[int, float] or to illustrate this more: for mypy the following is true while this will fail for Python isinstance(0, float). I'm not sure why they went for this, but generic numbers support is sketchy in general. This is why you don't see the int and Sequence[int] here because for mypy they are subtypes of float and Sequence[float] and therefore redundant.

However, at the time we arrive at the line, mypy suddenly thinks that fill can be a Dict again. I can't find the issue now, but this is longstanding. For some reason mypy doesn't forward the more concrete type to inner functions like this lambda.

Both

Suggested change
return defaultdict(lambda: fill) # type: ignore[return-value, arg-type]
return defaultdict(lambda: cast(FillType, fill))

and

Suggested change
return defaultdict(lambda: fill) # type: ignore[return-value, arg-type]
return defaultdict(cast(Callable[[], FillType], lambda: fill))

will eliminate both previously ignored errors, but will raise a new redundant-cast error 🙄

Thus, we have to ignore one either way and not having cast in there makes it more readable.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment inline other LGTM. I'm kinda worried that we need more and more of these workaround to avoid import cycles in the future though.

_check_fill_arg(fill)

if isinstance(fill, dict):
return fill

return defaultdict(lambda: fill) # type: ignore[return-value]
return defaultdict(lambda: fill) # type: ignore[return-value, arg-type]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth investigating. At some point mypy thinks that fill can also be an int. Although that is part of the correct type, it shouldn't be for the actual type. I guess we forgot to update something. @datumbox I can look into this if you want.

@datumbox
Copy link
Contributor Author

I'm kinda worried that we need more and more of these workaround to avoid import cycles in the future though.

There is an organic way to solve this, which is leaving the definitions as part of features module (this is the case now on main). Let me give that a try and if that's not what we want, I can revert the commit.

@datumbox datumbox force-pushed the prototype/transform_types branch from a0e1673 to 1d6d12b Compare September 23, 2022 11:17
Comment on lines +235 to +236
InputType = Union[torch.Tensor, PIL.Image.Image, _Feature]
InputTypeJIT = torch.Tensor
Copy link
Contributor Author

@datumbox datumbox Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously DType.

Comment on lines +12 to +13
FillType = Union[int, float, Sequence[int], Sequence[float], None]
FillTypeJIT = Union[int, float, List[float], None]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could move to another file if we don't want it here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ok there. I would prefer to have everything in a _typing.py file, but as you explained this will give us circular imports due to the method signatures. Thus, let's take the way of the least resistance.

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot better now, thanks a lot @datumbox! One question inline. I'll have a look at the mypy ignore now.

@datumbox datumbox force-pushed the prototype/transform_types branch from 1d6d12b to ad60b50 Compare September 23, 2022 11:22
@datumbox
Copy link
Contributor Author

There seems to be a breakage from TorchData on our tests.

@datumbox datumbox force-pushed the prototype/transform_types branch from 16e00b0 to f3cefdf Compare September 23, 2022 12:22
@datumbox
Copy link
Contributor Author

The tests are now passing. I'll restore the TorchData tests and merge.

@datumbox datumbox merged commit f725901 into pytorch:main Sep 23, 2022
@datumbox datumbox deleted the prototype/transform_types branch September 23, 2022 12:42
facebook-github-bot pushed a commit that referenced this pull request Sep 29, 2022
Summary:
* Align and Clean up transform types

* Move type definitions to `_utils.py`

* fixing error message on tests

* Apply code review suggestions

* Centralizing types and switching to always getting dicts.

* Fixing linter

* Refactoring typing definitions.

* Remove relative imports.

* Reuse type.

* Temporarily remove the TorchData tests.

* Restore the TorchData tests.

Reviewed By: YosuaMichael

Differential Revision: D39885420

fbshipit-source-id: bcc71cdea6dfd797eb8ac27ee9cb49deaf4dd95f

Co-authored-by: vfdev <[email protected]>
Co-authored-by: vfdev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants