Use correct annotations for prototype kernels? #6668

pmeier · 2022-09-29T18:53:07Z

Currently, we have a lot of functionals that annotate a parameter with List[int]

vision/torchvision/prototype/transforms/functional/_geometry.py

Lines 106 to 108 in ae83c9f

    
           def resize_image_tensor( 
        
               image: torch.Tensor, 
        
               size: List[int],

This annotation is wrong and the function actually handles Union[int, Tuple[int], Tuple[int, int], List[int]], i.e. integers as well as tuples and lists of one or two integers. Our preferred annotation for this is Union[int, Tuple[int, int]].

The wrong annotation is a remnant from a time, where JIT didn't support Union. The eager kernel supported integers all along. To emulate this behavior, we allowed sequences of one element to mean the same thing.

Our preferred annotation Union[int, Tuple[int, int]] communicates our intention a lot better: you can either pass a an integer or two of them. Since this is Python, one can of course use a list or any sequence of two elements as duck type for Tuple[int, int]. However, this won't work with JIT anymore. This discussion came up, because we trusted the wrong annotations while developing, which lead to bugs (#6636). Plus, in #6646 we added some metadata computation to the kernels and there was a discussion about if we should have different input List[int] vs. output types Tuple[int, int]. With our preferred annotation, the input type is still more diverse Union[int, Tuple[int, int]], but that is a common idiom to allow the input to be less precise than the output.

Our prototype functional API is separated into two things: kernels and dispatchers. The actual computation is happening in the former, while the latter only dispatches the input to the right kernel based on its type. In the stable API has only the dispatchers, which gives us the freedom to choose what we want to do with the kernels without BC concerns.

Thus, before we roll-out, we need to decide if we want to continue with the wrong annotations when they are no longer needed or break consistency between kernels and dispatchers.

Status quo

Currently, we are using the same wrong annotation for kernel and dispatcher. That means the elaborate parsing of the parameter has to happen on the kernel:

from typing import List, Tuple, Union
import torch.jit

InputSizeType = Union[int, Tuple[int, int]]
InputSizeTypeJIT = List[int]
OutputSizeType = Tuple[int, int]


def parse_input_size(size: InputSizeTypeJIT) -> OutputSizeType:
    if isinstance(size, int):
        height = width = size
    elif len(size) == 1:
        height = width = size[0]
    elif len(size) == 2:
        height, width = size
    else:
        raise ValueError
    return height, width


def eager_kernel1(size: InputSizeTypeJIT) -> OutputSizeType:
    height, width = parse_input_size(size)

    # some computation that possibly changes height and width

    return height, width


scripted_kernel1 = torch.jit.script(eager_kernel1)


def eager_dispatcher1(size: InputSizeTypeJIT) -> OutputSizeType:
    return eager_kernel1(size)


scripted_dispatcher1 = torch.jit.script(eager_dispatcher1)

Proposal

We could also annotate the kernel correctly

def eager_kernel2(size: InputSizeType) -> OutputSizeType:
    if isinstance(size, int):
        height = width = size
    else:
        height, width = size

    # some computation that possibly changes height and width

    return height, width

and perform the parsing only on the dispatchers:

def eager_dispatcher2(size: InputSizeTypeJIT) -> OutputSizeType:
    return eager_kernel2(parse_input_size(size))


scripted_dispatcher2 = torch.jit.script(eager_dispatcher2)

The reason we keep the wrong annotations on the dispatcher, is that annotations play a role in BC in the context of JIT: imagine someone scripted our operators and used them from C++. They passed the equivalent of a list of two elements and the operator worked. If we now change the annotation, the operator can still be scripted, but the old inputs will no longer work.

Implications

By design, kernel2 is more restrictive than kernel1, but has correct annotations. In eager mode, duck typing still works.
In contrast to the previous annotations, integers are supported by kernel2 while scripting

import itertools

for input_size, kernel in itertools.product(
    [0, (0,), (0, 0), [0], [0, 0]],
    [eager_kernel1, scripted_kernel1, eager_kernel2, scripted_kernel2],
):
    # Integers are not supported when annotation `List[int]`. 
    # `Tuple[int]` and `Tuple[int, int]` seem to be handled automagically
    if kernel is scripted_kernel1 and isinstance(input_size, int):
        continue

    # Sequences of other than two elements are no longer supported. 
    # `[0, 0]` can still be used as duck type for `Tuple[int, int]`.
    if kernel is eager_kernel2 and isinstance(input_size, (list, tuple)) and len(input_size) != 2:
        continue

    # Only `Union[int, Tuple[int, int]]` is supported
    if kernel is scripted_kernel2 and not (
        isinstance(input_size, int) or isinstance(input_size, tuple) and len(input_size) == 2
    ):
        continue

    assert kernel(input_size) == (0, 0)

Dispatchers are fully BC

for input_size, dispatcher in itertools.product(
    [0, (0,), (0, 0), [0], [0, 0]],
    [
        eager_dispatcher1,
        scripted_dispatcher1,
        eager_dispatcher2,
        scripted_dispatcher2,
    ],
):
    if isinstance(input_size, int) and dispatcher in {scripted_dispatcher1, scripted_dispatcher2}:
        continue

    assert dispatcher(input_size) == (0, 0)

Affected kernels and parameters

kernel	parameter	old annotation	new annotation
`resize_*`	`size`	`List[int]`	`Union[int, Tuple[int, int]]`
`affine_*`	`translate`	`List[float]`	`Union[float, Tuple[float, float]]`
	`shear`	`List[float]`	`Tuple[float, float]`
	`center`	`Optional[List[float]]`	`Optional[Tuple[float, float]]`
`rotate_*`	`center`	`Optional[List[float]]`	`Optional[Tuple[float, float]]`
`pad_*`	`padding`	`List[int]`	`Union[int, Tuple[int, int], Tuple[int, int, int, int]]`
`perspective_*`	`perspective_coeffs`	`List[float]`	`Tuple[float, float, float, float, float, float, float, float]`
`center_crop_*`	`output_size`	`List[int]`	`Union[int, Tuple[int, int]]`
`resized_crop_*`	`size`	`List[int]`	`Union[int, Tuple[int, int]]`
`five_crop_*`	`size`	`List[int]`	`Union[int, Tuple[int, int]]`
`ten_crop_*`	`size`	`List[int]`	`Union[int, Tuple[int, int]]`
`gaussian_blur_*`	`kernel_size`	`List[int]`	`Union[int, Tuple[int, int]]`
	`sigma`	`Optional[List[float]]`	`Optional[Union[float, Tuple[float, float]]]`

Unaffected parameters

Using List[int] or List[float] is not always the wrong choice. If we can't know the length statically, we should use it. Examples are the fill parameter for example on resize_* or mean and std on normalize_*:

vision/torchvision/prototype/transforms/functional/_misc.py

Lines 12 to 14 in 2907c49

    
           def normalize( 
        
               inpt: features.TensorImageTypeJIT, mean: List[float], std: List[float], inplace: bool = False 
        
           ) -> torch.Tensor:

Conclusion

We can use correct annotations on the kernels. This makes the API inconsistent, but does not break BC since the kernels were not public before. IMO, wrong annotations are worse than an inconsistent API and thus I'm advocating for this change.

cc @vfdev-5 @datumbox @bjuncek

The text was updated successfully, but these errors were encountered:

pmeier · 2022-11-30T10:17:46Z

We had a lengthy offline discussion about this. The consensus is that we are not going to go for this to avoid a disparity between kernels and dispatchers. As proposed, the kernels and dispatchers don't just have different annotations, but actually different behavior. Accounting for that would mean that we have again wrong annotations on the kernel, which we tried to avoid in the first place.

However, there is a case to be made regarding breaking BC and correct the annotations on the dispatchers and kernels together. We shouldn't do this in the first iteration of transforms v2 though, to make the transition as smooth as possible.

pmeier added module: transforms code quality prototype labels Sep 29, 2022

pmeier mentioned this issue Oct 17, 2022

Type annotations #2025

Closed

pmeier closed this as completed Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use correct annotations for prototype kernels? #6668

Use correct annotations for prototype kernels? #6668

pmeier commented Sep 29, 2022 •

edited

Loading

pmeier commented Nov 30, 2022

Use correct annotations for prototype kernels? #6668

Use correct annotations for prototype kernels? #6668

Comments

pmeier commented Sep 29, 2022 • edited Loading

Status quo

Proposal

Implications

Affected kernels and parameters

Unaffected parameters

Conclusion

pmeier commented Nov 30, 2022

pmeier commented Sep 29, 2022 •

edited

Loading