Skip to content

Proposal: More complex type aliasing (subtype without subclass) #263

Closed
@ebolyen

Description

@ebolyen

Preface: this is based on discussion from python/mypy#2029. While I am familiar with types in other languages, I'm still getting started with typing in the context of Python, so please forgive my ignorance.

Basic Idea:

Function annotations (and perhaps in the future, variable definition annotations) are available at runtime (using __annotations__) and as a consequence it is possible to use them to augment behavior at runtime. It would be nice if there was a mechanism to allow static analysis to co-exist with runtime augmentation.

The goal is not to encode arbitrary things in the type, only to fill in details which are compatible with static static analysis. The annotations have more semantic depth with respect to the runtime behavior, but statically are equivalent to some simpler annotation. In other words, the runtime annotations are subtypes (but not subclasses) of the aliased static type.

Examples of why this might be useful would include arbitrary type predicates (such as in typecheck-decorator) or runtime data coercion (such as in QIIME 2).

Concrete Use Cases:

What follows is only a half-baked syntax, I'm in no way advocating for any particular syntax, only the behavior.

Suppose we wanted to both have runtime validation of predicates and static analysis via MyPy. Right now they are mutually exclusive as an annotation for the parameter is a "predicate expression", which is taken literally by the static analysis tool.

Suppose instead we had the following syntax:

Predicate = TypeAlias('Predicate', [TypeVar('T', Callable[[int], bool])], int)

(I'm not using NewType here because it is my understanding that it makes a stronger statement about the relationship of the alias, a subclass, instead of just a subtype.)

The above would mean that expressions like Predicate[some_function] resolve to an int for MyPy.

As an example:

# Predicates:
def is_even(value: int) -> bool:
    return value % 2 == 0

def is_odd(value: int) -> bool:
    return not is_even(value)

# Annotated function:
@some_runtime_type_checker
def my_predicated_function(value: Predicate[is_even]) -> Predicate[is_odd]:
    return value + 1

From the perspective of a static analysis tool such a MyPy, the above function would resolve to:

def my_predicated_function(value: int) -> int:
    ...

which is sufficient to verify the interface of value during static analysis (during runtime value is a subtype according to the Liskov substitution principle). This syntax is useful, because at runtime, the predicated constraints can still be validated by some_runtime_type_checker as the __annotations__ dictionary contains the original type alias which are inspectable.

A different example:

Suppose we had a framework which was extended via small function registrations (my immediate use-case). Assuming it was in control of the data, it might allow these small functions to request specific "views" of the data (for example the function may just be a wrapper around a simple CLI tool so it would be "simplest" if the wrapper could just request a filepath of a specific file-format).

Continuing with my terrible syntax:

FilePath = TypeAlias('FilePath', [TypeVar('T', framework.FileFormatBase)], str)  
# Or maybe pathlib.Path instead of str

Then such a registered function could say something like:

@register_with_framework
def my_extension(data: FilePath[SomeXMLSchema]):
    # `data` is still just a `str` to MyPy
    ...

register_with_framework would then be able to observe that my_extension takes a filepath to a SomeXMLSchema formatted file. So when the framework needs to invoke my_extension, it first writes the data to a temporary file path using the SomeXMLSchema class. Once again the set of all values which are a filepath to SomeXMLSchema (only known at runtime) is a subset of the set of all strings.

Summary

The unifying theme between both of these examples is that their annotation is a type-hint, it very much reflects the kind of data that is expected. The problem is the type-hint cannot be fully resolved statically, so if there existed generic machinery to "map" a more complicated expression to a simpler one (its super-type), then these situations are possible to statically check (and I hope there are more than these two). This is basically the opposite situation of issue #241, instead of allowing subclassing without subtyping, I would like the ability to subtype without subclassing.

I know that was a lot of text from a total stranger, so thanks for reading!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions