Skip to content

Allow TypedDict.__required_keys__/__optional_keys__ to be used as Literal[...] #1394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gh-andre opened this issue Apr 21, 2023 · 6 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@gh-andre
Copy link

gh-andre commented Apr 21, 2023

TypedDict instances work very well for NoSQL databases, like Mongo DB, but for any extended functionality a class is required, and marrying the two requires a few error-prone steps described here.

Consider a Mongo DB database document described with this structure, which works out well to the most part when being supplied into pymongo methods (with the exception of mypy failing to realize that it's TypedDict is a mutable mapping).

class X(TypedDict):
    i: int
    s: str
    f: NotRequired[float]

Compared to Mapping, using a typed dictionary allows one to catch field references and value type mismatches quite quickly.

For more advanced functionality for this entity, however, a class may be maintained, like below. I will omit getters, etc, and just show relevant methods to set required and optional fields.

class Y:
    def __init__(self, x: X) -> None:
        self._storage = x

    def set_required(self, key: Literal["i", "s"], value: Any) -> None:
        self._storage[key] = value

    def set_optional(self, key: Literal["f"], value: Any) -> None:
        if value is None:
            del self._storage[key]
        else:
            self._storage[key] = value

Herein lies the problem - I cannot use X.__required_keys__ to drive field references in the underlying storage because it's not a literal and the fact that it's frozenset doesn't help here, so I have to maintain copies of field names separately.

If X.__required_keys__ and X.__optional_keys__ would operate similarly to how constexpr behaves in C++ and would propagate literal keys they are created with to where they are used, like those methods above, it would make it very straightforward to integrate typed dictionaries with classes and use the former as storage with a well-defined schema that can be referenced in class methods to enforce proper field references.

I will also note that @dataclass is not good for this functionality because of a couple of reasons. First, just like any class, it has no concept of missing attributes and an attribute set to None is interpreted by the database driver as null. Second, @dataclass is implemented via auto-generated __init__, so it imposes unreasonable field order to satisfy optional arguments following required ones in the constructor. Lack of an alternative constructor doesn't help either.

EDIT: Replaced self.x_o with self._storage, which was a copy-and-paste error copying examples from a larger test case.

@gh-andre gh-andre added the topic: feature Discussions about new features for Python's type annotations label Apr 21, 2023
@erictraut
Copy link
Collaborator

Are you saying that you want to use the variable X.__required_keys__ in a type annotation, as in key: X.__required_keys__? This expression refers to a value, not a type. It's the difference between the expression 1 and Literal[1]. The former is a value, and the latter is a type. Plus, X.__required_keys__ is a variable, and variables are not allowed in type annotations. So I don't think that proposal would work.

You mentioned a couple of problems with using @dataclass. The second of the two problems can be overcome if you use the kwonly option introduced in Python 3.11 and you call the constructor using keyword arguments. I wonder if you could overcome the first problem by agreeing on some sentinel (other than None) that indicates to the DB driver that the field is not present?

@gh-andre
Copy link
Author

@erictraut

Are you saying that you want to use the variable X.required_keys in a type annotation

No. That's the difference between a requirement and an implementation. I'm saying that it would good to have functionality that would behave as if Literal[...] is constructed from X.__required_keys__. This functionality doesn't exist and would need to be figured out.

a couple of problems with using @DataClass

I'm aware of possible workarounds that can be applied to @dataclass, but that's what they are - workarounds. I brought it up only as a inferior alternative to typed dictionaries in this context and suggest we avoid discussing ways to bend them into what they are not.

@erictraut
Copy link
Collaborator

I'm saying that it would good to have functionality that would behave as if Literal[...] is constructed from X.__required_keys__.

I don't understand what you mean by this statement. Can you elaborate? What would the resulting code look like if this worked the way you have in mind?

@gh-andre
Copy link
Author

@erictraut

I don't have the exact syntax to suggest, if that's what you are asking. That would be new functionality to design, which would produce the equivalent of Literal with all typed dictionary keys listed as literal elements.

This is technically possible because instances of TypedDict are required to be constructed from literal keys, so it's a matter of being able to construct a list of literals from dictionary keys, mandated to be literals or finals by PEP 589.

Functionally, it would behave as if this was true:

assert(Literal["i", "s"] == Literal[X.RequireKeysFinal])

Note, though, I'm not suggesting this syntax. Just conveying the thought on how it would functionally behave.

@hmc-cs-mdrissi
Copy link

hmc-cs-mdrissi commented Apr 22, 2023

class Foo(TypedDict):
  x: int
  y: str

def extract_field(d: Foo, field_name: Literal[Foo.__required_keys__]):
  ...

data: Foo = {"x": 2, "y": "hi"}
extract_field(data, "x") # Good
extract_field(data, "y") # Good
extract_field(data, "z") # Type checker error: z is not present in Literal[Foo. __required_keys__]

s: str = ...
extract_field(data, s) # Type checker error: str is not compatible with Literal[Foo. __required_keys__]

I'm guessing something like this is what you are requesting?

@gh-andre
Copy link
Author

@hmc-cs-mdrissi Thanks for responding

Yes, something along these lines would allow people to reuse typed dictionary keys as literal sets and have extract_field, render_field, update_field, etc, without having to repeat Lieral["x", "y"] in parameters of these functions or maintaining an alias for it.

__required_keys__ would still be needed for runtime lookups, so perhaps something like __required_keys_literal__, to retain the notion that it represents a set of literals or finals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests

3 participants