Skip to content

PyStructSequence constructor ignores unknown field names #110235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serhiy-storchaka opened this issue Oct 2, 2023 · 4 comments · Fixed by #110258
Closed

PyStructSequence constructor ignores unknown field names #110235

serhiy-storchaka opened this issue Oct 2, 2023 · 4 comments · Fixed by #110258
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Oct 2, 2023

Bug report

PyStructSequence constructor takes two arguments: tuple and dict. The tuple specifies values for "visible" fields (i.e. what you get when interpret PyStructSequence as a tuple), the dict specifies values for fields that are accessible only by name (they are like __slot__ attributes).

>>> import time
>>> time.struct_time((2023, 10, 2, 17, 50, 53, 0, 275, 0), {'tm_zone': 'GMT', 'tm_gmtoff': 0})
time.struct_time(tm_year=2023, tm_mon=10, tm_mday=2, tm_hour=17, tm_min=50, tm_sec=53, tm_wday=0, tm_yday=275, tm_isdst=0)

The problem is that all invalid names are silently ignored. It includes names of "visible" fields and typos.

>>> time.struct_time((0,)*9, {'tm_zone': 'GMT', 'tm_gmtoff': 0, 'tm_year': 2023, 'invalid': 1})
time.struct_time(tm_year=0, tm_mon=0, tm_mday=0, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=0, tm_yday=0, tm_isdst=0)

Linked PRs

@serhiy-storchaka serhiy-storchaka added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 2, 2023
@serhiy-storchaka
Copy link
Member Author

The original code was added by @mwhudson in bpo-526072 (#36205, ce358e3). Because it was only for pickling support, omitting some validity checks was acceptable.

@serhiy-storchaka
Copy link
Member Author

@XuehaiPan, it may be interesting to you, since you already do something similar in the __replace__().

The difference is that this code cannot modify the input dict, because it is not kwargs. But it can count the found names.

@XuehaiPan
Copy link
Contributor

XuehaiPan commented Oct 2, 2023

The problem is that all invalid names are silently ignored. It includes names of "visible" fields and typos.

Any thoughts on allowing the present visible fields in dict to override the values in the input sequence? Currently, we ignore visible field names in dict.

PyStructSequence.__new__(cls: Self, sequence: Iterable[Any], dict: dict[str, Any]) -> Self

Since we need n_visible_fields <= len(sequence) <= n_fields in __new__(), all visible fields should be present in the sequence argument. If a field value is not allowed to come from both sequence and dict, the dict argument should only contain invisible field names. Also, we need to check if len(sequence) > n_visible_fields, the given invisible field provided in sequence should not present again in dict.


For collections.namedtuple, it will raise TypeError for duplicate arguments:

namedtuple.__new__(cls: Self, *args: Any, **kwargs: Any) -> Self
>>> import collections
>>> MyTuple = collections.namedtuple('MyTuple', ['a', 'b', 'c', 'd'])
>>> MyTuple(1, 2, 3, 4, a=5, b=6)
TypeError: MyTuple.__new__() got multiple values for argument 'a'

@XuehaiPan
Copy link
Contributor

Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
2 participants