Skip to content

Complete type stubs for xml.dom.minidom (#6886) #10879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

zackw
Copy link

@zackw zackw commented Oct 13, 2023

This patch completes all the type stubs for xml.dom.minidom. I did not look at the rest of the xml.dom directory except as required to make the typing of minidom complete, so it probably doesn’t completely resolve #6886.

xml.dom.minidom is very poorly documented—its authors seem to have assumed that one could refer to the W3C DOM spec. But at the same time the implementation diverges significantly from the spec, so I do not have complete confidence in all the type annotations. Still, it should be an improvement on what’s there now.

The following commands now pass with no errors when executed from the top of the typeshed tree:

mypy --custom-typeshed-dir . --strict --warn-incomplete-stub \
     -m xml.dom.minidom
stubtest --custom-typeshed-dir . xml.dom.minidom

However, attempting to do the same for xml.dom.minicompat produces the following error:

mypy --custom-typeshed-dir . --strict --warn-incomplete-stub \
     -m xml.dom.minicompat
stdlib/xml/dom/minicompat.pyi:11: error: All bases of a protocol must be protocols  [misc]

I don’t know how to fix this, because in the implementation, EmptyNodeList is not a subclass of NodeList, therefore if I don’t make NodeList a protocol, then xml.dom.minidom.Childless fails stubtest. This patch probably should not land until this is resolved. Any advice you could give would be welcome.

This patch completes all the type stubs for xml.dom.minidom.  I did
not look at the rest of the xml.dom directory except as required to
make the typing of minidom complete, so it probably doesn’t completely
resolve python#6886.

xml.dom.minidom is very poorly documented—its authors seem to have
assumed that one could refer to the W3C DOM spec.  But at the same
time the implementation diverges significantly from the spec, so I
do not have complete confidence in all the type annotations.  Still,
it should be an improvement on what’s there now.

The following commands now pass with no errors when executed from the
top of the typeshed tree:

```sh
mypy --custom-typeshed-dir . --strict --warn-incomplete-stub \
     -m xml.dom.minidom
stubtest --custom-typeshed-dir . xml.dom.minidom
```

**However**, attempting to do the same for xml.dom.minicompat produces
the following error:

```sh
mypy --custom-typeshed-dir . --strict --warn-incomplete-stub \
     -m xml.dom.minicompat
stdlib/xml/dom/minicompat.pyi:11: error: All bases of a protocol must be protocols  [misc]
```

I don’t know how to fix this, because in the implementation,
EmptyNodeList is not a subclass of NodeList, therefore if I don’t make
NodeList a protocol, then xml.dom.minidom.Childless fails stubtest.
This patch probably should not land until this is resolved.  Any
advice you could give would be welcome.
@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@zackw
Copy link
Author

zackw commented Oct 13, 2023

It looks like all the failures are due to the xml.dom.minicompat issue I mentioned up top.

For clarity, this is one of those places where the implementation diverges from the spec. All implementations of the Node abstract interface are required to have a childNodes property that implements the NodeList interface. xml.dom.minicompat defines two concrete implementations of that interface: NodeList, which is a subclass of list, and EmptyNodeList, which is a subclass of tuple. Neither is a subclass of the other. xml.dom.minidom uses xml.dom.minicompat.NodeList for childNodes for some types that implement Node, and xml.dom.minicompat.EmptyNodeList for other types that implement Node.

It ought to be perfectly fine for users of xml.dom.minidom to treat the type of childNodes as always being NodeList, no matter what. That's how the spec designers intended it to be used, and I can get xml.dom.minidom to typecheck correctly with

from xml.dom.minicompat import NodeList
class Node:
    childNodes: NodeList

if I make xml.dom.minicompat.NodeList be a Protocol in the type stubs. But tuple, list, and Sequence are not Protocols, and so that same change breaks the typechecking of minicompat itself. I can't change the relationship between EmptyNodeList and NodeList unilaterally from the stubs -- that would require a change to the actual stdlib. I could put minicompat back the way it was and have Node be typed like so instead

from xml.dom.minicompat import EmptyNodeList, NodeList
class Node:
    childNodes: NodeList | EmptyNodeList

but then every user of the childNodes property would have to know about EmptyNodeList, which the authors of the stdlib intended to be a pure implementation detail. The entire minicompat module is undocumented in the stdlib. So I don't like that option either.

Again, any advice you could give would be welcome.

Copy link
Contributor

@hamdanal hamdanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note I am not a maintainer of typeshed, just a regular contributor

This is not a full review, just a few things I noticed after a quick spot check

class NodeList(list[_T]):
class NodeList(Sequence[_T], Protocol):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. At runtime NodeList inherits from list and it is indeed a list and its append method is used elsewhere in the code. Why was it changed?

@@ -95,74 +101,73 @@ class Node(xml.dom.Node):
) -> bytes: ...

def hasChildNodes(self) -> bool: ...
def insertBefore(self, newChild, refChild): ...
def insertBefore(self, newChild: _N, refChild: Node) -> _N: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a if refChild is None check at runtime

Suggested change
def insertBefore(self, newChild: _N, refChild: Node) -> _N: ...
def insertBefore(self, newChild: _N, refChild: Node | None) -> _N: ...

Comment on lines +113 to +114
def getUserData(self, key: str) -> object: ...
def setUserData(self, key: str, data: object, handler: UserDataHandler) -> object | None: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning object would require every use of the return value to be guarded by isinstance checks. Any is preferred for return values.

Suggested change
def getUserData(self, key: str) -> object: ...
def setUserData(self, key: str, data: object, handler: UserDataHandler) -> object | None: ...
def getUserData(self, key: str) -> Any: ...
def setUserData(self, key: str, data: object, handler: UserDataHandler) -> Any: ...


class Node(xml.dom.Node):
parentNode: Node | None
childNodes: NodeList
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To silence pyright. Need to import Incomplete from _typeshed.

Suggested change
childNodes: NodeList
childNodes: NodeList[Incomplete]

(same for the NodeList annotations below)

def insertBefore(self, newChild, refChild) -> NoReturn: ...
def removeChild(self, oldChild) -> NoReturn: ...
attributes: None
childNodes: NodeList
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an EmptyNodeList. If the type checker complains you can add a type: ignore comment

Suggested change
childNodes: NodeList
childNodes: EmptyNodeList

def writexml(self, writer: SupportsWrite[str], indent: str = "", addindent: str = "", newl: str = "") -> None: ...
def replaceWholeText(self, content): ...
def replaceWholeText(self, content: str) -> Self: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns None if content is falsy.

def removeNamedItemNS(self, namespaceURI: str, localName) -> None: ...
def setNamedItem(self, node) -> None: ...
def setNamedItemNS(self, node) -> None: ...
def getNamedItem(self, name: str) -> Node: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns None if the node is not found in the sequence

Suggested change
def getNamedItem(self, name: str) -> Node: ...
def getNamedItem(self, name: str) -> Node | None: ...

(same below)

def getNamedItem(self, name: str) -> Node: ...
def getNamedItemNS(self, namespaceURI: str, localName: str) -> Node: ...
def __getitem__(self, name_or_tuple: tuple[str, str] | str) -> Node: ...
def item(self, index: int) -> Node: ...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explicitly returns None. It is similar in behavior to dict.get()

Suggested change
def item(self, index: int) -> Node: ...
def item(self, index: int) -> Node | None: ...

@srittau
Copy link
Collaborator

srittau commented Apr 6, 2024

Thanks for contributing! I'm closing this PR for now, because it still fails some tests and has unresolved review feedback after three months of inactivity. If you are still interested, please feel free to open a new PR (or ping us to reopen this one).

@srittau srittau closed this Apr 6, 2024
@JelleZijlstra
Copy link
Member

Also thanks to @hamdanal for the useful review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xml typing incomplete
4 participants