Skip to content

Misleading comment "heap.sort() maintains the heap invariant" #114466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
florensacc opened this issue Jan 23, 2024 · 7 comments
Closed

Misleading comment "heap.sort() maintains the heap invariant" #114466

florensacc opened this issue Jan 23, 2024 · 7 comments
Labels
docs Documentation in the Doc dir pending The issue will be closed if no feedback is provided

Comments

@florensacc
Copy link

florensacc commented Jan 23, 2024

Documentation

In the fourth paragraph of the documentation says:
"""
These two make it possible to view the heap as a regular Python list without surprises: heap[0] is the smallest item, and heap.sort() maintains the heap invariant!
"""
This seems to suggest that heap.sort() does not change heap, although that's clearly not true:

heap = [1, 2, 4, 3]  # this is already a valid heap
heapq.heapify(heap)   # doesn't change `heap`
heap.sort()  # this is now `[1, 2, 3, 4]`!!

Linked PRs

@florensacc florensacc added the docs Documentation in the Doc dir label Jan 23, 2024
@Eclips4
Copy link
Member

Eclips4 commented Jan 23, 2024

cc @rhettinger

@gaogaotiantian
Copy link
Member

I'm not the master of the language but "maintains the heap invariant" here means "it's still a heap" to me. I guess a possible ambiguity is that the variable itself is also heap - but all the variable heap is labeled specially (with dark background). So the "heap" here means the general concept "heap", not the variable heap.

@AA-Turner AA-Turner added the pending The issue will be closed if no feedback is provided label Jan 23, 2024
@terryjreedy
Copy link
Member

The confusion is interpreting 'heap invariant' as noun 'heap' that is kept adjective 'invariant', unchanged. It is meant be interpreted as a compount noun meaning the invariant that characterizes a heap. In other words, 'invariant' is the noun that in maintained and 'heap' is used as a noun-adjective that specified which invariant is maintained. 'Invariant' is a math and CS term meaning a condition that characterizes an math or information structure. Unless more than one person is confused, we could close this. The term 'heap invariant' should have been defined earlier in the doc. @florensacc , please check.

@hauntsaninja
Copy link
Contributor

hauntsaninja commented Jan 30, 2024

The docs do describe the invariant, but they don't use the term "heap invariant" when doing so. If. others are likely to be confused as well, maybe something like the following could be useful:

 Heaps are binary trees for which every parent node has a value less than or
-equal to any of its children.  This implementation uses arrays for which
+equal to any of its children.  We refer to this condition as the heap invariant.
+
+This implementation uses arrays for which

@florensacc
Copy link
Author

Thanks for the comments, I see how the sentence is technically correct when understanding "invariant" as the noun and "heap" its noun-adjective. Defining "heap invariant" earlier like @hauntsaninja suggests would definitely help! And maybe we could also rephrase "and heap.sort() maintains the heap invariant!" as "and heap.sort() also satisfies the heap invariant!" or plainly "and heap.sort() is also a heap!".

@pochmann3
Copy link
Contributor

"and heap.sort() is also a heap!".

It's not. It's None.

The Theory section btw already does use the term "invariant".

Wikipedia doesn't use "invariant" but "property", and "heap property" does seem to be the much more popular term, and "maintains the heap property" would avoid the ambiguity.

hauntsaninja added a commit to hauntsaninja/cpython that referenced this issue Apr 11, 2024
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
@hauntsaninja
Copy link
Contributor

hauntsaninja commented Apr 11, 2024

I think "invariant" is clear / standard enough, and since OP says it would help, I've opened a PR with my suggestion. Like terryjreedy says, unless we have evidence of more confusion, I don't see a need to make further changes.

hauntsaninja added a commit that referenced this issue Apr 13, 2024
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 13, 2024
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
(cherry picked from commit 37a4cbd)

Co-authored-by: Shantanu <[email protected]>
hauntsaninja added a commit that referenced this issue Apr 13, 2024
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
(cherry picked from commit 37a4cbd)

Co-authored-by: Shantanu <[email protected]>
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
I think the choice of wording in these docs is great and doesn't
need to change. However, it could be useful to explicitly define
this term / the cost of doing so seems relatively low.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir pending The issue will be closed if no feedback is provided
Projects
None yet
Development

No branches or pull requests

7 participants