Skip to content

refactor tree iteration #296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from
Closed

refactor tree iteration #296

wants to merge 22 commits into from

Conversation

molpopgen
Copy link
Member

This PR implements a "natural" iterator over trees:

  • New type for which we impl Iterator.
  • New function TreeSequence::trees returning the iterator.

@molpopgen
Copy link
Member Author

To make this viable, we need to extract
the public API for a tree into a separate
struct. Then, use Deref to access it.

@molpopgen molpopgen force-pushed the refactor_tree_iteration branch 2 times, most recently from 49add21 to c3f5d16 Compare July 30, 2022 00:37
@molpopgen molpopgen force-pushed the refactor_tree_iteration branch from c3f5d16 to 07319c7 Compare September 26, 2022 15:08
@molpopgen
Copy link
Member Author

molpopgen commented Sep 26, 2022

Due to the ownership semantics of a tree sequence on the C side, we may not be able to do this safely w/o using Rc to manage the lifetime of that C side allocation. Being able to collect into a vector of trees hides the fact that those data must have a lifetime tied to the tree sequence somehow.

@molpopgen
Copy link
Member Author

Difficulties for this PR include:

  • It is (or seems to be) very difficult to do this in a safe fashion.
  • On the C side, the memory for a tree is re-used.
  • In rusts, all Iterators implement FromIterator via a blanket implementation.
  • The blanket implementation allows one to construct a vector of trees
    as a one-liner.
    All trees except the last will be invalid.
  • Further, trees hold non-owning pointers to their tree sequences on the C side.
    Rust cannot be psychic here.
    So the tree sequence can be dropped and not invalidate code using a collected
    vector of trees.
    (A work around for this is API changes that error/return None if the pointer is NULL,
    but that is moot because all the tree arrays are invalid.)

The safest method is the current use of a lending iterator.
To get idiomatic iterator behavior w/o the blanket FromIterator behavior seems to mean
rebuilding a bunch of iterator traits from the ground up.
It currently seems silly to do that much work.

Ways to work around some of these difficulties:

  • Change the rust API for the trees to return None if elements like the parent array, etc., are NULL.
  • Likewise, have checks that the tree sequence pointer is not null for any tree functions referring
    back to the tree sequence.

All of these workarounds are too big for a single PR.

@molpopgen molpopgen closed this Sep 27, 2022
@molpopgen molpopgen deleted the refactor_tree_iteration branch August 21, 2023 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant