Skip to content

doc: book section on lending row iterators #406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 48 additions & 1 deletion book/src/table_collection_row_access.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,53 @@
## Accessing table rows

We may also access entire table rows by a row id:
The rows of a table contain two types of data:

* Numerical data consisting of a single value.
* Ragged data. Examples include metadata for all tables,
ancestral states for sites, derived states for mutations,
parent and location information for individuals, etc..

`tskit` provides two ways to access row data.
The first is by a "view", which contains non-owning references
to the ragged column data.
The second is by row objects containing *copies* of the ragged column data.

The former will be more efficient when the ragged columns are populated.
The latter will be more convenient to work with because the API is a standard
rust iterator.

By holding references, row views have the usual implications for borrowing.
The row objects, however, own their data and are thus independent of their parent
objects.

### Row views

To generate a row view using a row id:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:get_edge_table_row_by_id}}
```

To iterate over all views we use *lending* iterators:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:get_edge_table_rows_by_lending_iterator}}
```

#### Lending iterators

The lending iterators are implemented using the [`streaming_iterator`](https://docs.rs/streaming-iterator/latest/streaming_iterator/) crate.
(The community now prefers the term "lending" over "streaming" for this concept.)
The `tskit` prelude includes the trait declarations that allow the code shown above to compile.

rust 1.65.0 stabilized Generic Associated Types, or GATs.
GATs allows lending iterators to be implemented directly without the workarounds used in the `streaming_iterator` crate.
We have decided not to implement our own lending iterator using GATs.
Rather, we will see what the community settles on and will decide in the future whether or not to adopt it.

### Row objects

We may access entire table rows by a row id:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:get_edge_table_row_by_id}}
Expand Down
30 changes: 30 additions & 0 deletions tests/book_table_collection.rs
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ fn add_node_handle_error() {
#[test]
fn get_data_from_edge_table() {
use rand::distributions::Distribution;
use tskit::prelude::*;
let sequence_length = tskit::Position::from(100.0);
let mut rng = rand::thread_rng();
let random_pos = rand::distributions::Uniform::new::<f64, f64>(0., sequence_length.into());
Expand Down Expand Up @@ -120,6 +121,35 @@ fn get_data_from_edge_table() {
}
// ANCHOR_END: get_edge_table_row_by_id

// ANCHOR: get_edge_table_row_view_by_id
if let Some(row_view) = tables.edges().row_view(edge_id) {
assert_eq!(row_view.id, 0);
assert_eq!(row_view.left, left);
assert_eq!(row_view.right, right);
assert_eq!(row_view.parent, parent);
assert_eq!(row_view.child, child);
} else {
panic!("that should have worked...");
}
// ANCHOR_END: get_edge_table_row_view_by_id

// ANCHOR: get_edge_table_rows_by_lending_iterator
let mut edge_table_lending_iter = tables.edges().lending_iter();
while let Some(row_view) = edge_table_lending_iter.next() {
// there is only one row!
assert_eq!(row_view.id, 0);
assert_eq!(row_view.left, left);
assert_eq!(row_view.right, right);
assert_eq!(row_view.parent, parent);
assert_eq!(row_view.child, child);
assert!(row_view.metadata.is_none()); // no metadata in our table
}
// ANCHOR_END: get_edge_table_rows_by_lending_iterator

assert!(tables
.check_integrity(tskit::TableIntegrityCheckFlags::default())
.is_ok());

// ANCHOR: get_edge_table_rows_by_iterator
for row in tables.edges_iter() {
// there is only one row!
Expand Down