-
Notifications
You must be signed in to change notification settings - Fork 35
Windowing functions should be lazy #340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Currently, window variables (
For 1. we need a Dask implementation of 2 is more work, since we can't materialize the window arrays as we do now. Instead, the strategy is probably: rechunk the window arrays to match the target variable chunks (e.g. genotypes for computing some of the popgen stats), then use So it might be worth having a couple of utility functions, |
It's not obvious to me that there really is a scaling issue here - have we done some experiments to see what sort of time these windowing functions take? Sounds like the dask versions are doing to be fairly tricky so might be worth doing a few quick checks before embarking on it? |
Closing this for the time being, can re-open if we see scalability issues. |
See https://github.com/pystatgen/sgkit/pull/303#discussion_r507906940
The text was updated successfully, but these errors were encountered: