Skip to content

Chunkwise iteration over arrays. #399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 6, 2019

Conversation

jeromekelleher
Copy link
Member

@jeromekelleher jeromekelleher commented Feb 5, 2019

Closes #398.

Thanks to @alimanfoo for giving me the formula for unit tests! The only tricky thing is the zero-d case, which I've mirrored numpy's behaviour for.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • Docs build locally (e.g., run tox -e docs)
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@alimanfoo alimanfoo added this to the v2.3 milestone Feb 5, 2019
Copy link
Member

@alimanfoo alimanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jeromekelleher, looks good.

Travis PY27 is failing because in PY2 it's itertools.izip_longest and PY3 it's itertools.zip_longest. Suggest to add something to the zarr.compat module to alias izip_longest to zip_longest under PY2.

Travis PY37 is failing with:

py37 runtests: commands[4] | flake8 zarr
zarr/storage.py:1452:25: E117 over-indented

...that's nothing to do with this PR, although I have no idea how that's crept in, but easiest thing to do would be to fix in this PR by dedenting that line.

Also please add an entry in the release notes (docs/release.rst).

zarr/core.py Outdated
chunk_size = self.chunks[0]
for j in range(self.shape[0]):
if j % chunk_size == 0:
chunk = self[j: j + chunk_size][:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chunk = self[j: j + chunk_size][:]
chunk = self[j: j + chunk_size]

Additional slice shouldn't be necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you would need to take an extra copy to be sure the data was kept in a local numpy array. Changed now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah just the single slice [j: j + chunk_size] is enough to bring data into a local numpy array. Adding the extra [:] is harmless but not necessary.

@jeromekelleher
Copy link
Member Author

Thanks @alimanfoo, I've implemented those. Love the changelog format btw, I think I'll be copying that!

Copy link
Member

@alimanfoo alimanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@alimanfoo alimanfoo self-assigned this Feb 6, 2019
@alimanfoo alimanfoo merged commit 8495469 into zarr-developers:master Feb 6, 2019
@jeromekelleher jeromekelleher deleted the efficient-iter branch February 6, 2019 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants