Skip to content

Add ABSStore #288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Aug 6, 2018
Merged

Add ABSStore #288

merged 22 commits into from
Aug 6, 2018

Conversation

tjcrone
Copy link
Member

@tjcrone tjcrone commented Aug 3, 2018

This adds the ABSStore MutableMapping for reading/writing to Azure Blob Storage. I think this needs more testing so I am happy to have this live in a separate branch for a while. Whatever works best for you.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Unit tests and doctests pass locally under Python 3.6 (e.g., run tox -e py36 or
    pytest -v --doctest-modules zarr)
  • Unit tests pass locally under Python 2.7 (e.g., run tox -e py27 or
    pytest -v zarr)
  • PEP8 checks pass (e.g., run tox -e py36 or flake8 --max-line-length=100 zarr)
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Doctests in tutorial pass (e.g., run tox -e py36 or python -m doctest -o NORMALIZE_WHITESPACE -o ELLIPSIS docs/tutorial.rst)
  • Changes documented in docs/release.rst
  • Docs build locally (e.g., run tox -e docs)
  • AppVeyor and Travis CI passes
  • Test coverage to 100% (Coveralls passes)

@martindurant
Copy link
Member

I would like to copy across from the initial issue the idea that file-system implementations should follow a convention of some sort - not necessarily subclassing from my definition, but possibly.
Again, I realise that this is not a file-system exactly, but a mapping (it doesn't produce file-like objects or support generic file-system operations), but you might find it interesting to compare to the simplest file-system implementations I've been involved in: local and http.

It may not be worthwhile from the point of view of zarr, but I imagine it would be appreciated by a wider community to separate the one class into ABSMapping, ABSFileSystem, ABSFile (in this implementation, this only supports one-shot read() and mode='r' - although range requests are available for some blob types).

I will not push the issue further :)

For the PR as it stands, I would prefer the two utility functions to be static methods of the class, since I think they are very much tied to the functionality of this class.

For testing, you would either have to build a mock server (tricky) or use vcrpy as gcsfs does (awkward and too much additional complexity to live within zarr).

@alimanfoo
Copy link
Member

Just to say I'm about to go offline for 3 weeks and unfortunately have run out of time to take any action on this. @tjcrone can I suggest you go ahead and merge this into a new branch if that would be useful for you. I noticed you have this going into "zarr-developers:aws_store", it might be good to rename that to "abs_store" or "azure_store" just to make sure we don't get confused between our cloud vendors :-) I leave it to your decision whether to stick with a minimal Mapping implementation or follow @martindurant's suggestion. Either way, if you document the new features as experimental then you should have some latitude to change course in future.

@martindurant
Copy link
Member

not "azure_store", if that's ok - the datalake-store does remain a valid option, after all.

@alimanfoo
Copy link
Member

alimanfoo commented Aug 3, 2018 via email

@tjcrone
Copy link
Member Author

tjcrone commented Aug 3, 2018

@alimanfoo, thanks for catching that. i will change the name of the branch to abs_store, to refer to Azure Blob Storage, which is what it was supposed to be. I will go ahead and merge into that branch which I think will be helpful now that it seems we have people using this feature and it would be great to make it easier for them to test.

Regarding @martindurant's suggestion regarding the overall structure, I don't think that I am knowledgeable enough to make this sort of change myself. I have structured the MutableMapping based on the others there now. If we want to take this in a different direction that's very fine by me. For now I will just keep it simple.

Regarding the location of the utility functions, I am happy to move them into the class. However I think they were originally placed outside the class by @rabernat because they would be ostensibly useful for any blob storage classes. I will go ahead and move them in now. They can easily be moved out if that makes more sense at a later time.

I would greatly appreciate getting some guidance on building automated tests for this when you are back online @alimanfoo.

@tjcrone tjcrone changed the base branch from aws_store to abs_store August 3, 2018 16:53
@alimanfoo
Copy link
Member

alimanfoo commented Aug 3, 2018 via email

@tjcrone tjcrone merged commit 29a8697 into zarr-developers:abs_store Aug 6, 2018
@tjcrone
Copy link
Member Author

tjcrone commented Aug 6, 2018

@mzjp2, the ABSStore is now merged into the abs_store branch of this repo, so I recommend continuing your testing with the code in this repo going forward. Thanks.

@jakirkham
Copy link
Member

FWIW opened issue ( https://github.com/zarr-developers/zarr/issues/290 ) to discuss cloud support generally. Please feel free to share anything relevant there.

@jakirkham jakirkham mentioned this pull request Mar 1, 2019
@jakirkham
Copy link
Member

So PR ( #345 ) has integrated this into master with some changed. Do we still need the abs_store branch or can we delete it?

@shikharsg
Copy link
Contributor

I think we can delete it

@jakirkham jakirkham added this to the v2.3 milestone Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants