Skip to content

Commit b76b282

Browse files
Joe Hammanalimanfoo
Joe Hamman
authored andcommitted
MongoDB and Redis stores (#372)
* add mongodb and redis stores still needs docs and some CI setup * top level doc strings * fix host kwarg * pickle support * different way of launching dbs on travis * back to default travis configs * fixes to mapping classes for both redis and mongodb stores * default redis port * pep8 * decode for py2? * no decode for py2 * address comments * cast to binary type in mongo getitem * more doc strings * more docs * split release note into two bullets * whitespace fix in .travis.yml * lint after merge * pin mongo/redis versions and a few doc changes * use redis client.delete and check for deleted keys * fix typo in requirements * Update docs/release.rst Co-Authored-By: jhamman <[email protected]> * Update docs/release.rst Co-Authored-By: jhamman <[email protected]> * skip redis/mongodb tests when unable to connect * fix pep8
1 parent be1c606 commit b76b282

8 files changed

+258
-4
lines changed

.travis.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ addons:
1111
packages:
1212
- libdb-dev
1313

14+
services:
15+
- redis-server
16+
- mongodb
17+
1418
matrix:
1519
include:
1620
- python: 2.7
@@ -20,6 +24,9 @@ matrix:
2024
dist: xenial
2125
sudo: true
2226

27+
before_script:
28+
- mongo mydb_test --eval 'db.createUser({user:"travis",pwd:"test",roles:["readWrite"]});'
29+
2330
install:
2431
- pip install -U pip setuptools wheel tox-travis coveralls
2532

docs/api/storage.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ Storage (``zarr.storage``)
2525

2626
.. automethod:: close
2727

28+
.. autoclass:: MongoDBStore
29+
.. autoclass:: RedisStore
2830
.. autoclass:: LRUStoreCache
2931

3032
.. automethod:: invalidate

docs/release.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,14 @@ Enhancements
2626
* Efficient iteration over arrays by decompressing chunkwise.
2727
By :user:`Jerome Kelleher <jeromekelleher>`, :issue:`398`, :issue:`399`.
2828

29+
* Adds the Redis-backed :class:`zarr.storage.RedisStore` class enabling a
30+
Redis database to be used as the backing store for an array or group.
31+
By :user:`Joe Hamman <jhamman>`, :issue:`299`, :issue:`372`.
32+
33+
* Adds the MongoDB-backed :class:`zarr.storage.MongoDBStore` class enabling a
34+
MongoDB database to be used as the backing store for an array or group.
35+
By :user:`Joe Hamman <jhamman>`, :issue:`299`, :issue:`372`.
36+
2937
Bug fixes
3038
~~~~~~~~~
3139

docs/tutorial.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -739,6 +739,13 @@ Python is built with SQLite support)::
739739
>>> z[:] = 42
740740
>>> store.close()
741741

742+
Also added in Zarr version 2.3 are two storage classes for interfacing with server-client
743+
databases. The :class:`zarr.storage.RedisStore` class interfaces `Redis <https://redis.io/>`_
744+
(an in memory data structure store), and the :class:`zarr.storage.MongoDB` class interfaces
745+
with `MongoDB <https://www.mongodb.com/>`_ (an oject oriented NoSQL database). These stores
746+
respectively require the `redis <https://redis-py.readthedocs.io>`_ and
747+
`pymongo <https://api.mongodb.com/python/current/>`_ packages to be installed.
748+
742749
Distributed/cloud storage
743750
~~~~~~~~~~~~~~~~~~~~~~~~~
744751

requirements_dev_optional.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
# These packages are currently not available on Windows.
22
bsddb3==6.2.6
33
lmdb==0.94
4+
redis==3.0.1
5+
pymongo==3.7.1

zarr/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
ones_like, full_like, open_array, open_like, create)
99
from zarr.storage import (DictStore, DirectoryStore, ZipStore, TempStore,
1010
NestedDirectoryStore, DBMStore, LMDBStore, SQLiteStore,
11-
LRUStoreCache)
11+
LRUStoreCache, RedisStore, MongoDBStore)
1212
from zarr.hierarchy import group, open_group, Group
1313
from zarr.sync import ThreadSynchronizer, ProcessSynchronizer
1414
from zarr.codecs import *

zarr/storage.py

Lines changed: 183 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737
normalize_storage_path, buffer_size,
3838
normalize_fill_value, nolock, normalize_dtype)
3939
from zarr.meta import encode_array_metadata, encode_group_metadata
40-
from zarr.compat import PY2, OrderedDict_move_to_end
40+
from zarr.compat import PY2, OrderedDict_move_to_end, binary_type
4141
from numcodecs.registry import codec_registry
4242
from numcodecs.compat import ensure_bytes, ensure_contiguous_ndarray
4343
from zarr.errors import (err_contains_group, err_contains_array, err_bad_compressor,
@@ -2084,6 +2084,188 @@ def clear(self):
20842084
)
20852085

20862086

2087+
class MongoDBStore(MutableMapping):
2088+
"""Storage class using MongoDB.
2089+
2090+
.. note:: This is an experimental feature.
2091+
2092+
Requires the `pymongo <https://api.mongodb.com/python/current/>`_
2093+
package to be installed.
2094+
2095+
Parameters
2096+
----------
2097+
database : string
2098+
Name of database
2099+
collection : string
2100+
Name of collection
2101+
**kwargs
2102+
Keyword arguments passed through to the `pymongo.MongoClient` function.
2103+
2104+
Examples
2105+
--------
2106+
Store a single array::
2107+
2108+
>>> import zarr
2109+
>>> store = zarr.MongoDBStore('localhost')
2110+
>>> z = zarr.zeros((10, 10), chunks=(5, 5), store=store, overwrite=True)
2111+
>>> z[...] = 42
2112+
>>> store.close()
2113+
2114+
Store a group::
2115+
2116+
>>> store = zarr.MongoDBStore('localhost')
2117+
>>> root = zarr.group(store=store, overwrite=True)
2118+
>>> foo = root.create_group('foo')
2119+
>>> bar = foo.zeros('bar', shape=(10, 10), chunks=(5, 5))
2120+
>>> bar[...] = 42
2121+
>>> store.close()
2122+
2123+
Notes
2124+
-----
2125+
The maximum chunksize in MongoDB documents is 16 MB.
2126+
2127+
"""
2128+
2129+
_key = 'key'
2130+
_value = 'value'
2131+
2132+
def __init__(self, database='mongodb_zarr', collection='zarr_collection',
2133+
**kwargs):
2134+
import pymongo
2135+
2136+
self._database = database
2137+
self._collection = collection
2138+
self._kwargs = kwargs
2139+
2140+
self.client = pymongo.MongoClient(**self._kwargs)
2141+
self.db = self.client.get_database(self._database)
2142+
self.collection = self.db.get_collection(self._collection)
2143+
2144+
def __getitem__(self, key):
2145+
doc = self.collection.find_one({self._key: key})
2146+
2147+
if doc is None:
2148+
raise KeyError(key)
2149+
else:
2150+
return binary_type(doc[self._value])
2151+
2152+
def __setitem__(self, key, value):
2153+
value = ensure_bytes(value)
2154+
self.collection.replace_one({self._key: key},
2155+
{self._key: key, self._value: value},
2156+
upsert=True)
2157+
2158+
def __delitem__(self, key):
2159+
result = self.collection.delete_many({self._key: key})
2160+
if not result.deleted_count == 1:
2161+
raise KeyError(key)
2162+
2163+
def __iter__(self):
2164+
for f in self.collection.find({}):
2165+
yield f[self._key]
2166+
2167+
def __len__(self):
2168+
return self.collection.count_documents({})
2169+
2170+
def __getstate__(self):
2171+
return self._database, self._collection, self._kwargs
2172+
2173+
def __setstate__(self, state):
2174+
database, collection, kwargs = state
2175+
self.__init__(database=database, collection=collection, **kwargs)
2176+
2177+
def close(self):
2178+
"""Cleanup client resources and disconnect from MongoDB."""
2179+
self.client.close()
2180+
2181+
def clear(self):
2182+
"""Remove all items from store."""
2183+
self.collection.delete_many({})
2184+
2185+
2186+
class RedisStore(MutableMapping):
2187+
"""Storage class using Redis.
2188+
2189+
.. note:: This is an experimental feature.
2190+
2191+
Requires the `redis <https://redis-py.readthedocs.io/>`_
2192+
package to be installed.
2193+
2194+
Parameters
2195+
----------
2196+
prefix : string
2197+
Name of prefix for Redis keys
2198+
**kwargs
2199+
Keyword arguments passed through to the `redis.Redis` function.
2200+
2201+
Examples
2202+
--------
2203+
Store a single array::
2204+
2205+
>>> import zarr
2206+
>>> store = zarr.RedisStore(port=6379)
2207+
>>> z = zarr.zeros((10, 10), chunks=(5, 5), store=store, overwrite=True)
2208+
>>> z[...] = 42
2209+
2210+
Store a group::
2211+
2212+
>>> store = zarr.RedisStore(port=6379)
2213+
>>> root = zarr.group(store=store, overwrite=True)
2214+
>>> foo = root.create_group('foo')
2215+
>>> bar = foo.zeros('bar', shape=(10, 10), chunks=(5, 5))
2216+
>>> bar[...] = 42
2217+
2218+
"""
2219+
def __init__(self, prefix='zarr', **kwargs):
2220+
import redis
2221+
self._prefix = prefix
2222+
self._kwargs = kwargs
2223+
2224+
self.client = redis.Redis(**kwargs)
2225+
2226+
def _key(self, key):
2227+
return '{prefix}:{key}'.format(prefix=self._prefix, key=key)
2228+
2229+
def __getitem__(self, key):
2230+
return self.client[self._key(key)]
2231+
2232+
def __setitem__(self, key, value):
2233+
value = ensure_bytes(value)
2234+
self.client[self._key(key)] = value
2235+
2236+
def __delitem__(self, key):
2237+
count = self.client.delete(self._key(key))
2238+
if not count:
2239+
raise KeyError(key)
2240+
2241+
def keylist(self):
2242+
offset = len(self._key('')) # length of prefix
2243+
return [key[offset:].decode('utf-8')
2244+
for key in self.client.keys(self._key('*'))]
2245+
2246+
def keys(self):
2247+
for key in self.keylist():
2248+
yield key
2249+
2250+
def __iter__(self):
2251+
for key in self.keys():
2252+
yield key
2253+
2254+
def __len__(self):
2255+
return len(self.keylist())
2256+
2257+
def __getstate__(self):
2258+
return self._prefix, self._kwargs
2259+
2260+
def __setstate__(self, state):
2261+
prefix, kwargs = state
2262+
self.__init__(prefix=prefix, **kwargs)
2263+
2264+
def clear(self):
2265+
for key in self.keys():
2266+
del self[key]
2267+
2268+
20872269
class ConsolidatedMetadataStore(MutableMapping):
20882270
"""A layer over other storage, where the metadata has been consolidated into
20892271
a single key.

zarr/tests/test_storage.py

Lines changed: 48 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,8 @@
2020
DirectoryStore, ZipStore, init_group, group_meta_key,
2121
getsize, migrate_1to2, TempStore, atexit_rmtree,
2222
NestedDirectoryStore, default_compressor, DBMStore,
23-
LMDBStore, SQLiteStore, atexit_rmglob, LRUStoreCache,
24-
ConsolidatedMetadataStore)
23+
LMDBStore, SQLiteStore, MongoDBStore, RedisStore,
24+
atexit_rmglob, LRUStoreCache, ConsolidatedMetadataStore)
2525
from zarr.meta import (decode_array_metadata, encode_array_metadata, ZARR_FORMAT,
2626
decode_group_metadata, encode_group_metadata)
2727
from zarr.compat import PY2
@@ -900,6 +900,29 @@ def test_context_manager(self):
900900
except ImportError: # pragma: no cover
901901
sqlite3 = None
902902

903+
try:
904+
import pymongo
905+
from pymongo.errors import ConnectionFailure, ServerSelectionTimeoutError
906+
try:
907+
client = pymongo.MongoClient(host='127.0.0.1',
908+
serverSelectionTimeoutMS=1e3)
909+
client.server_info()
910+
except (ConnectionFailure, ServerSelectionTimeoutError): # pragma: no cover
911+
pymongo = None
912+
except ImportError: # pragma: no cover
913+
pymongo = None
914+
915+
try:
916+
import redis
917+
from redis import ConnectionError
918+
try:
919+
rs = redis.Redis("localhost", port=6379)
920+
rs.ping()
921+
except ConnectionError: # pragma: no cover
922+
redis = None
923+
except ImportError: # pragma: no cover
924+
redis = None
925+
903926

904927
@unittest.skipIf(sqlite3 is None, 'python built without sqlite')
905928
class TestSQLiteStore(StoreTests, unittest.TestCase):
@@ -930,6 +953,29 @@ def test_pickle(self):
930953
pickle.dumps(store)
931954

932955

956+
@unittest.skipIf(pymongo is None, 'test requires pymongo')
957+
class TestMongoDBStore(StoreTests, unittest.TestCase):
958+
959+
def create_store(self):
960+
store = MongoDBStore(host='127.0.0.1', database='zarr_tests',
961+
collection='zarr_tests')
962+
# start with an empty store
963+
store.clear()
964+
return store
965+
966+
967+
@unittest.skipIf(redis is None, 'test requires redis')
968+
class TestRedisStore(StoreTests, unittest.TestCase):
969+
970+
def create_store(self):
971+
# TODO: this is the default host for Redis on Travis,
972+
# we probably want to generalize this though
973+
store = RedisStore(host='localhost', port=6379)
974+
# start with an empty store
975+
store.clear()
976+
return store
977+
978+
933979
class TestLRUStoreCache(StoreTests, unittest.TestCase):
934980

935981
def create_store(self):

0 commit comments

Comments
 (0)