Skip to content

rasterio chunks argument causes loading from s3 to fail #1816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rabernat opened this issue Jan 10, 2018 · 1 comment · Fixed by #1817
Closed

rasterio chunks argument causes loading from s3 to fail #1816

rabernat opened this issue Jan 10, 2018 · 1 comment · Fixed by #1817
Labels

Comments

@rabernat
Copy link
Contributor

Code Sample, a copy-pastable example if possible

# This works
url = 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'
ds = xr.open_rasterio(url)
# this doesn't
ds = xr.open_rasterio(url, chunks=512)

The error is

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-17-8b55d7e920b8> in <module>()
      6 # https://aws.amazon.com/public-datasets/landsat/
      7 # 512x512 chunking
----> 8 ds = xr.open_rasterio(url, chunks=512)
      9 ds

~/miniconda3/envs/geo_scipy/lib/python3.6/site-packages/xarray-0.10.0-py3.6.egg/xarray/backends/rasterio_.py in open_rasterio(filename, chunks, cache, lock)
    172         from dask.base import tokenize
    173         # augment the token with the file modification time
--> 174         mtime = os.path.getmtime(filename)
    175         token = tokenize(filename, mtime, chunks)
    176         name_prefix = 'open_rasterio-%s' % token

~/miniconda3/envs/geo_scipy/lib/python3.6/genericpath.py in getmtime(filename)
     53 def getmtime(filename):
     54     """Return the last modification time of a file, reported by os.stat()."""
---> 55     return os.stat(filename).st_mtime
     56 
     57 

FileNotFoundError: [Errno 2] No such file or directory: 's3://landsat-pds/L8/139/045/LC81390452014295LGN00/LC81390452014295LGN00_B1.TIF'

Problem description

It is pretty clear that the current xarray code expects to receive a filename. (The name of the argument is filename.) But rasterio's open function accepts a much wider range of dataset identifiers. The tokenizing function should be updated to allow for this. Seems like it should be a pretty easy fix.

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.2.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.0
pandas: 0.20.3
numpy: 1.13.1
scipy: 0.19.1
netCDF4: 1.3.1
h5netcdf: 0.4.1
Nio: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.16.0
matplotlib: 2.1.0
cartopy: 0.15.1
seaborn: 0.8.1
setuptools: 36.3.0
pip: 9.0.1
conda: None
pytest: 3.2.1
IPython: 6.1.0
sphinx: 1.6.5

@rabernat rabernat added the bug label Jan 10, 2018
@rabernat
Copy link
Contributor Author

Note that, in order for this s3 example to work, you need to have boto installed and aws credentials enabled:
https://aws.amazon.com/developers/getting-started/python/

rabernat added a commit to rabernat/xarray that referenced this issue Jan 10, 2018
shoyer pushed a commit that referenced this issue Jan 23, 2018
* fixes #1816

* new and refactored rasterio tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant