Skip to content

Subsetting not reducing memory consumption with mfdataset on Windows #1559

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorgsk opened this issue Sep 7, 2017 · 5 comments
Closed

Subsetting not reducing memory consumption with mfdataset on Windows #1559

jorgsk opened this issue Sep 7, 2017 · 5 comments

Comments

@jorgsk
Copy link

jorgsk commented Sep 7, 2017

On Windows, with the following code I run out of memory:

d = xr.open_mfdataset(filepath)
sed = d.sediment_mass_per_unit_area.isel(time=-1)
sed.load()
d.close()

but when using "open_dataset" I load the data instantly:

d = xr.open_dataset(filepath)
sed = d.sediment_mass_per_unit_area.isel(time=-1)
sed.load()
d.close()

On linux both methods use minimal memory. The target file is 1.5 GB, and the target variable is 1460x862x900 floats. Filepath is a single file, not multiple files. Xarray 0.9.6 used on both systems, installed with conda.

@jhamman
Copy link
Member

jhamman commented Sep 7, 2017

your 2 code snippets are identical.

@jorgsk
Copy link
Author

jorgsk commented Sep 8, 2017

Sorry about that, I've updated the second snippet. Also added more information in case it helps.

@fmaussion
Copy link
Member

Is the dask version also identical on both systems?

@jorgsk
Copy link
Author

jorgsk commented Sep 14, 2017

I found that dask was version 0.11 on Windows but 0.15 on Linux. Upgrading the Windows version to 0.15 fixed the issue, indicating that the problem was the version of dask.

@fmaussion
Copy link
Member

Yes, I recall @rabernat worked extensively on this. Possibly related:

@shoyer , @rabernat , maybe if you have some time it would be good to revisit these issues and see if they can be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants