-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Deleting a recently opened netCDF4 #9410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
Could you make an MCVE to copy & paste, using the context manager? |
Do you mean copying the contents of the file or the file itself? |
The file should be created inline. Thanks! |
I am a bit lost here. What I am trying to do doesn't seem to be related to the creation of the file. There are two dimensions in the dataset, and I am trying to slice a portion from ds as in the code below, after which I have no use for the original file. I need to delete it as it is big. An MCVE of something I did would look like: import xarray
import os
with xarray.open_dataset(filePath) as ds:
cropped_ds=ds.sel(x=slice(x1,x2), y=slice(y1,y2)) #x and y are the dimensions in the dataset
os.remove(filePath) Assuming it was because of the processing that happened in between, I replace it with just a print statement. import xarray
import os
with xarray.open_dataset(filePath) as ds:
print(ds)
os.remove(filePath) However, the problem persisted. I hope I was able to give what you asked in this comment. Please tell me if you need any other info. |
Sorry if I'm being unclear. Have a look at the docs for an MCVE in the issue template. The example should be able to be copy-pasted into a new python prompt. |
the issue is that we don't have access to your file (nor should we be able to access it), instead what we're looking for is if you can create a dummy dataset, save that to disk and allow us to reproduce your issue that way. For example: filepath = ...
ds = xr.Dataset({"a": (["x", "y"], np.ones(shape=(10, 12), dtype="float64"))}, coords={"x": range(10), "y": range(12)})
ds.to_netcdf(filepath)
... # code to reproduce your issue (you might have to adapt the dummy dataset to actually reproduce your issue, this is just an example) |
So, I was trying to write the MCVE for the issue I was facing. The code looks something like this: import xarray as xr
import numpy as np
import os
# Create latitude and longitude arrays
lat = np.arange(-90, 90, 0.01)
lon = np.arange(-180, 180, 0.01)
# Create a 2D array for temperature, here using a simple example like a sine function for variation
temperature = np.sin(np.sqrt(lat[:, np.newaxis]**2 + lon[np.newaxis, :]**2))
# Create an xarray Dataset
ds = xr.Dataset(
{
"TEMPERATURE": (["lat", "lon"], temperature)
},
coords={
"lat": lat,
"lon": lon
}
)
# Display the created dataset
ds.to_netcdf("sample.nc")
with xr.open_dataset("sample.nc") as ds:
cropped_ds=ds.sel(lon=slice(-95,-94), lat=slice(30,28))
os.remove("sample.nc") I can delete the file. But when I try to do the same for the data I am working on, it throws an error. Thus, I am adding the link to the file, which is open-source data downloaded from Copernicus Land Services. The following is the code that I used that gave out the error. filePath=r"c_gls_LAI300-RT0_201712310000_GLOBE_PROBAV_V1.0.1.nc"
with xr.open_dataset(filePath) as ds:
print(ds)
os.remove("c_gls_LAI300-RT0_201712310000_GLOBE_PROBAV_V1.0.1.nc")
Some things that I have found:
filePath=r"c_gls_LAI300-RT0_201712310000_GLOBE_PROBAV_V1.0.1.nc"
with xr.open_dataset(filePath, engine="netcdf4") as ds:
print(ds)
ds.to_netcdf("sample2.nc")
The dataset I created at the very beginning also requires about 5GB of space and the code executed without any issues. If I don't specify the engine, it issues an error message saying 22 GB of space was required.
filePath=r"c_gls_LAI300-RT0_201712310000_GLOBE_PROBAV_V1.0.1.nc"
with xr.open_dataset(filePath) as ds:
cropped_ds=ds.sel(lon=slice(-95,-94), lat=slice(30,28))
cropped_ds.to_netcdf("sample3.nc")
with xr.open_dataset("sample3.nc") as ds:
print(ds)
os.remove("sample3.nc") Do tell me if you require further info. |
That is quite surprising! Without some repro that doesn't involve downloading 1.5G of data, it's unlikely to get much traction. Does making a smaller-but-not-tiny file — say 150MB — trigger the error? |
The size is not causing the problem. I tried creating large and small files (about 5GB). I could read and delete it without any issues. I even cropped the data for a particular region from the above file and saved it on a separate file. I could read it and delete it. I can't think of a way to reproduce the same error here. |
OK so to confirm: this code fails for this specific file, but we can't find any other file where the problem occurs?
V surprising if so! Again my guess is that it's too specific a problem to get traction, but we can reopen if there's a more reproducible case... |
As a part of my task, I had to download, read, process and then finally delete the netCDF files after a certain number of files have been read due to storage limitations. But even after manually closing the files or using context manager:
OR
, I get
I did refer to the previously reported issues #1629 and #2887. But using context manager or changing the engine through which netCDF file is read was of no help. Is there any way to work around this?
The text was updated successfully, but these errors were encountered: