You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been running into some issues when using xr.full_like, when my other.data is a chunked dask array, and the fill_value is a numpy array.
Now, I just checked, full_like mentions only scalar in the signature. However, this is a very convenient way to get all the coordinates and dimensions attached to an array like this, so it feels like desirable functionality. And as I mention below, both numpy and dask function similary, taking much more than just scalars. https://xarray.pydata.org/en/stable/generated/xarray.full_like.html
The text was updated successfully, but these errors were encountered:
Huite
changed the title
xr.full_like fails when other is chunked
xr.full_like (often) fails when other is chunked and fill_value is non-scalar
Apr 16, 2020
* Avoid multiplication DeprecationWarning in rasterio backend
* full_like: error on non-scalar fill_value
Fixes#3977
* Added test
* Updated what's new
* core.utils.is_scalar instead of numpy.is_scalar
* More informative error message
* raises_regex for error test
I've been running into some issues when using
xr.full_like
, when myother.data
is a chunked dask array, and thefill_value
is a numpy array.Now, I just checked,
full_like
mentions only scalar in the signature. However, this is a very convenient way to get all the coordinates and dimensions attached to an array like this, so it feels like desirable functionality. And as I mention below, both numpy and dask function similary, taking much more than just scalars.https://xarray.pydata.org/en/stable/generated/xarray.full_like.html
MCVE Code Sample
This results in an error:
ValueError: could not broadcast input array from shape (1,3) into shape (1,4)
Expected Output
Expected is a DataArray with the dimensions and coords of
other
, and the numpy array offill_value
as its data.Problem Description
The issue lies here:
xarray/xarray/core/common.py
Lines 1420 to 1436 in 2c77eb5
Calling
dask.array.full
with the given number of chunks results in it trying to to apply thefill_value
for every individual chunk.As one would expect, if I set
fill_value
to the size of a single chunk it doesn't error:It does fail on a similarly chunked dask array (since it's applying it for every chunk):
The most obvious solution would be to force it down the
np.full_like
route, since all the values already exist in memory anyway. So maybe another type check does the trick. However,full()
accepts quite a variety of arguments for the fill value (scalars, numpy arrays, lists, tuples, ranges). The dask docs mention only a scalar in the signature fordask.array.full
:https://docs.dask.org/en/latest/array-api.html#dask.array.full
As does numpy.full:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.full.html
However, in all cases, they still broadcast automatically...
So kind of undefined behavior of a blocked
full
?Versions
Output of `xr.show_versions()`
INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: en LOCALE: None.None libhdf5: 1.10.5 libnetcdf: 4.7.3xarray: 0.15.1
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.3.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.2
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.9.2
distributed: 2.10.0
matplotlib: 3.1.2
cartopy: None
seaborn: 0.10.0
numbagg: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.3.4
IPython: 7.13.0
sphinx: 2.3.1
The text was updated successfully, but these errors were encountered: