You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a xr.DataArray() with dimensions (time, latitude, longitude) which is wrapped around a dask.Array() and I would like to apply a bandpass filter along the time dimension using the function scipy.signal.filtfilt(b, a, x, axis=0) where b and a are coefficients and x is my xr.DataArray() with the data to be filtered. Since my array is big and this is function is only operating along one dimension, I want to apply it as a ufunc using dask='parallelized':
I am getting the error message below. As far as I understand it, the numpy arrays a and b have a shape=(7,) which does not match the shape of the chunks of arr. This mismatch causes the problem.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-0dcd81577f52> in <module>
3 dask='parallelized',
4 output_dtypes=[arr.dtype],
----> 5 kwargs={'axis': 0})
6
7 filtered
~/Software/anaconda3/lib/python3.6/site-packages/xarray/core/computation.py in apply_ufunc(func, *args, **kwargs)
985 signature=signature,
986 join=join,
--> 987 exclude_dims=exclude_dims)
988 elif any(isinstance(a, Variable) for a in args):
989 return variables_ufunc(*args)
~/Software/anaconda3/lib/python3.6/site-packages/xarray/core/computation.py in apply_dataarray_ufunc(func, *args, **kwargs)
209
210 data_vars = [getattr(a, 'variable', a) for a in args]
--> 211 result_var = func(*data_vars)
212
213 if signature.num_outputs > 1:
~/Software/anaconda3/lib/python3.6/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, *args, **kwargs)
559 raise ValueError('unknown setting for dask array handling in '
560 'apply_ufunc: {}'.format(dask))
--> 561 result_data = func(*input_data)
562
563 if signature.num_outputs == 1:
~/Software/anaconda3/lib/python3.6/site-packages/xarray/core/computation.py in func(*arrays)
553 return _apply_with_dask_atop(
554 numpy_func, arrays, input_dims, output_dims,
--> 555 signature, output_dtypes, output_sizes)
556 elif dask == 'allowed':
557 pass
~/Software/anaconda3/lib/python3.6/site-packages/xarray/core/computation.py in _apply_with_dask_atop(func, args, input_dims, output_dims, signature, output_dtypes, output_sizes)
654
655 return da.atop(func, out_ind, *atop_args, dtype=dtype, concatenate=True,
--> 656 new_axes=output_sizes)
657
658
~/Software/anaconda3/lib/python3.6/site-packages/dask/array/top.py in atop(func, out_ind, *args, **kwargs)
471 raise ValueError("Must specify dtype of output array")
472
--> 473 chunkss, arrays = unify_chunks(*args)
474 for k, v in new_axes.items():
475 chunkss[k] = (v,)
~/Software/anaconda3/lib/python3.6/site-packages/dask/array/core.py in unify_chunks(*args, **kwargs)
2568 for n, j in enumerate(i))
2569 if chunks != a.chunks and all(a.chunks):
-> 2570 arrays.append(a.rechunk(chunks))
2571 else:
2572 arrays.append(a)
~/Software/anaconda3/lib/python3.6/site-packages/dask/array/core.py in rechunk(self, chunks, threshold, block_size_limit)
1767 """ See da.rechunk for docstring """
1768 from . import rechunk # avoid circular import
-> 1769 return rechunk(self, chunks, threshold, block_size_limit)
1770
1771 @property
~/Software/anaconda3/lib/python3.6/site-packages/dask/array/rechunk.py in rechunk(x, chunks, threshold, block_size_limit)
223 for lc, rc in zip(chunks, x.chunks))
224 chunks = normalize_chunks(chunks, x.shape, limit=block_size_limit,
--> 225 dtype=x.dtype, previous_chunks=x.chunks)
226
227 if chunks == x.chunks:
~/Software/anaconda3/lib/python3.6/site-packages/dask/array/core.py in normalize_chunks(chunks, shape, limit, dtype, previous_chunks)
2013 for c, s in zip(map(sum, chunks), shape)):
2014 raise ValueError("Chunks do not add up to shape. "
-> 2015 "Got chunks=%s, shape=%s" % (chunks, shape))
2016
2017 return tuple(tuple(int(x) if not math.isnan(x) else x for x in c) for c in chunks)
ValueError: Chunks do not add up to shape. Got chunks=((10,),), shape=(7,)
Expected Output
I would expect that a and b just get passed along with every chunk. I found a similar issue #1697, but the proposed workaround of passing non-chunked objects as kwargs does not work in case of filtfilt because a and b are needed before arr in the function call.
The problem is that a and b don't have an aligned shape with arr -- the last dimension has the wrong shape. Basically, this function doesn't fit in the "ufunc" model.
Thanks, @shoyer, the suggested solution works like a charm. I am just referencing #2808, in case this can serve as an example in an apply_ufunc() tutorial.
I have a xr.DataArray() with dimensions (time, latitude, longitude) which is wrapped around a dask.Array() and I would like to apply a bandpass filter along the time dimension using the function scipy.signal.filtfilt(b, a, x, axis=0) where b and a are coefficients and x is my xr.DataArray() with the data to be filtered. Since my array is big and this is function is only operating along one dimension, I want to apply it as a ufunc using dask='parallelized':
Here is an example
Code Sample
Problem description
I am getting the error message below. As far as I understand it, the numpy arrays a and b have a shape=(7,) which does not match the shape of the chunks of arr. This mismatch causes the problem.
Expected Output
I would expect that a and b just get passed along with every chunk. I found a similar issue #1697, but the proposed workaround of passing non-chunked objects as kwargs does not work in case of filtfilt because a and b are needed before arr in the function call.
Output of
xr.show_versions()
The text was updated successfully, but these errors were encountered: