-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
EA ops alignment with DataFrame #24301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you give a code example of what you are meaning? (isn't it expected that in |
This came up in #24282 trying to make use of the decorator for IntNA arithmetic ops. The relevant portion of the test that would be affected is: (see #24326, turning the tests into something copy/paste-able is much harder than it should be)
Instead of raising NIE, I want to make
I advocate option 4, as it matches numpy behavior that people are used to and would also let us avoid losing EA dtypes when transposing. |
I don't see how that example relates to the original post, which was about DataFrame + EA. Is the issue about DataFrame + Series[ea], in which case alignment matters? Or is it about DataFrame + EA, when there isn't any alignment (but maybe broadcasting?) To me that should behave the same as DataFrame + ndarray In [32]: df = pd.DataFrame({"A": [1, 2, 3]})
In [33]: df + pd.core.arrays.integer_array([1, 2, 3])
# raises ValueError, just like for ndarray
In [38]: df + np.array([1])
Out[38]:
A
0 2
1 3
2 4
In [39]: df + pd.core.arrays.integer_array([1])
Out[39]:
A
0 2
1 3
2 4 I don't think that last one is dispatching to EA implementation though. |
@TomAugspurger It’s about DataFrame + EA. The example with a length-1 array is a special case. Try the same thing with an array with length matching len(df). Both the EA and ndarray cases should raise ValueError. But in the ndarray case it can be solved by reshaping, while the EA case cannot. |
I don't think that's the recommended way of doing it though. We'd direct users towards In [17]: df.add(pd.core.arrays.integer_array(arr), axis=0)
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-17-84452ee6a044> in <module>
----> 1 df.add(pd.core.arrays.integer_array(arr), axis=0)
~/sandbox/pandas/pandas/core/ops.py in f(self, other, axis, level, fill_value)
2016 return _combine_series_frame(self, other, pass_op,
2017 fill_value=fill_value, axis=axis,
-> 2018 level=level)
2019 else:
2020 if fill_value is not None:
~/sandbox/pandas/pandas/core/ops.py in _combine_series_frame(self, other, func, fill_value, axis, level)
1903 axis = self._get_axis_number(axis)
1904 if axis == 0:
-> 1905 return self._combine_match_index(other, func, level=level)
1906 else:
1907 return self._combine_match_columns(other, func, level=level)
~/sandbox/pandas/pandas/core/frame.py in _combine_match_index(self, other, func, level)
4925 # fastpath --> operate directly on values
4926 with np.errstate(all="ignore"):
-> 4927 new_data = func(left.values.T, right.values).T
4928 return self._constructor(new_data,
4929 index=left.index, columns=self.columns,
~/sandbox/pandas/pandas/core/arrays/integer.py in integer_arithmetic_method(self, other)
585 if getattr(other, 'ndim', 0) > 1:
586 raise NotImplementedError(
--> 587 "can only perform ops with 1-d structures")
588
589 if isinstance(other, IntegerArray):
NotImplementedError: can only perform ops with 1-d structures So I would focus on getting that working I think. In this case, I think the issue is that |
One more reason why 2-D should EA should be supported:
df + ea
won't treatea
as column-like since it can't be reshaped to(nrows, 1)
. And operating withdf.T
has its own set of problems because transposing drops EA dtypes.The text was updated successfully, but these errors were encountered: