Open
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
df = pd.DataFrame(
{
'A': [0, 1, 2]
},
)
For example, suppose you have a DataFrame like the one above.
>>> df[
... pd.Series([-1, 1, pd.NA]) < 0
... ]
A
0 0
This works as expected.
>>>df[
... pd.Series([-1, 1, pd.NA]).convert_dtypes() < 0
... ]
A
0 0
This also works as expected.
>>> df['A'].mask(
... pd.Series([-1, 1, pd.NA]) < 0,
... 100
... )
0 100
1 1
2 2
Name: A, dtype: int64
This also works as expected.
>>> df['A'].mask(
... pd.Series([-1, 1, pd.NA]).convert_dtypes() < 0,
... 100
... )
0 100
1 1
2 100 # !?
Name: A, dtype: int64
This behavior confuses a lot of people.
Feature Description
I think most people would expect this result.
>>> df['A'].mask(
... pd.Series([-1, 1, pd.NA]).convert_dtypes() < 0,
... 100
... )
0 100
1 1
2 2
Name: A, dtype: int64
I think you should either set the result of the logical operation on pd.NA to False
>>> pd.Series([-1, 1, pd.NA]).convert_dtypes() < 0
0 True
1 False
2 False
dtype: boolean
or change the .mask() method to make pd.NA behave the same as False.
Alternative Solutions
Using .fillna(False) will do what you expect, but I think it would be a depressing task for many people.
>>> df['A'].mask(
... (pd.Series([-1, 1, pd.NA]).convert_dtypes() < 0).fillna(False),
... 100
... )
0 100
1 1
2 2
Name: A, dtype: int64
Additional Context
No response