-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Open
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.StringsString extension data type and string dataString extension data type and string data
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
print(pd.__version__)
df = pd.DataFrame({"A": ["1", "", "3"]}, dtype="string")
try:
result = df.where(df != "", np.nan)
arr = result["A"]._values
print(arr)
print(type(arr[1]))
except Exception as e:
print(e)
df.where(df != "", np.nan, inplace=True)
print(df)
arr = df["A"]._values
print(arr)
print(type(arr[1]))
Issue Description
code sample based on #46366
1.4.1
StringArray requires a sequence of strings or pandas.NA
A
0 1
1 NaN
2 3
<StringArray>
['1', nan, '3']
Length: 3, dtype: string
<class 'float'>
1.5.0.dev0+595.gf99ec8bf80
<StringArray>
['1', <NA>, '3']
Length: 3, dtype: string
<class 'pandas._libs.missing.NAType'>
A
0 1
1 NaN
2 3
<StringArray>
['1', nan, '3']
Length: 3, dtype: string
<class 'float'>
Expected Behavior
The behavior for the inplace=False
case has changed from 1.4.1 to main since #45168 allows other na values in the StringArray Constructor.
Whether this is correct for the DataFrame.where case may need discussion. Either way, the results for the inplace=True
case look incorrect to me and should be consistent with the inplace=False
case.
Installed Versions
.
Metadata
Metadata
Assignees
Labels
BugExtensionArrayExtending pandas with custom dtypes or arrays.Extending pandas with custom dtypes or arrays.StringsString extension data type and string dataString extension data type and string data