Skip to content

[BUG] Index of the RHS is being ignored in assignments #7368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
magnatelee opened this issue Feb 10, 2021 · 3 comments
Closed

[BUG] Index of the RHS is being ignored in assignments #7368

magnatelee opened this issue Feb 10, 2021 · 3 comments
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@magnatelee
Copy link
Contributor

Describe the bug
cuDF seems to be ignoring the index of the RHS when performing an assignment.

Steps/Code to reproduce bug

In the following example, assignment (1) treats y as if it were a series with an index [0, 1, 2], which deviates from the semantics of Pandas:

>>> import cudf
>>> x = cudf.Series([1, 2, 3])
>>> x
0    1
1    2
2    3
dtype: int64
>>> y = cudf.Series([3, 2, 1], index=[2, 1, 0])
>>> x.iloc[:] = y  # (1)                                                
>>> x
0    3
1    2
2    1
dtype: int64

Here is the result of the same program in Pandas:

>>> import pandas as pd
>>> x = pd.Series([1, 2, 3])
>>> x
0    1
1    2
2    3
dtype: int64
>>> y = pd.Series([3, 2, 1], index=[2, 1, 0])
>>> x.iloc[:] = y
>>> x
0    1
1    2
2    3
dtype: int64

Expected behavior
cuDF should yield the same result as Pandas. I believe internally Pandas performs a left outer join between indices of the LHS and the RHS to find the values of the RHS that correspond to the LHS.

@magnatelee magnatelee added bug Something isn't working Needs Triage Need team to review and classify labels Feb 10, 2021
@shwina shwina added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Feb 12, 2021
@isVoid isVoid self-assigned this Feb 26, 2021
@isVoid
Copy link
Contributor

isVoid commented Mar 2, 2021

As of Pandas 1.2.0, they seem to have reintroduced the behavior:

>>> x = pd.Series([1, 2, 3])
>>> y = pd.Series([3, 2, 1], index=[2, 1, 0])
>>> x.iloc[:] = y
>>> x
0    3
1    2
2    1
dtype: int64

@isVoid
Copy link
Contributor

isVoid commented Mar 2, 2021

Reported: pandas-dev/pandas#40184

@isVoid
Copy link
Contributor

isVoid commented Mar 3, 2021

According to the Pandas thread, it seems the reported behavior is expected. Closing.

@isVoid isVoid closed this as completed Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants