Skip to content

Conversation

spencerkclark
Copy link
Member

The remaining failing cftime resample tests in #8091 happen to be related to a bug that was fixed in the pandas implementation, pandas-dev/pandas#55283, leading answers to change in some circumstances. This PR ports that bug fix to xarray's implementation of resample for data indexed by a CFTimeIndex.

A simple example where answers change in pandas is the following:

Previously

>>> import numpy as np; import pandas as pd
>>> index = pd.date_range("2000", periods=5, freq="5D")
>>> series = pd.Series(np.arange(index.size), index=index)
>>> series.resample("2D", closed="right", label="right", offset="1s").mean()
2000-01-01 00:00:01    0.0
2000-01-03 00:00:01    NaN
2000-01-05 00:00:01    1.0
2000-01-07 00:00:01    NaN
2000-01-09 00:00:01    NaN
2000-01-11 00:00:01    2.0
2000-01-13 00:00:01    NaN
2000-01-15 00:00:01    3.0
2000-01-17 00:00:01    NaN
2000-01-19 00:00:01    NaN
2000-01-21 00:00:01    4.0
Freq: 2D, dtype: float64

Currently

>>> import numpy as np; import pandas as pd
>>> index = pd.date_range("2000", periods=5, freq="5D")
>>> series = pd.Series(np.arange(index.size), index=index)
>>> series.resample("2D", closed="right", label="right", offset="1s").mean()
2000-01-01 00:00:01    0.0
2000-01-03 00:00:01    NaN
2000-01-05 00:00:01    NaN
2000-01-07 00:00:01    1.0
2000-01-09 00:00:01    NaN
2000-01-11 00:00:01    2.0
2000-01-13 00:00:01    NaN
2000-01-15 00:00:01    NaN
2000-01-17 00:00:01    3.0
2000-01-19 00:00:01    NaN
2000-01-21 00:00:01    4.0
Freq: 2D, dtype: float64

This PR allows us to reproduce this change in xarray for data indexed by a CFTimeIndex. The bin edges were incorrect in the previous case; see pandas-dev/pandas#52064 (comment) for @MarcoGorelli's nice explanation as to why.

@dcherian
Copy link
Contributor

Thanks Spencer, the RTD failure looks like bad rst formatting.

/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/8393/doc/whats-new.rst:52: ERROR: Unexpected indentation.
/home/docs/checkouts/readthedocs.org/user_builds/xray/checkouts/8393/doc/whats-new.rst:46: WARNING: Inline interpreted text or phrase reference start-string without end-string.

@keewis keewis added the run-upstream Run upstream CI label Oct 31, 2023
Co-authored-by: Justus Magin <[email protected]>
@dcherian dcherian added the plan to merge Final call for comments label Oct 31, 2023
@dcherian dcherian merged commit d933578 into pydata:main Nov 2, 2023
@spencerkclark spencerkclark deleted the port-pandas-resample-fix branch November 2, 2023 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments run-upstream Run upstream CI topic-pandas-like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants