Skip to content

Impliment BusinessWindowIndexer for non-fixed offsets #24

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mroeschke
Copy link
Collaborator

@mroeschke mroeschke commented Sep 10, 2019

@@ -395,6 +395,8 @@ def _get_window_indexer(self, index_as_array):
-------
VariableWindowIndexer or FixedWindowIndexer
"""
if isinstance(self.window, BaseIndexer):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we ignore the arg when using a custom indexer? or should it still be passed (and can be ignored if needed inside the custom indexer?

imagine writing VariableIWindowIndexer and FixedWindowIndwxer as a BaseIndexer (which i think we should do at some point)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All custom indexers should subclass BaseIndexer so self.window should be the custom indexer which will accept arguments from mean (_apply).

VariableWindowIndexer and FixedWindowIndexer already subclass BaseIndexer

@mroeschke
Copy link
Collaborator Author

Question, @DiegoAlbertoTorres and @jreback, on the definition of a rolling business day.

Given the example below:

In [3]: from pandas import Series, date_range

In [4]: from pandas.core.window.indexers import BusinessWindowIndexer

In [5]: index = date_range("2019-09-06", "2019-09-13", freq="D")

In [12]: df = pd.DataFrame(range(len(index)), index=index)

In [14]: df['day_names'] = df.index.day_name()

In [15]: df
Out[15]:
            0  day_names
2019-09-06  0     Friday
2019-09-07  1   Saturday
2019-09-08  2     Sunday
2019-09-09  3     Monday
2019-09-10  4    Tuesday
2019-09-11  5  Wednesday
2019-09-12  6   Thursday
2019-09-13  7     Friday

Are values that fall on non-business days aggregated in the window? (Currently they are)

In [16]: df[0].rolling('D').mean()
Out[16]:
2019-09-06    0.0
2019-09-07    1.0
2019-09-08    2.0
2019-09-09    3.0
2019-09-10    4.0
2019-09-11    5.0
2019-09-12    6.0
2019-09-13    7.0
Freq: D, Name: 0, dtype: float64


# 2019-09-09 window = (Friday -> Monday] = Saturday + Sunday + Monday = (3 + 2 + 1) / 3 = 2
# 2019-09-08 window = (Friday -> Sunday] = Saturday + Sunday = (2 + 1) / 2 = 1.5
In [17]: df[0].rolling(BusinessWindowIndexer(index=s.index, offset="B")).mean()
Out[17]:
2019-09-06    0.0
2019-09-07    1.0
2019-09-08    1.5
2019-09-09    2.0
2019-09-10    4.0
2019-09-11    5.0
2019-09-12    6.0
2019-09-13    7.0
Freq: D, Name: 0, dtype: float64

@mroeschke mroeschke changed the title WIP: Impliment BusinessWindowIndexer for non-fixed offsets Impliment BusinessWindowIndexer for non-fixed offsets Sep 12, 2019
@mroeschke
Copy link
Collaborator Author

Will document this in the POC doc after this PR.

@mroeschke
Copy link
Collaborator Author

Closing this example. This example requires a non trivial reworking of the internals for a feature that can be done today.

@mroeschke mroeschke closed this Sep 17, 2019
@mroeschke mroeschke deleted the feature/rolling_business_day branch November 5, 2019 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants