Skip to content

BUG: ewma with time gives strange results with adjust = False vs. True, difference is too large to make sense #40098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task
jasonzhang2s opened this issue Feb 27, 2021 · 4 comments
Labels
Bug Needs Discussion Requires discussion from core team before further action Window rolling, ewma, expanding

Comments

@jasonzhang2s
Copy link

  • [ yes ] I have checked that this issue has not already been reported.

  • [ yes ] I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

np.random.seed(0)
idx=pd.date_range('20000101','20201231',periods=50000)
df=pd.DataFrame(data=np.random.normal(0, 1, 50000),index=idx)

# exclude first 1000 to avoid un-primed periods
df.ewm(halflife=pd.Timedelta('10d'),times=df.index,adjust=True).mean().iloc[1000:].plot()
df.ewm(halflife=pd.Timedelta('10d'),times=df.index,adjust=False).mean().iloc[1000:].plot()

# Your code here

Problem description

[this should explain why the current behaviour is a problem and why the expected output is a better solution]

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

@jasonzhang2s jasonzhang2s added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Feb 27, 2021
@mroeschke
Copy link
Member

@DiegoAlbertoTorres do you happen to know offhand how the ewm with times formula changes with adjust=False? (Formula found https://pandas.pydata.org/docs/user_guide/window.html#exponentially-weighted-window)

@mroeschke mroeschke added Needs Discussion Requires discussion from core team before further action Window rolling, ewma, expanding and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 8, 2021
@DiegoAlbertoTorres
Copy link
Contributor

I am not sure. The implementation of adjust=False rests on the identity that the denominator of EWMA when adjust=True (1 + (1-a)^1 + (1-a)^2 + ...)) is equivalent to simply a (see image below, substitute a with alpha). This is expressed in the docs here:
screenshot

However, when time is provided, the weight looks as below:
image

This is not a geometric series, so you cannot assume that it is equivalent to simply alpha. This can be easily shown by assuming a time vector which simply repeats the same timestamp to inifinity, which yields an infinite weight. I am not sure what we should do here.

I initially suspected that the adjustment (for the adjust parameter) we make to the iteration should be the same whether time is set or not. But the fact that the proof breaks down with my counterexample, plus Jason's discovery suggests this might not hold at all. I have not run Jason's example, how big is the difference? I think if we double check the code, and construct large enough counter-examples, we should be able to empirically show that the math behind adjust=False does not hold when the weights do not follow a geometric series.

@mroeschke
Copy link
Member

Here's a plot of the diff of Jason's data (adjust=True - adjust=False) for reference

adjust_true_minus_false

I think the safest thing to do would be to raise a NotImplementedError for times and adjust=False for now.

@MarcoGorelli
Copy link
Member

Relevant issue: #54328

I'm no expert here, but I think the solution might be to do the opposite of #40314, i.e. bring back adjust=False and raise on adjust=True?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Discussion Requires discussion from core team before further action Window rolling, ewma, expanding
Projects
None yet
Development

No branches or pull requests

4 participants