Skip to content

BUG: loffset not applied when using resample with agg() (GH13218) #13861

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wcwagner
Copy link
Contributor

@wcwagner wcwagner commented Aug 1, 2016

df = DataFrame({'A': vals, 'B': vals}, index=rng)

result = df.resample('10S', loffset='2h').agg({'A': 'count',
'B': 'mean'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u test for a series aggregation as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add PeriodIndex also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback @sinhrks I was way too naive with the fix, it ends up applying loffset twice when not passing a dict to agg(). I will try to get a better fix along with more tests tonight.

@sinhrks sinhrks added Bug Resample resample method labels Aug 1, 2016
@sinhrks sinhrks added this to the 0.19.0 milestone Aug 1, 2016
@codecov-io
Copy link

codecov-io commented Aug 2, 2016

Current coverage is 85.29% (diff: 100%)

Merging #13861 into master will decrease coverage by <.01%

@@             master     #13861   diff @@
==========================================
  Files           139        139          
  Lines         50164      50168     +4   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          42788      42791     +3   
- Misses         7376       7377     +1   
  Partials          0          0          

Powered by Codecov. Last update 257ac88...dbb054f

@@ -310,6 +310,9 @@ def aggregate(self, arg, *args, **kwargs):
*args,
**kwargs)

if isinstance(arg, dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should do this check inside _apply_loffset

@jreback
Copy link
Contributor

jreback commented Aug 8, 2016

in any event, pls add the test for timedelta index, but assert it fails (and put a TODO for the xref issue #7687 )

@wcwagner
Copy link
Contributor Author

wcwagner commented Aug 9, 2016

@jreback I'm not quite sure what you mean by put a TODO, could you please clarify?

Anyways, doing the tests was kind of tricky, so please lmk if I went in the right direction.

@jreback
Copy link
Contributor

jreback commented Aug 9, 2016

what I mean is can leave the timedelta fixes for another PR to fix
but u can out the tests in for them (and since they will fail u can just assert that they fail and add a comment, a todo for fixing later)

@wcwagner
Copy link
Contributor Author

wcwagner commented Aug 9, 2016

@jreback for future reference, when I update my PRs should I @ someone with write access to let them know? Or should I just wait, and check on them intermittently, rebasing when they get out of date?

thanks

result = df.resample('2D', loffset='2H').count()

# GH 13022, 7687 resample w/ TimedeltaIndex results in incorrect index
if isinstance(df.index, TimedeltaIndex):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no - put in the separate test classes as I indicated

@jreback
Copy link
Contributor

jreback commented Aug 9, 2016

you can ping me if u want me to look

I added some comments

@wcwagner
Copy link
Contributor Author

wcwagner commented Aug 9, 2016

@jreback split the tests into separate test classes

@@ -368,7 +370,7 @@ def _gotitem(self, key, ndim, subset=None):
return grouped

def _groupby_and_aggregate(self, how, grouper=None, *args, **kwargs):
""" revaluate the obj with a groupby aggregation """
""" reevaluate the obj with a groupby aggregation """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-evaluate

the result of resample
arg : string or dict, optional
only passed in when agg() is used w/ resample
if string is passed to agg, loffset is already applied
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is VERY confusing. pls explain.

@wcwagner
Copy link
Contributor Author

@jreback I think I explained the use of arg sufficiently.

I also didn't realize that loffset was broken when passing in a list to agg, e.g. df.resample('2d', loffset='2h').agg(['mean'])., so I added a fix and put in the corresponding tests.

@wcwagner wcwagner force-pushed the bug/13218 branch 2 times, most recently from 76ebcf6 to e2c3549 Compare August 10, 2016 11:44
@wcwagner
Copy link
Contributor Author

@jreback when you have time can you take a look at this, thanks

----------
result : Series or DataFrame
the result of resample
arg : string, list, or dict, optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be refactored to not do this. It is unlike anything else in pandas to pass an argument to determine state.

@wcwagner
Copy link
Contributor Author

wcwagner commented Aug 13, 2016

@jreback I've tried, but can't think of any more alternative solutions. I went back to what I had originally, with the logic in the aggregate() functions.

if this is insufficient, please lmk if you want me to close the PR

@jreback
Copy link
Contributor

jreback commented Aug 13, 2016

well, then reopen if you have a better way.

@jreback jreback closed this Aug 13, 2016
@wcwagner
Copy link
Contributor Author

wcwagner commented Aug 22, 2016

@jreback I force pushed but I don't have the option to reopen. I don't wanna create a dupe, so is there any way you can reopen this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loffset doesn't work in resample if used with agg()
4 participants