Skip to content

BUG: loffset not applied when using resample with agg() (GH13218) #14213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

wcwagner
Copy link
Contributor

@wcwagner wcwagner commented Sep 13, 2016

_apply_loffset is called in resampler's downsample and groupby_and_agg methods, defined here and here

When a str is passed into aggregate, the above methods are called here, and consequently the loffset is applied.

However, if anything other than a str is passed into aggregate, then a different path will be taken and the loffset must be applied. So a simple check in aggregate will apply the loffset correctly.

note: my last PR was closed, and I can't reopen it by a force push, hence the new one

@codecov-io
Copy link

codecov-io commented Sep 13, 2016

Current coverage is 84.78% (diff: 100%)

Merging #14213 into master will increase coverage by <.01%

@@             master     #14213   diff @@
==========================================
  Files           145        145          
  Lines         51090      51094     +4   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43315      43321     +6   
+ Misses         7775       7773     -2   
  Partials          0          0          

Powered by Codecov. Last update e27b296...8619e9e

@@ -323,6 +323,11 @@ def aggregate(self, arg, *args, **kwargs):
*args,
**kwargs)

# if arg was a string, _aggregate called resampler's _downsample or
# _groupby_and_agg methods, which would've already applied the loffset
if not isinstance(arg, compat.string_types):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear why this is happening here, and not in .downsample (and .upsample) instead. checking the arg type is not very robust (and does't follow a good pattern).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not happening in there because .downsample is not called when anything other than a string is passed into agg.

Resampler has downsample & groupby_and_agg methods (e.g. count, mean) which in return call _downsample, and when a string is passed into agg, these methods are called. However, if a string isn't passed in, these methods would not have been called, and consequently _downsample would not have been called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And as for the implementation, for something as small as loffset, is it really worth refactoring a lot of code to change the way it is applied?

I've gone through the stack trying to find another place to apply it, but this is by far the clearest & simplest way I've found.

@jreback jreback added Resample resample method Bug labels Sep 13, 2016
@jreback
Copy link
Contributor

jreback commented Nov 25, 2016

@wcwagner if you want to update to 0.19.2 and rebase, we can take another look.

@jreback
Copy link
Contributor

jreback commented Dec 26, 2016

can you move release note to 0.20.0

@wcwagner
Copy link
Contributor Author

Sorry for the delay, I will update everything tomorrow.

@jreback
Copy link
Contributor

jreback commented Dec 28, 2016

pls add example referenced in #15002 as well (your changes should cover it)

@wcwagner
Copy link
Contributor Author

@jreback do you want me to add the example in #15002 into the unittests? I suppose it would be a good idea to do so, seeing as it tests the how parameter (even though it's deprecated).

@jreback
Copy link
Contributor

jreback commented Dec 29, 2016

yep

@wcwagner
Copy link
Contributor Author

Seems like resample's loffset tests were removed from TestPeriodIndex and TestTimedeltaIndex. Should I remove my tests from those two as well?

@@ -1188,6 +1188,29 @@ def test_resample_loffset_count(self):

assert_series_equal(result, expected)

def test_resample_loffset_arg_type(self):
# GH 13218, 15002
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better if you can put this in Base, that way code is not duplicated. Of better to at least partially do it (e.g. if you have a specific part that can be put there, then add another test in the sub-classes to cover the rest).

expected_index = self.create_index(df.index[0],
periods=len(df.index) / 2,
freq='2D')
# loffset coreces PeriodIndex to DateTimeIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm that seems odd that a PI coerces when loffset is applied. is this happen in previous versions of pandas?

Copy link
Contributor Author

@wcwagner wcwagner Dec 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I'm pretty sure in v0.18 the same thing happened. Is this intended?

In [19]: df = pd.DataFrame(data=list(range(10)), index=pd.period_range("1/1/2000", freq="D", periods=10))

In [20]: df.resample("2D").mean().index
Out[20]:  DatetimeIndex(['2000-01-01', '2000-01-03', '2000-01-05', '2000-01-07',
               '2000-01-09'],
              dtype='datetime64[ns]', freq='2D')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I added this to the master issue: #12871, so that when PI is more compat (IOW it returns PI when PI is input, though not always possible) for resampling this will be tested.

@jreback jreback mentioned this pull request Dec 31, 2016
13 tasks
@jreback jreback added this to the 0.20.0 milestone Dec 31, 2016
@jreback jreback closed this in b2cdc02 Dec 31, 2016
@jreback
Copy link
Contributor

jreback commented Dec 31, 2016

thanks @wcwagner

I reworked this a bit to make it cleaner in 3662413

love for you to tackle some more issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Resample resample method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

loffset doesn't work in resample if used with agg()
3 participants