-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
bug in tz_localize when summer/winter time change #8531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
try this using 0.15.0rc1 http://pandas.pydata.org/ their is now an
cc @rockg |
I have updated the version - your example work fine.
|
How can I 'set' ambiguous equal True in my previous example ? |
Also:
|
@gbakalian out of curiosity why are you resampling over a DST transition with hourly data? |
LOL indeed can sounds weird ! In my database I received several curves with different frequencies and different timezones from different weather providers. I need to get them at the same time zone and resample them hourly. but it can happen that among all the curves I retrieve some which are already hourly. if you prefer the following example also crash :
|
thanks! that sounds like a good use case. if you'd like to dig-in would welcome a PR to fix! It prob just not passing the argument thru |
May I ask you a question: The ambiguous parameters is set in which element ? - and is it kept ? (sorry if I am asking dumb questions) because in the previous example I do not tell him anywhere that I want ambiguous=True, as I made the tz localisation using pytz and not tz_localize on pd.Series. So maybe we should be able to pass ambiguous through resample ?
this does not work neither so I guess in tz_localize_to_utc in the tslib.pyx file the ambiguous parameters is not passed or read correctly ? Also I guess the 4 tz_localize between lines 418-428 in the index.py file should have ambiguous=ambiguous. |
that sounds about right. here's some help for doing a pull-request: https://github.com/pydata/pandas/wiki this prob has several aspects which are not handling this: have it fail, then fix. |
True/False are not valid arguments to ambiguous when dealing with a DatetimeIndex. Use |
If you want to explicitly tell what times are ambiguous with multiple DST switches you need to pass in an array of True/False values. Now, as to passing in ambiguous into the start/end times I think the right way to handle this is to pass in a non-ambiguous time into the constructor of the DatetimeIndex and then when it's localized further below everything should work properly. Otherwise you will maybe have multiple arguments that need to be passed with ambiguous. |
Passing True/False into ambiguous is ambiguous in itself. I agree that the naming in the Timestamp case is a bit misleading but in the DatetimeIndex case it should not be. We tried to be as explicit as possible by forcing the user to fully specify DST/non-DST flags or using the infer/raise/NaT arguments to specify how those ambiguous times should be handled. In the case of the Timestamp, True/False are analogous to the DST/non-DST flags of the DatetimeIndex (and you will see that they are cast to an array when passed into the underlying localize function). |
You should not need an
Yeah, I've been kicking myself for fighting so hard for that name; I was forgetting that practicality beats purity. |
I like |
I am ok with de-ambiguitizing (is that a word?) the arguments, and bringing back The resample issue (this issue) can wait for 0.15.1. its prob just doing something dumb somewhere. |
Well there really are two uses. When you have one timestamp we should just try and replicate what tz.localize does in pytz. So that can be done in Timestamp's constructor/tz_localize. Ambiguous should remain for DatetimeIndex and is_dst should not be used. I'm out of dodge for awhile so somebody else would have to do it. |
I really like having a consistent interface, though. Maybe the best answer is to use a categorical? Then you would have |
currently
you are refering to the |
@ischwabacher you want to accept |
Yes. It would just be for clarity; IIUC a |
haha, no you can accept a ok, sounds like we can do this for 0.15.1 then (as you don't need to bring back @ischwabacher why don't you create a new issue for this change (if you want to add in |
I still don't understand what the point of I completely agree that Categorical is too involved. |
Okay, I think I understand. For Timestamps, |
Are there no vectorized operations on
I posted that and was immediately confronted with the horrible vision of |
I was thinking when constructing a single Timestamp, one should always know whether it's a dst or non-dst Timestamp and so it would never make sense to have those other keywords. However, when doing vectorized operations then I guess it does make sense. I still am not sold on having more than argument. I don't know of a case that we can't handle with the current one. I like your new naming for Timestamp but I don't see how that would apply to a DatetimeIndex (unless you have an array of 'dst', 'std' but that seems overkill). We could do something like |
Solution: named constants |
I'm happy with that. |
@ischwabacher I don't understand your last point on named constants. Can you give an example of how that would look like? |
I want to make the parameter grokkable. If you read On the other hand, I'm also reconsidering the use case for an
If we're dealing with vectorized operations in such cases, we probably want either On the other hand, my original argument for
|
I am getting some data from the populate MetaTrader 5 platform, the data is in Cyprus time, which has DST. The clocks went forward on March 12th this year, I expected the data from there to be Passing in GMT is the same, It all sticks in +0 instead of moving forward. |
Works with datetime.datetime, does not work with pandas.tslib.Timestamp.
Encounter the issue when trying to resample hourly a series.
The text was updated successfully, but these errors were encountered: