-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: improve DTI string parse #13692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1046,7 +1046,12 @@ def _get_binner_for_grouping(self, obj): | |||
l = [] | |||
for key, group in grouper.get_iterator(self.ax): | |||
l.extend([key] * len(group)) | |||
grouper = binner.__class__(l, freq=binner.freq, name=binner.name) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe isolate this a bit with some functions like _get_binner_for_resample does (maybe we should abstract this out even a bit more and have some PeriodIndexGrouper, DatetimeinexGrouper, which subclass TimeGrouper, but this might require some effort).
Current coverage is 84.54%@@ master #13692 diff @@
==========================================
Files 141 141
Lines 51185 51159 -26
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
- Hits 43275 43250 -25
+ Misses 7910 7909 -1
Partials 0 0
|
if not (is_datetime64_dtype(data) or is_datetimetz(data) or | ||
is_integer_dtype(data)): | ||
data = tools.to_datetime(data, dayfirst=dayfirst, | ||
yearfirst=yearfirst) | ||
|
||
if issubclass(data.dtype.type, np.datetime64) or is_datetimetz(data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason not to use is_datetime64_dtype
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to normal datetime64
, datetime64tz (DatetimeIndex)
and int
can be directly converted to DatetimeIndex
.
@sinhrks this cleans up lots of junk! just a couple of clarifications. |
lgtm. ready to go? |
yes, it's ready:) |
git diff upstream/master | flake8 --diff
cleaned up
DatetimeIndex
constructor removing slower string-parsing path.Performance Improvement
related to #7599, internally use
to_datetime
always as it tries some fastpath.Bug Fixes
The cleanup fixed these 2 kind of issues:
1. #11169 and #11287 Invalid string parsing may raise
TypeError
I met the same issue on travis and fixed with try-except clause (I can't reproduce it on my local Mac).
2. Index may incorrectly coerces mismatched tz
on current master,
DatetimeIndex
and normalIndex
behaves differently.after the PR both behave the same, showing understandable error.