Skip to content

Commit bc84cdb

Browse files
committed
Lift expensive Regex construction from DateFormat method body.
Constructing the Regex touched in this commit can represent a significant fraction (e.g. half or better) of the runtime of the DateFormat method touched in this commit. To make this DateFormat method more efficient, let's lift that Regex construction out of that method body.
1 parent a3c2798 commit bc84cdb

File tree

1 file changed

+24
-2
lines changed

1 file changed

+24
-2
lines changed

stdlib/Dates/src/io.jl

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,21 @@ const CONVERSION_TRANSLATIONS = IdDict{Type, Any}(
332332
Time => (Hour, Minute, Second, Millisecond, Microsecond, Nanosecond, AMPM),
333333
)
334334

335+
# The `DateFormat(format, locale)` method just below consumes the following Regex.
336+
# Constructing this Regex is fairly expensive; doing so in the method itself can
337+
# consume half or better of `DateFormat(format, locale)`'s runtime. So instead we
338+
# construct and cache it outside the method body. Note, however, that when
339+
# `keys(CONVERSION_SPECIFIERS)` changes, the cached Regex must be updated
340+
# accordingly; hence the Ref-ness of the cache, the helper method with which
341+
# to populate the cache, and cache of the hash of `keys(CONVERSION_SPECIFIERS)`
342+
# to facilitate checking for changes.
343+
function compute_dateformat_regex(conversion_specifiers)
344+
letters = String(collect(keys(conversion_specifiers)))
345+
return Regex("(?<!\\\\)([\\Q$letters\\E])\\1*")
346+
end
347+
const DATEFORMAT_REGEX_CACHE = Ref(compute_dateformat_regex(CONVERSION_SPECIFIERS))
348+
const CONVERSION_SPECIFIERS_KEYS_HASH = Ref(hash(keys(CONVERSION_SPECIFIERS)))
349+
335350
"""
336351
DateFormat(format::AbstractString, locale="english") -> DateFormat
337352
@@ -379,8 +394,15 @@ function DateFormat(f::AbstractString, locale::DateLocale=ENGLISH)
379394
prev = ()
380395
prev_offset = 1
381396

382-
letters = String(collect(keys(CONVERSION_SPECIFIERS)))
383-
for m in eachmatch(Regex("(?<!\\\\)([\\Q$letters\\E])\\1*"), f)
397+
# To understand this block, please see the comments attached to the
398+
# definitions of DATEFORMAT_REGEX_CACHE and CONVERSION_SPECIFIERS_KEYS_HASH above.
399+
conversion_specifiers_keys_hash = hash(keys(CONVERSION_SPECIFIERS))
400+
if conversion_specifiers_keys_hash != CONVERSION_SPECIFIERS_KEYS_HASH[]
401+
DATEFORMAT_REGEX_CACHE[] = compute_dateformat_regex(CONVERSION_SPECIFIERS)
402+
CONVERSION_SPECIFIERS_KEYS_HASH[] = conversion_specifiers_keys_hash
403+
end
404+
405+
for m in eachmatch(DATEFORMAT_REGEX_CACHE[], f)
384406
tran = replace(f[prev_offset:prevind(f, m.offset)], r"\\(.)" => s"\1")
385407

386408
if !isempty(prev)

0 commit comments

Comments
 (0)