-
Notifications
You must be signed in to change notification settings - Fork 110
Implement Instant parsing in common module #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
/* | ||
* The algorithm for parsing time and zone offset was adapted from | ||
* https://github.com/square/moshi/blob/aea17e09bc6a3f9015d3de0e951923f1033d299e/adapters/src/main/java/com/squareup/moshi/adapters/Iso8601Utils.java |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please advise on the proper formulation of attribution that shall be added.
import kotlin.math.min | ||
import kotlin.math.pow | ||
|
||
internal fun parseInstantCommon(string: String): Instant = parseIsoString(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parseInstantCommon facade may try multiple formats before giving up such as #83 if that gets implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the need for such extensibility arises, we have little enough code to be able to add it quickly.
Instant.parse now supports: - Time zone offsets in the form of +-hh:mm - Time part allows the omission of colons (allows hhmmss) - Time part allows the omission of seconds Implementation notes: Input string is split by the time delimiter (T|t) into date and time parts. Date parsing is delegated to LocalDate.parse Time parsing employs the well known algorithm from Iso8601Utils.java originally implemented in Jackson. This commit is based on Moshi's version of that file.
val c = timePart[offset] | ||
if (c != 'Z' && c != 'z' && c != '+' && c != '-') { | ||
seconds = parseInt(timePart, offset, offset + 2).also { offset += 2 } | ||
if (seconds > 59 && seconds < 63) { // https://github.com/Kotlin/kotlinx-datetime/issues/5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is #5 applicable here, or should this be converted to idiomatic range check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is applicable. That said, this line is strange in any case: the existing parser for LocalDateTime
doesn't recognize leap seconds, so we have a disconnect between
Instant.parse("2020-06-14T01:01:61Z")
successfully parsing and
LocalDateTime.parse("2020-06-14T01:01:61")
failing.
Also, it should be noted Java Time's parser throws on seconds outside of [0; 60)
. Why should we accept leap seconds and why only in Instant
and not in LocalDateTime
?
Last forgotten IAE among code that throws DateTimeFormatException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thanks for the PR. We are actually interested in making this change and wanted to do it before the library's next release.
We can't accept this PR as is, though, given that the proposed changes are inconsistent with how the other parsers behave (LocalDateTime
in particular). Moreover, it initially looks like this change could be implemented in a much more simple manner by parsing (EDIT: looks like the behavior of ZonedDateTime
OffsetDateTime
is more consistent with how the modern JDK versions handle the parsing, thanks @ilya-g for the suggestion!) on JVM and JS and slightly editing the existing parser on Native, and it's unclear why it would be worth doing anything other than that.
import kotlin.math.min | ||
import kotlin.math.pow | ||
|
||
internal fun parseInstantCommon(string: String): Instant = parseIsoString(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the need for such extensibility arises, we have little enough code to be able to add it quickly.
val c = timePart[offset] | ||
if (c != 'Z' && c != 'z' && c != '+' && c != '-') { | ||
seconds = parseInt(timePart, offset, offset + 2).also { offset += 2 } | ||
if (seconds > 59 && seconds < 63) { // https://github.com/Kotlin/kotlinx-datetime/issues/5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is applicable. That said, this line is strange in any case: the existing parser for LocalDateTime
doesn't recognize leap seconds, so we have a disconnect between
Instant.parse("2020-06-14T01:01:61Z")
successfully parsing and
LocalDateTime.parse("2020-06-14T01:01:61")
failing.
Also, it should be noted Java Time's parser throws on seconds outside of [0; 60)
. Why should we accept leap seconds and why only in Instant
and not in LocalDateTime
?
indexOfNonDigit(timePart, offset + 1) // assume at least one digit | ||
val parseEndOffset = | ||
min(endOffset, offset + 9) // parse up to 9 digits | ||
val fraction = parseInt(timePart, offset, parseEndOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this code deals with fractions with more than nine digits that are representable as nanoseconds is by truncating the extra digits towards zero—not even by rounding them. So,
Instant.parse("2020-06-14T01:01:59.1234567899Z")
// results in 2020-06-14T01:01:59.123456789Z
This is highly questionable, as, again, parsing LocalDateTime
throws in such cases, as does parsing Instant
in Java. This makes sense, given that silently losing the user input is highly undesirable. Maybe an argument could be made that rounding is a sensible solution here, but even then the reasoning would have to be applied consistently across the parsers we have.
timezone = TimeZone.UTC | ||
} else if (timezoneIndicator == '+' || timezoneIndicator == '-') { | ||
val timezoneOffset = timePart.substring(offset) | ||
// 18-Jun-2015, tatu: Minor simplification, skip offset of "+0000"/"+00:00" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments have little meaning without the commit history, so they probably shouldn't be included here.
if (e.isJodaDateTimeParseException()) throw DateTimeFormatException(e) | ||
throw e | ||
} | ||
actual fun parse(isoString: String): Instant = parseInstantCommon(isoString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be implemented easily—and consistently with the other parsers—by parsing a ZonedDateTime
OffsetDateTime
and then converting that to Instant
.
The same for the Java implementation.
concreteCharParser('.') | ||
.chainSkipping(fractionParser(0, 9, 9)) // nanos | ||
)) | ||
.chainIgnoring(concreteCharParser('Z').or(concreteCharParser('z'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the other parsers are implemented via ZonedDateTime
OffsetDateTime
, there is no need to have the parser for Instant
in the common code, and the only place that requires an explicit implementation is the native part. Then, it's probably easier to modify this parser at this line so that it accepts any ZoneOffset
and not just Z
(and propagate this change correspondingly).
Thank you for your review. Since this code is highly inconsistent with existing logic I'll close this PR down. As you're already planning on making the change yourselves and have it done by the next release (this month is the target according to Ilya Gorbunov on Slack I see no reason for a third party to be a part of that process. I only knew about the April release, but not that it would include offset time zone parsing. I hope this bootleg PR reinforces the fact that accepting time zone offsets is critically important. I'm looking forward to the next release! @dkhalanskyjb P.S. I suggest that #56 be put in the 0.2.0 milestone. |
Resolves #56
Instant.parse now supports:
+-hh:mm
and+-hhmm
Implementation notes:
Input string is split by the time delimiter (T|t) into date and time parts.
Date parsing is delegated to
LocalDate.parse
Time parsing employs the well known algorithm from
Iso8601Utils.java
originally implemented in Jackson. This commit is based on Moshi's
version of that file.
Motivation
Any datetime library worth its salt must be able to parse numeric time zone offsets. I'm afraid here's no way around it.
I initially implemented this for myself since I need the ability to parse numeric time zone offsets. In fact I think that most of the
OffsetDateTime
-type strings floating around the Internet come with the+-hh:mm
offset representation rather than theZ
designator. Migrating fromThreeTenBP
I was surprised that only theZ
designator was supported.This implementation may hold fort until a format builder is added (#58 #39)
Discussion points:
The leniency of the time parser comes free with the algorithm.
I vote that the leniency be kept in the spirit of the "Be permissive on input, strict on output" rule of thumb.