Skip to content

bpo-15873: add '.fromisoformat' for date, time and datetime #4841

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

deronnax
Copy link
Contributor

@deronnax deronnax commented Dec 13, 2017

adding method 'fromisoformat' to date, time and datetime classes

https://bugs.python.org/issue15873

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

Copy link
Contributor

@SylvainDe SylvainDe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks mostly good to me. I've added a few comments which may or may not be relevant.

if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
return NULL;


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've used only one blank line in the other places (which is IMHO more beautiful).

Also, do you reckon it would make sense to have a single function like:


static PyObject *
strptime_fromisoformat(PyObject *cls, PyObject *args, const char *method_name)
{
    static PyObject *module = NULL;
    PyObject *string;

    if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
        return NULL;

    if (module == NULL) {
        module = PyImport_ImportModule("_strptime");
        if (module == NULL)
            return NULL;
    }

    return PyObject_CallMethod(module, method_name, "OO", cls, string);
}

static PyObject *
date_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isodate");
}

static PyObject *
time_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isotime");
}

static PyObject *
datetime_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isodatetime");
}










@@ -732,6 +732,15 @@ def fromordinal(cls, n):
y, m, d = _ord2ymd(n)
return cls(y, m, d)

@classmethod
def fromisoformat(cls, date_string):
"""Constructs a date from an RFC 3339 string, a strict subset of ISO 8601.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention is to use imperative form ("Construct"). This comment applies in other places.

r'(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?$',
ASCII|IGNORECASE)

_datetime_re = re_compile(_date_re.pattern[:-1] + r'[T ]' + _time_re.pattern,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, it be easier to define tmp variables to avoid dealing the indexing here:

date_pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
time_pattern = r'(?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?'
datetime_pattern = date_pattern + r'[T ]' + time_pattern
date_re = re_compile(date_pattern + '$', ASCII)
time_re = re_compile(time_pattern + '$', ASCII | IGNORECASE)
datetime_re = re_compile(datetime_pattern + '$', ASCII | IGNORECASE)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that this won't match all the outputs of datetime.isoformat, if that's a goal of this version of datetime.fromisoformat, since the sep parameter does accept non-ASCII components.

That said, it seems like _date_re is requiring a full specification of YYYY-MM-DD, in which case you can just split the string as:

date_str = dt_str[0:10]
time_str = dt_str[11:]
dt_sep = dt_str[10:11]

It's then kinda trivial to enforce whatever restrictions you want on dt_sep, and continue to use the ASCII flag for date_re and time_re.


Raises ValueError in case of ill-formatted or invalid string.
"""
import _strptime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have this at the top level ?


Return a date object corresponding to *date_string*, according to RFC 3339,
a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`date.isoformat`. Microseconds are rounded to 6 digits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t microseconds in a date string be illegal? Maybe clarify which parts of the RFC are relevant; perhaps the full-date format?

:exc:`ValueError` is raised if *date_string* is not a valid RFC 3339
date string.

.. `RFC 3339`: https://www.ietf.org/rfc/rfc3339.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this get marked up? I tend to prefer the IETF’s HTML versions (e.g. https://tools.ietf.org/html/rfc3339), and a lot of the documentation seems to link to those directly via :rfc:`3339` syntax.

a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`time.isoformat`. Microseconds are rounded to 6 digits.
:exc:`ValueError` is raised if *time_string* is not a valid RFC 3339
time string..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubled full stop.


Return a datetime object corresponding to *datetime_string*, according to RFC 3339,
a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`datetime.isoformat`. Microseconds are rounded to 6 digits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be clearer to say something like “The time is rounded to a whole number of microseconds”?

Also, I suggest clarifying that the separator may be a space instead of T. This is suggested, but not required by the RFC profile, and means that the output of format(datetime) is supported.

kw = {k: int(v) for k, v in kw.items()}
if us:
us = round(float(us), 6)
kw['microsecond'] = int(us * 1e6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the time and datetime classes require microsecond to be strictly less than 1 million, and it looks like you don’t handle rolling over seconds, minutes, etc. Test case:

datetime.fromisoformat('2017-12-31 23:59:59.9999995') -> datetime(2018, 1, 1, 0, 0, 0)

For datetime, I might do the rolling over by adding a timedelta. For time, maybe it is not worth doing any rounding.

Also, float is approximate, e.g. for me float(".000_001_499_999_999_999_999_99") rounds up over 1.5 µs, which round would round up to 2 µs. I wonder if it is worth doing the rounding without float; then you could claim in the documentation that rounding is always half-to-even. Untested code:

_time_re = ... r'(?:(?P<microsecond>\.\d{1,6})(?P<us_frac>\d*))?' ...
...
us = int(float(us) * 1e6)
frac = kw.pop('us_frac').rstrip("0")
# Round halfway up to even rather than down to odd
us += frac > "5" or frac == "5" and us % 2

@abalkin
Copy link
Member

abalkin commented Mar 1, 2018

Closing since an alternative implementation has been merged. See GH-4699.

@abalkin abalkin closed this Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants