Skip to content

bpo-15873: add '.fromisoformat' for date, time and datetime #4841

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions Doc/library/datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,19 @@ Other constructors, all class methods:
:exc:`ValueError` on :c:func:`localtime` failure.


.. classmethod:: date.fromisoformat(date_string)

Return a date object corresponding to *date_string*, according to RFC 3339,
a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`date.isoformat`. Microseconds are rounded to 6 digits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t microseconds in a date string be illegal? Maybe clarify which parts of the RFC are relevant; perhaps the full-date format?

:exc:`ValueError` is raised if *date_string* is not a valid RFC 3339
date string.

.. `RFC 3339`: https://www.ietf.org/rfc/rfc3339.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this get marked up? I tend to prefer the IETF’s HTML versions (e.g. https://tools.ietf.org/html/rfc3339), and a lot of the documentation seems to link to those directly via :rfc:`3339` syntax.


.. versionadded:: 3.7


.. classmethod:: date.fromordinal(ordinal)

Return the date corresponding to the proleptic Gregorian ordinal, where January
Expand Down Expand Up @@ -793,6 +806,19 @@ Other constructors, all class methods:
:exc:`ValueError` on :c:func:`gmtime` failure.


.. classmethod:: datetime.fromisoformat(datetime_string)

Return a datetime object corresponding to *datetime_string*, according to RFC 3339,
a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`datetime.isoformat`. Microseconds are rounded to 6 digits.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be clearer to say something like “The time is rounded to a whole number of microseconds”?

Also, I suggest clarifying that the separator may be a space instead of T. This is suggested, but not required by the RFC profile, and means that the output of format(datetime) is supported.

:exc:`ValueError` is raised if *datetime_string* is not a valid RFC 3339
datetime string.

.. `RFC 3339`: https://www.ietf.org/rfc/rfc3339.txt

.. versionadded:: 3.7


.. classmethod:: datetime.fromordinal(ordinal)

Return the :class:`.datetime` corresponding to the proleptic Gregorian ordinal,
Expand Down Expand Up @@ -1394,6 +1420,22 @@ day, and subject to adjustment via a :class:`tzinfo` object.
If an argument outside those ranges is given, :exc:`ValueError` is raised. All
default to ``0`` except *tzinfo*, which defaults to :const:`None`.


Other constructor:

.. classmethod:: time.fromisoformat(string)

Return a time object corresponding to *time_string*, according to RFC 3339,
a stricter, simpler subset of ISO 8601, and such as is returned by
:func:`time.isoformat`. Microseconds are rounded to 6 digits.
:exc:`ValueError` is raised if *time_string* is not a valid RFC 3339
time string..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doubled full stop.


.. `RFC 3339`: https://www.ietf.org/rfc/rfc3339.txt

.. versionadded:: 3.7


Class attributes:


Expand Down
51 changes: 50 additions & 1 deletion Lib/_strptime.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import locale
import calendar
from re import compile as re_compile
from re import IGNORECASE
from re import IGNORECASE, ASCII
from re import escape as re_escape
from datetime import (date as datetime_date,
timedelta as datetime_timedelta,
Expand All @@ -27,6 +27,55 @@ def _getlang():
# Figure out what the current language is set to.
return locale.getlocale(locale.LC_TIME)


_date_re = re_compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})$',
ASCII)

_time_re = re_compile(r'(?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})'
r'(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?$',
ASCII|IGNORECASE)

_datetime_re = re_compile(_date_re.pattern[:-1] + r'[T ]' + _time_re.pattern,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, it be easier to define tmp variables to avoid dealing the indexing here:

date_pattern = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
time_pattern = r'(?P<hour>\d{2}):(?P<minute>\d{2}):(?P<second>\d{2})(?P<microsecond>\.\d+)?(?P<tzinfo>Z|[+-]\d{2}:\d{2})?'
datetime_pattern = date_pattern + r'[T ]' + time_pattern
date_re = re_compile(date_pattern + '$', ASCII)
time_re = re_compile(time_pattern + '$', ASCII | IGNORECASE)
datetime_re = re_compile(datetime_pattern + '$', ASCII | IGNORECASE)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll note that this won't match all the outputs of datetime.isoformat, if that's a goal of this version of datetime.fromisoformat, since the sep parameter does accept non-ASCII components.

That said, it seems like _date_re is requiring a full specification of YYYY-MM-DD, in which case you can just split the string as:

date_str = dt_str[0:10]
time_str = dt_str[11:]
dt_sep = dt_str[10:11]

It's then kinda trivial to enforce whatever restrictions you want on dt_sep, and continue to use the ASCII flag for date_re and time_re.

ASCII|IGNORECASE)


def _parse_isodate(cls, isostring):
return _parse_isoformat(cls, isostring, _date_re)


def _parse_isotime(cls, isostring):
return _parse_isoformat(cls, isostring, _time_re)


def _parse_isodatetime(cls, isostring):
return _parse_isoformat(cls, isostring, _datetime_re)


def _parse_isoformat(cls, isostring, iso_re):
match = iso_re.match(isostring)
if not match:
raise ValueError("invalid RFC 3339 %s string: %r"
% (cls.__name__, isostring))
kw = match.groupdict()
tzinfo = kw.pop('tzinfo', None)
if tzinfo == 'Z' or tzinfo == 'z':
tzinfo = datetime_timezone.utc
elif tzinfo is not None:
offset_hours, _, offset_mins = tzinfo[1:].partition(':')
offset = datetime_timedelta(hours=int(offset_hours), minutes=int(offset_mins))
if tzinfo[0] == '-':
offset = -offset
tzinfo = datetime_timezone(offset)
us = kw.pop('microsecond', None)
kw = {k: int(v) for k, v in kw.items()}
if us:
us = round(float(us), 6)
kw['microsecond'] = int(us * 1e6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the time and datetime classes require microsecond to be strictly less than 1 million, and it looks like you don’t handle rolling over seconds, minutes, etc. Test case:

datetime.fromisoformat('2017-12-31 23:59:59.9999995') -> datetime(2018, 1, 1, 0, 0, 0)

For datetime, I might do the rolling over by adding a timedelta. For time, maybe it is not worth doing any rounding.

Also, float is approximate, e.g. for me float(".000_001_499_999_999_999_999_99") rounds up over 1.5 µs, which round would round up to 2 µs. I wonder if it is worth doing the rounding without float; then you could claim in the documentation that rounding is always half-to-even. Untested code:

_time_re = ... r'(?:(?P<microsecond>\.\d{1,6})(?P<us_frac>\d*))?' ...
...
us = int(float(us) * 1e6)
frac = kw.pop('us_frac').rstrip("0")
# Round halfway up to even rather than down to odd
us += frac > "5" or frac == "5" and us % 2

if tzinfo:
kw['tzinfo'] = tzinfo
return cls(**kw)


class LocaleTime(object):
"""Stores and handles locale-specific information related to time.

Expand Down
29 changes: 29 additions & 0 deletions Lib/datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -732,6 +732,15 @@ def fromordinal(cls, n):
y, m, d = _ord2ymd(n)
return cls(y, m, d)

@classmethod
def fromisoformat(cls, date_string):
"""Constructs a date from an RFC 3339 string, a strict subset of ISO 8601.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention is to use imperative form ("Construct"). This comment applies in other places.


Raises ValueError in case of ill-formatted or invalid string.
"""
import _strptime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to have this at the top level ?

return _strptime._parse_isodate(cls, date_string)

# Conversions to string

def __repr__(self):
Expand Down Expand Up @@ -1075,6 +1084,16 @@ def __new__(cls, hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold
self._fold = fold
return self

@classmethod
def fromisoformat(cls, time_string):
"""Constructs a time from an RFC 3339 string, a strict subset of ISO 8601.
Microseconds are rounded to 6 digits.

Raises ValueError in case of ill-formatted or invalid string.
"""
import _strptime
return _strptime._parse_isotime(cls, time_string)

# Read-only field accessors
@property
def hour(self):
Expand Down Expand Up @@ -1472,6 +1491,16 @@ def utcfromtimestamp(cls, t):
"""Construct a naive UTC datetime from a POSIX timestamp."""
return cls._fromtimestamp(t, True, None)

@classmethod
def fromisoformat(cls, datetime_string):
"""Constructs a datetime from an RFC 3339 string, a strict subset of ISO 8601.
Microseconds are rounded to 6 digits.

Raises ValueError in case of ill-formatted or invalid string.
"""
import _strptime
return _strptime._parse_isodatetime(cls, datetime_string)

@classmethod
def now(cls, tz=None):
"Construct a datetime from time.time() and optional time zone info."
Expand Down
61 changes: 61 additions & 0 deletions Lib/test/datetimetester.py
Original file line number Diff line number Diff line change
Expand Up @@ -1166,6 +1166,19 @@ def test_fromtimestamp(self):
self.assertEqual(d.month, month)
self.assertEqual(d.day, day)

def test_fromisoformat(self):
self.assertEqual(self.theclass.fromisoformat('2014-12-31'),
self.theclass(2014, 12, 31))
self.assertEqual(self.theclass.fromisoformat('4095-07-31'),
self.theclass(4095, 7, 31))

with self.assertRaises(ValueError):
self.theclass.fromisoformat('2014-12-011')
with self.assertRaises(ValueError):
self.theclass.fromisoformat('20141211')
with self.assertRaises(ValueError):
self.theclass.fromisoformat('043-12-01')

def test_insane_fromtimestamp(self):
# It's possible that some platform maps time_t to double,
# and that this test will fail there. This test should
Expand Down Expand Up @@ -1976,6 +1989,18 @@ def test_utcfromtimestamp(self):
got = self.theclass.utcfromtimestamp(ts)
self.verify_field_equality(expected, got)

def test_fromisoformat(self):
self.assertEqual(self.theclass.fromisoformat('2015-12-31T14:27:00'),
self.theclass(2015, 12, 31, 14, 27, 0))
self.assertEqual(self.theclass.fromisoformat('2015-12-31 14:27:00'),
self.theclass(2015, 12, 31, 14, 27, 0))
# lowercase 'T' date-time separator. Uncommon but tolerated (rfc 3339)
self.assertEqual(self.theclass.fromisoformat('2015-12-31t14:27:00'),
self.theclass(2015, 12, 31, 14, 27, 0))

with self.assertRaises(ValueError):
self.theclass.fromisoformat('2015-01-07X00:00:00')

# Run with US-style DST rules: DST begins 2 a.m. on second Sunday in
# March (M3.2.0) and ends 2 a.m. on first Sunday in November (M11.1.0).
@support.run_with_tz('EST+05EDT,M3.2.0,M11.1.0')
Expand Down Expand Up @@ -2517,6 +2542,42 @@ def test_isoformat(self):
self.assertEqual(t.isoformat(timespec='microseconds'), "12:34:56.000000")
self.assertEqual(t.isoformat(timespec='auto'), "12:34:56")

def test_fromisoformat(self):
# basic
self.assertEqual(self.theclass.fromisoformat('04:05:01.000123'),
self.theclass(4, 5, 1, 123))
self.assertEqual(self.theclass.fromisoformat('00:00:00'),
self.theclass(0, 0, 0))
# usec, rounding high
self.assertEqual(self.theclass.fromisoformat('10:20:30.40000059'),
self.theclass(10, 20, 30, 400001))
# usec, rounding low + long digits we don't care about
self.assertEqual(self.theclass.fromisoformat('10:20:30.400003434'),
self.theclass(10, 20, 30, 400003))
with self.assertRaises(ValueError):
self.theclass.fromisoformat('12:00AM')
with self.assertRaises(ValueError):
self.theclass.fromisoformat('120000')
with self.assertRaises(ValueError):
self.theclass.fromisoformat('1:00')
with self.assertRaises(ValueError):
self.theclass.fromisoformat('17:54:43.')

def tz(h, m):
return timezone(timedelta(hours=h, minutes=m))

self.assertEqual(self.theclass.fromisoformat('00:00:00Z'),
self.theclass(0, 0, 0, tzinfo=timezone.utc))
# lowercase UTC timezone. Uncommon but tolerated (rfc 3339)
self.assertEqual(self.theclass.fromisoformat('00:00:00z'),
self.theclass(0, 0, 0, tzinfo=timezone.utc))
self.assertEqual(self.theclass.fromisoformat('00:00:00-00:00'),
self.theclass(0, 0, 0, tzinfo=tz(0, 0)))
self.assertEqual(self.theclass.fromisoformat('08:30:00.004255+02:30'),
self.theclass(8, 30, 0, 4255, tz(2, 30)))
self.assertEqual(self.theclass.fromisoformat('08:30:00.004255-02:30'),
self.theclass(8, 30, 0, 4255, tz(-2, -30)))

def test_1653736(self):
# verify it doesn't accept extra keyword arguments
t = self.theclass(second=1)
Expand Down
76 changes: 76 additions & 0 deletions Modules/_datetimemodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -2607,6 +2607,25 @@ date_fromtimestamp(PyObject *cls, PyObject *args)
return result;
}

/* Return new date from given date string, using _strptime._parse_isodate(). */
static PyObject *
date_fromisoformat(PyObject *cls, PyObject *args)
{
static PyObject *module = NULL;
PyObject *string;

if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
return NULL;

if (module == NULL) {
module = PyImport_ImportModule("_strptime");
if (module == NULL)
return NULL;
}

return PyObject_CallMethod(module, "_parse_isodate", "OO", cls, string);
}

/* Return new date from proleptic Gregorian ordinal. Raises ValueError if
* the ordinal is out of range.
*/
Expand Down Expand Up @@ -2920,6 +2939,11 @@ static PyMethodDef date_methods[] = {
PyDoc_STR("timestamp -> local date from a POSIX timestamp (like "
"time.time()).")},

{"fromisoformat", (PyCFunction)date_fromisoformat,
METH_VARARGS | METH_CLASS,
PyDoc_STR("Construct a date from an RFC 3339 string, a strict subset of ISO 8601.\n"
"Raises ValueError in case of ill-formatted or invalid string.\n")},

{"fromordinal", (PyCFunction)date_fromordinal, METH_VARARGS |
METH_CLASS,
PyDoc_STR("int -> date corresponding to a proleptic Gregorian "
Expand Down Expand Up @@ -3711,6 +3735,26 @@ time_str(PyDateTime_Time *self)
return _PyObject_CallMethodId((PyObject *)self, &PyId_isoformat, NULL);
}

/* Return new time from time string, using _strptime._parse_isotime(). */
static PyObject *
time_fromisoformat(PyObject *cls, PyObject *args)
{
static PyObject *module = NULL;
PyObject *string;

if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
return NULL;


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've used only one blank line in the other places (which is IMHO more beautiful).

Also, do you reckon it would make sense to have a single function like:


static PyObject *
strptime_fromisoformat(PyObject *cls, PyObject *args, const char *method_name)
{
    static PyObject *module = NULL;
    PyObject *string;

    if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
        return NULL;

    if (module == NULL) {
        module = PyImport_ImportModule("_strptime");
        if (module == NULL)
            return NULL;
    }

    return PyObject_CallMethod(module, method_name, "OO", cls, string);
}

static PyObject *
date_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isodate");
}

static PyObject *
time_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isotime");
}

static PyObject *
datetime_fromisoformat(PyObject *cls, PyObject *args)
{
    return strptime_fromisoformat(cls, args, "_parse_isodatetime");
}










if (module == NULL) {
module = PyImport_ImportModule("_strptime");
if (module == NULL)
return NULL;
}

return PyObject_CallMethod(module, "_parse_isotime", "OO", cls, string);
}

static PyObject *
time_isoformat(PyDateTime_Time *self, PyObject *args, PyObject *kw)
{
Expand Down Expand Up @@ -4018,6 +4062,13 @@ time_reduce(PyDateTime_Time *self, PyObject *arg)

static PyMethodDef time_methods[] = {

{"fromisoformat", (PyCFunction)time_fromisoformat,
METH_VARARGS | METH_CLASS,
PyDoc_STR("Construct a time from an RFC 3339 string, a strict subset "
"of ISO 8601.\n"
"Microseconds are rounded to 6 digits.\n"
"Raises ValueError in case of ill-formatted or invalid string.\n")},

{"isoformat", (PyCFunction)time_isoformat, METH_VARARGS | METH_KEYWORDS,
PyDoc_STR("Return string in ISO 8601 format, [HH[:MM[:SS[.mmm[uuu]]]]]"
"[+HH:MM].\n\n"
Expand Down Expand Up @@ -4726,6 +4777,25 @@ datetime_str(PyDateTime_DateTime *self)
return _PyObject_CallMethodId((PyObject *)self, &PyId_isoformat, "s", " ");
}

/* Return new datetime from _strptime._parse_isodatetime(). */
static PyObject *
datetime_fromisoformat(PyObject *cls, PyObject *args)
{
static PyObject *module = NULL;
PyObject *string;

if (!PyArg_ParseTuple(args, "U:fromisoformat", &string))
return NULL;

if (module == NULL) {
module = PyImport_ImportModule("_strptime");
if (module == NULL)
return NULL;
}

return PyObject_CallMethod(module, "_parse_isodatetime", "OO", cls, string);
}

static PyObject *
datetime_isoformat(PyDateTime_DateTime *self, PyObject *args, PyObject *kw)
{
Expand Down Expand Up @@ -5506,6 +5576,12 @@ static PyMethodDef datetime_methods[] = {
METH_VARARGS | METH_KEYWORDS | METH_CLASS,
PyDoc_STR("timestamp[, tz] -> tz's local time from POSIX timestamp.")},

{"fromisoformat", (PyCFunction)datetime_fromisoformat,
METH_VARARGS | METH_CLASS,
PyDoc_STR("Construct a datetime from an RFC 3339 string, a strict subset of ISO 8601.\n"
"Microseconds are rounded to 6 digits.\n"
"Raises ValueError in case of ill-formatted or invalid string.\n")},

{"utcfromtimestamp", (PyCFunction)datetime_utcfromtimestamp,
METH_VARARGS | METH_CLASS,
PyDoc_STR("Construct a naive UTC datetime from a POSIX timestamp.")},
Expand Down