ENH: add timedelta modulus operator support (mm) #12120

tylerjereddy · 2018-10-09T03:35:30Z

Add support for modulus operator when both operands
are timedelta64 ~~with seconds units, and no other cases.~~

Related to #12092, though doesn't fully cover the modulus
scenarios requested there because I haven't added a branch
for modulus timedelta64 with a Python integer.

I think this approach can be summarized as intercepting the
array nb_remainder slot function before it dispatches to
the ufunc machinery.

eric-wieser · 2018-10-09T08:22:29Z

Shouldn't the ufunc machinery be able to handle this itself? How is the other arithmetic handled?

tylerjereddy · 2018-10-09T16:57:42Z

The ufunc machinery for timedelta arithmetic is here in umath/loops.c.src. I could try to refactor the work I've done here to fit into that machinery if that's strongly preferred.

shoyer · 2018-10-09T18:00:45Z

I think it would definitely be preferred to implement remainder as another ufunc loop. Hard coding it into array_remainder is pretty awkward, but more importantly it guarantees that x % y and np.remainder(x, y) are exactly equivalent. This is important because we support overriding % via __array_ufunc__, as described in NEP 13.

tylerjereddy · 2018-10-10T05:23:12Z

It seems that writing in a TIMEDELTA_mm_m_remainder() function in numpy/core/src/umath/loops.c.src along with the function prototype in numpy/core/src/umath/loops.h.src isn't sufficient for array_remainder to discover the ufunc loop, even though that seems to be all that was done for the other arithmetic operations on timedelta64. Adding a preprocessor def for TIMEDELTA_remainder doesn't help either.

I can disable a check in ufunc_loop_matches() in numpy/core/src/umath/ufunc_type_resolution.c to make things work better--this falls back to using remainder for built-in datetime.timedelta but seems fishy.

I'll keep trying!

eric-wieser · 2018-10-10T05:35:17Z

A nice followup would be divmod, which can probably reuse the same code.

eric-wieser · 2018-10-10T05:35:58Z

You probably need to tweak the type resolver function

ewmoore · 2018-10-10T15:31:06Z

Does it need to be added to numpy/core/code_generators/generate_umath.py?

tylerjereddy · 2018-10-10T17:32:59Z

Does it need to be added to numpy/core/code_generators/generate_umath.py?

Apparently yes, and I'm likely going to have to write a TypeResolver function for remainder too. Looks like there's a fair bit of copy-pasting with PyUFunc_AdditionTypeResolver and PyUFunc_SubtractionTypeResolver for datetime handling, so likely similar to those.

As far as I can tell this isn't that different from what I've done in this PR -- the Python C API is still used for type checking and making decisions, the mess is just being hidden in the resolver machinery.

charris · 2018-10-10T17:41:20Z

Needs a release note.

tylerjereddy · 2018-10-10T20:29:02Z

Refactored to use ufunc machinery as requested -- it appears remainder now works for all combinations of timedelta64 units, with appropriate exceptions on Years and Months of course.

My decision to typecast as: mm -> m on modulus (preserve timedelta64 type & report units) may require some discussion.

Points for:

@miccoli suggests that this matches with their expectations
some clarity to be gained by preserving the units in the remainder, and subtraction
for timedelta64 is also mm -> m

Points against:

timedelta64 division operation, which is (more) closely related to remainder, casts to double with signature: mm -> d

Also good if we can decide on what type we want from remainder when dividing by i.e., an int64 as this was also mentioned in #12092 as logically perserving the type. For conventional division, we preserve timedelta64 type when dividing by int64 and float64, so that may be an easier decision to make.

@shoyer does this need a mailing list check first maybe?

Might be nice if we could confine this PR to mm remainder and then I can expand to md and mq (double and int64) type resolution in future PRs?

eric-wieser · 2018-10-10T20:38:46Z

As far as I'm concerned, mm->m is the only reasonable signature for modulus. If in doubt, look at the behavior of the builtin timedelta.

tylerjereddy · 2018-10-10T21:04:16Z

from built-in datetime.timedelta:

divison matches what we currently have: mm->d
same goes for our other division operations: md->m and mq->m
remainder is indeed: mm->m
remainders with the other types mentioned above are:
TypeError: unsupported operand type(s) for %: 'datetime.timedelta' and 'float'
TypeError: unsupported operand type(s) for %: 'datetime.timedelta' and 'int'

So maybe supporting modulus with int and float is more controversial (was suggested as expected to work in the linked issue).

tylerjereddy · 2018-10-10T21:09:29Z

The codecov checks are green, but it didn't actually do anything -- missing a report or something.

miccoli · 2018-10-10T23:06:20Z

If I may clarify #12092: from

dividend = divisor × quotient + remainder

it follows that all three quantities (dividend, divisor × quotient, remainder) must be homogeneous and have the same time units.

Since for multiplication (divisor × quotient) we have mq -> m, qm -> m, md -> m, dm -> m, for the remainder function the signature should be

mm -> m
mq -> m
md -> m

i.e a numeric divisor should be accepted but the result should still be timedelta64.
(For division the signature is mm -> d because the time units cancel out and the quotient is dimensionless.)

The datetime.timedelta arithmetic is different from timedelta64: in fact

>>> datetime.timedelta(days=10) / 7
datetime.timedelta(days=1, seconds=37028, microseconds=571429)
>>> np.timedelta64(10, 'D') / 7
numpy.timedelta64(1,'D')

Therefore I find useful to define

>>> np.timedelta64(10, 'D') % 7 == np.timedelta64(3, 'D')

while the corresponding

datetime.timedelta(days=10)  % 7 == datetime.timedelta(microseconds=4)

or

datetime.timedelta(days=10)  % 7 == datetime.timedelta(days=3)

make no sense to me.

In other terms: in euclidean division the quotient should be an integer, and this makes sense for timedelta64. On the contrary, for datetime.timedelta it is hard to define a sensible integer quotient, and this makes the definition of the remainder problematic.

tylerjereddy · 2018-10-10T23:26:07Z

@miccoli Ok, so timedelta stuff can be confusing, but in short -- we're in agreement for the currently proposed mm->m for timedelta64 remainder slot?

One point for possible clarification from your analysis:

must be homogeneous and have the same time units.

This PR currently proposes allowing:
np.timedelta64(1, 'us') % np.timedelta64(727, 'ns') -> np.timedelta64(273, 'ns')

That is type homogenous, but the units are not--are you suggesting we don't want that?

shoyer · 2018-10-11T00:05:08Z

The datetime.timedelta arithmetic is different from timedelta64: in fact

>>> datetime.timedelta(days=10) / 7
datetime.timedelta(days=1, seconds=37028, microseconds=571429)
>>> np.timedelta64(10, 'D') / 7
numpy.timedelta64(1,'D')

I think this is arguably a bug, especially on Python 3 -- you should need to write np.timedelta64(10, 'D') // 7 for that. I don't know if have a good way to automatically pick the datatype for result, but silent truncation seems bad.

I think there's a case for sticking with mm remainder here.

This PR currently proposes allowing:
np.timedelta64(1, 'us') % np.timedelta64(727, 'ns') -> np.timedelta64(273, 'ns')

That is type homogenous, but the units are not--are you suggesting we don't want that?

I think this is the correct behavior.

@shoyer does this need a mailing list check first maybe?

Reproducing the behavior of datetime.timedelta in np.timedelta64 seems pretty uncontroversial to me. I don't think there's any cause for pinging the mailing list.

miccoli · 2018-10-11T07:24:31Z

This PR currently proposes allowing:
np.timedelta64(1, 'us') % np.timedelta64(727, 'ns') -> np.timedelta64(273, 'ns')

I agree that this is correct.
(I missed the fact that np.timedelta64(1, 'us') - np.timedelta64(727, 'ns') -> numpy.timedelta64(273,'ns') and wrongly assumed that sum and subtraction only work with same time units)

@shoyer

I think this is arguably a bug, especially on Python 3 -- you should need to write np.timedelta64(10, 'D') // 7 for that. I don't know if have a good way to automatically pick the datatype for result, but silent truncation seems bad.

For reference:

>>> datetime.timedelta(days=10) / 7
datetime.timedelta(days=1, seconds=37028, microseconds=571429)
>>> datetime.timedelta(days=10) // 7
datetime.timedelta(days=1, seconds=37028, microseconds=571428)

thus datetime.timedelta(days=10) / 7 is rounded to the nearest µs while datetime.timedelta(days=10) // 7 is truncated. (Note however that the result is of the same type, while 10/7 and 10//7 have different types.)

Therefore I would argue that np.timedelta64(10, 'D') / 7 -> np.timedelta64(1, 'D') is correct, while
np.timedelta64(11, 'D') / 7 -> np.timedelta64(1, 'D') is a minor bug. For my usage cases it is important that the datetime64 and timedelta64 time units (or resolution) do not change, so I would not see favourably the fact that, for example
np.timedelta64(10, 'D') / 7 -> np.timedelta64(123428571428571, 'ns')
Of course this is a debatable opinion.

In conclusion: I agree that mm -> m can be implemented, while the other cases need more discussion, in order to clarify which is the desired result with a timedelta64 dividend and a numeric (integer or floating point) divisor.

eric-wieser · 2018-10-11T13:45:00Z

For my usage cases it is important that the datetime64 and timedelta64 time units (or resolution) do not change, so I would not see favourably the fact that, for example np.timedelta64(10, 'D') / 7 -> np.timedelta64(123428571428571, 'ns')

I would argue that this is exactly what // is for - if you want want your variable to remain an integer rather than become a float, you use int // 2 not int / 2.

I would be in favor of deprecating timedelta64 / int

eric-wieser · 2018-10-11T13:46:49Z

numpy/core/src/umath/ufunc_type_resolution.c

+
+    return 0;
+
+type_reso_error: {


This indent is a little jarring - I'd put the brace on its own line

Better to fix as #12147, I think

eric-wieser · 2018-10-11T13:47:39Z

numpy/core/src/umath/ufunc_type_resolution.c

+                PyObject_Repr((PyObject *)PyArray_DESCR(operands[1])));
+        PyErr_SetObject(PyExc_TypeError, errmsg);
+        Py_DECREF(errmsg);
+        return -1;


This code is very exception-unsafe - you need to check for NULL from the result of PyObject_Repr, PyUString_ConcatAndDel, and PyUString_FromFormat

Postponed as part of #12147

eric-wieser · 2018-10-11T13:49:48Z

numpy/core/src/umath/ufunc_type_resolution.c

+                    type_tup, out_dtypes);
+    }
+    if (type_num1 == NPY_TIMEDELTA) {
+        if (type_num2 == NPY_TIMEDELTA) {


Why not just write this:

if (type_num1 == NPY_TIMEDELTA && type_num2 == NPY_TIMEDELTA) { // your code } else { return PyUFunc_DefaultTypeResolver(...) }

That saves you from having to produce an error message for datetime, making all my above comments moot

It is all just copy-paste from other type resolvers. I was planning to leave room for implementing mq and md remainders, where there would be other switches to handle type_num2 on a case by case basis, so check type_num1 but multiple checks on type_num2.

eric-wieser · 2018-10-11T14:23:37Z

numpy/core/src/umath/loops.c.src

@@ -1591,6 +1591,34 @@ TIMEDELTA_mm_d_divide(char **args, npy_intp *dimensions, npy_intp *steps, void *
    }
 }

+NPY_NO_EXPORT void
+TIMEDELTA_mm_m_remainder(char **args, npy_intp *dimensions, npy_intp *steps, void *NPY_UNUSED(func))


This function looks correct, thanks

tylerjereddy · 2018-10-11T19:25:40Z

Updated with a small reference doc example and to reflect the error handling code changes merged in from Eric recently.

Also added a release note

tylerjereddy · 2018-10-11T22:09:58Z

numpy/core/src/umath/loops.c.src

+        const npy_timedelta in1 = *(npy_timedelta *)ip1;
+        const npy_timedelta in2 = *(npy_timedelta *)ip2;
+        if (in1 == NPY_DATETIME_NAT || in2 == NPY_DATETIME_NAT) {
+            *((npy_timedelta *)op1) = NPY_NAN;


maybe we should propagate NaT instead here to preserve the mm->m signature -- just noticing this as I try to round up the coverage % a little on the patch

eric-wieser · 2018-10-12T04:52:26Z

Seems strange to me that np.timedelta64(1, 'us') // np.timedelta64(1, 'us') is an error right now - floor division seems to have an obvious interpretation in my mind.

Something for a later PR.

eric-wieser · 2018-10-12T04:54:46Z

doc/source/reference/arrays.datetime.rst

@@ -119,6 +119,9 @@ simple datetime calculations.
    >>> np.timedelta64(1,'W') / np.timedelta64(1,'D')
    7.0

+    >>> np.timedelta64(1, 'us') % np.timedelta64(727, 'ns')


I would have thought something like np.timedelta64(1,'W') % np.timedelta64(10,'D') would be a slightly clearer example, but not really important

eric-wieser · 2018-10-12T04:58:21Z

numpy/core/code_generators/generate_umath.py

          TD(intflt),
+          [TypeDescription('m', FullTypeDescr, 'mm', 'm'),
+          ],


Line wrapping here is a little weird, and doesn't match the other cases with only one TypeDescription

eric-wieser

Minor nits, looks otherwise good.

numpy/core/tests/test_datetime.py

eric-wieser · 2018-10-12T05:00:26Z

numpy/core/tests/test_datetime.py

+        # similar behavior enforced by CPython timedelta
+        with assert_raises_regex(RuntimeWarning,
+                                 "divide by zero encountered in remainder"):
+            np.timedelta64(10, 's') % np.timedelta64(0, 's')


Should check the result (0) too

doc/source/reference/arrays.datetime.rst

eric-wieser · 2018-10-14T16:52:29Z

numpy/core/tests/test_datetime.py

+
+    @pytest.mark.parametrize("val1, val2", [
+        # years and months can't be unambiguously
+        # divided for modulus operation except for Y % M


I means, strictly M % Y, M % M, Y % Y are all fine too. There is nothing special about how Y and M behave - there are just rules prohibiting mixing units larger than W with units smaller than or equal to W. In isolation, all the units behave the same.

tylerjereddy · 2018-10-14T22:33:12Z

Cleaned up the test comment a bit & rebased / force pushed so we get a Windows test on appveyor for the time being.

eric-wieser · 2018-10-15T02:35:04Z

numpy/core/tests/test_datetime.py

+    def test_timedelta_modulus_div_by_zero(self):
+        # similar behavior enforced by CPython timedelta
+        with assert_raises_regex(RuntimeWarning,
+                                 "divide by zero encountered in remainder"):


Wait, why does this raise a warning? Shouldn't it warn a warning?

In CPython, ZeroDivisionError: integer division or modulo by zero is raised by timedelta(seconds=10) % timedelta(seconds=0)

For development all warnings except a few are raised as errors in pytest.ini, but in the absence of that file it should just be a warning. Is it actually raised by the code?

Eric is right -- in this feature branch it is just a warning when executed as plain code outside the test suite, so I should likely just check for a warning.

I assume we deviate from CPython timedelta because NumPy can gracefully handle division by 0 in scenarios where Python raises an exception.

Hopefully assert_warns works then -- I'd normally use the pytest equivalent, but there's no precedent for that in NumPy IIRC.

eric-wieser · 2018-10-15T02:36:05Z

numpy/core/tests/test_datetime.py

+        with assert_raises_regex(RuntimeWarning,
+                                 "divide by zero encountered in remainder"):
+            actual = np.timedelta64(10, 's') % np.timedelta64(0, 's')
+            assert_equal(actual, 0)


Coverage says this line is never hit, meaning the result in the array is never actually checked.

Yeah, I wasn't sure why you asked me to check that the result is zero in a previous review comment, but now I'm seeing that you thought / think we should deviate from standard Python on this one and have warning instead of exception that break control flow?

I think something weird is going on within pytest that's promoting the warning to an error. I can divide by zero just fine:

In [13]: np.float64(1) % np.float64(0) C:\Program Files\Python 3.5\Scripts\ipython:1: RuntimeWarning: invalid value encountered in double_scalars Out[13]: nan In [16]: np.int64(1) % np.int64(0) C:\Program Files\Python 3.5\Scripts\ipython:1: RuntimeWarning: divide by zero encountered in longlong_scalars Out[16]: 0

I think you need to use assert_warns or something here, and then the warning will not escalate, and you can check the result too.

numpy/core/src/umath/ufunc_type_resolution.c

* added support for modulus operator with timedelta operands; type signature is mm->m

tylerjereddy · 2018-10-15T17:31:54Z

Revised to switch a test from checking for exception to warning, as requested.

tylerjereddy added 01 - Enhancement component: numpy.datetime64 (and timedelta64) labels Oct 9, 2018

tylerjereddy force-pushed the remainder_timedelta64 branch from c472685 to b662819 Compare October 9, 2018 16:29

charris added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Oct 10, 2018

tylerjereddy force-pushed the remainder_timedelta64 branch from b662819 to 9aaeafa Compare October 10, 2018 20:12

tylerjereddy changed the title ~~ENH: add timedelta seconds modulus~~ ENH: add timedelta modulus operator support (mm) Oct 10, 2018

eric-wieser reviewed Oct 11, 2018

View reviewed changes

tylerjereddy force-pushed the remainder_timedelta64 branch from 9aaeafa to 17237f5 Compare October 11, 2018 19:22

tylerjereddy force-pushed the remainder_timedelta64 branch from 17237f5 to 0a807fe Compare October 11, 2018 20:00

tylerjereddy commented Oct 11, 2018

View reviewed changes

tylerjereddy force-pushed the remainder_timedelta64 branch from 0a807fe to e07f7dc Compare October 11, 2018 22:49

eric-wieser reviewed Oct 12, 2018

View reviewed changes

eric-wieser approved these changes Oct 12, 2018

View reviewed changes

eric-wieser reviewed Oct 12, 2018

View reviewed changes

numpy/core/tests/test_datetime.py Outdated Show resolved Hide resolved

eric-wieser reviewed Oct 12, 2018

View reviewed changes

tylerjereddy force-pushed the remainder_timedelta64 branch from e07f7dc to 6461602 Compare October 12, 2018 18:01

eric-wieser reviewed Oct 14, 2018

View reviewed changes

doc/source/reference/arrays.datetime.rst Show resolved Hide resolved

eric-wieser reviewed Oct 14, 2018

View reviewed changes

tylerjereddy force-pushed the remainder_timedelta64 branch from 6461602 to abca780 Compare October 14, 2018 22:31

eric-wieser reviewed Oct 15, 2018

View reviewed changes

numpy/core/src/umath/ufunc_type_resolution.c Show resolved Hide resolved

ENH: add timedelta modulus

c9a6b02

* added support for modulus operator with timedelta operands; type signature is mm->m

tylerjereddy force-pushed the remainder_timedelta64 branch from abca780 to c9a6b02 Compare October 15, 2018 17:30

tylerjereddy removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Oct 18, 2018

stefanv merged commit 7cb9edf into numpy:master Oct 30, 2018

miccoli mentioned this pull request Oct 31, 2018

remainder is not implemented for timedelta64 #12092

Closed

tylerjereddy mentioned this pull request Nov 2, 2018

ENH: add mm->q floordiv #12308

Merged

eric-wieser mentioned this pull request Jan 8, 2019

ENH: add mm->qm divmod #12683

Merged

charris mentioned this pull request Jan 16, 2019

ENH: add mm->q floordiv #12767

Merged


		return 0;

		type_reso_error: {

Uh oh!

ENH: add timedelta modulus operator support (mm) #12120

ENH: add timedelta modulus operator support (mm) #12120

Uh oh!

Conversation

tylerjereddy commented Oct 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser commented Oct 9, 2018

Uh oh!

tylerjereddy commented Oct 9, 2018

Uh oh!

shoyer commented Oct 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tylerjereddy commented Oct 10, 2018

Uh oh!

eric-wieser commented Oct 10, 2018

Uh oh!

eric-wieser commented Oct 10, 2018

Uh oh!

ewmoore commented Oct 10, 2018

Uh oh!

tylerjereddy commented Oct 10, 2018

Uh oh!

charris commented Oct 10, 2018

Uh oh!

tylerjereddy commented Oct 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser commented Oct 10, 2018

Uh oh!

tylerjereddy commented Oct 10, 2018

Uh oh!

tylerjereddy commented Oct 10, 2018

Uh oh!

miccoli commented Oct 10, 2018

Uh oh!

tylerjereddy commented Oct 10, 2018

Uh oh!

shoyer commented Oct 11, 2018

Uh oh!

miccoli commented Oct 11, 2018

Uh oh!

eric-wieser commented Oct 11, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser Oct 11, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tylerjereddy commented Oct 11, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser commented Oct 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

tylerjereddy commented Oct 9, 2018 •

edited

Loading

shoyer commented Oct 9, 2018 •

edited

Loading

tylerjereddy commented Oct 10, 2018 •

edited

Loading

eric-wieser Oct 11, 2018 •

edited

Loading

eric-wieser commented Oct 12, 2018 •

edited

Loading

tylerjereddy Oct 15, 2018 •

edited

Loading

eric-wieser Oct 15, 2018 •

edited

Loading

eric-wieser Oct 15, 2018 •

edited

Loading