Remove auto densification and unify operator code. #46

hameerabbasi · 2017-12-27T14:26:57Z

I've removed the auto-densification in __add__ and exp and I've removed the tests for it. In addition, I've unified the code for all operators so that they only work when not densified. In addition, they work with scalars if the scalars don't densify them.

mrocklin · 2017-12-27T16:13:30Z

sparse/core.py

-        assert isinstance(other, COO)
+        if not isinstance(other, COO):
+            raise ValueError("Performing this operation would produce "
+                             "a dense result: %s" % str(func))


Hrm, this isn't always true though. For example x > 5 or x + 0 (which comes up when using the sum function) are both common case situations and should both work efficiently. Is there some way that we can test against zero here?

It's already being checked in _perform_op. That one handles operations with scalars pretty nicely, and is general. I've also added a test for it (test_elemwise_scalar)

Though I can move it in here and refactor if you like.

Edit: Actually, I'll do that right now. Hold off on merging.

mrocklin · 2017-12-27T16:37:56Z

sparse/tests/test_core.py

+
+    xs = COO.from_numpy(x)
+
+    assert_eq(func(xs, y), func(x, y))


Can we also check here that the output is a COO object and that the number of non-zeros hasn't significantly increased?

It won't increase, but it can decrease. Let me add that check.

mrocklin · 2017-12-27T16:39:12Z

sparse/core.py

@@ -1067,7 +1071,7 @@ def __abs__(self):

    def exp(self, out=None):
        assert out is None
-        return np.exp(self.maybe_densify())
+        return self.elemwise(np.exp)


Will this always err (given current code)

(not a concern, just asking a question)

Yes. But given what we talked about in #10 it's probably best to keep it.

Edit: If you really need exp you can do np.exp(x.maybe_densify()) or np.exp(x.todense()).

mrocklin · 2017-12-27T16:40:13Z

sparse/core.py

@@ -641,11 +650,13 @@ def elemwise(self, func, *args, **kwargs):
                   sorted=self.sorted)

    def elemwise_binary(self, func, other, *args, **kwargs):
-        assert isinstance(other, COO)
+        if not isinstance(other, COO):
+            return self.elemwise(func, other, *args, **kwargs)


Thanks for this change. I think that this is nicer, especially given that elemwise_binary is public API. People will probably try to use it and your change here will, I suspect, include fewer surprises for them.

We might at some point reduce this down to a single public elemwise function that handles both cases and dispatches to _elemwise_unary and _elemwise_binary based on inputs. (not necessarily now though if you'd like to wrap this up quickly).

No, properly > quickly. It saves headache down the line.

The problem I see with that is, elemwise already exists and does what you are describing as _elemwise_unary. I have no problem dispatching between the two, but my question is:

Is it worthwhile changing the API? Changing what elemwise is and hiding elemwise_binary? Do we have someone for whom it'll break?

I was also considering supporting scipy.sparse.spmatrix. It's a small change, and well worth it, I expect.

It will also make implementing __array_ufunc__ much easier.

The new elemwise would, I think, cover the functionality of the old elemwise. It would now also work if the args weren't scalars, but were other things as well.

In general though this project is young enough and obscure enough that I think API breaks are fine. Anyone using this project is an early adopter.

I agree that supporting scipy.sparse.spmatrix objects would be valuable.

hameerabbasi · 2017-12-27T17:39:01Z

sparse/core.py


    def elemwise(self, func, *args, **kwargs):
+        """


I needed your opinion here. Should we keep it this way and assume other=args[0] or should we add an other=None argument for explicitly selecting between binary and unary functions?

The first is more compact but the second can help when, for example, the first input to your function itself is supposed to be COO or spmatrix.

Without thinking about this too much I'm inclined to leave this as def elemwise(self, func, *args, **kwargs). I would expect len(args) to determine between elemwise_unary and elemwise_binary as you've done below.

I'm not sure I understand the second case

Suppose you have an inherently unary function that accepts COO or spmatrix as its second positional argument, instead of it being an operand to a binary function. I agree, it's a corner use case, but it is more explicit.

In case you think this won't ever be needed (and I mostly agree), this should be good to merge, if you haven't got any other comments.

I wouldn't worry about this for now. Short term I might encourage users to use kwrags for this.

Fair enough. I overlooked that kwargs can map onto normal args in Python!

mrocklin · 2017-12-27T18:21:16Z

sparse/core.py

+                return self._elemwise_binary(func, *args, **kwargs)
+            elif isinstance(other, scipy.sparse.spmatrix):
+                other = COO.from_scipy_sparse(other)
+                return self._elemwise_binary(func, other, *args[1:], **kwargs)


This branch could use a test

mrocklin · 2017-12-27T18:21:48Z

sparse/core.py

+            function. Otherwise, it will be treated as a unary function.
+        kwargs : dict, optional
+            The kwargs to pass to the function.
+        Returns


I think we need a newline above this docstring title for clean rendering with sphinx

mrocklin · 2017-12-27T18:46:43Z

This looks great to me. Thanks @hameerabbasi !

I've added you as a collaborator to this fork. You should now have the ability to merge PRs. In general I tend to use "Squash and Merge" when merging PRs unless particular care has been taken around the commit history and the individual commits have significant value.

Care to give this a try and merge your own PR here?

hameerabbasi · 2017-12-27T18:52:16Z

Thanks so much for adding me as a collaborator! Would you prefer I still worked on my own fork in different branches or can I also work with branches here?

mrocklin · 2017-12-27T18:53:46Z

Thanks so much for adding me as a collaborator!

Heh, you're doing more work on this than anyone else :)

Would you prefer I still worked on my own fork in different branches or can I also work with branches here?

I tend to prefer keeping development work on different branches when possible. I plan to continue working from my fork if/when we move this to a more public org.

hameerabbasi added 3 commits December 27, 2017 15:21

Get rid of auto densification and unify ops and elemwise code.

cbdb757

Add more operators.

092c714

Add tests for all operators.

a7e003b

mrocklin reviewed Dec 27, 2017

View reviewed changes

Move scalar logic to elemwise_binary.

7ebc8b5

mrocklin reviewed Dec 27, 2017

View reviewed changes

hameerabbasi added 2 commits December 27, 2017 18:19

Unify elemwise and elemwise_binary.

a38fdd4

Add computed function to test instead of re-computing it.

e8f3199

hameerabbasi commented Dec 27, 2017

View reviewed changes

mrocklin reviewed Dec 27, 2017

View reviewed changes

hameerabbasi added 3 commits December 27, 2017 19:29

Added newline to docstring for Sphinx.

7eef866

Added test for operation with scipy sparse matrix.

ea462e6

Fix spontaneous test failure.

9f63e53

hameerabbasi merged commit 736d7d6 into pydata:master Dec 27, 2017

hameerabbasi deleted the fix-auto-densification branch December 27, 2017 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove auto densification and unify operator code. #46

Remove auto densification and unify operator code. #46

hameerabbasi commented Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

hameerabbasi Dec 27, 2017 •

edited

Loading

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017 •

edited

Loading

mrocklin Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

hameerabbasi Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

hameerabbasi Dec 27, 2017

mrocklin Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

hameerabbasi Dec 27, 2017

mrocklin Dec 27, 2017

hameerabbasi Dec 27, 2017

mrocklin Dec 27, 2017

mrocklin Dec 27, 2017

mrocklin commented Dec 27, 2017

hameerabbasi commented Dec 27, 2017

mrocklin commented Dec 27, 2017

Remove auto densification and unify operator code. #46

Remove auto densification and unify operator code. #46

Conversation

hameerabbasi commented Dec 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hameerabbasi Dec 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hameerabbasi Dec 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mrocklin commented Dec 27, 2017

hameerabbasi commented Dec 27, 2017

mrocklin commented Dec 27, 2017

hameerabbasi Dec 27, 2017 •

edited

Loading

hameerabbasi Dec 27, 2017 •

edited

Loading