-
-
Notifications
You must be signed in to change notification settings - Fork 132
Add densification manager. #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I think this is ready for code review before I get to the docs. See the tests for example use. |
b07f168
to
698504b
Compare
I find it weird that the
Isn't it possible to make it a |
Also, the
|
Yes, both of these bugs are one and the same. I could, for example, just store a I have, however, provided a However, my question is: Do we actually want the children to be linked to the parent or not? Linked in the sense that once the parent's config changes, the child's does as well. Another thing I was considering was automatic densification (above a certain density and below a certain size). This would help performance a lot. |
Or I could store the children in the context manager and reduce the children's configs once the manager has exited. |
sparse/densification.py
Outdated
return | ||
|
||
if max_size is None: | ||
max_size = 10000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly use the default value of max_size=10000
in the function signature?
sparse/densification.py
Outdated
if min_density is None: | ||
min_density = 0.25 | ||
|
||
if not isinstance(densify, bool) and densify is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of allowing densify
to be of mixed type (None
and bool
), you could create a Maybe
class that encapsulates the behaviour
sparse/densification.py
Outdated
|
||
def __str__(self): | ||
if isinstance(self.densify, bool): | ||
return '<DensificationConfig: densify=%s>' % self.densify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there should be one unified repr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel otherwise... Printing fluff that isn't relevant to the repr can be useless. This just prints the required info.
sparse/densification.py
Outdated
raise ValueError('Invalid DensificationConfig.') | ||
|
||
@staticmethod | ||
def combine(*configs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between combine
and from_many
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I combine
d them (pun intended).
Seriously, though. from_parents
builds the parents, and reduces if none of the parents are in a context manager mode.
_reduce_from_parents
gets the current rules without regards to what the parents are.
sparse/densification.py
Outdated
|
||
class DensificationConfig(object): | ||
def __init__(self, densify=None, max_size=None, min_density=None): | ||
if isinstance(densify, (Iterable, DensificationConfig)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like how you abuse the densify
parameter here. If you want a copy constructor, we should have one specifically for it.
sparse/coo.py
Outdated
return COO(self.coords[:, nonzero], data_func[nonzero], | ||
shape=self.shape, | ||
has_duplicates=self.has_duplicates, | ||
sorted=self.sorted) | ||
sorted=self.sorted, densification_config=densification_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💄 all other parameters got their own line.
raise ValueError("Performing this operation would produce " | ||
"a dense result: %s" % name) | ||
|
||
def check(self, name, *arrays): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Varargs like *arrays
can become a huge PITA, I think people should pass an iterable instead.
Also, name
is not necessarily needed for a check, so it could be made optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name is needed, we use it for the exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to change this, but it is a huge ease to simply do config.check(name, arr)
or config.check(name, arr1, arr2)
instead of putting everything into a list. If we need to pass a list, we can always do config.check(name, *arrays)
. Could you tell me why it's difficult to support?
sparse/coo.py
Outdated
self._densification_config = old_densification_config | ||
|
||
@property | ||
def densification_config(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like how COO
knows how to deal with all the different DensificationConfig
flavors. All of this should be entirely encapsulated in DensificationConfig
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Thanks for the great feedback, @nils-werner! |
I feel I should be writing tests for the |
I might be a bit late to this discussion but, are we sure that we want to do this? Automatic densification may not be a good idea long term. If possible it would be good to make this change easy to remove in the future without having to touch all of the codebase. When I see densification commands going into each of the small functions I become concerned. This also raises the bar for contributions from new developers. |
I'll work on that... I'm not entirely sure I need this either, at least for any of my use cases. The best I can do here is pass the parent Edit: Also, I'm actively working on this. |
Okay, I'm convinced that a densification manager is unwise, and that densification should always be explicit. I'll close this PR but let the branch stay in case someone else can build on my work. |
Fixes #10
A densification manager for
COO
.You can do, for example,
And it should work automagically.