Skip to content

Deprecate the (new) _array methods by allowing both parent(u) and parent[u] access? #2015

@hyanwong

Description

@hyanwong

In #1320 we bemoaned the introduction of new Tree.parent_array[u], Tree.left_sib_array[u], etc. notation, which are the array-based (direct memory access) equivalent of Tree.parent(u), Tree.left_sib(u), etc. Nevertheless, this made it into tskit 0.3.6.

However, it seems like we can use the __call__ dunder method to determine what happens when we call class_instance(u). So we could have both Tree.parent(u) and Tree.parent[u]as valid access methods, by creating an overloaded numpy ndarray class, and returning a numpy array using this as a view:

class Ts_nparray(np.ndarray):
    # see https://numpy.org/doc/stable/user/basics.subclassing.html
    def __call__(self, index):
        return self[index]

a = np.arange(10, 20).view(Ts_nparray)
assert a(3) == a[3]

If this doesn't seem too hacky, we could thus have

class _ts_nparray(np.ndarray):
    # a simple wrapper class to allow the array to be accessed using arr(x) as well as arr[x]
    # see https://numpy.org/doc/stable/user/basics.subclassing.html
    def __call__(self, index):
        return self[index]  # could also convert this to a python int, for full backwards compatibility


class Tree
    ...
    @property 
    def parent(self):
        # get the numpy direct memory array here
        return numpy_direct_memory_array.view(_ts_nparray)

    @property 
    def parent_array(self):
        # Deprecated but could be maintained for backwards compatibility
        return self.parent()

That would mean all the following are equivalent:

tree.parent[0]  # Standard numpy access, recommended for new code
tree.parent(0)  # Old tskit syntax, deprecated but permanently maintained for backwards compatibility
tree.parent_array[0] # Recently introduced syntax (in 0.3.6), make deprecated, possibly could be removed later
tree.parent_array(0)  # Comes along for the ride(!)

And we would strongly recommend that new code uses the first (standard numpy) method. Using __call__ to keep the old behaviour too seems a bit hacky, but would mean that old code would (hopefully) switch seamlessly to the new.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceThis issue addresses performance, either runtime or memoryPython APIIssue is about the Python API

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions