Skip to content

Remove deregistered objects from the inactive overload cache #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

BetsyMcPhail
Copy link

@BetsyMcPhail BetsyMcPhail commented Sep 4, 2019

Fixes issue reported in RobotLocomotion/drake#11424


This change is Reviewable

@EricCousineau-TRI EricCousineau-TRI self-assigned this Sep 9, 2019
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@EricCousineau-TRI for review

Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on @EricCousineau-TRI)

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @BetsyMcPhail)


include/pybind11/pybind11.h, line 2413 at r1 (raw file):

    if (!self)
        return function();
    handle type = self.get_type();

nit CI errors are happening due to this being unused. This error should disappear if you try the suggestion; if that doesn't work, then you'll want to remove this.


include/pybind11/pybind11.h, line 2414 at r1 (raw file):

        return function();
    handle type = self.get_type();
    auto key = std::make_pair(self.ptr(), name);

Rather than use the object pointer and require removal upon object destruction, can you try my suggestion from this comment?
RobotLocomotion/drake#11424 (comment)


include/pybind11/detail/class.h, line 228 at r1 (raw file):

        if (self == it->second && Py_TYPE(self) == Py_TYPE(it->second)) {
            registered_instances.erase(it);
            PyObject *to_remove = (PyObject *)it->second;

When you try my above suggestion, can you remove this code?

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @BetsyMcPhail)

a discussion (no related file):
I think this could be testable with the following procedure. Per the root-cause analysis in the Drake issue:

  • Have two functions that creates a type that has a pybind "vtable", and return the type
  • Call one function, and call it's overload.
  • Delete the type reference, garbage collect, ensure that it was deleted, then create the other type, and check.

a discussion (no related file):
I'm going to briefly try out my own suggestions; will post back to this discussion item.


@EricCousineau-TRI EricCousineau-TRI force-pushed the remove-old-objects-from-cache branch from 5661566 to 7915de6 Compare September 12, 2019 14:02
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)


include/pybind11/pybind11.h, line 2413 at r1 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit CI errors are happening due to this being unused. This error should disappear if you try the suggestion; if that doesn't work, then you'll want to remove this.

Done. (Fixed in separate push.)


include/pybind11/pybind11.h, line 2414 at r1 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Rather than use the object pointer and require removal upon object destruction, can you try my suggestion from this comment?
RobotLocomotion/drake#11424 (comment)

This does look like it'd be rather complicated, and would require non-trivial changes to the internals - or it would require on-purpose memory leaks to concatenate the full type name into the key.

However, the current workaround does seem a bit expensive, and I'm not sure it would get accepted in upstream as-is; perhaps it would get accepted with the internals modification, or simply concatenating.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

I'm going to briefly try out my own suggestions; will post back to this discussion item.

Tried making unittest, but encountered a different error; will post about that separately, possibly directly in upstream.

Here's the failure reproduced directly in upstream:
EricCousineau-TRI@9726086

I'll apply the patch here, and push it.


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 4 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Tried making unittest, but encountered a different error; will post about that separately, possibly directly in upstream.

Here's the failure reproduced directly in upstream:
EricCousineau-TRI@9726086

I'll apply the patch here, and push it.

(To clarify, I've created both a test which exercises this failure, as well as finding another error...)


@EricCousineau-TRI EricCousineau-TRI force-pushed the remove-old-objects-from-cache branch from 1144159 to 342cc67 Compare September 12, 2019 21:54
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: (pending CI)

Reviewed 2 of 2 files at r2, 2 of 2 files at r3.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

I think this could be testable with the following procedure. Per the root-cause analysis in the Drake issue:

  • Have two functions that creates a type that has a pybind "vtable", and return the type
  • Call one function, and call it's overload.
  • Delete the type reference, garbage collect, ensure that it was deleted, then create the other type, and check.

OK Retracting. We can see what pybind11 authors might suggest if/when they have the bandwidth. Added TODOs.


a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

(To clarify, I've created both a test which exercises this failure, as well as finding another error...)

Done.



include/pybind11/pybind11.h, line 2414 at r1 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

This does look like it'd be rather complicated, and would require non-trivial changes to the internals - or it would require on-purpose memory leaks to concatenate the full type name into the key.

However, the current workaround does seem a bit expensive, and I'm not sure it would get accepted in upstream as-is; perhaps it would get accepted with the internals modification, or simply concatenating.

OK Retracting given that there's now a TODO.


include/pybind11/detail/class.h, line 228 at r1 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

When you try my above suggestion, can you remove this code?

OK Retracting.

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch from 342cc67 to 351c186 Compare September 18, 2019 14:33
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @BetsyMcPhail and @EricCousineau-TRI)


tests/test_class.py, line 312 at r4 (raw file):

    del Child1
    pytest.gc_collect()
    assert wref() is None

Sorry for not fixing this earlier; I think this will still ultimately fail CI.

The right fix would be to possibly disable this test unless it's under CPython, Python 3. (I think pytest has some skipIf mechanism can you can combine with info from sys.)

Let me know if you'd like me to make that change.

Copy link
Author

@BetsyMcPhail BetsyMcPhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @EricCousineau-TRI)


tests/test_class.py, line 312 at r4 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Sorry for not fixing this earlier; I think this will still ultimately fail CI.

The right fix would be to possibly disable this test unless it's under CPython, Python 3. (I think pytest has some skipIf mechanism can you can combine with info from sys.)

Let me know if you'd like me to make that change.

No problem. I figured I'd make the quick fix first and then dig into any remaining CI failures. I can make the other change since I'm already working in this file. I'll let you know if I run into any issues.

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch from 351c186 to fe82311 Compare September 18, 2019 15:36
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @BetsyMcPhail and @EricCousineau-TRI)


tests/test_class.py, line 312 at r4 (raw file):

Previously, BetsyMcPhail (Betsy McPhail) wrote…

No problem. I figured I'd make the quick fix first and then dig into any remaining CI failures. I can make the other change since I'm already working in this file. I'll let you know if I run into any issues.

Sounds good. From looking at the prior build: https://travis-ci.com/RobotLocomotion/pybind11/builds/128189983

Looks like this test should only run if it's on Linux, CPython, and not a debug build.
Also, it looks like c++11 / gcc4.8 also fails in Python 2, so perhaps constrain it only run for Python 3?

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch from fe82311 to 356f862 Compare September 18, 2019 16:46
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 2 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)


tests/test_class.py, line 296 at r5 (raw file):

@pytest.unsupported_on_pypy
@pytest.mark.skipif(sys.platform != "linux", reason="only runs on Linux")
@pytest.mark.skipif(sys.version_info < (3, 0), reason="requires python3")

nit You may to also check if this is a debug build.
From looking, perhaps the check is if "d" in sys.abiflags

This may be valid only in Python 3 (3.2 per the PEP), so perhaps add it to the existing stuff:

skipif(sys.version_info < (3, 0) or "d" in sys.abiflags, reason="requires python3 in release mode")

PEP: https://www.python.org/dev/peps/pep-3149/

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch 2 times, most recently from cd98716 to e6c8c5b Compare September 18, 2019 20:30
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @BetsyMcPhail and @EricCousineau-TRI)


tests/test_class.py, line 296 at r6 (raw file):

@pytest.unsupported_on_pypy

:( It looks like this is very platform dependent; this failure seems like it's hard to check in a way that's not CI-specific (e.g. checking for environment variables):
https://travis-ci.com/RobotLocomotion/pybind11/jobs/236907166 (fails, but is CPython, Python 3, non-debug, Linux)

Perhaps this should just be skipped?

pytest.mark.skip(reason="Generally reproducible in CPython, Python 3, non-debug, on Linux. However, it is hard to pin this down for CI.")

Copy link
Author

@BetsyMcPhail BetsyMcPhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @EricCousineau-TRI)


tests/test_class.py, line 296 at r6 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

:( It looks like this is very platform dependent; this failure seems like it's hard to check in a way that's not CI-specific (e.g. checking for environment variables):
https://travis-ci.com/RobotLocomotion/pybind11/jobs/236907166 (fails, but is CPython, Python 3, non-debug, Linux)

Perhaps this should just be skipped?

pytest.mark.skip(reason="Generally reproducible in CPython, Python 3, non-debug, on Linux. However, it is hard to pin this down for CI.")

Unfortunately, I think you're right. It's going to be hard to define a combination that works reliably.

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch from e6c8c5b to fead9db Compare September 19, 2019 14:21
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 2 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)

a discussion (no related file):
I will file the Drake PR that tests using this commit.


@EricCousineau-TRI
Copy link
Collaborator

Almost there! One last lint error:

$ flake8
./tests/test_class.py:2:1: F401 'sys' imported but unused
import sys

@BetsyMcPhail BetsyMcPhail force-pushed the remove-old-objects-from-cache branch from fead9db to c07168a Compare September 19, 2019 16:20
Copy link
Author

@BetsyMcPhail BetsyMcPhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 2 unresolved discussions (waiting on @BetsyMcPhail and @EricCousineau-TRI)


tests/test_class.py, line 296 at r6 (raw file):

Previously, BetsyMcPhail (Betsy McPhail) wrote…

Unfortunately, I think you're right. It's going to be hard to define a combination that works reliably.

Done.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @BetsyMcPhail and @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

I will file the Drake PR that tests using this commit.

Doing this now...


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion (waiting on @BetsyMcPhail and @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Doing this now...

Filed: RobotLocomotion/drake#12095
Waiting for CI to pass. If it's good, will merge this.


Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @BetsyMcPhail)


tests/test_class.py, line 295 at r8 (raw file):

@pytest.mark.skip(reason="Generally reproducible in CPython, Python 3, non-debug, on Linux.\

nit Super minor, but for line skips, can you instead split the string literal? e.g.

func(arg="My text "
         "with line continuation.")

@EricCousineau-TRI EricCousineau-TRI force-pushed the remove-old-objects-from-cache branch from c07168a to 579cd5a Compare September 24, 2019 20:30
Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, all discussions resolved (waiting on @EricCousineau-TRI)

a discussion (no related file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

Filed: RobotLocomotion/drake#12095
Waiting for CI to pass. If it's good, will merge this.

Done. (Passed.)



tests/test_class.py, line 295 at r8 (raw file):

Previously, EricCousineau-TRI (Eric Cousineau) wrote…

nit Super minor, but for line skips, can you instead split the string literal? e.g.

func(arg="My text "
         "with line continuation.")

Done.

Copy link
Collaborator

@EricCousineau-TRI EricCousineau-TRI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r9.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@EricCousineau-TRI EricCousineau-TRI merged commit 965af15 into RobotLocomotion:drake Sep 24, 2019
@BetsyMcPhail BetsyMcPhail deleted the remove-old-objects-from-cache branch September 25, 2019 20:59
EricCousineau-TRI pushed a commit that referenced this pull request Jan 14, 2020
…#35)

Reverts #32

This change cause segfaults on Mac CI: RobotLocomotion/drake#12105

This was PR'd against this pybind11 fork and merged, then incorporated into a Drake PR as a commit update.
The Drake PR was then reverted in #12105, but the commit was not reverted in pybind11 itself, hence this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants