Skip to content

Inline values array into the object #115776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #654
markshannon opened this issue Feb 21, 2024 · 9 comments
Open
Tracked by #654

Inline values array into the object #115776

markshannon opened this issue Feb 21, 2024 · 9 comments
Assignees
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage type-feature A feature request or enhancement

Comments

@markshannon
Copy link
Member

markshannon commented Feb 21, 2024

Feature or enhancement

Currently when allocating a plain Python object, we allocate the object, and its values array.
This has a few downsides:

  1. We perform two allocations instead of one.
  2. An extra indirection is needed when accessing an attribute on the object
  3. If the __dict__ is materialized we can no longer use specialized lookup.

We could fix three with an extra pointer in the object header, but that would waste more space.

We should append the values array directly after the object header, which fixes the above issues.

It adds some complexity, as we need to track ownership of the values so that they are freed exactly once, but may enable some simplifications as well. Overall, it would seem to make little difference to complexity.

See faster-cpython/ideas#72 for more discussion.

Linked PRs

@encukou
Copy link
Member

encukou commented Apr 11, 2024

This change introduced a regression, see #117750

@encukou
Copy link
Member

encukou commented Aug 29, 2024

The new public API, Py_TPFLAGS_INLINE_VALUES, should be documented.

@cdce8p
Copy link
Contributor

cdce8p commented Jan 16, 2025

Posting it here in case it was missed. #123192 caused mypyc to break. We haven't been able to narrow it down to a clean reproducer just yet unfortunately. For more details see python/mypy#17973

@henryiii
Copy link
Contributor

henryiii commented May 9, 2025

This breaks pybind11, as well (assuming mypyc is still broken since that's still open). We used to be able to specify a class accepted dynamic attributes, something like this:

inline void enable_dynamic_attributes(PyHeapTypeObject *heap_type) {
    auto *type = &heap_type->ht_type;
    type->tp_flags |= Py_TPFLAGS_HAVE_GC;
    type->tp_flags |= Py_TPFLAGS_MANAGED_DICT;
    type->tp_traverse = pybind11_traverse;
    type->tp_clear = pybind11_clear;

    static PyGetSetDef getset[]
        = {{"__dict__", PyObject_GenericGetDict, PyObject_GenericSetDict, nullptr, nullptr},
           {nullptr, nullptr, nullptr, nullptr, nullptr}};
    type->tp_getset = getset;
}

But now those attributes no longer are present. Is there something we should be doing instead?

This shows up in two tests. One is a pickling test; the dynamic attributes don't come back when the object goes through pickling.

    @pytest.mark.xfail("env.PYPY")
    def test_roundtrip_with_dict():
        p = m.PickleableWithDict("test_value")
        p.extra = 15
        p.dynamic = "Attribute"

        data = pickle.dumps(p, pickle.HIGHEST_PROTOCOL)
        p2 = pickle.loads(data)
        assert p2.value == p.value
        assert p2.extra == p.extra
>       assert p2.dynamic == p.dynamic
E       AttributeError: 'pybind11_tests.pickling.PickleableWithDict' object has no attribute 'dynamic'

The second test indicates you can no longer replace the dict.

 @pytest.mark.xfail("env.PYPY")
    def test_dynamic_attributes():
        instance = m.DynamicClass()
        assert not hasattr(instance, "foo")
        assert "foo" not in dir(instance)

        # Dynamically add attribute
        instance.foo = 42
        assert hasattr(instance, "foo")
        assert instance.foo == 42
        assert "foo" in dir(instance)

        # __dict__ should be accessible and replaceable
        assert "foo" in instance.__dict__
        instance.__dict__ = {"bar": True}
>       assert not hasattr(instance, "foo")
E       AssertionError: assert not True
E        +  where True = hasattr(<pybind11_tests.methods_and_attributes.DynamicClass object at 0x112c5c4a0>, 'foo')

@cdce8p
Copy link
Contributor

cdce8p commented May 9, 2025

assuming mypyc is still broken since that's still open

Yes, it still is. Tracked in

Interestingly it's also a pickle test that's failing.

A workaround for mypyc would be to remove the Py_TPFLAGS_INLINE_VALUES, I just haven't looked into it yet if that's the right call. We remove other flags as well already.

 #if PY_MINOR_VERSION == 11
     // This is a hack. Python 3.11 doesn't include good public APIs to work with managed
     // dicts, which are the default for heap types. So we try to opt-out until Python 3.12.
     t->ht_type.tp_flags &= ~Py_TPFLAGS_MANAGED_DICT;
+#elif PY_MINOR_VERSION == 14
+    t->ht_type.tp_flags &= ~Py_TPFLAGS_INLINE_VALUES;

python/mypy#17973 (comment)

@henryiii
Copy link
Contributor

henryiii commented May 10, 2025

The problem occurs when you make the class, then try to assign a new dictionary to __dict__ (which is what pickle does).

I tried removing the Py_TPFLAGS_INLINE_VALUES, but it didn't do anything, and I'm not sure it should. The code here doesn't check to see if it's set, it only checks the size and then then forces it to be inline, setting it if it's not already set. Maybe it should check for Py_TPFLAGS_MANAGED_DICT besides just checking the size? This patch fixes it for me:

diff --git a/Objects/typeobject.c b/Objects/typeobject.c
index a7ab69fef4c..4025ada1242 100644
--- a/Objects/typeobject.c
+++ b/Objects/typeobject.c
@@ -8715,7 +8715,7 @@ type_ready_managed_dict(PyTypeObject *type)
             return -1;
         }
     }
-    if (type->tp_itemsize == 0) {
+    if (type->tp_itemsize == 0 && !(type->tp_flags & Py_TPFLAGS_MANAGED_DICT)) {
         type_add_flags(type, Py_TPFLAGS_INLINE_VALUES);
     }
     return 0;

Edit: this never fails to be hit, actually, so is not the right solution.

@henryiii
Copy link
Contributor

henryiii commented May 10, 2025

I have a MWE without pybind11 or mypyc:

src/main.c:
#define PY_SSIZE_T_CLEAN
#include <Python.h>

typedef struct {
    PyObject_VAR_HEAD
} ManagedDictObject;

int ManagedDict_traverse(PyObject *self, visitproc visit, void *arg) {
    PyObject_VisitManagedDict(self, visit, arg);
    Py_VISIT(Py_TYPE(self));
    return 0;
}

int ManagedDict_clear(PyObject *self) {
    PyObject_ClearManagedDict(self);
    return 0;
}

static PyGetSetDef ManagedDict_getset[] = {
    {"__dict__", PyObject_GenericGetDict, PyObject_GenericSetDict, NULL, NULL},
    {NULL, NULL, NULL, NULL, NULL},
};

static PyType_Slot ManagedDict_slots[] = {
    {Py_tp_new, (void *)PyType_GenericNew},
    {Py_tp_getset, (void *)ManagedDict_getset},
    {Py_tp_traverse, (void *)ManagedDict_traverse},
    {Py_tp_clear, (void *)ManagedDict_clear},
    {0}
};

static PyType_Spec ManagedDict_spec = {
    "manageddictbug.ManagedDict",
    sizeof(ManagedDictObject),
    0, // itemsize
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_MANAGED_DICT | Py_TPFLAGS_HEAPTYPE | Py_TPFLAGS_HAVE_GC,
    ManagedDict_slots
};

static PyModuleDef manageddictbugmodule = {
    PyModuleDef_HEAD_INIT,
    "manageddictbug",
    NULL,
    -1,
    NULL,
};

PyMODINIT_FUNC
PyInit_manageddictbug(void) {
    PyObject *m = PyModule_Create(&manageddictbugmodule);
    if (m == NULL)
        return NULL;

    PyObject *ManagedDictType = PyType_FromSpec(&ManagedDict_spec);
    if (ManagedDictType == NULL) {
        Py_DECREF(m);
        return NULL;
    }

    if (PyModule_AddObject(m, "ManagedDict", ManagedDictType) < 0) {
        Py_DECREF(ManagedDictType);
        Py_DECREF(m);
        return NULL;
    }

    return m;
}
pyproject.toml:
[build-system]
requires = ["scikit-build-core"]
build-backend = "scikit_build_core.build"

[project]
name = "example-broken"
version = "0.1.0"
CMakeLists.txt:
cmake_minimum_required(VERSION 3.15...4.0)
project(${SKBUILD_PROJECT_NAME} LANGUAGES C)

find_package(Python REQUIRED COMPONENTS Development.Module)

python_add_library(manageddictbug MODULE WITH_SOABI src/main.c)
install(TARGETS manageddictbug DESTINATION .
example.py:
import manageddictbug

obj = manageddictbug.ManagedDict()
obj.foo = 42
print(obj.foo)
print(obj.__dict__)

obj.__dict__ = {"bar": 3}
print(obj.__dict__)
print(obj.bar)
$ uv venv -p 3.13 -q && uv pip install . -q && .venv/bin/python example.py
42
{'foo': 42}
{'bar': 3}
3
$ uv venv -p ~/git/software/pybind11/.venv/bin/python3.14 -q && uv pip install . -q && .venv/bin/python example.py
42
{'foo': 42}
{'bar': 3}
Traceback (most recent call last):
  File "/Users/henryschreiner/git/scikit-build-proj/example_broken/example.py", line 13, in <module>
    print(obj.bar)
          ^^^^^^^
AttributeError: 'manageddictbug.ManagedDict' object has no attribute 'bar'

Edit: Fixed, I forgot PyObject_VAR_HEAD. This reproduces the bug: with Python 3.13, it works, 3.14b1, it fails.

I did test that if we change the pickle implementation to do __dict__.update instead, it works. So it's just __dict__ = {...} that's broken; __dict__ updates, but the attributes don't update.

There is custom code in PyObject_GenericSetDict that is supposed to be handling the inline case, but maybe it wasn't hit before and is faulty?

@henryiii
Copy link
Contributor

Opened a new issue.

@encukou
Copy link
Member

encukou commented May 12, 2025

The new public API, Py_TPFLAGS_INLINE_VALUES, still needs documentation.

@encukou encukou reopened this May 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants