Description
Include/cpython/tupleobject.h
has
typedef struct {
PyObject_VAR_HEAD
/* ob_item contains space for 'ob_size' elements.
Items must normally not be NULL, except during construction when
the tuple is not yet visible outside the function that builds it. */
PyObject *ob_item[1];
} PyTupleObject;
CPython may allocate the object with trailing elements, then access it with something like
PyUnicode_InternInPlace(&_PyTuple_ITEMS(tuple)[i]);
where i > 0
.
This out-of-bounds access is UB. (https://stackoverflow.com/questions/44745677/flexible-array-members-can-lead-to-undefined-behavior mentioned that before C99 TC2 there was a non-normative example which suggested that [1]
can be used. That was incorrect and was removed by TC2.)
The 2022-06-24 Clang -fstrict-flex-arrays commit
(https://reviews.llvm.org/D126864 https://reviews.llvm.org/rG886715af962de2c92fac4bd37104450345711e4a) made -fsanitize=array-bounds
stricter and would catch such UB. Note: the Clang patch appears non-comprehensive. It misses many similar UB cases but catches the CPython UB.
Reproduce (with clang compiled from latest llvm-project):
% mkdir -p out/bounds && cd out/bounds
% ../../configure CC=/tmp/RelA/bin/clang CXX=/tmp/RelA/bin/clang++ CFLAGS=-fsanitize=bounds LDFLAGS=-fsanitize=bounds
% make -j 60
CC='/tmp/RelA/bin/clang' LDSHARED='/tmp/RelA/bin/clang -shared -fsanitize=bounds ' OPT='-DNDEBUG -g -fwrapv -O3 -Wall' ./python -E ../../setup.py build
../../Objects/codeobject.c:49:34: runtime error: index 2 out of bounds for type 'PyObject *[1]' (aka 'struct _object *[1]')
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../../Objects/codeobject.c:49:34 in
running build
running build_ext
See also https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html GCC appears to be more permissive, at least til now: "Although using one-element arrays this way is discouraged, GCC handles accesses to trailing one-element array members analogously to zero-length arrays."
There are multiple suspicious places in CPython:
% rg 'ob_.*\[1\]'
Include/memoryobject.h
65: Py_ssize_t ob_array[1]; /* shape, strides, suboffsets */
Tools/gdb/libpython.py
876: digit ob_digit[1];
Include/cpython/tupleobject.h
10: PyObject *ob_item[1];
Include/cpython/longintrepr.h
81: digit ob_digit[1];
Include/cpython/bytesobject.h
8: char ob_sval[1];