-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-40120: Fix unbounded struct char[] undefined behavior. #19232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
809f263
d3329fe
a1759d9
630a26a
cd483f6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
Fixed internal structure definitions for structs such as PyBytesObject and | ||
unicode's encoding_map to not rely on C undefined behavior for access to | ||
their trailing unbounded character array in favor of C99 approved flexible | ||
array member syntax. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,9 @@ _Py_IDENTIFIER(__bytes__); | |
|
||
Using PyBytesObject_SIZE instead of sizeof(PyBytesObject) saves | ||
3 bytes per string allocation on a typical system. | ||
|
||
The + 1 accounts for the trailing \0 byte that we include as a safety | ||
measure for code that treats the underlying char * as a C string. | ||
*/ | ||
#define PyBytesObject_SIZE (offsetof(PyBytesObject, ob_sval) + 1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait, you made PyBytesObject 1 byte smaller, and you didn't have to substract 1 somewhere? Does it mean that Python overallocates 1 byte since forever? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thankfully not. Because this code used |
||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8208,14 +8208,14 @@ struct encoding_map { | |
PyObject_HEAD | ||
unsigned char level1[32]; | ||
int count2, count3; | ||
unsigned char level23[1]; | ||
unsigned char level23[]; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @gpshead: Can you please write a PR which only contains this change? Since it doesn't touch the public C API, C++ compatibility is not an issue (I don't think that Python can be built with a C++ compiler). We can start with this change and see how it goes? See also the discussion at python/peps#1349 |
||
}; | ||
|
||
static PyObject* | ||
encoding_map_size(PyObject *obj, PyObject* args) | ||
{ | ||
struct encoding_map *map = (struct encoding_map*)obj; | ||
return PyLong_FromLong(sizeof(*map) - 1 + 16*map->count2 + | ||
return PyLong_FromLong(sizeof(*map) + 16*map->count2 + | ||
128*map->count3); | ||
} | ||
|
||
|
@@ -8347,7 +8347,7 @@ PyUnicode_BuildEncodingMap(PyObject* string) | |
|
||
/* Create a three-level trie */ | ||
result = PyObject_MALLOC(sizeof(struct encoding_map) + | ||
16*count2 + 128*count3 - 1); | ||
16*count2 + 128*count3); | ||
if (!result) | ||
return PyErr_NoMemory(); | ||
PyObject_Init(result, &EncodingMapType); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to: