-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
Support other types than dict for __builtins__ #58593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CPython expects __builtins__ to be a dict, but it is interesting to be able to use another type. For example, my pysandbox project (sandbox to secure Python) requires a read-only mapping for __builtins__. The PEP-416 was rejected, so there is no builtin frozendict type, but it looks like the dictproxy type will be exposed as a public type. Attached patch uses PyDict_CheckExact() to check if __builtins__ is a dict and add a "slow-path" for other types. The overhead on runtime performance should be very low (near zero), PyDict_CheckExact() just dereference a pointer (to read the object type) and compare two pointers. The patch depends on issue bpo-14383 patch (identifier.patch) for the __build_class__ identifier. I can write a new patch without this dependency if needed. |
See the issue bpo-14386 which exposes dictproxy as a public type. |
Example combining patches of bpo-14385 and bpo-14386 to run code with read-only __builtins__: ns={'__builtins__': __builtins__.__dict__}
exec(compile("__builtins__['superglobal']=1; print(superglobal)", "test", "exec"), ns)
ns={'__builtins__': dictproxy(__builtins__.__dict__)}
exec(compile("__builtins__['superglobal']=2; print(superglobal)", "test", "exec"), ns)
----------- end of test.py Output: $ ./python test.py
1
Traceback (most recent call last):
File "x.py", line 4, in <module>
exec(compile("__builtins__['superglobal']=1; print(superglobal)", "test", "exec"), ns)
File "test", line 1, in <module>
TypeError: 'dictproxy' object does not support item assignment Note: this protection is not enough to secure Python, but it is an important part of a Python sandbox. |
Oh, and by the way, I workaround the lack of read-only mapping in pysandbox by removing dict methods: dict.__init__(), dict.clear(), dict.update(), etc. This is a problem because these methods are useful in Python. |
With my patch, Python doesn't check __builtins__ type whereas ceval.c replaces any lookup error by a NameError. Example: $ ./python
Python 3.3.0a1+ (default:f8d01c8baf6a+, Mar 26 2012, 01:44:48)
>>> code=compile("print('Hello World!')", "", "exec")
>>> exec(code,{'__builtins__': {'print': print}})
Hello World!
>>> exec(code,{'__builtins__': {}})
NameError: name 'print' is not defined
>>> exec(code,{'__builtins__': 1})
NameError: name 'print' is not defined It should only replace the current exception by NameError if the current exception is a LookupError. And my patch on LOAD_GLOBAL is not correct, it does still call PyDict_GetItem. I'm waiting until bpo-14383 is done before writing a new patch. |
New version:
Before my patch, a new dict was created for builtins if __builtins__ exists in global but is not a dict. With my patch, the __builtins__ is kept and the type is checked at runtime. If __builtins__ is not a mapping, an exception is raised on lookup in ceval.c. We may check __builtins__ type in PyFrame_New() using: PyDict_Check(builtins) || (PyMapping_Check(mapping) && !PyList_Check(mapping) && !PyTuple_Check(mapping)) (PyDict_Check(builtins) is checked first for performance) |
Oops, patch version 2 was not correct: I forgot a { ... } in ceval.c. New patch fixing this issue but leaves also the LOAD_GLOBAL code unchanged : keep the goto and don't try to factorize the 3 last instructions. LOAD_GLOBAL is really critical in performance. With patch version 3, the overall overhead is +0.4% according to pybench. |
+ assert(!builtins); Oops, the assert must be replaced with assert(builtins != NULL) -> fixed in patch version 4. |
This looks fine. |
New changeset e3ab8aa0216c by Victor Stinner in branch 'default': |
I note that the documentation still states a dictionary is required for globals. Should that not be updated as well? |
Apologies, I meant to link to the dev docs: |
The patch for this issue changed LOAD_GLOBAL to use PyObject_GetItem when globals() is a dict subclass, but LOAD_NAME, STORE_GLOBAL, and DELETE_GLOBAL weren't changed. (LOAD_NAME uses PyObject_GetItem for builtins now, but not for globals.) This means that global lookup doesn't respect overridden __getitem__ inside a class statement (unless you explicitly declare the name global with a global statement, in which case LOAD_GLOBAL gets used instead of LOAD_NAME). I don't have a strong opinion on whether STORE_GLOBAL or DELETE_GLOBAL should respect overridden __setitem__ or __delitem__, but the inconsistency between LOAD_GLOBAL and LOAD_NAME seems like a bug that should be fixed. For reference, in the following code, the first 3 exec calls successfully print 5, and the last exec call fails, due to the LOAD_GLOBAL/LOAD_NAME inconsistency: class Foo(dict):
def __getitem__(self, index):
return 5 if index == 'y' else super().__getitem__(index)
exec('print(y)', Foo())
exec('global y; print(y)', Foo())
exec('''
class UsesLOAD_NAME:
global y
print(y)''', Foo())
exec('''
class UsesLOAD_NAME:
print(y)''', Foo()) |
This issues is now closed. Please open a new issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: