-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-45653: Freeze parts of the encodings package #30030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ec2bf3f
to
e835a2f
Compare
/* Set encodings.__path__ for frozen encodings package | ||
* | ||
* The encodings package is frozen but most encodings modules are not. The | ||
* __path__ attribute of the encodings package must be reset so importlib is | ||
* able to find the pure Python modules. | ||
* Returns -1 on error | ||
*/ | ||
static int | ||
_set_encodings_path(PyObject *mod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... This should not be necessary. It is already handled by FrozenImporter.find_spec()
(in Lib/importlib/_bootstrap.py). If path is not getting set then something went wrong and needs to be fixed.
Is it here to provide a fallback for the config->stdlib_dir == NULL
case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something is not working right when embedding Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😞
I'll try to take a look this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I spent some time looking at this today. I take back what I said: "This should not be necessary." The approach you took is probably good enough until we can find a better solution. I plan on troubleshooting the test_embed failures if you don't figure them out first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that sys._stdlib_dir
is set to None. This can happen in some embedding scenarios (for now). This is the reason why FrozenImporter.find_spec()
doesn't populate encodings.__path__
in the failing tests. It is why _set_encodings_path()
is failing.
The problem is that sys._stdlib_dir
is set to None. sys._stdlib_dir
is set from _Py_GetStdlibDir()
, which returns the value calculated by the getpath.c code during runtime init. In some embedding cases that code refuses to extrapolate the stdlib dir, so it ends up None.
FrozenImporter.find_spec()
uses sys._stdlib_dir
to figured out the encodings.__path__
entry to add. If the stdlib dir is unknown then it doesn't add any. This is also why _set_encodings_path()
isn't working.
We have several options:
- extrapolate
sys._stdlib_dir
some other way - fall back to using the non-frozen encodings module
- freeze all "encodings" submodules
(2) effectively accomplishes the same thing as (1), though it doesn't actually update sys._stdlib_dir
. Furthermore, we already know it works. (2) also has the benefit of being very simple, since we'd use the normal import machinery unchanged. (Note that (1) and (2) are not guaranteed to find the stdlib dir. However, with (2) that failure mode already exists, so embedders would already have to deal with it.) (3) would get what we want but would make the compiled binary bigger and would add a bunch of noise to make
output when building.
So I recommend (2).
(2) involves 2 things: drop _set_encodings_path()
here, and update FrozenImporter.find_spec()
.
diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py
index afb95f4e1d..4200afae7c 100644
--- a/Lib/importlib/_bootstrap.py
+++ b/Lib/importlib/_bootstrap.py
@@ -941,6 +941,9 @@ def find_spec(cls, fullname, path=None, target=None):
origin=cls._ORIGIN,
is_package=ispkg)
filename, pkgdir = cls._resolve_filename(origname, fullname, ispkg)
+ if ispkg and not pkgdir:
+ # We can't resolve __path__, so Fall back to the path finder.
+ return None
spec.loader_state = type(sys.implementation)(
filename=filename,
origname=origname,
The question is, do we still need this version, or is Kumar's original #29788 sufficient? I'd like to have a single PR that we can actually merge. :-) |
Signed-off-by: Christian Heimes <[email protected]> Co-authored-by: Kumar Aditya <[email protected]>
0ce223d
to
eef175c
Compare
The PR is inspired by Kumar's PR GH-29788. I extended
_PyCodecRegistry_Init
to set theencodings.__path__
to[os.path.join(config->stdlib_dir, "encodings)]
.Signed-off-by: Christian Heimes [email protected]
Co-authored-by: Kumar Aditya [email protected]
https://bugs.python.org/issue45653