-
-
Notifications
You must be signed in to change notification settings - Fork 42
[FR]: Generate venv at build time #522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Blocker: #339 (comment) |
I think I am blocked on this. I am trying to create a python environment on an OCI image that has a few dependencies pre-installed. It doesn't seem possible to have the venv already created in the image. It also doesn't seem possible to symlink the venv in the https://github.com/vinnybod/bazel-examples/tree/python-oci/python-oci |
Related to #522 and associated issues. This PR introduces a new venv tool, what for lack of better words I'm calling the "shim". While standard interpreters do support the use of relative paths to the interpreter in `pyvenv.cfg` -- for instance `home = ./bin/python` is legal -- we still need a `bin/python` which is portable. Existing relocatable virtual environment solutions (uv, conda) still ultimately create both an absolute path reference in `pyvenv.cfg` and usually in the `bin/python` symlink to a specific interpreter on the filesystem. UV does this statically as is standard, Conda will also do this statically with an explicit update process as part of the conda unpack. We can't create relocatable symlinks, and we also can't dynamically correct static symlinks without continuing to have #339 as a problem. What we can do is dynamically identify a Python interpreter and hoodwink it with regards to its own path, causing it to load a virtualenv which is itself entirely relocatable. As with many other tools, the Python interpreter introspects its `argv[0]` to figure out its own nominal path and identify the home, see [2], [3] for the gory details. This means that when the interpreter is invoked via a symlink under normal execution, the "path" of the interpreter per `argv[0]` is the path of the symlink with respect to which `pyvenv.cfg` can be identified and the virtualenv activated automatically. The Darwin platform provides the `_NSGetExecutablePath` libc call which is an alternative mechanism for determining the "path" by which the interpreter is invoked. This PR introduces a shim tool which can be emplaced as the target of `bin/python`, which will consult the `pyvenv.cfg` of a conventional virtualenv to find the requested interpreter version, and will attempt to delegate to an identified interpreter in such a way as to make the interpreter believe that its path is that of the shim tool. On a conventional unix this is as simple as lying about the value of `argv[0]`, on Darwin the `PYTHONEXECUTABLE` environment flag must be set to make Python disbelieve the value of `_NSGetExecutablePath` which is harder to hoodwink. Using this tool, a relocatable virtualenv can be structured as follows ``` ./pyvenv.cfg # conventional ./bin/python -> ./aspect_venv_shim # customized "interpreter" ./bin/python3 -> ./python # conventional ./bin/python3.{N} -> ./python # conventional ./bin/aspect_venv_shim ./lib/python3.${N}/site-packages/... # conventional; standard contents ``` We should be able to create one of these venvs by updating our `uv` dependency, setting the `relocatable=True` flag when creating virtualenvs, and attempting to specify the custom interpreter path of `./bin/aspect_venv_shim` so that the "python" symlink and its siblings will enter this pipeline. Using this as our "interpreter" will allow us to pull venv creation forwards from runtime to a normal build action, and allow us to create conda-pack like structures which require no unpack post-processing directly from that venv. The downside of this approach is that as with other `exec` based Python launchers such as Pex or Bazel's `--run_under` it is likely to interfere with debugging tools that want to instrument a Python process, as the bootloader is not an interpreter and cannot be analyzed as such. ### Changes are visible to end-users: yes - Searched for relevant documentation and updated as needed: yes - Breaking change (forces users to change their own code or config): no - Suggested release notes appear below: no ### Test plan - [x] Built for MacOS; made a relocatable venv, customized the `pyvenv.cfg` to be `home = ./bin/python`, updated the `bin/python` link to be `bin/aspect_venv_shim`, confirmed that invoking the venv's python links did cause the venv to load. - [x] Built for Linux; made a relocatable venv, customized the `pyvenv.cfg` to be `home = ./bin/python`, updated the `bin/python` link to be `bin/aspect_venv_shim`, confirmed that invoking the venv's python links did cause the venv to load. ### Notes [1] https://discuss.python.org/t/interpreter-independent-isolated-virtual-environments/5378/53 [2] https://github.com/python/cpython/blob/main/Modules/getpath.py#L293 [3] https://github.com/python/cpython/blob/main/Modules/getpath.c#L774
What is the current behavior?
This is a tracking issue for all things related to venv generation at build time. Biggest issue so far is when running py_binary put into a docker image on Lambda, it simply does not work since all the paths on Lambda is write protected.
Current issues with venv
#433
#373
#351
#339
Describe the feature
Generate the venv at Build time by making the venv relocatable.
The text was updated successfully, but these errors were encountered: