Skip to content

[FR]: Generate venv at build time #522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
thesayyn opened this issue Jan 28, 2025 · 2 comments
Open

[FR]: Generate venv at build time #522

thesayyn opened this issue Jan 28, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@thesayyn
Copy link
Member

What is the current behavior?

This is a tracking issue for all things related to venv generation at build time. Biggest issue so far is when running py_binary put into a docker image on Lambda, it simply does not work since all the paths on Lambda is write protected.

Current issues with venv

#433
#373
#351
#339

Describe the feature

Generate the venv at Build time by making the venv relocatable.

@thesayyn thesayyn added the enhancement New feature or request label Jan 28, 2025
@thesayyn
Copy link
Member Author

Blocker: #339 (comment)

@vinnybod
Copy link
Contributor

vinnybod commented Feb 6, 2025

I think I am blocked on this.

I am trying to create a python environment on an OCI image that has a few dependencies pre-installed. It doesn't seem possible to have the venv already created in the image. It also doesn't seem possible to symlink the venv in the $PATH and still contain the 3rd party dependencies.

https://github.com/vinnybod/bazel-examples/tree/python-oci/python-oci

@arrdem arrdem self-assigned this Mar 4, 2025
arrdem added a commit that referenced this issue Apr 28, 2025
Related to #522 and associated issues.

This PR introduces a new venv tool, what for lack of better words I'm
calling the "shim".

While standard interpreters do support the use of relative paths to the
interpreter in `pyvenv.cfg` -- for instance `home = ./bin/python` is
legal -- we still need a `bin/python` which is portable. Existing
relocatable virtual environment solutions (uv, conda) still ultimately
create both an absolute path reference in `pyvenv.cfg` and usually in
the `bin/python` symlink to a specific interpreter on the filesystem. UV
does this statically as is standard, Conda will also do this statically
with an explicit update process as part of the conda unpack.

We can't create relocatable symlinks, and we also can't dynamically
correct static symlinks without continuing to have #339 as a problem.

What we can do is dynamically identify a Python interpreter and hoodwink
it with regards to its own path, causing it to load a virtualenv which
is itself entirely relocatable.

As with many other tools, the Python interpreter introspects its
`argv[0]` to figure out its own nominal path and identify the home, see
[2], [3] for the gory details. This means that when the interpreter is
invoked via a symlink under normal execution, the "path" of the
interpreter per `argv[0]` is the path of the symlink with respect to
which `pyvenv.cfg` can be identified and the virtualenv activated
automatically. The Darwin platform provides the `_NSGetExecutablePath`
libc call which is an alternative mechanism for determining the "path"
by which the interpreter is invoked.

This PR introduces a shim tool which can be emplaced as the target of
`bin/python`, which will consult the `pyvenv.cfg` of a conventional
virtualenv to find the requested interpreter version, and will attempt
to delegate to an identified interpreter in such a way as to make the
interpreter believe that its path is that of the shim tool.

On a conventional unix this is as simple as lying about the value of
`argv[0]`, on Darwin the `PYTHONEXECUTABLE` environment flag must be set
to make Python disbelieve the value of `_NSGetExecutablePath` which is
harder to hoodwink.

Using this tool, a relocatable virtualenv can be structured as follows

```
./pyvenv.cfg                        # conventional
./bin/python -> ./aspect_venv_shim  # customized "interpreter"
./bin/python3 -> ./python           # conventional
./bin/python3.{N} -> ./python        # conventional
./bin/aspect_venv_shim
./lib/python3.${N}/site-packages/...                 # conventional; standard contents
```

We should be able to create one of these venvs by updating our `uv`
dependency, setting the `relocatable=True` flag when creating
virtualenvs, and attempting to specify the custom interpreter path of
`./bin/aspect_venv_shim` so that the "python" symlink and its siblings
will enter this pipeline.

Using this as our "interpreter" will allow us to pull venv creation
forwards from runtime to a normal build action, and allow us to create
conda-pack like structures which require no unpack post-processing
directly from that venv.

The downside of this approach is that as with other `exec` based Python
launchers such as Pex or Bazel's `--run_under` it is likely to interfere
with debugging tools that want to instrument a Python process, as the
bootloader is not an interpreter and cannot be analyzed as such.

### Changes are visible to end-users: yes

- Searched for relevant documentation and updated as needed: yes
- Breaking change (forces users to change their own code or config): no
- Suggested release notes appear below: no

### Test plan

- [x] Built for MacOS; made a relocatable venv, customized the
`pyvenv.cfg` to be `home = ./bin/python`, updated the `bin/python` link
to be `bin/aspect_venv_shim`, confirmed that invoking the venv's python
links did cause the venv to load.
- [x] Built for Linux; made a relocatable venv, customized the
`pyvenv.cfg` to be `home = ./bin/python`, updated the `bin/python` link
to be `bin/aspect_venv_shim`, confirmed that invoking the venv's python
links did cause the venv to load.

### Notes
[1]
https://discuss.python.org/t/interpreter-independent-isolated-virtual-environments/5378/53
[2] https://github.com/python/cpython/blob/main/Modules/getpath.py#L293
[3] https://github.com/python/cpython/blob/main/Modules/getpath.c#L774
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants