Skip to content

Conversation

arrdem
Copy link
Collaborator

@arrdem arrdem commented Aug 26, 2025

This patch reworks how the interpreter shim operates to eliminate the data dependency on the activate script having been evaluated and move the parameters which previously came via the shell environment.

Because the behavior of activate isn't actually specified, we previously assumed that it was safe to just extend activate. This isn't true at all it turns out. Under normal operation the interpreter initialization as implemented in Modules/getpath.py provides special behavior when the interpreter's observed filename (argv[0]) is a symlink as it is in a normal non-relocatable virtualenv. In this case the interpreter and will search for a pyvenv.cfg relative to the "interpreter" and implicitly activate the venv as needed.

Our ./bin/python being an actual binary interrupts this which means we're on our own to make getpath and site do the needful.

This should be as simple as setting the home= key in the pyvenv.cfg, but there isn't a principled way other than going through the runfiles library to locate where the interpreter lands in the runfiles tree relative to the virtualenv. Ideally we'd set the home= key to a relative path and mostly move on. Instead due in part to bzlmod munging we have to do the repo mapping dance which can't be done statically. So we need to do rlocation somewhere.

The solution this patch comes to is moving that rlocation logic out of the activate script and into the launcher itself so that the launcher can search for the needed $PYTHONHOME and desired interpreter without the user having previously loaded the required parameters into the env via activate.

Fixes #594.
Fixes #631.

Changes are visible to end-users: yes

  • Searched for relevant documentation and updated as needed: yes
  • Breaking change (forces users to change their own code or config): yes
  • Suggested release notes appear below: yes

The py_venv_binary and friends no longer depend on their virtualenvs being activated via the shell. Their interpreters can safely be invoked directly as under an IDE or in a Spark environment.

The py_venv_binary now no longer includes NEITHER the user's site-packages NOR the interpreter's site-packages directory unless the enable_system_site_packages or enable_user_site_packages attributes are explicitly set. This prevents accidental non-hermetic imports.

Test plan

  • Covered by existing test cases
  • Manual testing; please provide instructions so we can reproduce:
aspect build //py/tests/py-internal-venv:test && bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv/.test/bin/python -c 'import sys; from pprint import pprint; pprint(sys.path)'
INFO: Analyzed target //py/tests/py-internal-venv:test (2 packages loaded, 922 targets configured).
INFO: From Compiling Rust bin shim_macos_aarch64_build (1 files):
warning: fields `cfg` and `version_info` are never read
  --> py/tools/venv_shim/src/main.rs:45:5
   |
43 | struct PyCfg {
   |        ----- fields in this struct
44 |     root: PathBuf,
45 |     cfg: PathBuf,
   |     ^^^
46 |     version_info: String,
   |     ^^^^^^^^^^^^
   |
   = note: `PyCfg` has a derived impl for the trait `Debug`, but this is intentionally ignored during dead code analysis
   = note: `#[warn(dead_code)]` on by default

warning: 1 warning emitted

INFO: Found 1 target...
Target //py/tests/py-internal-venv:test up-to-date:
  bazel-bin/py/tests/py-internal-venv/test
INFO: Elapsed time: 10.289s, Critical Path: 6.72s
INFO: 4 processes: 1 internal, 3 darwin-sandbox.
INFO: Build completed successfully, 4 total actions
"bazel-bin/py/tests/py-internal-venv/test.runfiles/MANIFEST"
['',
 '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python39.zip',
 '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python3.9',
 '/private/var/tmp/_bazel_arrdem/93bfea6cdc1153cc29a75400cd38823a/external/python_toolchain_aarch64-apple-darwin/lib/python3.9/lib-dynload',
 '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv/.test/lib/python3.9/site-packages',
 '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py/py/tests/py-internal-venv',
 '/Users/arrdem/Documents/work/aspect/rules_py/bazel-bin/py/tests/py-internal-venv/test.runfiles/aspect_rules_py']

TODO

  • Need to add automated tests covering entrypoint bypass
  • Need to add automated tests covering the usersite behavior
  • Need to add automated tests covering the system site behavior

This patch reworks how the interpreter shim operates to eliminate the
data dependency on the `activate` script having been evaluated and move
the parameters which previously came via the shell environment.

Because the behavior of `activate` isn't actually specified we
previously assumed that it was safe to just extend `activate`. This
isn't true at all it turns out, as under normal operation the
`./bin/python` link and the behavior of the interpreter initialization
as implemented in `Modules/getpath.py` provide special behavior when the
interpreter's observed filename (`argv[0]`) is a symlink as it is in a
normal non-relocatable virtualenv and will search for a `pyvenv.cfg` and
implicitly activate the venv as needed in that case.

Our `./bin/python` being an actual binary tool interrupts this which
means we're on our own to make `getpath` and `site` do the needful.

This should be as simple as setting the `home=` key in the `pyvenv.cfg`,
but there isn't a principled way other than going through the runfiles
library to locate where the interpreter lands in the runfiles tree
relative to the virtualenv. Ideally we'd set the `home=` key to a
relative path and mostly move on. Instead due in part to bzlmod munging
we have to do the repo mapping dance which can't be done statically. So
we need to do `rlocation` somewhere.

The solution this patch comes to is moving that `rlocation` logic out of
the `activate` script and into the launcher itself so that the launcher
can search for the needed `$PYTHONHOME` and desired interpreter without
the user having previously loaded the required parameters into the env
via `activate`.

Fixes #594 at the cost of making the shim binary unusable outside the
context of `rules_py`.
Copy link

github-actions bot commented Aug 26, 2025

e2e/use_release folder: LCOV of commit f2efb1d during CI #1946

Summary coverage rate:
  lines......: 100.0% (2 of 2 lines)
  functions..: 100.0% (1 of 1 function)
  branches...: no data found

Files changed coverage rate: n/a

Copy link

aspect-workflows bot commented Aug 26, 2025

Test

Buildkite build #392 is running...

Copy link
Contributor

@tellett tellett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with 2 comments/questions.

Copy link
Contributor

@myrrlyn myrrlyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you succeed against the test suite last night. Everything I've noted down is for Rust style rather than Aspect logic, so it is up to you whether you change or not.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ arrdem
❌ szg-alex-payne
You have signed the CLA already but the status is still pending? Let us recheck it.

@arrdem arrdem enabled auto-merge (squash) September 3, 2025 02:56
@arrdem arrdem disabled auto-merge September 3, 2025 02:56
@arrdem arrdem merged commit c340913 into main Sep 3, 2025
14 of 16 checks passed
@arrdem arrdem deleted the arrdem/interpreter-sans-activate branch September 3, 2025 02:56
@ctcjab
Copy link

ctcjab commented Sep 9, 2025

@arrdem, just tested out 1.6.4-rc1 which includes this patch, and running /path/to/.app+app_test.venv/bin/python -V without activating first no longer gives Error: × $VIRTUAL_ENV was unbound! A venv must be activated, but now it gives

thread 'main' panicked at py/tools/venv_shim/src/main.rs:199:40:
called `Result::unwrap()` on an `Err` value: RunfilesDirNotFound
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This is just the most basic initial test I could think of, hopefully it's a valid one.

I also tried running python -V after activating, and got a similar crash:

❯ source .app+app_test.venv/bin/activate 

❯ which python
~/clones/shorty/.app+app_test.venv/bin/python

❯ python -VV
thread 'main' panicked at py/tools/venv_shim/src/main.rs:199:40:
called `Result::unwrap()` on an `Err` value: RunfilesDirIoError(Os { code: 2, kind: NotFound, message: "No such file or directory" })
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I also tried configuring VSCode to use a rules_py-generated virtualenv and it failed with:
Screenshot 2025-09-09 at 4 26 25 PM

Am I doing something wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: running venv python directly no longer possible [Bug]: Cursor IDE fails to recognize venv
6 participants