Skip to content

Bytecode compilation output depends on order of files compiled #129724

Open
@konstin

Description

@konstin

Bug report

Bug description:

This is minimal reproduction of this downstream bug report: astral-sh/uv#10619

The output of compileall.compile_file depends on the order in which the files are compiled. This means compilation is non-deterministic if builds are distributed over a process pool.

This becomes a problem when building docker images, where you usually bytecode compile ahead of time for faster startup, and where the hash of the image depends on all files in the image, including the .pyc files.

Specifically, the output of

a = {"foo", 2, 3}

def f():
    b = {"foo", 2, 3}

is different if we previously compiled another file with

import foo

Reproducer script:

#!/bin/bash

set -e

script=$(cat << EOF
import compileall
import sys

for path in sys.argv[1:]:
    compileall.compile_file(path)
EOF
)

cat << EOF > a.py
import foo
EOF

cat << EOF > b.py
a = {"foo", 2, 3}

def f():
    b = {"foo", 2, 3}
EOF

# Both files
rm -rf __pycache__
python3.14 -c "$script" a.py b.py
sha256sum __pycache__/b.cpython-314.pyc

# For debugging
cp __pycache__/b.cpython-314.pyc b1.cpython-314.pyc

# Single file only
rm -rf __pycache__
python3.14 -c "$script" b.py
sha256sum __pycache__/b.cpython-314.pyc

# For debugging
cp __pycache__/b.cpython-314.pyc b2.cpython-314.pyc

This is caused be different refcounts in the marshalled files:

import marshal
import sys

with open("b1.cpython-313.pyc", "rb") as f:
  f.read(16)  # Skip header
  pyc1 = marshal.load(f)

with open("b2.cpython-313.pyc", "rb") as f:
  f.read(16)  # Skip header
  pyc2 = marshal.load(f)

print(sys.getrefcount(pyc1.co_consts[0]))
print(sys.getrefcount(pyc2.co_consts[0]))

This prints 2 and 3.

The original report is from 3.13, i've reproduced it with 3.14.0a4. It happens at least on linux and windows.

CPython versions tested on:

3.14

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-featureA feature request or enhancement

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions