Skip to content

Add topic guide: Repeatable Installs #10100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/html/cli/pip_install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -519,9 +519,9 @@ of having the wheel cache disabled is thus extra build time for sdists, and
this can be solved by making sure pre-built wheels are available from the index
server.

Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`. A
:ref:`comparison of hash-checking mode with other repeatability strategies
<Repeatability>` is available in the User Guide.
Hash-checking mode also works with :ref:`pip download` and :ref:`pip wheel`.
See :doc:`../topics/repeatable-installs` for a comparison of hash-checking mode
with other repeatability strategies.

.. warning::

Expand Down
1 change: 1 addition & 0 deletions docs/html/topics/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,6 @@ This section of the documentation is currently being fleshed out. See
authentication
caching
configuration
repeatable-installs
vcs-support
```
98 changes: 98 additions & 0 deletions docs/html/topics/repeatable-installs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Repeatable Installs

pip can be used to achieve various levels of repeatable environments. This page
walks through increasingly stricter definitions of what "repeatable" means.

## Pinning the package versions

Pinning package versions of your dependencies in the requirements file
protects you from bugs or incompatibilities in newly released versions:

```
SomePackage == 1.2.3
DependencyOfSomePackage == 4.5.6
```

```{note}
Pinning refers to using the `==` operator to require the package to be a
specific version.
```

A requirements file, containing pinned package versions can be generated using
{ref}`pip freeze`. This would not only the top-level packages, but also all of
their transitive dependencies. Performing the installation using
{ref}`--no-deps <install_--no-deps>` would provide an extra dose of insurance
against installing anything not explicitly listed.

This strategy is easy to implement and works across OSes and architectures.
However, it trusts the locations you're fetching the packages from (like PyPI)
and the certificate authority chain. It also relies on those locations not
allowing packages to change without a version increase. (PyPI does protect
against this.)

## Hash-checking

Beyond pinning version numbers, you can add hashes against which to verify
downloaded packages:

```none
FooProject == 1.2 --hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824
```

This protects against a compromise of PyPI or the HTTPS certificate chain. It
also guards against a package changing without its version number changing (on
indexes that allow this). This approach is a good fit for automated server
deployments.

Hash-checking mode is a labour-saving alternative to running a private index
server containing approved packages: it removes the need to upload packages,
maintain ACLs, and keep an audit trail (which a VCS gives you on the
requirements file for free). It can also substitute for a vendored library,
providing easier upgrades and less VCS noise. It does not, of course,
provide the availability benefits of a private index or a vendored library.

[pip-tools] is a package that builds upon pip, and provides a good workflow for
managing and generating requirements files.

[pip-tools]: https://github.com/jazzband/pip-tools#readme

## Using a wheelhouse (AKA Installation Bundles)

{ref}`pip wheel` can be used to generate and package all of a project's
dependencies, with all the compilation performed, into a single directory that
can be converted into a single archive. This archive then allows installation
when index servers are unavailable and avoids time-consuming recompilation.

````{admonition} Example
Creating the bundle, on a modern Unix system:

```
$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
$ python -m pip wheel -r requirements.txt --wheel-dir=$tempdir
$ cwd=`pwd`
$ (cd "$tempdir"; tar -cjvf "$cwd/bundled.tar.bz2" *)
```

Installing from the bundle, on a modern Unix system:

```
$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
$ (cd $tempdir; tar -xvf /path/to/bundled.tar.bz2)
$ python -m pip install --force-reinstall --no-index --no-deps $tempdir/*
```
````

Note that such a wheelhouse contains compiled packages, which are typically
OS and architecture-specific, so these archives are not necessarily portable
across machines.

Hash-checking mode can also be used along with this method (since this uses a
requirements file as well), to ensure that future archives are built with
identical packages.

```{warning}
Beware of the `setup_requires` keyword arg in {file}`setup.py`. The (rare)
packages that use it will cause those dependencies to be downloaded by
setuptools directly, skipping pip's protections. If you need to use such a
package, see {ref}`Controlling setup_requires <controlling-setup-requires>`.
```
Comment on lines +93 to +98
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads a bit confusing to me at first since a wheelhouse contains only wheels and setup_requires shouldn’t affect it? Then I realise this is probably talking about the population phase (i.e. pip wheel can use hash-checking mode, but that does not protect you from requirements listed in setup_requires). Maybe this could be restructured to make it clearer.

Also, should this also be mentioned in the hash-checking section above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I think I just borrowed this verbatim from the original text we had already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

83 changes: 2 additions & 81 deletions docs/html/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ installed by pip in any particular order.
In practice, there are 4 common uses of Requirements files:

1. Requirements files are used to hold the result from :ref:`pip freeze` for the
purpose of achieving :ref:`repeatable installations <Repeatability>`. In
purpose of achieving :doc:`topics/repeatable-installs`. In
this case, your requirement file contains a pinned version of everything that
was installed when ``pip freeze`` was run.

Expand Down Expand Up @@ -762,86 +762,7 @@ is the latest version:
Ensuring Repeatability
======================

pip can achieve various levels of repeatability:

Pinned Version Numbers
----------------------

Pinning the versions of your dependencies in the requirements file
protects you from bugs or incompatibilities in newly released versions::

SomePackage == 1.2.3
DependencyOfSomePackage == 4.5.6

Using :ref:`pip freeze` to generate the requirements file will ensure that not
only the top-level dependencies are included but their sub-dependencies as
well, and so on. Perform the installation using :ref:`--no-deps
<install_--no-deps>` for an extra dose of insurance against installing
anything not explicitly listed.

This strategy is easy to implement and works across OSes and architectures.
However, it trusts PyPI and the certificate authority chain. It
also relies on indices and find-links locations not allowing
packages to change without a version increase. (PyPI does protect
against this.)

Hash-checking Mode
------------------

Beyond pinning version numbers, you can add hashes against which to verify
downloaded packages::

FooProject == 1.2 --hash=sha256:2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

This protects against a compromise of PyPI or the HTTPS
certificate chain. It also guards against a package changing
without its version number changing (on indexes that allow this).
This approach is a good fit for automated server deployments.

Hash-checking mode is a labor-saving alternative to running a private index
server containing approved packages: it removes the need to upload packages,
maintain ACLs, and keep an audit trail (which a VCS gives you on the
requirements file for free). It can also substitute for a vendor library,
providing easier upgrades and less VCS noise. It does not, of course,
provide the availability benefits of a private index or a vendor library.

For more, see
:ref:`pip install\'s discussion of hash-checking mode <hash-checking mode>`.

.. _`Installation Bundle`:

Installation Bundles
--------------------

Using :ref:`pip wheel`, you can bundle up all of a project's dependencies, with
any compilation done, into a single archive. This allows installation when
index servers are unavailable and avoids time-consuming recompilation. Create
an archive like this::

$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
$ python -m pip wheel -r requirements.txt --wheel-dir=$tempdir
$ cwd=`pwd`
$ (cd "$tempdir"; tar -cjvf "$cwd/bundled.tar.bz2" *)

You can then install from the archive like this::

$ tempdir=$(mktemp -d /tmp/wheelhouse-XXXXX)
$ (cd $tempdir; tar -xvf /path/to/bundled.tar.bz2)
$ python -m pip install --force-reinstall --ignore-installed --upgrade --no-index --no-deps $tempdir/*

Note that compiled packages are typically OS- and architecture-specific, so
these archives are not necessarily portable across machines.

Hash-checking mode can be used along with this method to ensure that future
archives are built with identical packages.

.. warning::

Finally, beware of the ``setup_requires`` keyword arg in :file:`setup.py`.
The (rare) packages that use it will cause those dependencies to be
downloaded by setuptools directly, skipping pip's protections. If you need
to use such a package, see :ref:`Controlling
setup_requires<controlling-setup-requires>`.
This is now covered in :doc:`../topics/repeatable-installs`.

.. _`Fixing conflicting dependencies`:

Expand Down