Skip to content

Commit a5f87a7

Browse files
markrwilliamsncoghlan
authored andcommitted
PEP 571: Updated version of the manylinux ABI (GH-565)
manylinux1 is getting old enough now to start making things difficult (specifically around network security), so it's time for a refresh to a slightly newer baseline.
1 parent 69eb650 commit a5f87a7

File tree

1 file changed

+343
-0
lines changed

1 file changed

+343
-0
lines changed

pep-0571.rst

Lines changed: 343 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,343 @@
1+
PEP: 571
2+
Title: The manylinux2 Platform Tag
3+
Version: $Revision$
4+
Last-Modified: $Date$
5+
Author: Mark Williams <[email protected]>
6+
BDFL-Delegate: Nick Coghlan <[email protected]>
7+
Discussions-To: Distutils SIG <[email protected]>
8+
Status: Active
9+
Type: Informational
10+
Content-Type: text/x-rst
11+
Created:
12+
Post-History:
13+
Resolution:
14+
15+
16+
Abstract
17+
========
18+
19+
This PEP proposes the creation of a ``manylinux2`` platform tag to
20+
succeed the ``manylinux1`` tag introduced by PEP 513 [1]_. It also
21+
proposes that PyPI and ``pip`` both be updated to support uploading,
22+
downloading, and installing ``manylinux2`` distributions on compatible
23+
platforms.
24+
25+
Rationale
26+
=========
27+
28+
True to its name, the ``manylinux1`` platform tag has made the
29+
installation of binary extension modules a reality on many Linux
30+
systems. Libraries like ``cryptography`` [2]_ and ``numpy`` [3]_ are
31+
more accessible to Python developers now that their installation on
32+
common architectures does not depend on fragile development
33+
environments and build toolchains.
34+
35+
``manylinux1`` wheels achieve their portability by allowing the
36+
extension modules they contain to link against only a small set of
37+
system-level shared libraries that export versioned symbols old enough
38+
to benefit from backwards-compatibility policies. Extension modules
39+
in a ``manylinux1`` wheel that rely on ``glibc``, for example, must be
40+
built against version 2.5 or earlier; they may then be run systems
41+
that provide more recent ``glibc`` version that still export the
42+
required symbols at version 2.5.
43+
44+
PEP 513 drew its whitelisted shared libraries and their symbol
45+
versions from CentOS 5.11, which was the oldest supported CentOS
46+
release at the time of its writing. Unfortunately, CentOS 5.11
47+
reached its end-of-life on March 31st, 2017 with a clear warning
48+
against its continued use. [4]_ No further updates, such as security
49+
patches, will be made available. This means that its packages will
50+
remain at obsolete versions that hamper the efforts of Python software
51+
packagers who use the ``manylinux1`` Docker image.
52+
53+
CentOS 6 is now the oldest supported CentOS release, and will receive
54+
maintenance updates through November 30th, 2020. [5]_ We propose that
55+
a new PEP 425-style [6]_ platform tag called ``manylinux2`` be derived
56+
from CentOS 6 and that the ``manylinux`` toolchain, PyPI, and ``pip``
57+
be updated to support it.
58+
59+
60+
The ``manylinux2`` policy
61+
=========================
62+
63+
The following criteria determine a ``linux`` wheel's eligibility for
64+
the ``manylinux2`` tag:
65+
66+
1. The wheel may only contain binary executables and shared objects
67+
compiled for one of the two architectures supported by CentOS 6:
68+
x86_64 or i686. [5]_
69+
2. The wheel's binary executables or shared objects may not link
70+
against externally-provided libraries except those in the following
71+
whitelist: ::
72+
73+
libgcc_s.so.1
74+
libstdc++.so.6
75+
libm.so.6
76+
libdl.so.2
77+
librt.so.1
78+
libcrypt.so.1
79+
libc.so.6
80+
libnsl.so.1
81+
libutil.so.1
82+
libpthread.so.0
83+
libresolv.so.2
84+
libX11.so.6
85+
libXext.so.6
86+
libXrender.so.1
87+
libICE.so.6
88+
libSM.so.6
89+
libGL.so.1
90+
libgobject-2.0.so.0
91+
libgthread-2.0.so.0
92+
libglib-2.0.so.0
93+
94+
This list is identical to the externally-provided libraries
95+
whitelisted for ``manylinux1``, minus ``libncursesw.so.5`` and
96+
``libpanelw.so.5``. [7]_ ``libpythonX.Y`` remains ineligible for
97+
inclusion for the same reasons outlined in PEP 513.
98+
99+
On Debian-based systems, these libraries are provided by the packages:
100+
101+
============ =======================================================
102+
Package Libraries
103+
============ =======================================================
104+
libc6 libdl.so.2, libresolv.so.2, librt.so.1, libc.so.6,
105+
libpthread.so.0, libm.so.6, libutil.so.1, libcrypt.so.1,
106+
libnsl.so.1
107+
libgcc1 libgcc_s.so.1
108+
libgl1 libGL.so.1
109+
libglib2.0-0 libgobject-2.0.so.0, libgthread-2.0.so.0, libglib-2.0.so.0
110+
libice6 libICE.so.6
111+
libsm6 libSM.so.6
112+
libstdc++6 libstdc++.so.6
113+
libx11-6 libX11.so.6
114+
libxext6 libXext.so.6
115+
libxrender1 libXrender.so.1
116+
============ =======================================================
117+
118+
On RPM-based systems, they are provided by these packages:
119+
120+
============ =======================================================
121+
Package Libraries
122+
============ =======================================================
123+
glib2 libglib-2.0.so.0, libgthread-2.0.so.0, libgobject-2.0.so.0
124+
glibc libresolv.so.2, libutil.so.1, libnsl.so.1, librt.so.1,
125+
libcrypt.so.1, libpthread.so.0, libdl.so.2, libm.so.6,
126+
libc.so.6
127+
libICE libICE.so.6
128+
libX11 libX11.so.6
129+
libXext: libXext.so.6
130+
libXrender libXrender.so.1
131+
libgcc: libgcc_s.so.1
132+
libstdc++ libstdc++.so.6
133+
mesa libGL.so.1
134+
============ =======================================================
135+
136+
3. If the wheel contains binary executables or shared objects linked
137+
against any whitelisted libraries that also export versioned
138+
symbols, they may only depend on the following maximum versions::
139+
140+
GLIBC_2.12
141+
CXXABI_1.3.3
142+
GLIBCXX_3.4.13
143+
GCC_4.3.0
144+
145+
As an example, ``manylinux2`` wheels may include binary artifacts
146+
that require ``glibc`` symbols at version ``GLIBC_2.4``, because
147+
this an earlier version than the maximum of ``GLIBC_2.12``.
148+
4. If a wheel is built for any version of CPython 2 or CPython
149+
versions 3.0 up to and including 3.2, it *must* include a CPython
150+
ABI tag indicating its Unicode ABI. A ``manylinux2`` wheel built
151+
against Python 2, then, must include either the ``cpy27mu`` tag
152+
indicating it was built against an interpreter with the UCS-4 ABI
153+
or the ``cpy27m`` tag indicating an interpeter with the UCS-2
154+
ABI. [8]_ [9]_
155+
5. A wheel *must not* require the ``PyFPE_jbuf`` symbol. This is
156+
achieved by building it against a Python compiled *without* the
157+
``--with-fpectl`` ``configure`` flag.
158+
159+
Compilation of Compliant Wheels
160+
===============================
161+
162+
Like ``manylinux1``, the ``auditwheel`` tool adds ```manylinux2``
163+
platform tags to ``linux`` wheels built by ``pip wheel`` or
164+
``bdist_wheel`` in a ``manylinux2`` Docker container.
165+
166+
Docker Images
167+
-------------
168+
169+
``manylinux2`` Docker images based on CentOS 6 x86_64 and i686 are
170+
provided for building binary ``linux`` wheels that can reliably be
171+
converted to ``manylinux2`` wheels. [10]_ These images come with a
172+
full compiler suite installed (``gcc``, ``g++``, and ``gfortran``
173+
4.8.2) as well as the latest releases of Python and ``pip``.
174+
175+
Compatibility with kernels that lack ``vsyscall``
176+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
177+
178+
A Docker container assumes that its userland is compatible with its
179+
host's kernel. Unfortunately, an increasingly common kernel
180+
configuration breaks breaks this assumption for x86_64 CentOS 6 Docker
181+
images.
182+
183+
Versions 2.14 and earlier of ``glibc`` require the kernel provide an
184+
archaic system call optimization known as ``vsyscall`` on x86_64. [11]_
185+
To effect the optimization, the kernel maps a read-only page of
186+
frequently-called system calls -- most notably ``time(2)`` -- into
187+
each process at a fixed memory location. ``glibc`` then invokes these
188+
system calls by dereferencing a function pointer to the appropriate
189+
offset into the ``vsyscall`` page and calling it. This avoids the
190+
overhead associated with invoking the kernel that affects normal
191+
system call invocation. ``vsyscall`` has long been deprecated in
192+
favor of an equivalent mechanism known as vDSO, or "virtual dynamic
193+
shared object", in which the kernel instead maps a relocatable virtual
194+
shared object containing the optimized system calls into each
195+
process. [12]_
196+
197+
The ``vsyscall`` page has serious security implications because it
198+
does not participate in address space layout randomization (ASLR).
199+
Its predictable location and contents make it a useful source of
200+
gadgets used in return-oriented programming attacks. [13]_ At the same
201+
time, its elimination breaks the x86_64 ABI, because ``glibc``
202+
versions that depend on ``vsyscall`` suffer from segmentation faults
203+
when attempting to dereference a system call pointer into a
204+
non-existent page. As a compromise, Linux 3.1 implemented an
205+
"emulated" ``vsyscall`` that reduced the executable code, and thus the
206+
material for ROP gadgets, mapped into the process. [14]_
207+
``vsyscall=emulated`` has been the default configuration in most
208+
distribution's kernels for many years.
209+
210+
Unfortunately, ``vsyscall`` emulation still exposes predicatable code
211+
at a reliable memory location, and continues to be useful for
212+
return-oriented programming. [15]_ Because most distributions have now
213+
upgraded to ``glibc`` versions that do not depend on ``vsyscall``,
214+
they are beginning to ship kernels that do not support ``vsyscall`` at
215+
all. [16]_
216+
217+
CentOS 5.11 and 6 both include versions of ``glibc`` that depend on
218+
the ``vsyscall`` page (2.5 and 2.12.2 respectively), so containers
219+
based on either cannot run under kernels provided with many
220+
distribution's upcoming releases. [17]_ Continuum Analytics faces a
221+
related problem with its conda software suite, and as they point out,
222+
this will pose a significant obstacle to using these tools in hosted
223+
services. [18]_ If Travis CI, for example, begins running jobs under
224+
a kernel that does not provide the ``vsyscall`` interface, Python
225+
packagers will not be able to use our Docker images there to build
226+
``manylinux`` wheels. [19]_
227+
228+
We have derived a patch from the ``glibc`` git repository that
229+
backports the removal of all dependencies on ``vsyscall`` to the
230+
version of ``glibc`` included with our ``manylinux2`` image. [20]_
231+
Rebuilding ``glibc``, and thus building ``manylinux2`` image itself,
232+
still requires a host kernel that provides the ``vsyscall`` mechanism,
233+
but the resulting image can be both run on hosts that provide it and
234+
those that do not. Because the ``vsyscall`` interface is an
235+
optimization that is only applied to running processes, the
236+
``manylinux2`` wheels built with this modified image should be
237+
identical to those built on an unmodified CentOS 6 system. Also, the
238+
``vsyscall`` problem applies only to x86_64; it is not part of the
239+
i686 ABI.
240+
241+
Auditwheel
242+
----------
243+
244+
The ``auditwheel`` tool has also been updated to produce
245+
``manylinux2`` wheels. [21]_ Its behavior and purpose are otherwise
246+
unchanged from PEP 513.
247+
248+
249+
Platform Detection for Installers
250+
=================================
251+
252+
Platforms may define a ``manylinux2_compatible`` boolean attribute on
253+
the ``_manylinux`` module described in PEP 513. A platform is
254+
considered incompatible with ``manylinux2`` if the attribute is
255+
``False``.
256+
257+
258+
Backwards compatibility with ``manylinux1`` wheels
259+
==================================================
260+
261+
As explained in PEP 513, the specified symbol versions for
262+
``manylinux1`` whitelisted libraries constitute an *upper bound*. The
263+
same is true for the symbol versions defined for ``manylinux2`` in
264+
this PEP. As a result, ``manylinux1`` wheels are considered
265+
``manylinux2`` wheels. A ``pip`` that recognizes the ``manylinux2``
266+
platform tag will thus install ``manylinux1`` wheels for
267+
``manylinux2`` platforms -- even when explicitly set -- when no
268+
``manylinux2`` wheels are available. [22]_
269+
270+
PyPI Support
271+
============
272+
273+
PyPI should permit wheels containing the ``manylinux2`` platform tag
274+
to be uploaded in the same way that it permits ``manylinux1``. It
275+
should not attempt to verify the compatibility of ``manylinux2``
276+
wheels.
277+
278+
279+
References
280+
==========
281+
282+
.. [1] PEP 513 -- A Platform Tag for Portable Linux Built Distributions
283+
(https://www.python.org/dev/peps/pep-0513/)
284+
.. [2] pyca/cryptography
285+
(https://cryptography.io/)
286+
.. [3] numpy
287+
(https://numpy.org)
288+
.. [4] CentOS 5.11 EOL announcement
289+
(https://lists.centos.org/pipermail/centos-announce/2017-April/022350.html)
290+
.. [5] CentOS Product Specifications
291+
(https://web.archive.org/web/20180108090257/https://wiki.centos.org/About/Product)
292+
.. [6] PEP 425 -- Compatibility Tags for Built Distributions
293+
(https://www.python.org/dev/peps/pep-0425/)
294+
.. [7] ncurses 5 -> 6 transition means we probably need to drop some
295+
libraries from the manylinux whitelist
296+
(https://github.com/pypa/manylinux/issues/94)
297+
.. [8] PEP 3149
298+
https://www.python.org/dev/peps/pep-3149/
299+
.. [9] SOABI support for Python 2.X and PyPy
300+
https://github.com/pypa/pip/pull/3075
301+
.. [10] manylinux2 Docker images
302+
(https://hub.docker.com/r/markrwilliams/manylinux2/)
303+
.. [11] On vsyscalls and the vDSO
304+
(https://lwn.net/Articles/446528/)
305+
.. [12] vdso(7)
306+
(http://man7.org/linux/man-pages/man7/vdso.7.html)
307+
.. [13] Framing Signals -- A Return to Portable Shellcode
308+
(http://www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf)
309+
.. [14] ChangeLog-3.1
310+
(https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.1)
311+
.. [15] Project Zero: Three bypasses and a fix for one of Flash's Vector.<*> mitigations
312+
(https://googleprojectzero.blogspot.com/2015/08/three-bypasses-and-fix-for-one-of.html)
313+
.. [16] linux: activate CONFIG_LEGACY_VSYSCALL_NONE ?
314+
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=852620)
315+
.. [17] [Wheel-builders] Heads-up re: new kernel configurations breaking the manylinux docker image
316+
(https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html)
317+
.. [18] Due to glibc 2.12 limitation, static executables that use
318+
time(), cpuinfo() and maybe a few others cannot be run on systems
319+
that do not support or use `vsyscall=emulate`
320+
(https://github.com/ContinuumIO/anaconda-issues/issues/8203)
321+
.. [19] Travis CI
322+
(https://travis-ci.org/)
323+
.. [20] remove-vsyscall.patch
324+
https://github.com/markrwilliams/manylinux/commit/e9493d55471d153089df3aafca8cfbcb50fa8093#diff-3eda4130bdba562657f3ec7c1b3f5720
325+
.. [21] auditwheel manylinux2 branch
326+
(https://github.com/markrwilliams/auditwheel/tree/manylinux2)
327+
.. [22] pip manylinux2 branch
328+
https://github.com/markrwilliams/pip/commits/manylinux2
329+
330+
331+
Copyright
332+
=========
333+
334+
This document has been placed into the public domain.
335+
336+
..
337+
Local Variables:
338+
mode: indented-text
339+
indent-tabs-mode: nil
340+
sentence-end-double-space: t
341+
fill-column: 70
342+
coding: utf-8
343+
End:

0 commit comments

Comments
 (0)