Following system colour scheme Selected dark colour scheme Selected light colour scheme

Python Enhancement Proposals

PEP 668 – Marking Python base environments as “externally managed”

Author:
Geoffrey Thomas <geofft at ldpreload >, Matthias Klose <doko at ubuntu >, Filipe Laíns <lains at riseup.net>, Donald Stufft <donald at stufft.io>, Tzu-ping Chung <uranusjr at gmail >, Stefano Rivera <stefanor at debian.org>, Elana Hashman <ehashman at debian.org>, Pradyun Gedam <pradyunsg at gmail >
PEP-Delegate:
Paul Moore <p.f.moore at gmail >
Discussions-To:
Discourse thread
Status:
Accepted
Type:
Standards Track
Topic:
Packaging
Created:
18-May-2021
Post-History:
28-May-2021
Resolution:
Discourse message

Table of Contents

Attention

This PEP is a historical document. The up-to-date, canonical spec,Externally Managed Environments,is maintained on thePyPA specs page.

×

See thePyPA specification update processfor how to propose changes.

Abstract

A long-standing practical problem for Python users has been conflicts between OS package managers and Python-specific package management tools like pip. These conflicts include both Python-level API incompatibilities and conflicts over file ownership.

Historically, Python-specific package management tools have defaulted to installing packages into an implicit global context. With the standardization and popularity of virtual environments, a better solution for most (but not all) use cases is to use Python-specific package management tools only within a virtual environment.

This PEP proposes a mechanism for a Python installation to communicate to tools like pip that its global package installation context is managed by some means external to Python, such as an OS package manager. It specifies that Python-specific package management tools should neither install nor remove packages into the interpreter’s global context, by default, and should instead guide the end user towards using a virtual environment.

It also standardizes an interpretation of thesysconfigschemes so that, if a Python-specific package manager is about to install a package in an interpreter-wide context, it can do so in a manner that will avoid conflicting with the external package manager and reduces the risk of breaking software shipped by the external package manager.

Terminology

A few terms used in this PEP have multiple meanings in the contexts that it spans. For clarity, this PEP uses the following terms in specific ways:

distro
Short for “distribution,” a collection of various sorts of software, ideally designed to work properly together, including (in contexts relevant to this document) the Python interpreter itself, software written in Python, and software written in other languages. That is, this is the sense used in phrases such as “Linux distro” or “Berkeley Software Distribution.”

A distro can be an operating system (OS) of its own, such as Debian, Fedora, or FreeBSD. It can also be an overlay distribution that installs on top of an existing OS, such as Homebrew or MacPorts.

This document uses the short term “distro,” because the term “distribution” has another meaning in Python packaging contexts: a source or binary distribution package of a single piece of Python language software, that is, in the sense of setuptools.dist.Distributionor “sdist”. To avoid confusion, this document does not use the plain term “distribution” at all. In the Python packaging sense, it uses the full phrase “distribution package” or just “package” (see below).

The provider of a distro - the team or company that collects and publishes the software and makes any needed modifications - is its distributor.

package
A unit of software that can be installed and used within Python. That is, this refers to what Python-specific packaging tools tend to call a “distribution package”or simply a “distribution”; the colloquial abbreviation “package” is used in the sense of the Python Package Index.

This document does not use “package” in the sense of an importable name that contains Python modules, though in many cases, a distribution package consists of a single importable package of the same name.

This document generally does not use the term “package” to refer to units of installation by a distro’s package manager (such as .debor.rpmfiles). When needed, it uses phrasing such as “a distro’s package.” (Again, in many cases, a Python package is shipped inside a distro’s package named something likePython - plus the Python package name.)

Python-specific package manager
A tool for installing, upgrading, and/or removing Python packages in a manner that conforms to Python packaging standards (such as PEP 376andPEP 427). The most popular Python-specific package manager is pip[1];other examples include the old Easy Install command[2]as well as direct usage of a setup.pycommand.

(Conda[3]is a bit of a special case, as theconda command can install much more than just Python packages, making it more like a distro package manager in some senses. Since the condacommand generally only operates on Conda-created environments, most of the concerns in this document do not apply tocondawhen acting as a Python-specific package manager.)

distro package manager
A tool for installing, upgrading, and/or removing a distro’s packages in an installed instance of that distro, which is capable of installing Python packages as well as non-Python packages, and therefore generally has its own database of installed software unrelated toPEP 376.Examples includeapt,dpkg,dnf, rpm,pacman,andbrew.The salient feature is that if a package was installed by a distro package manager, removing or upgrading it in a way that would satisfy a Python-specific package manager will generally leave a distro package manager in an inconsistent state.

This document also uses phrases like “external package manager” or “system’s package manager” to refer to a distro package manager in certain contexts.

shadow
To shadow an installed Python package is to cause some other package to be preferred for imports without removing any files from the shadowed package. This requires multiple entries on sys.path:if package A 2.0 installs modulea.pyin one sys.pathentry, and package A 1.0 installs modulea.pyin a latersys.pathentry, thenimportareturns the module from the former, and we say that A 2.0 shadows A 1.0.

Motivation

Thanks to Python’s immense popularity, software distros (by which we mean Linux and other OS distros as well as overlay distros like Homebrew and MacPorts) generally ship Python for two purposes: as a software package to be used in its own right by end users, and as a language dependency for other software in the distro.

For example, Fedora and Debian (and their downstream distros, as well as many others) ship a/usr/bin/ Python 3binary which provides the Python 3command available to end users as well as the #!/usr/bin/ Python 3shebang for Python-language software included in the distro. Because there are no official binary releases of Python for Linux/UNIX, almost all Python end users on these OSes use the Python interpreter built and shipped with their distro.

ThePython 3executable available to the users of the distro and thePython 3executable available as a dependency for other software in the distro are typically the same binary. This means that if an end user installs a Python package using a tool likepip outside the context of a virtual environment, that package is visible to Python-language software shipped by the distro. If the newly-installed package (or one of its dependencies) is a newer, backwards-incompatible version of a package that was installed through the distro, it may break software shipped by the distro.

This may pose a critical problem for the integrity of distros, which often have package-management tools that are themselves written in Python. For example, it’s possible to unintentionally break Fedora’s dnfcommand with apipinstallcommand, making it hard to recover.

This applies both to system-wide installs (sudopipinstall) as well as user home directory installs (pipinstall--user), since packages in either location show up on thesys.pathof /usr/bin/ Python 3.

There is a worse problem with system-wide installs: if you attempt to recover from this situation withsudopipuninstall,you may end up removing packages that are shipped by the system’s package manager. In fact, this can even happen if you simply upgrade a package - pip will try to remove the old version of the package, as shipped by the OS. At this point it may not be possible to recover the system to a consistent state using just the software remaining on the system.

Over the past many years, a consensus has emerged that the best way to install Python libraries or applications (when not using a distro’s package) is to use a virtual environment. This approach was popularized by the PyPAvirtualenvproject, and a simple version of that approach is now available in the Python standard library as venv.Installing a Python package into a virtualenv prevents it from being visible to the unqualified/usr/bin/ Python 3interpreter and prevents breaking system software.

In some cases, however, it’s useful and intentional to install a Python package from outside of the distro that influences the behavior of distro-shipped commands. This is common in the case of software like Sphinx or Ansible which have a mechanism for writing Python-language extensions. A user may want to use their distro’s version of the base software (for reasons of paid support or security updates) but install a small extension from PyPI, and they’d want that extension to be importable by the software in their base system.

While this continues to carry the risk of installing a newer version of a dependency than the operating system expects or otherwise negatively affecting the behavior of an application, it does not need to carry the risk of removing files from the operating system. A tool like pip should be able to install packages in some directory on the defaultsys.path,if specifically requested, without deleting files owned by the system’s package manager.

Therefore, this PEP proposes two things.

First, it proposesa way for distributors of a Python interpreter to mark that interpreter as having its packages managed by means external to Python,such that Python-specific tools like pip should not change the installed packages in the interpreter’s globalsys.path in any way (add, upgrade/downgrade, or remove) unless specifically overridden. It also provides a means for the distributor to indicate how to use a virtual environment as an alternative.

This is an opt-in mechanism: by default, the Python interpreter compiled from upstream sources will not be so marked, and so running pipinstallwith a self-compiled interpreter, or with a distro that has not explicitly marked its interpreter, will work as it always has worked.

Second, it sets the rule that when installing packages to an interpreter’s global context (either to an unmarked interpreter, or if overriding the marking),Python-specific package managers should modify or delete files only within the directories of the sysconfig scheme in which they would create files.This permits a distributor of a Python interpreter to set up two directories, one for its own managed packages, and one for unmanaged packages installed by the end user, and ensure that installing unmanaged packages will not delete (or overwrite) files owned by the external package manager.

Rationale

As described in detail in the next section, the first behavior change involves creating a marker file namedEXTERNALLY-MANAGED,whose presence indicates that non-virtual-environment package installations are managed by some means external to Python, such as a distro’s package manager. This file is specified to live in thestdlib directory in the defaultsysconfigscheme, which marks the interpreter / installation as a whole, not a particular location on sys.path.The reason for this is that, as identified above, there are two related problems that risk breaking an externally-managed Python: you can install an incompatible new version of a package system-wide (e.g., withsudopipinstall), and you can install one in your user account alone, but in a location that is on the standard Python command’ssys.path(e.g., withpipinstall--user). If the marker file were in the system-widesite-packagesdirectory, it would not clearly apply to the second case. TheAlternatives section has further discussion of possible locations.

The second behavior change takes advantage of the existing sysconfigsetup in distros that have already encountered this class of problem, and specifically addresses the problem of a Python-specific package manager deleting or overwriting files that are owned by an external package manager.

Use cases

The changed behavior in this PEP is intended to “do the right thing” for as many use cases as possible. In this section, we consider the changes specified by this PEP for several representative use cases / contexts. Specifically, we ask about the two behaviors that could be changed by this PEP:

  1. Will a Python-specific installer tool likepipinstallpermit installations by default, after implementation of this PEP?
  2. If you do run such a tool, should it be willing to delete packages shipped by the external (non-Python-specific) package manager for that context, such as a distro package manager?

(For simplicity, this section discusses pip as the Python-specific installer tool, though the analysis should apply equally to any other Python-specific package management tool.)

This table summarizes the use cases discussed in detail below:

Case Description pipinstallpermitted Deleting externally-installed packages permitted
1 Unpatched CPython Currently yes; stays yes Currently yes; stays yes
2 Distro/usr/bin/ Python 3 Currently yes; becomes no (assuming the distro adds a marker file) Currently yes (except on Debian); becomes no
3 Distro Python in venv Currently yes; stays yes There are no externally-installed packages
4 Distro Python in venv with--system-site-packages Currently yes; stays yes Currently no; stays no
5 Distro Python in Docker Currently yes; becomes no (assuming the distro adds a marker file) Currently yes; becomes no
6 Conda environment Currently yes; stays yes Currently yes; stays yes
7 Dev-facing distro Currently yes; becomes no (assuming they add a marker file) Currently often yes; becomes no (assuming they configuresysconfigas needed)
8 Distro building packages Currently yes; can stay yes Currently yes; becomes no
9 PYTHONHOMEcopied from a distro Python stdlib Currently yes; becomes no Currently yes; becomes no
10 PYTHONHOMEcopied from upstream Python stdlib Currently yes; stays yes Currently yes; stays yes

In more detail, the use cases above are:

  1. A standard unpatched CPython, without any special configuration of or patches tosysconfigand without a marker file. This PEP does not change its behavior.

    Such a CPython should (regardless of this PEP) not be installed in a way that overlaps any distro-installed Python on the same system. For instance, on an OS that ships Python in/usr/bin,you should not install a custom CPython built with./configure --prefix=/usr,or it will overwrite some files from the distro and the distro will eventually overwrite some files from your installation. Instead, your installation should be in a separate directory (perhaps/usr/local,/opt,or your home directory).

    Therefore, we can assume that such a CPython has its ownstdlib directory and its ownsysconfigschemes that do not overlap any distro-installed Python. So any OS-installed packages are not visible or relevant here.

    If there is a concept of “externally-installed” packages in this case, it’s something outside the OS and generally managed by whoever built and installed this CPython. Because the installer chose not to add a marker file or modifysysconfigschemes, they’re choosing the current behavior, andpipinstallcan remove any packages available in this CPython.

  2. A distro’s/usr/bin/ Python 3,either when runningpip installas root orpipinstall--user,following our Recommendations for distros.

    These recommendations include shipping a marker file in the stdlibdirectory, to preventpipinstallby default, and placing distro-shipped packages in a location other than the defaultsysconfigscheme, so thatpipas root does not write to that location.

    Many distros (including Debian, Fedora, and their derivatives) are already doing the latter.

    On Debian and derivatives,pipinstalldoes not currently delete distro-installed packages, because Debian carries apatch to pip to prevent this.So, for those distros, this PEP is not a behavior change; it simply standardizes that behavior in a way that is no longer Debian-specific and can be included into upstream pip.

    (We have seen user reports of externally-installed packages being deleted on Debian or a derivative. We suspect this is because the user has previously runsudopipinstall--upgradepipand therefore now has a version of/usr/bin/pipwithout the Debian patch; standardizing this behavior in upstream package installers would address this problem.)

  3. A distro Python when used inside a virtual environment (either from venvorvirtualenv).

    Inside a virtual environment, all packages are owned by that environment. Even whenpip,setuptools,etc. are installed into the environment, they are and should be managed by tools specific to that environment; they are not system-managed.

  4. A distro Python when used inside a virtual environment with --system-site-packages.This is like the previous case, but worth calling out explicitly, because anything on the global sys.pathis visible.

    Currently, the answer to “Willpipdelete externally-installed packages” is no, because pip has a special case for running in a virtual environment and attempting to delete packages outside it. After this PEP, the answer remains no, but the reasoning becomes more general: system site packages will be outside any of the sysconfigschemes used for package management in the environment.

  5. A distro Python when used in a single-application container image (e.g., a Docker container). In this use case, the risk of breaking system software is lower, since generally only a single application runs in the container, and the impact is lower, since you can rebuild the container and you don’t have to struggle to recover a running machine. There are also a large number of existing Dockerfiles with an unqualifiedRUNpipinstall...statement, etc., and it would be good not to break those. So, builders of base container images may want to ensure that the marker file is not present, even if the underlying OS ships one by default.

    There is a small behavior change: currently,piprun as root will delete externally-installed packages, but after this PEP it will not. We don’t propose a way to override this. However, since the base image is generally minimal, there shouldn’t be much of a use case for simply uninstalling packages (especially without using the distro’s own tools). The common case is when pip wants to upgrade a package, which previously would have deleted the old version (except on Debian). After this change, the old version will still be on disk, but pip will stillshadowexternally-installed packages, and we believe this to be sufficient for this not to be a breaking change in practice - a Pythonimportstatement will still get you the newly-installed package.

    If it becomes necessary to have a way to do this, we suggest that the distro should document a way for the installer tool to access thesysconfigscheme used by the distro itself. See the Recommendations for distrossection for more discussion.

    It is the view of the authors of this PEP that it’s still a good idea to use virtual environments with distro-installed Python interpreters, even in single-application container images. Even though they run a singleapplication,that application may run commands from the OS that are implemented in Python, and if you’ve installed or upgraded the distro-shipped Python packages using Python-specific tools, those commands may break.

  6. Conda specifically supports the use of non-condatools like pip to install software not available in the Conda repositories. In this context, Conda acts as the external package manager / distro and pip as the Python-specific one.

    In some sense, this is similar to the first case, since Conda provides its own installation of the Python interpreter.

    We don’t believe this PEP requires any changes to Conda, and versions of pip that have implemented the changes in this PEP will continue to behave as they currently do inside Conda environments. (That said, it may be worth considering whether to use separate sysconfigschemes for pip-installed and Conda-installed software, for the same reasons it’s a good idea for other distros.)

  7. By a “developer-facing distro,” we mean a specific type of distro where direct users of Python or other languages in the distro are expected or encouraged to make changes to the distro itself if they wish to add libraries. Common examples include private “monorepos” at software development companies, where a single repository builds both third-party and in-house software, and the direct users of the distro’s Python interpreter are generally software developers writing said in-house software. User-level package managers like Nixpkgsmay also count, because they encourage users of Nix who are Python developers topackage their software for Nix.

    In these cases, the distro may want to respond to an attempted pipinstallwith guidance encouraging use of the distro’s own facilities for adding new packages, along with a link to documentation.

    If the distro supports/encourages creating a virtual environment from the distro’s Python interpreter, there may also be custom instructions for how to properly set up a virtual environment (as for example Nixpkgs does).

  8. When building distro Python packages for a distro Python (case 2), it may be useful to havepipinstallbe usable as part of the distro’s package build process. (Consider, for instance, building a Python -xyzRPM by usingpipinstall.inside an sdist / source tarball forxyz.) The distro may also want to use a more targeted but still Python-specific installation tool such as installer.

    For this case, the build process will need to find some way to suppress the marker file to allowpipinstallto work, and will probably need to point the Python-specific tool at the distro’s sysconfigscheme instead of the shipped default. See the Recommendations for distrossection for more discussion on how to implement this.

    As a result of this PEP, pip will no longer be able to remove packages already on the system. However, this behavior change is fine because a package build process should not (and generally cannot) include instructions to delete some other files on the system; it can only package up its own files.

  9. A distro Python used withPYTHONHOMEto set up an alternative Python environment (as opposed to a virtual environment), where PYTHONHOMEis set to some directory copied directly from the distro Python (e.g.,cp-a/usr/lib/ Python 3.xpyhome/lib).

    Assuming there are no modifications, then the behavior is just like the underlying distro Python (case 2). So there are behavior changes - you can no longerpipinstallby default, and if you override it, it will no longer delete externally-installed packages (i.e., Python packages that were copied from the OS and live in the OS-managedsys.pathentry).

    This behavior change seems to be defensible, in that if your PYTHONHOMEis a straight copy of the distro’s Python, it should behave like the distro’s Python.

  10. A distro Python (or any Python interpreter) used with a PYTHONHOMEtaken from a compatible unmodified upstream Python.

    Because the behavior changes in this PEP are keyed off of files in the standard library (the marker file instdliband the behavior of thesysconfigmodule), the behavior is just like an unmodified upstream CPython (case 1).

Specification

Marking an interpreter as using an external package manager

Before a Python-specific package installer (that is, a tool such as pip - not an external tool such as apt) installs a package into a certain Python context, it should make the following checks by default:

  1. Is it running outside of a virtual environment? It can determine this by whethersys.prefix==sys.base_prefix(but see Backwards Compatibility).
  2. Is there anEXTERNALLY-MANAGEDfile in the directory identified bysysconfig.get_path( "stdlib", sysconfig.get_default_scheme())?

If both of these conditions are true, the installer should exit with an error message indicating that package installation into this Python interpreter’s directory are disabled outside of a virtual environment.

The installer should have a way for the user to override these rules, such as a command-line flag--break-system-packages.This option should not be enabled by default and should carry some connotation that its use is risky.

TheEXTERNALLY-MANAGEDfile is an INI-style metadata file intended to be parsable by the standard libraryconfigparsermodule. If the file can be parsed by configparser.ConfigParser(interpolation=None)using the UTF-8 encoding, and it contains a section[externally-managed],then the installer should look for an error message specified in the file and output it as part of its error. If the first element of the tuple returned bylocale.getlocale(locale.LC_MESSAGES),i.e., the language code, is notNone,it should look for the error message as the value of a key namedError-followed by the language code. If that key does not exist, and if the language code contains underscore or hyphen, it should look for a key namedError- followed by the portion of the language code before the underscore or hyphen. If it cannot find either of those, or if the language code is None,it should look for a key simply namedError.

If the installer cannot find an error message in the file (either because the file cannot be parsed or because no suitable error key exists), then the installer should just use a pre-defined error message of its own, which should suggest that the user create a virtual environment to install packages.

Software distributors who have a non-Python-specific package manager that manages libraries in thesys.pathof their Python package should, in general, ship aEXTERNALLY-MANAGEDfile in their standard library directory. For instance, Debian may ship a file in /usr/lib/ Python 3.9/EXTERNALLY-MANAGEDconsisting of something like

[externally-managed]
Error=ToinstallPythonpackagessystem-wide,tryaptinstall
Python 3-xyz,wherexyzisthepackageyouaretryingto
install.

Ifyouwishtoinstallanon-Debian-packagedPythonpackage,
createavirtualenvironmentusingPython 3-mvenvpath/to/venv.
Thenusepath/to/venv/bin/Pythonandpath/to/venv/bin/pip.Make
sureyouhavePython 3-fullinstalled.

Ifyouwishtoinstallanon-DebianpackagedPythonapplication,
itmaybeeasiesttousepipxinstallxyz,whichwillmanagea
virtualenvironmentforyou.Makesureyouhavepipxinstalled.

See/usr/share/doc/Python 3.9/README.venvformoreinformation.

which provides useful and distro-relevant information to a user trying to install a package. Optionally, translations can be provided in the same file:

Error-de_DE=Wenn ist das Nunstück git und Slotermeyer?

Ja! Beiherhund das Oder die Virtualenvironment gersput!

In certain contexts, such as single-application container images that aren’t updated after creation, a distributor may choose not to ship an EXTERNALLY-MANAGEDfile, so that users can install whatever they like (as they can today) without having to manually override this rule.

Writing to only the targetsysconfigscheme

Usually, a Python package installer installs to directories in a scheme returned by thesysconfigstandard library package. Ordinarily, this is the scheme returned by sysconfig.get_default_scheme(),but based on configuration (e.g. pipinstall--user), it may use a different scheme.

Whenever the installer is installing to asysconfigscheme, this PEP specifies that the installer should never modify or delete files outside of that scheme. For instance, if it’s upgrading a package, and the package is already installed in a directory outside that scheme (perhaps in a directory from another scheme), it should leave the existing files alone.

If the installer does end up shadowing an existing installation during an upgrade, we recommend that it produces a warning at the end of its run.

If the installer is installing to a location outside of a sysconfigscheme (e.g.,pipinstall--target), then this subsection does not apply.

Recommendations for distros

This section is non-normative. It provides best practices we believe distros should follow unless they have a specific reason otherwise.

Mark the installation as externally managed

Distros should create anEXTERNALLY-MANAGEDfile in their stdlibdirectory.

Guide users towards virtual environments

The file should contain a useful and distro-relevant error message indicating both how to install system-wide packages via the distro’s package manager and how to set up a virtual environment. If your distro is often used by users in a state where thePython 3command is available (and especially wherepiporget-pipis available) butPython 3-mvenvdoes not work, the message should indicate clearly how to makePython 3-mvenvwork properly.

Consider packagingpipx,a tool for installing Python-language applications, and suggesting it in the error. pipx automatically creates a virtual environment for that application alone, which is a much better default for end users who want to install some Python-language software (which isn’t available in the distro) but are not themselves Python users. Packaging pipx in the distro avoids the irony of instructing users topipinstall--user --break-system-packagespipxtoavoidbreaking system packages. Consider arranging things so your distro’s package / environment for Python for end users (e.g.,Python 3on Fedora orPython 3-full on Debian) depends on pipx.

Keep the marker file in container images

Distros that produce official images for single-application containers (e.g., Docker container images) should keep the EXTERNALLY-MANAGEDfile, preferably in a way that makes it not go away if a user of that image installs package updates inside their image (thinkRUNapt-getdist-upgrade).

Create separate distro and local directories

Distros should place two separate paths on the system interpreter’s sys.path,one for distro-installed packages and one for packages installed by the local system administrator, and configure sysconfig.get_default_scheme()to point at the latter path. This ensures that tools like pip will not modify distro-installed packages. The path for the local system administrator should come before the distro path onsys.pathso that local installs take preference over distro packages.

For example, Fedora and Debian (and their derivatives) both implement this split by using/usr/localfor locally-installed packages and /usrfor distro-installed packages. Fedora uses /usr/local/lib/ Python 3.x/site-packagesvs. /usr/lib/ Python 3.x/site-packages.(Debian uses /usr/local/lib/ Python 3/dist-packagesvs. /usr/lib/ Python 3/dist-packagesas an additional layer of separation from a locally-compiled Python interpreter: if you build and install upstream CPython in/usr/local/bin,it will look at /usr/local/lib/ Python 3/site-packages,and Debian wishes to make sure that packages installed via the locally-built interpreter don’t show up onsys.pathfor the distro interpreter.)

Note that the/usr/localvs./usrsplit is analogous to how thePATHenvironment variable typically includes /usr/local/bin:/usr/binand non-distro software installs to /usr/localby default. This split isrecommended by the Filesystem Hierarchy Standard.

There are two ways you could do this. One is, if you are building and packaging Python libraries directly (e.g., your packaging helpers unpack aPEP 517-built wheel or callsetup.pyinstall), arrange for those tools to use a directory that is not in asysconfig scheme but is still onsys.path.

The other is to arrange for the defaultsysconfigscheme to change when running inside a package build versus when running on an installed system. Thesysconfigcustomization hooks from bpo-43976should make this easy (once accepted and implemented): make your packaging tool set an environment variable or some other detectable configuration, and define aget_preferred_schemesfunction to return a different scheme when called from inside a package build. Then you can usepip installas part of your distro packaging.

We propose adding a--scheme=...option to instruct pip to run against a specific scheme. (SeeImplementation Notesbelow for how pip currently determines schemes.) Once that’s available, for local testing and possibly for actual packaging, you would be able to run something likepipinstall--scheme=posix_distroto explicitly install a package into your distro’s location (bypassing get_preferred_schemes). One could also, if absolutely needed, use pipuninstall--scheme=posix_distroto use pip to remove packages from the system-managed directory, which addresses the (hopefully theoretical) regression in use case 5 inRationale.

To install packages with pip, you would also need to either suppress theEXTERNALLY-MANAGEDmarker file to allow pip to run or to override it on the command line. You may want to use the same means for suppressing the marker file in build chroots as you do in container images.

The advantage of setting these up to be automatic (suppressing the marker file in your build environment and having get_preferred_schemesautomatically return your distro’s scheme) is that an unadornedpipinstallwill work inside a package build, which generally means that an unmodified upstream build script that happens to internally callpipinstallwill do the right thing. You can, of course, just ensure that your packaging process always callspipinstall--scheme=posix_distro--break-system-packages, which would work too.

The best approach here depends a lot on your distro’s conventions and mechanisms for packaging.

Similarly, thesysconfigpaths that are not for importable Python code - that is,include,platinclude,scripts,and data- should also have two variants, one for use by distro-packaged software and one for use for locally-installed software, and the distro should be set up such that both are usable. For instance, a typical FHS-compliant distro will use /usr/local/includefor the default scheme’sincludeand /usr/includefor distro-packaged headers and place both on the compiler’s search path, and it will use/usr/local/binfor the default scheme’sscriptsand/usr/binfor distro-packaged entry points and place both on$PATH.

Backwards Compatibility

All of these mechanisms are proposed for new distro releases and new versions of tools like pip only.

In particular, we strongly recommend that distros with a concept of major versions only add the marker file or changesysconfig schemes in a new major version; otherwise there is a risk that, on an existing system, software installed via a Python-specific package manager now becomes unmanageable (without an override option). For a rolling-release distro, if possible, only add the marker file or changesysconfigschemes in a new Python minor version.

One particular backwards-compatibility difficulty for package installation tools is likely to be managing environments created by old versions ofvirtualenvwhich have the latest version of the tool installed. A “virtual environment” now has a fairly precise definition: it uses thepyvenv.cfgmechanism, which causes sys.base_prefix!=sys.prefix.It is possible, however, that a user may have an old virtual environment created by an older version ofvirtualenv;as of this writing, pip supports Python 3.6 onwards, which is in turn supported byvirtualenv15.1.0 onwards, so this scenario is possible. In older versions ofvirtualenv,the mechanism is instead to set a new attribute,sys.real_prefix,and it does not use the standard library support for virtual environments, sosys.base_prefixis the same assys.prefix.So the logic for robustly detecting a virtual environment is something like:

defis_virtual_environment():
returnsys.base_prefix!=sys.prefixorhasattr(sys,"real_prefix")

Security Implications

The purpose of this feature is not to implement a security boundary; it is to discourage well-intended changes from unexpectedly breaking a user’s environment. That is to say, the reason this PEP restricts pipinstalloutside a virtual environment is not that it’s a security risk to be able to do so; it’s that “There should be one– and preferably only one –obvious way to do it,” and that way should be using a virtual environment.pipinstalloutside a virtual environment is rather too obvious for what is almost always the wrong way to do it.

If there is a case where a user should not be able tosudopip installorpipinstall--userand add files tosys.pathfor security reasons,that needs to be implemented either via access control rules on what files the user can write to or an explicitly securedsys.pathfor the program in question. Neither of the mechanisms in this PEP should be interpreted as a way to address such a scenario.

For those reasons, an attempted install with a marker file present is not a security incident, and there is no need to raise an auditing event for it. If the calling user legitimately has access tosudo pipinstallorpipinstall--user,they can accomplish the same installation entirely outside of Python; if they do not legitimately have such access, that’s a problem outside the scope of this PEP.

The marker file itself is located in the standard library directory, which is a trusted location (i.e., anyone who can write to the marker file used by a particular installer could, presumably, run arbitrary code inside the installer). Therefore, there is generally no need to filter out terminal escape sequences or other potentially-malicious content in the error message.

Alternatives

There are a number of similar proposals we considered that this PEP rejects or defers, largely to preserve the behavior in the case-by-case analysis inRationale.

Marker file

Should the marker file be insys.path,marking a particular directory as not to be written to by a Python-specific package manager? This would help with the second problem addressed by this PEP (not overwriting deleting distro-owned files) but not the first (incompatible installs). A directory-specific marker in /usr/lib/ Python 3.x/site-packageswould not discourage installations into either/usr/local/lib/ Python 3.x/site-packages or~/.local/lib/ Python 3.x/site-packages,both of which are on sys.pathfor/usr/bin/ Python 3.In other words, the marker file should not be interpreted as marking a singledirectoryas externally managed (even though it happens to be in a directory on sys.path); it marks the entirePython installationas externally managed.

Another variant of the above: should the marker file be in sys.path,where if it can be found in any directory in sys.path,it marks the installation as externally managed? An apparent advantage of this approach is that it automatically disables itself in virtual environments. Unfortunately, This has the wrong behavior with a--system-site-packagesvirtual environment, where the system-widesys.pathis visible but package installations are allowed. (It could work if the rule of exempting virtual environments is preserved, but that seems to have no advantage over the current scheme.)

Should the marker just be a new attribute of asysconfigscheme? There is some conceptual cleanliness to this, except that it’s hard to override. We want to make it easy for container images, package build environments, etc. to suppress the marker file. A file that you can remove is easy; code insysconfigis much harder to modify.

Should the file be in/etc?No, because again, it refers to a specific Python installation. A user who installs their own Python may well want to install packages within the global context of that interpreter.

Should the configuration setting be inpip.confor distutils.cfg?Apart from the above objections about marking an installation, this mechanism isn’t specific to either of those tools. (It seems reasonable for pip toalsoimplement a configuration flag for users to prevent themselves from performing accidental non-virtual-environment installs in any Python installation, but that is outside the scope of this PEP.)

Should the file be TOML? TOML is gaining popularity for packaging (see e.g.PEP 517) but does not yet have an implementation in the standard library. Strictly speaking, this isn’t a blocker - distros need only write the file, not read it, so they don’t need a TOML library (the file will probably be written by hand, regardless of format), and packaging tools likely have a TOML reader already. However, the INI format is currently used for various other forms of packaging metadata (e.g.,pydistutils.cfgandsetup.cfg), meets our needs, and is parsable by the standard library, and the pip maintainers expressed a preference to avoid using TOML for this yet.

Should the file beemail.message-style? While this format is also used for packaging metadata (e.g. sdist and wheel metadata) and is also parsable by the standard library, it doesn’t handle multi-line entries quite as clearly, and that is our primary use case.

Should the marker file be executable Python code that evaluates whether installation should be allowed or not? Apart from the concerns above about having the file insys.path,we have a concern that making it executable is committing to too powerful of an API and risks making behavior harder to understand. (Note that the get_default_schemehook ofbpo-43976is in fact executable, but that code needs to be supplied when the interpreter builds; it isn’t intended to be supplied post-build.)

When overriding the marker, should a Python-specific package manager be disallowed from shadowing a package installed by the external package manager (i.e., installing modules of the same name)? This would minimize the risk of breaking system software, but it’s not clear it’s worth the additional user experience complexity. There are legitimate use cases for shadowing system packages, and an additional command-line option to permit it would be more confusing. Meanwhile, not passing that option wouldn’t eliminate the risk of breaking system software, which may be relying on atry:importxyzfailing, finding a limited set of entry points, etc. Communicating this distinction seems difficult. We think it’s a good idea for Python-specific package managers to print a warning if they shadow a package, but we think it’s not worth disabling it by default.

Why not use theINSTALLERfile fromPEP 376to determine who installed a package and whether it can be removed? First, it’s specific to a particular package (it’s in the package’sdist-info directory), so like some of the alternatives above, it doesn’t provide information on an entire environment and whether package installations are permissible.PEP 627also updatesPEP 376to prevent programmatic use ofINSTALLER,specifying that the file is “to be used for informational purposes only. […] Our goal is supporting interoperating tools, and basing any action on which tool happened to install a package runs counter to that goal.” Finally, asPEP 627 envisions, there are legitimate use cases for one tool knowing how to handle packages installed by another tool; for instance,condacan safely remove a package installed bypipinto a Conda environment.

Why does the specification give no means for disabling package installations inside a virtual environment? We can’t see a particularly strong use case for it (at least not one related to the purposes of this PEP). If you need it, it’s simple enough topip uninstallpipinside that environment, which should discourage at least unintentional changes to the environment (and this specification makes no provision to disableintentionalchanges, since after all the marker file can be easily removed).

System Python

Shouldn’t distro software just run with the distrosite-packages directory alone onsys.pathand ignore the local system administrator’ssite-packagesas well as the user-specific one? This is a worthwhile idea, and various versions of it have been circulating for a while under the name of “system Python” or “platform Python” (with a separate “user Python” for end users writing Python or installing Python software separate from the system). However, it’s much more involved of a change. First, it would be a backwards-incompatible change. As mentioned in theMotivation section, there are valid use cases for running distro-installed Python applications like Sphinx or Ansible with locally-installed Python libraries available on theirsys.path.A wholesale switch to ignoring local packages would break these use cases, and a distro would have to make a case-by-case analysis of whether an application ought to see locally-installed libraries or not.

Furthermore,Fedora attempted this change and reverted it,finding, ironically, that their implementation of the changebroke their package manager.Given that experience, there are clearly details to be worked out before distros can reliably implement that approach, and a PEP recommending it would be premature.

This PEP is intended to be a complete and self-contained change that is independent of a distributor’s decision for or against “system Python” or similar proposals. It is not incompatible with a distro implementing “system Python” in the future, and even though both proposals address the same class of problems, there are still arguments in favor of implementing something like “system Python” even after implementing this PEP. At the same time, though, this PEP specifically tries to make a more targeted and minimal change, such that it can be implemented by distributors who don’t expect to adopt “system Python” (or don’t expect to implement it immediately). The changes in this PEP stand on their own merits and are not an intermediate step for some future proposal. This PEP reduces (but does not eliminate) the risk of breaking system software while minimizing (but not completely avoiding) breaking changes, which should therefore be much easier to implement than the full “system Python” idea, which comes with the downsides mentioned above.

We expect that the guidance in this PEP - that users should use virtual environments whenever possible and that distros should have separatesys.pathdirectories for distro-managed and locally-managed modules - should make further experiments easier in the future. These may include distributing wholly separate “system” and “user” Python interpreters, running system software out of a distro-owned virtual environment orPYTHONHOME(but shipping a single interpreter), or modifying the entry points for certain software (such as the distro’s package manager) to use asys.path that only sees distro-managed directories. Those ideas themselves, however, remain outside the scope of this PEP.

Implementation Notes

This section is non-normative and contains notes relevant to both the specification and potential implementations.

Currently, pip does not directly expose a way to choose a target sysconfigscheme, but it has three ways of looking up schemes when installing:

pipinstall
Callssysconfig.get_default_scheme(),which is usually (in upstream CPython and most current distros) the same as get_preferred_scheme('prefix').
pipinstall--prefix=/some/path
Callssysconfig.get_preferred_scheme('prefix').
pipinstall--user
Callssysconfig.get_preferred_scheme('user').

Finally,pipinstall--target=/some/pathwrites directly to /some/pathwithout looking up any schemes.

Debian currently carries apatch to change the default install location inside a virtual environment,using a few heuristics (including checking for theVIRTUAL_ENVenvironment variable), largely so that the directory used in a virtual environment remains site-packagesand notdist-packages.This does not particularly affect this proposal, because the implementation of that patch does not actually change the defaultsysconfigscheme, and notably does not change the result of sysconfig.get_path( "stdlib" ).

Fedora currently carries apatch to change the default install location when not running inside rpmbuild,which they use to implement the two-system-wide-directories approach. This is conceptually the sort of hook envisioned bybpo-43976,except implemented as a code patch todistutilsinstead of as a changed sysconfigscheme.

The implementation ofis_virtual_environmentabove, as well as the logic to load theEXTERNALLY-MANAGEDfile and find the error message from it, may as well get added to the standard library (sysandsysconfig,respectively), to centralize their implementations, but they don’t need to be added yet.

References

For additional background on these problems and previous attempts to solve them, seeDebian bug 771794“pip silently removes/updates system provided Python packages” from 2014, Fedora’s 2018 article Making sudo pip safeabout pointingsudopipat /usr/local (which acknowledges that the changes still do not makesudopip completely safe), pip issues5605( “Disable upgrades to existing Python modules which were not installed via pip” ) and5722( “pip should respect /usr/local” ) from 2018, and the post-PyCon US 2019 discussion threadPlaying nice with external package managers.


Source:https://github / Python /peps/blob/main/peps/pep-0668.rst

Last modified:2024-05-17 01:32:43 GMT