Mastering Python - Second Edition

5 (1 reviews total)
By Rick van Hattem
    What do you get with a Packt Subscription?

  • Instant access to this title and 7,500+ eBooks & Videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. Free Chapter
    Getting Started – One Environment per Project
About this book

Even if you find writing Python code easy, writing code that is efficient, maintainable, and reusable is not so straightforward. Many of Python’s capabilities are underutilized even by more experienced programmers. Mastering Python, Second Edition, is an authoritative guide to understanding advanced Python programming so you can write the highest quality code. This new edition has been extensively revised and updated with exercises, four new chapters and updates up to Python 3.10.

Revisit important basics, including Pythonic style and syntax and functional programming. Avoid common mistakes made by programmers of all experience levels. Make smart decisions about the best testing and debugging tools to use, optimize your code’s performance across multiple machines and Python versions, and deploy often-forgotten Python features to your advantage. Get fully up to speed with asyncio and stretch the language even further by accessing C functions with simple Python calls. Finally, turn your new-and-improved code into packages and share them with the wider Python community.

If you are a Python programmer wanting to improve your code quality and readability, this Python book will make you confident in writing high-quality scripts and taking on bigger challenges

Publication date:
May 2022
Publisher
Packt
Pages
710
ISBN
9781800207721

 

Getting Started – One Environment per Project

In this chapter, you’ll learn about the different ways of setting up Python environments for your projects and how to use multiple Python versions on a single system outside of what your package manager offers.

After the environment is set up, we will continue with the installation of packages using both the Python Package Index (PyPI) and conda-forge, the package index that is coupled with Anaconda.

Lastly, we will look at several methods of keeping track of project dependencies.

To summarize, the following topics will be covered:

  • Creating environments using venv, pipenv, poetry, pyenv, and anaconda
  • Package installation through pip, poetry, pipenv, and conda
  • Managing dependencies using requirements.txt, poetry, and pipenv
 

Virtual environments

The Python ecosystem offers many methods of installing and managing packages. You can simply download and extract code to your project directory, use the package manager from your operating system, or use a tool such as pip to install a package. To make sure your packages don’t collide, it is recommended that you use a virtual environment. A virtual environment is a lightweight Python installation with its own package directories and a Python binary copied (or linked) from the binary used to create the environment.

Why virtual environments are a good idea

It might seem like a hassle to create a virtual environment for every Python project, but it offers enough advantages to do so. More importantly, there are several reasons why installing packages globally using pip is a really bad idea:

  • Installing packages globally usually requires elevated privileges (such as sudo, root, or administrator), which is a huge security risk. When executing pip install <package>, the setup.py of that package is executed as the user that executed the pip install command. That means that if the package contains malware, it now has superuser privileges to do whatever it wants. Don’t forget that anyone can upload a package to PyPI (pypi.org) without any vetting. As you will see later in this book, it only takes a couple of minutes for anyone to create and upload a package.
  • Depending on how you installed Python, it can mess with the existing packages that are installed by your package manager. On an Ubuntu Linux system, that means you could break pip or even apt itself because a pip install -U <package> installs and updates both the package and all of the dependencies.
  • It can break your other projects. Many projects try their best to remain backward compatible, but every pip install could pull in new/updated dependencies that could break compatibility with other packages and projects. The Django Web Framework, for example, changes enough between versions that many projects using Django will need several changes after an upgrade to the latest release. So, when you’re upgrading Django on your system to the latest version and have a project that was written for a previous version, your project will most likely be broken.
  • It pollutes your list of packages, making it hard to keep track of your project’s dependencies.

In addition to alleviating the issues above, there is a major advantage as well. You can specify the Python version (assuming you have it installed) when creating the virtual environment. This allows you to test and debug your projects in multiple Python versions easily while keeping the exact same package versions beyond that.

Using venv and virtualenv

You are probably already familiar with virtualenv, a library used to create a virtual environment for your Python installation. What you might not know is the venv command, which has been included with Python since version 3.3 and can be used as a drop-in replacement for virtualenv in most cases. To keep things simple, I recommend creating a directory where you keep all of your environments. Some people opt for an env, .venv, or venv directory within the project, but I advise against that for several reasons:

  • Your project files are important, so you probably want to back them up as often as possible. By keeping the bulky environment with all of the installed packages outside of your backups, your backups become faster and lighter.
  • Your project directory stays portable. You can even keep it on a remote drive or flash drive without having to worry that the virtual environment will only work on a single system.
  • It prevents you from accidentally adding the virtual environment files to your source control system.

If you do decide to keep your virtual environment inside your project directory, make sure that you add that directory to your .gitignore file (or similar) for your version control system. And if you want to keep your backups faster and lighter, exclude it from the backups. With correct dependency tracking, the virtual environment should be easy enough to rebuild.

Creating a venv

Creating a venv is a reasonably simple process, but it varies slightly according to the operating system being used.

The following examples use the virtualenv module directly, but for ease I recommend using poetry instead, which is covered later in this chapter. This module will automatically create a virtual environment for you when you first use it. Before you make the step up to poetry, however, it is important to understand how virtual environments work.

Since Python 3.6, the pyvenv command has been deprecated in favor of python -m venv.

In the case of Ubuntu, the python3-venv package has to be installed through apt because the Ubuntu developers have mutilated the default Python installation by not including ensurepip.

For Linux/Unix/OS X, using zsh or bash as a shell, it is:

$ python3 -m venv envs/your_env
$ source envs/your_env/bin/activate
(your_env) $

And for Windows cmd.exe (assuming python.exe is in your PATH), it is:

C:\Users\wolph>python.exe -m venv envs\your_env
C:\Users\wolph>envs\your_env\Scripts\activate.bat
(your_env) C:\Users\wolph>

PowerShell is also supported and can be used in a similar fashion:

PS C:\Users\wolph>python.exe -m venv envs\your_env
PS C:\Users\wolph> envs\your_env\Scripts\Activate.ps1
(your_env) PS C:\Users\wolph>

The first command creates the environment and the second activates the environment. After activating the environment, commands such as python and pip use the environment-specific versions, so pip install only installs within your virtual environment. A useful side effect of activating the environment is the prefix with the name of your environment, which is (your_env) in this case.

Note that we are not using sudo or other methods of elevating privileges. Elevating privileges is both unnecessary and a potential security risk, as explained in the Why virtual environments are a good idea section.

Using virtualenv instead of venv is as simple as replacing the following command:

$ python3 -m venv envs/your_env

with this one:

$ virtualenv envs/your_env

An additional advantage of using virtualenv instead of venv, in that case, is that you can specify the Python interpreter:

$ virtualenv -p python3.8 envs/your_env

Whereas with the venv command, it uses the currently running Python installation, so you need to change it through the following invocation:

$ python3.8 -m venv envs/your_env

Activating a venv/virtualenv

Every time you get back to your project after closing the shell, you need to reactivate the environment. The activation of a virtual environment consists of:

  • Modifying your PATH environment variable to use envs\your_env\Script or envs/your_env/bin for Windows or Linux/Unix, respectively
  • Modifying your prompt so that instead of $, you see (your_env) $, indicating that you are working in a virtual environment

In the case of poetry, you can use the poetry shell command to create a new shell with the activated environment.

While you can easily modify those manually, an easier method is to run the activate script that was generated when creating the virtual environment.

For Linux/Unix with zsh or bash as the shell, it is:

$ source envs/your_env/bin/activate
(your_env) $

For Windows using cmd.exe, it is:

C:\Users\wolph>envs\your_env\Scripts\activate.bat
(your_env) C:\Users\wolph>

For Windows using PowerShell, it is:

PS C:\Users\wolph> envs\your_env\Scripts\Activate.ps1
(your_env) PS C:\Users\wolph>

By default, the PowerShell permissions might be too restrictive to allow this. You can change this policy for the current PowerShell session by executing:

Set-ExecutionPolicy Unrestricted -Scope Process

If you wish to permanently change it for every PowerShell session for the current user, execute:

Set-ExecutionPolicy Unrestricted -Scope CurrentUser

Different shells, such as fish and csh, are also supported by using the activate.fish and activate.csh scripts, respectively.

When not using an interactive shell (with a cron job, for example), you can still use the environment by using the Python interpreter in the bin or scripts directory for Linux/Unix or Windows, respectively. Instead of running python script.py or /usr/bin/python script.py, you can use:

/home/wolph/envs/your_env/bin/python script.py

Note that commands installed through pip (and pip itself) can be run in a similar fashion:

/home/wolph/envs/your_env/bin/pip

Installing packages

Installing packages within your virtual environment can be done using pip as normal:

$ pip3 install <package>

The great advantage comes when looking at the list of installed packages:

$ pip3 freeze

Because our environment is isolated from the system, we only see the packages and dependencies that we have explicitly installed.

Fully isolating the virtual environment from the system Python packages can be a downside in some cases. It takes up more disk space and the package might not be in sync with the C/C++ libraries on the system. The PostgreSQL database server, for example, is often used together with the psycopg2 package. While binaries are available for most platforms and building the package from the source is fairly easy, it can sometimes be more convenient to use the package that is bundled with your system. That way, you are certain that the package is compatible with both the installed Python and PostgreSQL versions.

To mix your virtual environment with system packages, you can use the --system-site-packages flag when creating the environment:

$ python3 -m venv --system-site-packages envs/your_env

When enabling this flag, the environment will have the system Python environment sys.path appended to your virtual environment’s sys.path, effectively providing the system packages as a fallback when an import from the virtual environment fails.

Explicitly installing or updating a package within your virtual environment will effectively hide the system package from within your virtual environment. Uninstalling the package from your virtual environment will make it reappear.

As you might suspect, this also affects the results of pip freeze. Luckily, pip freeze can be told to only list the packages local to the virtual environment, which excludes the system packages:

$ pip3 freeze --local

Later in this chapter, we will discuss pipenv, which transparently handles the creation of the virtual environment for you.

Using pyenv

The pyenv library makes it really easy to quickly install and switch between multiple Python versions. A common issue with many Linux and Unix systems is that the package managers opt for stability over recency. In most cases, this is definitely an advantage, but if you are running a project that requires the latest and greatest Python version, or a really old version, it requires you to compile and install it manually. The pyenv package makes this process really easy for you but does still require the compiler to be installed.

A nice addition to pyenv for testing purposes is the tox library. This library allows you to run your tests on a whole list of Python versions simultaneously. The usage of tox is covered in Chapter 10, Testing and Logging – Preparing for Bugs.

To install pyenv, I recommend visiting the pyenv project page, since it depends highly on your operating system and operating system version. For Linux/Unix, you can use the regular pyenv installation manual or the pyenv-installer (https://github.com/pyenv/pyenv-installer) one-liner, if you deem it safe enough:

$ curl https://pyenv.run | bash

Make sure that you follow the instructions given by the installer. To ensure pyenv works properly, you will need to modify your .zshrc or .bashrc.

Windows does not support pyenv natively (outside of Windows Subsystem for Linux) but has a pyenv fork available: https://github.com/pyenv-win/pyenv-win#installation

After installing pyenv, you can view the list of supported Python versions using:

$ pyenv install --list

The list is rather long, but can be shortened with grep on Linux/Unix:

$ pyenv install --list | grep 3.10
  3.10.0
  3.10-dev
...

Once you’ve found the version you like, you can install it through the install command:

$ pyenv install 3.10-dev
Cloning https://github.com/python/cpython...
Installing Python-3.10-dev...
Installed Python-3.10-dev to /home/wolph/.pyenv/versions/3.10-dev

The pyenv install command takes an optional --debug parameter, which builds a debug version of Python that makes debugging C/C++ extensions possible using a debugger such as gdb.

Once the Python version has been built, you can activate it globally, but you can also use the pyenv-virtualenv plugin (https://github.com/pyenv/pyenv-virtualenv) to create a virtualenv for your newly created Python environment:

$ pyenv virtualenv 3.10-dev your_pyenv

you can see in the preceding example, as opposed to the venv and virtualenv commands, pyenv virtualenv automatically creates the environment in the ~/.pyenv/versions/<version>/envs/ directory so you’re not allowed to fully specify your own path. You can change the base path (~/.pyenv/) through the PYENV_ROOT environment variable, however. Activating the environment using the activate script in the environment directory is still possible, but more complicated than it needs to be since there’s an easy shortcut:

$ pyenv activate your_pyenv

Now that the environment is activated, you can run environment-specific commands, such as pip, and they will only modify your environment.

Using Anaconda

Anaconda is a distribution that supports both the Python and R programming languages. It is much more than simply a virtual environment manager, though; it’s a whole different Python distribution with its own virtual environment system and even a completely different package system. In addition to supporting PyPI, it also supports conda-forge, which features a very impressive number of packages focused on scientific computing.

For the end user, the most important difference is that packages are installed through the conda command instead of pip. This brings a much more advanced dependency check when installing packages. Whereas pip will simply install a package and all of its dependencies without regard for other installed packages, conda will look at all of the installed packages and make sure it won’t install a version that is not supported by the installed packages.

The conda package manager is not alone in smart dependency checking. The pipenv package manager (discussed later in this chapter) does something similar.

Getting started with Anaconda Navigator

Installing Anaconda is quite easy on all common platforms. For Windows, OS X, and Linux, you can go to the Anaconda site and download the (graphical) installer: https://www.anaconda.com/products/distribution#Downloads

Once it’s installed, the easiest way to continue is by launching Anaconda Navigator, which should look something like this:

Figure 1.1: Anaconda Navigator – Home

Creating an environment and installing packages is pretty straightforward as well:

  1. Click on the Environments button on the left.
  2. Click on the Create button below.
  3. Enter your name and Python version.
  4. Click on Create to create your environment and wait a bit until Anaconda is done:

    Figure 1.2: Anaconda Navigator – Creating an environment

Once Anaconda has finished creating your environment, you should see a list of installed packages. Installing packages can be done by changing the filter of the package list from Installed to All, marking the checkbox near the packages you want to install, and applying the changes.

While creating an environment, Anaconda Navigator shows you where the environment will be created.

Getting started with conda

While Anaconda Navigator is a really nice tool to use to get an overview, being able to run your code from the command line can be convenient too. With the conda command, that is luckily very easy.

First, you need to open the conda shell. You can do this from Anaconda Navigator if you wish, but you can also run it straightaway. On Windows, you can open Anaconda Prompt or Anaconda PowerShell Prompt from the start menu. On Linux and OS X, the most convenient method is to initialize the shell integration. For zsh, you can use:

$ conda init zsh

For other shells, the process is similar. Note that this process modifies your shell configuration to automatically activate the base environment every time you open a shell. This can be disabled with a simple configuration option:

$ conda config --set auto_activate_base false

If automatic activation is not enabled, you will need to run the activate command to get back into the conda base environment:

$ conda activate
(base) $

If, instead of the conda base environment, you wish to activate the environment you created earlier, you need to specify the name:

$ conda activate conda_env
(conda_env) $

If you have not created the environment yet, you can do so using the command line as well:

$ conda create --name conda_env
Collecting package metadata (current_repodata.json): done
Solving environment: done
...
Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
...

To list the available environments, you can use the conda info command:

$ conda info --envs
# conda environments
#
base                  *  /usr/local/anaconda3
conda_env                /usr/local/anaconda3/envs/conda_env

Installing conda packages

Now it’s time to install a package. For conda packages, you can simply use the conda install command. For example, to install the progressbar2 package that I maintain, use:

(conda_env) $ conda install progressbar2
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##
  environment location: /usr/local/anaconda3/envs/conda_env

  added / updated specs:
    - progressbar2
The following packages will be downloaded:
...
The following NEW packages will be INSTALLED:
...
Proceed ([y]/n)? y

Downloading and Extracting Packages
...

Now you can run Python and see that the package has been installed and is working properly:

(conda_env) $ python
Python 3.8.0 (default, Nov  6 2019, 15:49:01)
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import progressbar

>>> for _ in progressbar.progressbar(range(5)): pass
...
100% (5 of 5) |##############################| Elapsed Time: 0:00:00 Time:  0:00:00

Another way to verify whether the package has been installed is by running the conda list command, which lists the installed packages similarly to pip list:

(conda_env) $ conda list
# packages in environment at /usr/local/anaconda3/envs/conda_env:
#
# Name                    Version                   Build  Channel
...

Installing PyPI packages

With PyPI packages, we have two options within the Anaconda distribution. The most obvious is using pip, but this has the downside of partially circumventing the conda dependency checker. While conda install will take the packages installed through PyPI into consideration, the pip command might upgrade packages undesirably. This behavior can be improved by enabling the conda/pip interoperability setting, but this seriously impacts the performance of conda commands:

$ conda config --set pip_interop_enabled True

Depending on how important fixed versions or conda performance is for you, you can also opt for converting the package to a conda package:

(conda_env) $ conda skeleton pypi progressbar2
Warning, the following versions were found for progressbar2
...
Use --version to specify a different version.
...
## Package Plan ##
...
The following NEW packages will be INSTALLED:
...
INFO:conda_build.config:--dirty flag and --keep-old-work not specified. Removing build/test folder after successful build/test.

Now that we have a package, we can modify the files if needed, but using the automatically generated files works most of the time. All that is left now is to build and install the package:

(conda_env) $ conda build progressbar2
...
(conda_env) $ conda install --use-local progressbar2
Collecting package metadata (current_repodata.json): done
Solving environment: done
...

And now we are done! The package has been installed through conda instead of pip.

Sharing your environment

When collaborating with others, it is essential to have environments that are as similar as possible to avoid debugging local issues. With pip, we can simply create a requirements file by using pip freeze, but that will not include the conda packages. With conda, there’s actually an even better solution, which stores not only the dependencies and versions but also the installation channels, environment name, and environment location:

(conda_env) $ conda env export –file environment.yml
(conda_env) $ cat environment.yml
name: conda_env
channels:
  - defaults
dependencies:
...
prefix: /usr/local/anaconda3/envs/conda_env

Installing the packages from that environment file can be done while creating the environment:

$ conda env create --name conda_env –file environment.yml

Or they can be added to an existing environment:

(conda_env) $ conda env update --file environment.yml
Collecting package metadata (repodata.json): done
...
 

Managing dependencies

The simplest way of managing dependencies is storing them in a requirements.txt file. In its simplest form, this is a list of package names and nothing else. This file can be extended with version requirements and can even support environment-specific installations.

A fancier method of installing and managing your dependencies is by using a tool such as poetry or pipenv. Internally, these use the regular pip installation method, but they build a full dependency graph of all the packages. This makes sure that all package versions are compatible with each other and allows the parallel installation of non-dependent packages.

Using pip and a requirements.txt file

The requirements.txt format allows you to list all of the dependencies of your project as broadly or as specifically as you feel is necessary. You can easily create this file yourself, but you can also tell pip to generate it for you, or even to generate a new file based on a previous requirements.txt file so you can view the changes. I recommend using pip freeze to generate an initial file and cherry-picking the dependencies (versions) you want.

For example, assuming that we run pip freeze in our virtual environment from before:

(your_env) $ pip3 freeze
pkg-resources==0.0.0

If we store that file in a requirements.txt file, install a package, and look at the difference, we get this result:

(your_env) $ pip3 freeze > requirements.txt
(your_env) $ pip3 install progressbar2
Collecting progressbar2
...
Installing collected packages: six, python-utils, progressbar2
Successfully installed progressbar2-3.47.0 python-utils-2.3.0 six-1.13.0
(your_env) $ pip3 freeze -r requirements.txt 
pkg-resources==0.0.0
## The following requirements were added by pip freeze:
progressbar2==3.47.0
python-utils==2.3.0
six==1.13.0

As you can see, the pip freeze command automatically detected the addition of the six, progressbar2, and python-utils packages, and it immediately pinned those versions to the currently installed ones.

The lines in the requirements.txt file are understood by pip on the command line as well, so to install a specific version, you can run:

$ pip3 install 'progressbar2==3.47.0'

Version specifiers

Often, pinning a version as strictly as that is not desirable, however, so let’s change the requirements file to only contain what we actually care about:

# We want a progressbar that is at least version 3.47.0 since we've tested that.
# But newer versions are ok as well.
progressbar2>=3.47.0

If someone else wants to install all of the requirements in this file, they can simply tell pip to include that requirement:

(your_env) $ pip3 install -r requirements.txt 
Requirement already satisfied: progressbar2>=3.47.0 in your_env/lib/python3.9/site-packages (from -r requirements.txt (line 1))
Requirement already satisfied: python-utils>=2.3.0 in your_env/lib/python3.9/site-packages (from progressbar2>=3.47.0->-r requirements.txt (line 1))
Requirement already satisfied: six in your_env/lib/python3.9/site-packages (from progressbar2>=3.47.0->-r requirements.txt (line 1))

In this case, pip checks to see whether all packages are installed and will install or update them if needed.

-r requirements.txt works recursively, allowing you to include multiple requirements files.

Now let’s assume we’ve encountered a bug in the latest version and we wish to skip it. We can assume that only this specific version is affected, so we will only blacklist that version:

# Progressbar 2 version 3.47.0 has a silly bug but anything beyond 3.46.0 still works with our code
progressbar2>=3.46,!=3.47.0

Lastly, we should talk about wildcards. One of the most common scenarios is needing a specific major version number but still wanting the latest security update and bug fixes. There are a few ways to specify these:

# Basic wildcard:
progressbar2 ==3.47.*
# Compatible release:
progressbar2 ~=3.47.1
# Compatible release above is identical to:
progressbar2 >=3.47.1, ==3.47.*

With the compatible release pattern (~=), you can select the newest version that is within the same major release but is at least the specified version.

The version identification and dependency specification standard is described thoroughly in PEP 440:

https://peps.python.org/pep-0440/

Installing through source control repositories

Now let’s say that we’re really unlucky and there is no working release of the package yet, but it has been fixed in the develop branch of the Git repository. We can install that either through pip or through a requirements.txt file, like this:

(your_env) $ pip3 install --editable 'git+https://github.com/wolph/python-progressbar@develop#egg=progressbar2'
Obtaining progressbar2 from git+https://github.com/wolph/python-progressbar@develop#egg=progressbar2
  Updating your_env/src/progressbar2 clone (to develop)
Requirement already satisfied: python-utils>=2.3.0 in your_env/lib/python3.9/site-packages (from progressbar2)
Requirement already satisfied: six in your_env/lib/python3.9/site-packages (from progressbar2)
Installing collected packages: progressbar2
  Found existing installation: progressbar2 3.47.0
    Uninstalling progressbar2-3.47.0:
      Successfully uninstalled progressbar2-3.47.0
  Running setup.py develop for progressbar2
Successfully installed progressbar2

You may notice that pip not only installed the package but actually did a git clone to your_env/src/progressbar2. This is an optional step caused by the --editable (short option: -e) flag, which has the additional advantage that every time you re-run the command, the git clone will be updated. It also makes it rather easy to go to that directory, modify the code, and create a pull request with a fix.

In addition to Git, other source control systems such as Bazaar, Mercurial, and Subversion are also supported.

Additional dependencies using extras

Many packages offer optional dependencies for specific use cases. In the case of the progressbar2 library, I have added tests and docs extras to install the test or documentation building dependencies needed to run the tests for the package. Extras can be specified using square brackets separated by commas:

# Install the documentation and test extras in addition to the progressbar
progressbar2[docs,tests]
# A popular example is the installation of encryption libraries when using the requests library:
requests[security]

Conditional dependencies using environment markers

If your project needs to run on multiple systems, you will most likely encounter dependencies that are not required on all systems. One example of this is libraries that are required on some operating systems but not on others. An example of this is the portalocker package I maintain; on Linux/Unix systems, the locking mechanisms needed are supported out of the box. On Windows, however, they require the pywin32 package to work. The install_requires part of the package (which uses the same syntax as requirements.txt) contains this line:

pywin32!=226; platform_system == "Windows"

This specifies that on Windows, the pywin32 package is required, and version 226 was blacklisted due to a bug.

In addition to platform_system, there are several more markers, such as python_version and platform_machine (contains architecture x86_64, for example).

The full list of markers can be found in PEP 496: https://peps.python.org/pep-0496/.

One other useful example of this is the dataclasses library. This library has been included with Python since version 3.7, so we only need to install the backport for older Python versions:

dataclasses; python_version < '3.7'

Automatic project management using poetry

The poetry tool provides a really easy-to-use solution for creating, updating, and sharing your Python projects. It’s also very fast, which makes it a fantastic starting point for a project.

Creating a new poetry project

Starting a new project is very easy. It will automatically handle virtual environments, dependencies, and other project-related tasks for you. To start, we will use the poetry init wizard:

$ poetry init
This command will guide you through creating your pyproject.toml config.

Package name [t_00_poetry]:
Version [0.1.0]:
Description []:
Author [Rick van Hattem <Wolph@wol.ph>, n to skip]:
License []:
Compatible Python versions [^3.10]:

Would you like to define your main dependencies interactively? (yes/no) [yes] no
Would you like to define your development dependencies interact...? (yes/no) [yes] no
...
Do you confirm generation? (yes/no) [yes]

Following these few questions, it automatically creates a pyproject.toml file for us that contains all the data we entered and some automatically generated data. As you may have noticed, it automatically prefilled several values for us:

  • The project name. This is based on the current directory name.
  • The version. This is fixed to 0.1.0.
  • The author field. This looks at your git user information. This can be set using:
    $ git config --global user.name "Rick van Hattem"
    $ git config --global user.email "Wolph@wol.ph"
    
  • The Python version. This is based on the Python version you are running poetry with, but it can be customized using poetry init --python=...

Looking at the generated pyproject.toml, we can see the following:

[tool.poetry]
name = "t_00_poetry"
version = "0.1.0"
description = ""
authors = ["Rick van Hattem <Wolph@wol.ph>"]

[tool.poetry.dependencies]
python = "^3.10"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Adding dependencies

Once we have the project up and running, we can now add dependencies:

$ poetry add progressbar2
Using version ^3.55.0 for progressbar2
...
Writing lock file
...
  • Installing progressbar2 (3.55.0)

This automatically installs the package, adds it to the pyproject.toml file, and adds the specific version to the poetry.lock file. After this command, the pyproject.toml file has a new line added to the tool.poetry.dependencies section:

[tool.poetry.dependencies]
python = "^3.10"
progressbar2 = "^3.55.0"

The poetry.lock file is a bit more specific. Whereas the progressbar2 dependency could have a wildcard version, the poetry.lock file stores the exact version, the file hashes, and all the dependencies that were installed:

[[package]]
name = "progressbar2"
version = "3.55.0"
... 
[package.dependencies]
python-utils = ">=2.3.0"
...
[package.extras]
docs = ["sphinx (>=1.7.4)"]
...
[metadata]
lock-version = "1.1"
python-versions = "^3.10"
content-hash = "c4235fba0428ce7877f5a94075e19731e5d45caa73ff2e0345e5dd269332bff0"

[metadata.files]
progressbar2 = [
    {file = "progressbar2-3.55.0-py2.py3-none-any.whl", hash = "sha256:..."},
    {file = "progressbar2-3.55.0.tar.gz", hash = "sha256:..."},
]
...

By having all this data, we can build or rebuild a virtual environment for a poetry-based project on another system exactly as it was created on the original system. To install, upgrade, and/or downgrade the packages exactly as specified in the poetry.lock file, we need a single command:

$ poetry install
Installing dependencies from lock file
...

This is very similar to how the npm and yarn commands work if you are familiar with those.

Upgrading dependencies

In the previous examples, we simply added a dependency without specifying an explicit version. Often this is a safe approach, as the default version requirement will allow for any version within that major version.

If the project uses normal Python versioning or semantic versioning (more about that in Chapter 18, Packaging - Creating Your Own Libraries or Applications), that should be perfect. At the very least, all of my projects (such as progressbar2) are generally both backward and largely forward compatible, so simply fixing the major version is enough. In this case, poetry defaulted to version ^3.55.0, which means that any version newer than or equal to 3.55.0, up to (but not including) 4.0.0, is valid.

Due to the poetry.lock file, a poetry install will result in those exact versions being installed instead of the new versions, however. So how can we upgrade the dependencies? For this purpose, we will start by installing an older version of the progressbar2 library:

$ poetry add 'progressbar2=3.1.0'

Now we will relax the version in the pyproject.toml file to ^3.1.0:

[tool.poetry.dependencies]
progressbar2 = "^3.1.0"

Once we have done this, a poetry install will still keep the 3.1.0 version, but we can make poetry update the dependencies for us:

$ poetry update
...
  • Updating progressbar2 (3.1.0 -> 3.55.0)

Now, poetry has nicely updated the dependencies in our project while still adhering to the requirements we set in the pyproject.toml file. If you set the version requirements of all packages to *, it will always update everything to the latest available versions that are compatible with each other.

Running commands

To run a single command using the poetry environment, you can use poetry run:

$ poetry run pip

For an entire development session, however, I would suggest using the shell command:

$ poetry shell

After this, you can run all Python commands as normal, but these will now be running from the activated virtual environment.

For cron jobs this is similar, but you will need to make sure that you change directories first:

0 3 * * *       cd /home/wolph/workspace/poetry_project/ && poetry run python script.py

This command runs every day at 03:00 (24-hour clock, so A.M.).

Note that cron might not be able to find the poetry command due to having a different environment. In that case, I would recommend using the absolute path to the poetry command, which can be found using which:

$ which poetry
/usr/local/bin/poetry

Automatic dependency tracking using pipenv

For large projects, your dependencies can change often, which makes the manual manipulation of the requirements.txt file rather tedious. Additionally, having to create a virtual environment before you can install your packages is also a pretty repetitive task if you work on many projects. The pipenv tool aims to transparently solve these issues for you, while also making sure that all of your dependencies are compatible and updated. And as a final bonus, it combines the strict and loose dependency versions so you can make sure your production environment uses the exact same versions you tested.

Initial usage is simple; go to your project directory and install a package. Let’s give it a try:

$ pipenv install progressbar2
Creating a virtualenv for this project...
...
Using /usr/local/bin/python3 (3.10.4) to create virtualenv...
...
 Successfully created virtual environment!
...
Creating a Pipfile for this project...
Installing progressbar2...
Adding progressbar2 to Pipfile's [packages]...
 Installation Succeeded
Pipfile.lock not found, creating...
...
 Success!
Updated Pipfile.lock (996b11)!
Installing dependencies from Pipfile.lock (996b11)...
   0/0 — 00:00:0

That’s quite a bit of output even when abbreviated. But let’s look at what happened:

  • A virtual environment was created.
  • A Pipfile was created, which contains the dependency as you specified it. If you specify a specific version, that will be added to the Pipfile; otherwise, it will be a wildcard requirement, meaning that any version will be accepted as long as there are no conflicts with other packages.
  • A Pipfile.lock was created containing the exact list of packages and versions as installed. This allows an identical install on a different machine with the exact same versions.

The generated Pipfile contains the following:

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[dev-packages]

[packages]
progressbar2 = "*"

[requires]
python_version = "3.10"

And the Pipfile.lock is a bit larger, but immediately shows another advantage of this method:

{
    ...
    "default": {
        "progressbar2": {
            "hashes": [
                "sha256:14d3165a1781d053...",
                "sha256:2562ba3e554433f0..."
            ],
            "index": "pypi",
            "version": "==4.0.0"
        },
        "python-utils": {
            "hashes": [
                "sha256:4dace6420c5f50d6...",
                "sha256:93d9cdc8b8580669..."
            ],
            "markers": "python_version >= '3.7'",
            "version": "==3.1.0"
        },
        ...
    },
    "develop": {}
}

As you can see, in addition to the exact package versions, the Pipfile.lock contains the hashes of the packages as well. In this case, the package provides both a .tar.gz (source) and a .whl (wheel) file, which is why there are two hashes. Additionally, the Pipfile.lock contains all packages installed by pipenv, including all dependencies.

Using these hashes, you can be certain that during a deployment, you will receive the exact same file and not some corrupt or even malicious file.

Because the versions are completely fixed, you can also be certain that anyone deploying your project using the Pipfile.lock will get the exact same package versions. This is very useful when working together with other developers.

To install all the necessary packages as specified in the Pipfile (even for the initial install), you can simply run:

$ pipenv install
Installing dependencies from Pipfile.lock (5c99e1)…
   3/3 — 00:00:00
To activate this project's virtualenv, run pipenv shell.
Alternatively, run a command inside the virtualenv with pipenv run.

Any time you run pipenv install package, the Pipfile will be automatically modified with your changes and checked for incompatible packages. The big downside is that pipenv can become terribly slow for large projects. I have encountered multiple projects where a no-op pip install would take several minutes due to the fetching and checking of the entire dependency graph. In most cases, it’s still worth it, however; the added functionality can save you a lot of headaches.

Don’t forget to run your regular Python commands with the pipenv run prefix or from pipenv shell.

Updating your packages

Because of the dependency graph, you can easily update your packages without having to worry about dependency conflicts. With one command, you’re done:

$ pipenv update

Should you still encounter issues with the versions because some packages haven’t been checked against each other, you can fix that by specifying the versions of the package you do or do not want:

$ pipenv install 'progressbar2!=3.47.0'
Installing progressbar2!=3.47.0…
Adding progressbar2 to Pipfile's [packages]…
 Installation Succeeded 
Pipfile.lock (c9327e) out of date, updating to (5c99e1)…
 Success! 
Updated Pipfile.lock (c9327e)!
Installing dependencies from Pipfile.lock (c9327e)…
   3/3 — 00:00:00

By running that command, the packages section of the Pipfile changes to:

[packages]
progressbar2 = "!=3.47.0"

Deploying to production

Getting the exact same versions on all of your production servers is absolutely essential to prevent hard-to-trace bugs. For this very purpose, you can tell pipenv to install everything as specified in the Pipenv.lock file while still checking to see whether Pipfile.lock is out of date. With one command, you have a fully functioning production virtual environment with all packages installed.

Let’s create a new directory and see if it all works out:

$ mkdir ../pipenv_production
$ cp Pipfile Pipfile.lock ../pipenv_production/
$ cd ../pipenv_production/
$ pipenv install --deploy
Creating a virtualenv for this project...
Pipfile: /home/wolph/workspace/pipenv_production/Pipfile
Using /usr/bin/python3 (3.10.4) to create virtualenv...
...
 Successfully created virtual environment!
...
Installing dependencies from Pipfile.lock (996b11)...
   2/2 — 00:00:01
$ pipenv shell
Launching subshell in virtual environment...
(pipenv_production) $ pip3 freeze
progressbar2==4.0.0
python-utils==3.1.0

All of the versions are exactly as expected and ready for use.

Running cron commands

To run your Python commands outside of the pipenv shell, you can use the pipenv run prefix. Instead of python, you would run pipenv run python. In normal usage, this is a lot less practical than activating the pipenv shell, but for non-interactive sessions, such as cron jobs, this is an essential feature. For example, a cron job that runs at 03:00 (24-hour clock, so A.M.) every day would look something like this:

0 3 * * *       cd /home/wolph/workspace/pipenv_project/ && pipenv run python script.py
 

Exercises

Many of the topics discussed in this chapter already gave full examples, leaving little room for exercises. There are additional resources to discover, however.

Reading the Python Enhancement Proposals (PEPs)

A good way to learn more about the topics discussed in this chapter (and all the following chapters) is to read the PEP pages. These proposals were written before the changes were accepted into the Python core. Note that not all of the PEPs on the Python site have been accepted, but they will remain on the Python site:

Combining pyenv and poetry or pipenv

Even though the chapter did not cover it, there is nothing stopping you from telling poetry or pipenv to use a pyenv-based Python interpreter. Give it a try!

Converting an existing project to a poetry project

Part of this exercise should be to either create a brand new pyproject.toml or to convert an existing requirements.txt file to a pyproject.toml.

 

Summary

In this chapter, you learned why virtual environments are useful and you discovered several implementations of them and their advantages. We explored how to create virtual environments and how to install multiple different Python versions. Finally, we covered how to manage the dependencies for your Python projects.

Since Python is an interpreted language, it is easily possible to run code from the interpreter directly instead of through a Python file.

The default Python interpreter already features command history and depending on your install, basic autocompletion.

But with alternative interpreters we can have many more features in our interpreter such as syntax highlighting, smart autocompletion which includes documentation, and more.

The next chapter will show us several alternative interpreters and their advantages.

Join our community on Discord

Join our community’s Discord space for discussions with the author and other readers: https://discord.gg/QMzJenHuJf

About the Author
  • Rick van Hattem

    Rick van Hattem is an experienced programmer, entrepreneur, Stack Overflow veteran, and software/database architect with more than 20 years of programming experience, including 15 years with Python. He has extensive experience with high-performance architecture featuring large amounts of concurrent users and/or data. Rick has founded several start-ups and has consulted many companies, including a few Y Combinator start-ups and several large businesses

    Browse publications by this author
Latest Reviews (1 reviews total)
Good to read, something I use when having a problem. It's more a dictionary book rather than reading it from page 1 to end.
Mastering Python - Second Edition
Unlock this book and the full library FREE for 7 days
Start now