Installing libraries and packages

Python packages are a set of python modules, while python libraries are a group of python functions aimed to carry out special tasks. There are over 137,000 python libraries and over 235,000 python packages. These libraries and packages can ease a developer’s experience and avoid the need to re-invent the wheel, as the saying goes. Henceforth, the word package(s) will be used when referring to both packages and libraries. While there are distinctions within python, the words packages and libraries are frequently interchanged, and the process for installing them is generally the same.

There are several ways that Python packages can be installed and managed. ARC support staff can install packages that can be made available to anyone who loads the appropriate module, while individuals can install packages for their own use. There are two main routes for users to install packages. One is to install directly into the user’s personal python library using the pip command. In that case, all packages (for a given version of python) are stored in the same directory. However, this approach can result in conflicts with package version requirements.

Sometimes one application needs a particular version of a package but a different application needs another version. Since the requirements conflict, installing either version will leave one application unable to run. This situation can be resolved by using virtual environments. A virtual environment is a semi-isolated Python environment that allows packages to be installed for use by a particular application or for a particular project.

Common tools used for Python package installation and environment management include: 

  • pip – the Python package installer, can be used on its own or within a virtual environment
  • venv – an environment manager within which pip is used to install packages
  • virtualenv – an environment manager within which pip is used to install packages
  • conda – an environment manager and a package installer, it does not rely on pip to install packages

You should pick one method and use it exclusively to avoid mixing types of installations.

Python packages are installed for the specific version of Python that is in use during installation. If you switch from using a module for one version of Python to a different one with where either the major or minor version changes, then you will have to re-install any packages/libraries in order to make them available in the library of the new version of Python. You only need to install Python packages once for each cluster on which you wish to use the library and, separately, for each version of Python that you use. Please note, Python packages should be installed using the command line from a login node, not from within Jupyter Notebook or the JupyterLab app.

A brief description of how to use each tool to install packages and manage virtual environments, along with a description of where you can expect to find installed packages, is provided below.

pip

To install a Python package into your personal library using pip, enter the following command, replacing <package_name> with the actual package name:

$ pip install --user <package_name>

The --user tag will, by default, place packages in

$HOME/.local/lib/python?.?/site-packages

where ?.? indicates the versioning of the Python release. The library will then be available to you for this and future sessions.

You can install a specific version of a package by giving the package name followed by == and the version number: pip install --user tensorflow==2.3.2

venv

venv is the standard Python tool for creating virtual environments, and has been part of Python since version 3.3. Starting with Python 3.4, it defaults to installing pip into all created virtual environments. Installing packages into an active venv is done via the pip command, as described above.

Virtual environments are created as follows:

$ python -m venv /path/to/new/virtual/environment

Alternatively, you can change into the directory of the project you are working on and simply provide a name for the virtual environment in place of the full path:

$ cd /path/to/my/project
$ python -m venv myenv

To activate a virtual environment, type:

$ source myenv/bin/activate

If you are not in your project directory, then you must provide the full path to the virtual environment you specified when creating the virtual environment:

$ source /path/to/my/project/myenv/bin/activate

When you are in an active virtual environment, you will see the name of the environment in the prompt. The PATH environment variable is updated so that the virtual environment’s bin directory is at the beginning:

(myenv) $ which pip python
~/my_project/myvenv/bin/pip
~/my_project/myvenv/bin/python

At this point, you would use the pip command to install any needed packages. Packages that you install using pip while in a virtual environment will be placed in the myenv folder, isolated from the global Python installation, and only available to you from within the virtual environment.

You can deactivate a virtual environment by typing deactivate in your terminal.

virtualenv

virtualenv is a third party alternative (and predecessor) to venv. It comes installed with the Developer Python modules. However, it is not installed for the Anaconda modules. If you would like to work within a virtual environment using virtualenv with one of the Anaconda modules, you will need to install it.

Install virtualenv with pip:

pip install --user virtualenv

Create a virtual environment for a project:

$ cd project_folder
$ virtualenv <env_name>

Similar to the way venv works with Python 3, virtualenv myenv will create a folder in the current directory which will contain the Python executable files and a copy of the pip library which you can use to install other packages. To begin using the virtual environment, it has to be activated:

$ source myenv/bin/activate

The name of the current virtual environment will now appear in parenthesis to the left of the prompt to let you know that it’s active.

When done working in the virtual environment, simply deactivate it:

(myvenv) $ deactivate

conda

Conda is both a package installer, like pip, and an environment manager, like venv and virtualenv. While pipvenv, and virtualenv are for Python, conda is language agnostic and works with other languages as well as Python.

To create a virtual environment for Python with conda, enter the following:

$ conda create --name conda-env python

where conda-env can be replaced with whatever name you choose for your virtual environment. Also, -n can be used in place of --name.
This environment will use the same version of Python as your current shell’s Python interpreter. To specify a different version of Python, specify the version number when creating your virtual environment as follows:

$ conda create -n conda-env python=3.7

You can install additional packages when creating an environment, by specifying them after the environment name. You can also specify which versions of packages you’d like to install.

$ conda create -n conda-env python=3.7 numpy=1.16.1 requests=2.19.1

It’s recommended to install all packages that you want to include in an environment at the same time in order to avoid dependency conflicts.

You can then activate your conda environment as follows:

$ conda activate conda-env
(conda-env) $

As when using other virtual environment programs, the name of the current virtual environment will now appear in parenthesis to the left of the prompt to let you know that it’s active. When you are finished working in the environment, simply enter $ conda deactivate and your normal prompt will return.

Virtual environments created with conda reside, by default, in the envs directory found in the following path: /home/$USER/.conda/envs