Mamba provides a powerful environment management system that includes both the conda
and mamba
package managers. While conda
is widely known for managing environments and packages in the Python ecosystem, mamba
offers a faster, more efficient alternative for solving package dependencies and installations. With mamba
, users can benefit from significant performance improvements, especially when managing large environments, while still maintaining compatibility with conda's extensive ecosystem. Both tools can be used interchangeably within the Mamba framework, offering flexibility for users depending on their needs.
Mamba Package Manager and Conda Distribution
To use Mamba, you need to load the appropriate module for the Python distribution version you want to use. The following example shows how to load the Mamba module (the $
is the prompt; do not type it):
$ module load mamba
To check which versions are available, use the module
command:
$ module available mamba
Then, load the version you need. Please see our page on Lmod for more information on loading modules.
Installing Packages with Mamba
Mamba is an optimized package manager, a drop-in replacement for conda
. To create a new environment and install packages:
$ mamba create -n myenv python=3.10
After creating an environment, activate it:
$ source activate myenv
You can then install additional packages:
$ mamba install numpy scipy
To deactivate the environment:
$ source deactivate
Installing Packages with Conda
To create a new environment and install packages:
$ conda create -n myenv python=3.10
After creating an environment, activate it:
$ source activate myenv
You can then install additional packages:
$ conda install numpy scipy
To deactivate the environment:
$ source deactivate
Running Python Scripts
To run Python at a prompt, simply type:
$ python
You may also run Python scripts directly from the command line. For example, if you have a script my_script.py
in the current directory:
$ python ./my_script.py
As with MATLAB, you can run Python scripts on a login node for smaller, less computationally intensive tasks. Ensure your program does not exceed memory or CPU limits (as a guideline, no more than 4 GB of memory and a runtime of less than 5–10 minutes). For large, long-running computations, submit the job through the batch system.
Submitting Jobs with Mamba
When running Python scripts on the cluster, especially for larger jobs, it’s recommended to submit them through SLURM or other job scheduling systems. Suppose you have a Python program my_script.py
. You can run this in batch mode by creating a SLURM script that looks like this:
#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --output=output.txt
#SBATCH --ntasks=1
#SBATCH --time=01:00:00
#SBATCH --mem=4G
# Load the Mamba module and activate environment
module load mamba
source activate myenv
# Run the Python script
python my_script.py
Submit this script with the sbatch
command:
$ sbatch my_job.slurm
Parallel Computing with Python and Mamba
Python supports parallel computing using several libraries like multiprocessing
, Dask
, and mpi4py
. These libraries allow Python code to take advantage of multiple CPUs available on a node or across nodes in a cluster.
Using Multiprocessing
For simple parallelism using multiple cores on a single machine, Python’s built-in multiprocessing
library can be used. For example, in your script:
import multiprocessing as mp
def worker(task):
# Your task here
pass
if __name__ == '__main__':
pool = mp.Pool(mp.cpu_count()) # Utilize all available CPUs
results = pool.map(worker, tasks)
pool.close()
pool.join()
Distributed Computing with Dask
For distributed computing across multiple nodes or larger datasets, Dask is a powerful framework. Dask enables parallelism for many Python libraries like numpy
, pandas
, and more.
Install Dask using Mamba:
$ mamba install dask distributed
You can submit Dask jobs via SLURM. Here’s a basic SLURM submission script for a Dask job:
#!/bin/bash
#SBATCH --job-name=dask_job
#SBATCH --output=dask_output.txt
#SBATCH --ntasks=4
#SBATCH --time=02:00:00
#SBATCH --mem=16G
module load mamba
source activate myenv
# Launch Dask Scheduler and Workers
dask-scheduler --scheduler-file scheduler.json &
sleep 5
dask-worker --scheduler-file scheduler.json --nthreads 4 &
dask-worker --scheduler-file scheduler.json --nthreads 4 &
You can then connect to this Dask cluster from your Python code using:
from dask.distributed import Client
client = Client(scheduler_file='scheduler.json')
# Submit tasks to the cluster
Using mpi4py for MPI
For more advanced distributed computing across multiple nodes, you can use mpi4py
, a Python binding for MPI. Install it via Mamba:
$ mamba install mpi4py
In your Python script, use MPI as follows:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
print(f'Hello from rank {rank} out of {size}')
A SLURM script for running an MPI Python program:
#!/bin/bash
#SBATCH --job-name=mpi_python
#SBATCH --ntasks=4
#SBATCH --time=01:00:00
#SBATCH --mem=8G
module load mamba
source activate myenv
# Run the MPI program
mpirun python mpi_script.py
Submit this script with:
$ sbatch mpi_job.slurm
Cleaning Up
After your parallel job has completed, ensure all resources are released. For instance, when using Dask, shut down your workers and scheduler at the end of your job, or use client.close()
in your Python script.
Pre-installed packages
Name | Version | Build | Channel |
---|---|---|---|
_libgcc_mutex | 0.1 | conda_forge | conda-forge |
_openmp_mutex | 4.5 | 2_kmp_llvm | conda-forge |
anyio | 4.4.0 | pyhd8ed1ab_0 | conda-forge |
archspec | 0.2.3 | pyhd8ed1ab_0 | conda-forge |
argon2-cffi | 23.1.0 | pyhd8ed1ab_0 | conda-forge |
argon2-cffi-bindings | 21.2.0 | py312h98912ed_4 | conda-forge |
arrow | 1.3.0 | pyhd8ed1ab_0 | conda-forge |
asttokens | 2.4.1 | pyhd8ed1ab_0 | conda-forge |
async-lru | 2.0.4 | pyhd8ed1ab_0 | conda-forge |
attrs | 24.2.0 | pyh71513ae_0 | conda-forge |
babel | 2.14.0 | pyhd8ed1ab_0 | conda-forge |
beautifulsoup4 | 4.12.3 | pyha770c72_0 | conda-forge |
blas | 2.116 | mkl | conda-forge |
blas-devel | 3.9.0 | 16_linux64_mkl | conda-forge |
bleach | 6.1.0 | pyhd8ed1ab_0 | conda-forge |
boltons | 24.0.0 | pyhd8ed1ab_0 | conda-forge |
brotli-python | 1.1.0 | py312h30efb56_1 | conda-forge |
bzip2 | 1.0.8 | hd590300_5 | conda-forge |
c-ares | 1.28.1 | hd590300_0 | conda-forge |
ca-certificates | 2024.7.4 | hbcca054_0 | conda-forge |
cached-property | 1.5.2 | hd8ed1ab_1 | conda-forge |
cached_property | 1.5.2 | pyha770c72_1 | conda-forge |
certifi | 2024.7.4 | pyhd8ed1ab_0 | conda-forge |
cffi | 1.17.0 | py312h1671c18_0 | conda-forge |
charset-normalizer | 3.3.2 | pyhd8ed1ab_0 | conda-forge |
colorama | 0.4.6 | pyhd8ed1ab_0 | conda-forge |
comm | 0.2.2 | pyhd8ed1ab_0 | conda-forge |
conda | 24.7.1 | py312h7900ff3_0 | conda-forge |
conda-libmamba-solver | 24.1.0 | pyhd8ed1ab_0 | conda-forge |
conda-package-handling | 2.2.0 | pyh38be061_0 | conda-forge |
conda-package-streaming | 0.9.0 | pyhd8ed1ab_0 | conda-forge |
cuda-cudart | 11.8.89 | 0 | nvidia |
cuda-cupti | 11.8.87 | 0 | nvidia |
cuda-libraries | 11.8.0 | 0 | nvidia |
cuda-nvrtc | 11.8.89 | 0 | nvidia |
cuda-nvtx | 11.8.86 | 0 | nvidia |
cuda-runtime | 11.8.0 | 0 | nvidia |
cuda-version | 12.6 | 3 | nvidia |
debugpy | 1.8.5 | py312hca68cad_0 | conda-forge |
decorator | 5.1.1 | pyhd8ed1ab_0 | conda-forge |
defusedxml | 0.7.1 | pyhd8ed1ab_0 | conda-forge |
distro | 1.9.0 | pyhd8ed1ab_0 | conda-forge |
entrypoints | 0.4 | pyhd8ed1ab_0 | conda-forge |
et_xmlfile | 1.1.0 | pyhd8ed1ab_0 | conda-forge |
fastapi | 0.106.2 | pyhd8ed1ab_0 | conda-forge |
flask | 2.5.1 | pyhd8ed1ab_0 | conda-forge |
fonttools | 4.39.0 | pyhd8ed1ab_0 | conda-forge |
freetype | 2.13.2 | hd8ed1ab_1 | conda-forge |
frozenlist | 1.6.3 | pyhd8ed1ab_0 | conda-forge |
h5netcdf | 1.1.0 | pyhd8ed1ab_0 | conda-forge |
h5py | 3.8.0 | pyhd8ed1ab_0 | conda-forge |
hdf5 | 1.14.1 | hb85507c_0 | conda-forge |
httpcore | 0.17.0 | pyhd8ed1ab_0 | conda-forge |
httpx | 0.24.1 | pyhd8ed1ab_0 | conda-forge |
hyperlink | 21.0.0 | pyhd8ed1ab_0 | conda-forge |
idna | 3.4 | pyhd8ed1ab_0 | conda-forge |
importlib-metadata | 6.7.0 | pyhd8ed1ab_0 | conda-forge |
importlib_resources | 6.0.0 | pyhd8ed1ab_0 | conda-forge |
iniconfig | 2.0.0 | pyhd8ed1ab_0 | conda-forge |
intervaltree | 3.1.0 | pyhd8ed1ab_0 | conda-forge |
ipywidgets | 8.0.5 | pyhd8ed1ab_0 | conda-forge |
jupyter-client | 8.3.2 | pyhd8ed1ab_0 | conda-forge |
jupyter-server | 2.24.0 | pyhd8ed1ab_0 | conda-forge |
jupyter-server-terminals | 0.4.3 | pyhd8ed1ab_0 | conda-forge |
kiwisolver | 1.4.4 | pyhd8ed1ab_0 | conda-forge |
ldap3 | 2.9.1 | pyhd8ed1ab_0 | conda-forge |
lxml | 4.9.3 | pyhd8ed1ab_0 | conda-forge |
markdown-it-py | 2.2.0 | pyhd8ed1ab_0 | conda-forge |
markupsafe | 2.1.3 | pyhd8ed1ab_0 | conda-forge |
matplotlib | 3.8.0 | pyhd8ed1ab_0 | conda-forge |
matplotlib-inline | 0.1.6 | pyhd8ed1ab_0 | conda-forge |
mistune | 2.0.5 | pyhd8ed1ab_0 | conda-forge |
mkl | 2023.2 | 0 | conda-forge |
mkl-service | 2.4.0 | pyh5e1b64e_0 | conda-forge |
mkl_fft | 1.3.0 | pyha5e1b64e_0 | conda-forge |
mpi4py | 3.1.4 | pyhd8ed1ab_0 | conda-forge |
multidict | 6.0.4 | pyhd8ed1ab_0 | conda-forge |
networkx | 3.1 | pyhd8ed1ab_0 | conda-forge |
notebook | 7.0.0 | pyhd8ed1ab_0 | conda-forge |
numpy | 1.25.2 | py312hefef724_0 | conda-forge |
numpy-base | 1.25.2 | py312h11e9ef6_0 | conda-forge |
numpy-quaternion | 2023.6.27 | pyhd8ed1ab_0 | conda-forge |
oauthlib | 3.2.2 | pyhd8ed1ab_0 | conda-forge |
openpyxl | 3.1.1 | pyhd8ed1ab_0 | conda-forge |
pandas | 2.1.2 | pyhd8ed1ab_0 | conda-forge |
pandas-datareader | 0.10.0 | pyhd8ed1ab_0 | conda-forge |
pandas-stubs | 1.2.0.3 | pyhd8ed1ab_0 | conda-forge |
parso | 0.8.4 | pyhd8ed1ab_0 | conda-forge |
pathspec | 0.10.3 | pyhd8ed1ab_0 | conda-forge |
pdfminer | 20221105 | pyhd8ed1ab_0 | conda-forge |
pdfminer.six | 20221105 | pyhd8ed1ab_0 | conda-forge |
pluggy | 1.2.0 | pyhd8ed1ab_0 | conda-forge |
ply | 3.11 | pyhd8ed1ab_0 | conda-forge |
prometheus-client | 0.16.0 | pyhd8ed1ab_0 | conda-forge |
prompt-toolkit | 3.0.40 | pyhd8ed1ab_0 | conda-forge |
protobuf | 4.24.2 | pyhd8ed1ab_0 | conda-forge |
psutil | 5.9.5 | pyhd8ed1ab_0 | conda-forge |
pycparser | 2.21 | pyhd8ed1ab_0 | conda-forge |
pydantic | 1.10.8 | pyhd8ed1ab_0 | conda-forge |
pydub | 0.25.1 | pyhd8ed1ab_0 | conda-forge |
pyopenssl | 24.0.0 | pyhd8ed1ab_0 | conda-forge |
pyparsing | 3.0.9 | pyhd8ed1ab_0 | conda-forge |
pytest | 7.8.0 | pyhd8ed1ab_0 | conda-forge |
python | 3.12.0 | h0a3f5b1_0 | conda-forge |
python_abi | 3.12 | 1_cp312 | conda-forge |
pytz | 2024.1 | pyhd8ed1ab_0 | conda-forge |
pyyaml | 6.0 | pyhd8ed1ab_0 | conda-forge |
requests | 2.31.0 | pyhd8ed1ab_0 | conda-forge |
scikit-learn | 1.3.2 | pyhd8ed1ab_0 | conda-forge |
scipy | 1.11.3 | pyhd8ed1ab_0 | conda-forge |
setuptools | 67.7.2 | pyhd8ed1ab_0 | conda-forge |
six | 1.16.0 | pyhd8ed1ab_0 | conda-forge |
sphinx | 6.4.0 | pyhd8ed1ab_0 | conda-forge |
sphinxcontrib-htmlhelp | 3.0.0 | pyhd8ed1ab_0 | conda-forge |
sqlalchemy | 2.0.15 | pyhd8ed1ab_0 | conda-forge |
sqlite | 3.43.0 | hca2bb3b_0 | conda-forge |
tensorboard | 2.15.0 | pyhd8ed1ab_0 | conda-forge |
tensorflow | 2.15.0 | pyhd8ed1ab_0 | conda-forge |
tensorflow-cpu | 2.15.0 | pyhd8ed1ab_0 | conda-forge |
tensorboard-data-server | 0.8.1 | pyhd8ed1ab_0 | conda-forge |
tifffile | 2024.8.3 | pyhd8ed1ab_0 | conda-forge |
toml | 0.10.2 | pyhd8ed1ab_0 | conda-forge |
tornado | 6.3.2 | pyhd8ed1ab_0 | conda-forge |
typing-extensions | 4.8.0 | pyhd8ed1ab_0 | conda-forge |
urllib3 | 1.26.14 | pyhd8ed1ab_0 | conda-forge |
watchdog | 3.0.0 | pyhd8ed1ab_0 | conda-forge |
webencodings | 0.5.1 | pyhd8ed1ab_0 | conda-forge |
werkzeug | 2.5.0 | pyhd8ed1ab_0 | conda-forge |
wheel | 0.40.0 | pyhd8ed1ab_0 | conda-forge |
xarray | 2024.7.0 | pyhd8ed1ab_0 | conda-forge |
yaml | 0.2.5 | he6c8c70_0 | conda-forge |
zlib | 1.2.13 | h7c83c4a_0 | conda-forge |
To view the list of pre-installed packages, execute the following command from an HPC terminal:
$ module load mamba/py3.11 & conda list
To view the list of pre-installed packages, execute the following command from an HPC terminal:
$ module load mamba/py3.10 & conda list