LMOD software configuration manager - How the cluster delivers software

Several versions of software may also need to be provided on the same system. Some software depends on other software; that is, will only run if some other software is also available. Environment variables often need to be set to properly configure software.

Lmod is a software package that is used to manage which other software packages are available and properly configure them. The Lmod package provides the module command, which what you will use to access most software installed on the cluster.

Using Lmod

The modules software provides the module command, an easy mechanism for changing the environment as you need to add or remove software packages from your environment. A module is a collection of environment variable settings that can be loaded or unloaded. When you first log in, a set of modules is loaded by default. To see which modules are currently loaded, you can use the command

$ module list

To see what modules are available, or to see which versions of a particular package (Matlab in the example) are available, you can use (respectively) the commands

Loading a software package’s module will set up the environment variables needed for that package to run.

Unloading a software package’s module remove the environment variables that were added for the package.

$ module available
[ . . . deleted . . . ]
$ module av matlab
----------------- /usr/flux/software/rhel6/Modules/modulefiles -----------------
matlab/2010b          matlab/2012b          matlab/2014a
matlab/2012a(default) matlab/2013a

where available can be abbreviated to as little as av for brevity. As you can see, there are several version of Matlab available. If you use the command

$ module load matlab

the version labeled (default) will be loaded. If there is no default version set, then the one with the highest version number will be loaded. To load a version other than the default, you specify the version as it is displayed by the module av command; for example,

$ module load matlab/2014a

To unload a module, you would use

$ module unload matlab

where you do not need to specify the version of the package’s module that you are unloading (rm is a synonym for unload).

To see what a module will set in your environment, say for the software package R, you would use

$ module show R
-------------------------------------------------------------------
/home/software/rhel6/Modules/modulefiles/R/2.15.1:

conflict	 R 
prepend-path	 PATH /home/software/rhel6/R/2.15.1-gcc/bin 
prepend-path	 LD_LIBRARY_PATH /home/software/rhel6/R/2.15.1-gcc/lib64/R/lib 
module-whatis	 R is a software environment for statistical computing. 
module-whatis	 Flux Documentation: https://arc-ts.umich.edu/software/R/
module-whatis	 Vendor Website: http://www.r-project.org/ 
module-whatis	 Manual: http://cran.r-project.org/manuals.html 
-------------------------------------------------------------------

The output shows you the location of the R module file, any conflicts (in this case, only other versions of R) and what will be added to the beginning of your PATH and LD_LIBRARY_PATH (loader library path) environment variables. It will also show references to online documentation for the package, if they are available.

Module groups

Currently, modules are provided by the administrative unit that installs and maintains the software, and each unit has a module repository. The default repository is that provided by the CAEN HPC unit. Other units are LSA, the Medical School, the School of Public Health, and the Institute for Social Research. To see or use the software provided by those units, you must first load the unit module, which adds the unit’s collection of modules to the list.

For example, the following transcript shows loading the lsa module, querying available versions of the package fsl, where you will note there are two locations

$ module load lsa
$ module av fsl

------------------- /home/software/rhel6/Modules/modulefiles -------------------
fsl/5.0.2.2 fsl/5.0.6

----------------- /home/software/rhel6/lsa/Modules/modulefiles -----------------
fsl/4.1.9

$ module load fsl
$ module list
Currently Loaded Modulefiles:
  1) moab        3) modules     5) git/1.8.1   7) fsl/5.0.6
  2) torque      4) use.own     6) lsa

The most recent version was loaded from the first listed module repository.

Loading software

When you know the name of the software module you wish to load, say, stata, then you load it with the following command (the $ is the prompt; do not type it).

$ module load stata

Loading software will add any needed configuration to your environment so the software can be used. Software comes in different versions, and versions are added to the module name with a slash, as in this example which loads version 15 of the Stata module.

$ module load stata/15

A specific version of a module is always loaded, and you can see which modules/versions are loaded with

$ module list
Currently Loaded Modules:
  1) stata/15

Finding software

To load software, you must know the name of the module. The following sections illustrate ways to find software for which modules exist. The illustrations proceed from the most general to the most specific.

Each software module contains the module name, which is often the name of the package, a description of the software package, and a list of categories into which the installer of the software thinks the software falls. The search commands search different combinations of these.

Searching for software by keyword

The most general way to search for software, as you might wish to do when you don’t know the exact software name but do have some possible words that might describe its general use. Using a keyword search is the most general. That will search for the given term in the name, description, and categories of all modules. For example, searching for software that has statistics in all of the module descriptive field mentioned above is shown in the following example. The output shows that three software modules contain statistics in their description, category, or name: Rsas, and stata.

$ module keyword statistics
----------------------------------------------------------------------------
The following modules match your search criteria: "statistics"
----------------------------------------------------------------------------
  R: R/3.3.0, R/3.3.2, R/3.3.3, R/3.4.1, R/3.4.2
    R environment for statistical computing and graphics
  sas: sas/9.4-TS1M4, sas/9.4
    SAS statistical system software.
  stata: stata/14, stata/15
    Statistical analysis software.
----------------------------------------------------------------------------
To learn more about a package execute:
   $ module spider Foo
where "Foo" is the name of a module.

To find detailed information about a particular package you
must specify the version if there is more than one version:
   $ module spider Foo/11.1
----------------------------------------------------------------------------

This search will give you names of modules about the command to find more information about them.

Searching by name

To search for all installed software by module name, regardless whether it is currently available or not, you should use the spider subcommand. The following example shows the output from searching for fftw, which shows version 3.3.7 is installed and that the module fftw-mpi is a ‘near miss’.

$ module spider fftw
----------------------------------------------------------------------------
  fftw:
----------------------------------------------------------------------------
    Description:
      Libraries for computation of discrete Fourier transform.

     Versions:
        fftw/3.3.7
     Other possible modules matches:
        fftw-mpi
----------------------------------------------------------------------------
  To find other possible module matches execute:
      $ module -r spider '.*fftw.*'
----------------------------------------------------------------------------
  For detailed information about a specific "fftw" module (including how
  to load the modules) use the module's full name.
  For example:
     $ module spider fftw/3.3.7
----------------------------------------------------------------------------

Note that wildcard matches use the .* to match zero or more characters, not a plain * character.

The last line shows you the command to run to find out how you can load fftw. Running that command shows the following command.

$ module spider fftw/3.3.7
----------------------------------------------------------------------------
  fftw: fftw/3.3.7
----------------------------------------------------------------------------
    Description:
      Libraries for computation of discrete Fourier transform.

    You will need to load all module(s) on any one of the lines below before the "fftw/3.3.7" module is available to load.

      gcc/4.8.5
      intel/18.0.0

From that, you will see that you must load one of the modules gcc/4.8.5 or intel/18.0.0 first, after which the fftw software that matches the loaded compiler will then be available to load.

Available software

Software that is compiled and built here, as opposed to software that is purchased from a company, often depends on other software to run. For example, software that requires that the compiler module with which it was built be loaded for it to run properly. Software that requires a specific compiler is said to belong to a compiler family, and its module(s) will only be available to load once the compiler is loaded.

Here we show an abbreviated list of what is available with no compiler loaded.

$ module available
------------------ Core applications including compilers -------------------
   R/default (D)    gcc/4.8.5         matlab/R2017b        stata/14
   R/3.4.1          intel/18.0.0      sas/9.4-TS1M4 (D)    stata/15 (D)
   R/3.4.2          launcher/3.1.1    sas/9.4

Any module listed above is available to be loaded without loading anything first. Software compiled with, say, the GCC, version 4.8.5, compiler is not listed. Loading the gcc/4.8.5 module will make additional software available, as shown below.

$ module load gcc/4.8.5
$ module available
------------------- Applications compiled with GCC 4.8.5 -------------------
   fftw/3.3.7     impi/2018.0.128    ompi/1.10.7        szip/2.1.1
   hdf5/1.8.20    netcdf/4.6.0       ompi/3.0.0  (D)

------------------ Core applications including compilers -------------------
   R/default (D)    gcc/4.8.5      (L)    matlab/R2017b        stata/14
   R/3.4.1          intel/18.0.0          sas/9.4-TS1M4 (D)    stata/15 (D)
   R/3.4.2          launcher/3.1.1        sas/9.4

Prior to loading the gcc/4.8.5 module (or some other compiler for which fftw has been compiled), it was not considered available for loading.

You can combine a prerequisite on the same load command as the package you want, as in

$ module load gcc/4.8.5 fftw

If you do not specify the version number for a module, as is the case for the fftw module in the last example, the default version (or the numerically largest) will be loaded.

More ways to find modules

In the output from module av matlab, module suggests a couple of alternate ways to search for software. When you use module av, it will match the search string anywhere in the module name; for example,

$ module av gcc

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   fftw/3.3.4/gcc/4.8.5                          hdf5-par/1.8.16/gcc/4.8.5
   fftw/3.3.4/gcc/4.9.3                   (D)    hdf5-par/1.8.16/gcc/4.9.3 (D)
   gcc/4.8.5                                     hdf5/1.8.16/gcc/4.8.5
   gcc/4.9.3                                     hdf5/1.8.16/gcc/4.9.3     (D)
   gcc/5.4.0                              (D)    openmpi/1.10.2/gcc/4.8.5
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3        openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)    openmpi/1.10.2/gcc/5.4.0  (D)

  Where:
   D:  Default Module

However, if you are looking for just gcc, that is more than you really want. So, you can use one of two commands. The first is

$ module spider gcc

----------------------------------------------------------------------------
  gcc:
----------------------------------------------------------------------------
    Description:
      GNU compiler suite

     Versions:
        gcc/4.8.5
        gcc/4.9.3
        gcc/5.4.0

     Other possible modules matches:
        fftw/3.3.4/gcc  gromacs/5.1.2/openmpi/1.10.2/gcc  hdf5-par/1.8.16/gcc  ...

----------------------------------------------------------------------------
  To find other possible module matches do:
      module -r spider '.*gcc.*'

----------------------------------------------------------------------------
  For detailed information about a specific "gcc" module (including how to load
the modules) use the module's full name.
  For example:

     $ module spider gcc/5.4.0
----------------------------------------------------------------------------

That is probably more like what you are looking for if you really are searching just for gcc. That also gives suggestions for alternate searching, but let us return to the first set of suggestions, and see what we get with keyword searching.

At the time of writing, if you were to use module av to look for Python, you would get this result.

[bennet@flux-build-centos7 modulefiles]$ module av python

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   python-dev/3.5.1

However, we have Python distributions that are installed that do not have python as part of the module name. In this case, module spider will also not help. Instead, you can use

$ module keyword python

----------------------------------------------------------------------------
The following modules match your search criteria: "python"
----------------------------------------------------------------------------

  anaconda2: anaconda2/4.0.0
    Python 2 distribution.

  anaconda3: anaconda3/4.0.0
    Python 3 distribution.

  epd: epd/7.6-1
    Enthought Python Distribution

  python-dev: python-dev/3.5.1
    Python is a general purpose programming language

----------------------------------------------------------------------------
To learn more about a package enter:

   $ module spider Foo

where "Foo" is the name of a module

To find detailed information about a particular package you
must enter the version if there is more than one version:

   $ module spider Foo/11.1
----------------------------------------------------------------------------

That displays all the modules that have been tagged with the python keyword or where python appears in the module name.

More about software versions

Note that Lmod will indicate the default version in the output from module av, which will be loaded if you do not specify the version.

$ module av gromacs

------------------------ /sw/arcts/centos7/modulefiles -------------------------
   gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3
   gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0 (D)

  Where:
   D:  Default Module

When loading modules with complex names, for example, gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0, you can specify up to the second-from-last element to load the default version. That is,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc

will load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0

To load a version other than the default, specify the version as it is displayed by the module av command; for example,

$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/4.9.3

When unloading a module, only the base name need be given; for example, if you loaded either gromacs module,

$ module unload gromacs

Module prerequisites and named sets

Some modules rely on other modules. For example, the gromacs module has many dependencies, some of which conflict with the default modules. To load it, you might first clear all modules with module purge, then load the dependencies, then finally load gromacs.

$ module list
Currently Loaded Modules:
  1) intel/16.0.3   2) openmpi/1.10.2/intel/16.0.3   3) StdEnv

$ module purge
$ module load gcc/5.4.0 openmpi/1.10.2/gcc/5.4.0 boost/1.61.0 mkl/11.3.3
$ module load gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
$ module list
Currently Loaded Modules:
  1) gcc/5.4.0                  4) mkl/11.3.3
  2) openmpi/1.10.2/gcc/5.4.0   5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
  3) boost/1.61.0

That’s a lot to do each time. Lmod provides a way to store a set of modules and give it a name. So, once you have the above list of modules loaded, you can use

$ module save my_gromacs

to save the whole list under the name my_gromacs. We recommend that you make each set fully self-contained, and that you use the full name/version for each module (to prevent problems if the default version of one of them changes), then use the combination

$ module purge
$ module restore my_gromacs
Restoring modules to user's my_gromacs

To see a list of the named sets you have (which are stored in ${HOME}/.lmod.d, use

$ module savelist
Named collection list:
  1) my_gromacs

and to see which modules are in a set, use

$ module describe my_gromacs
Collection "my_gromacs" contains: 
   1) gcc/5.4.0                   4) mkl/11.3.3
   2) openmpi/1.10.2/gcc/5.4.0    5) gromacs/5.1.2/openmpi/1.10.2/gcc/5.4.0
   3) boost/1.61.0

How to get more information about the module and the software

We try to provide some helpful information about the modules. For example,

$ module help openmpi/1.10.2/gcc/5.4.0
------------- Module Specific Help for "openmpi/1.10.2/gcc/5.4.0" --------------

OpenMPI consists of a set of compiler 'wrappers' that include the appropriate
settings for compiling MPI programs on the cluster.  The most commonly used
of these are

    mpicc
    mpic++
    mpif90

Those are used in the same way as the regular compiler program, for example,

    $ mpicc -o hello hello.c

will produce an executable program file, hello, from C source code in hello.c.

In addition to adding the OpenMPI executables to your path, the following
environment variables set by the openmpi module.

    $MPI_HOME

For some generic information about the program you can use

$ module whatis openmpi/1.10.2/gcc/5.4.0
openmpi/1.10.2/gcc/5.4.0      : Name: openmpi
openmpi/1.10.2/gcc/5.4.0      : Description: OpenMPI implementation of the MPI protocol
openmpi/1.10.2/gcc/5.4.0      : License information: https://www.open-mpi.org/community/license.php
openmpi/1.10.2/gcc/5.4.0      : Category: Utility, Development, Core
openmpi/1.10.2/gcc/5.4.0      : Package documentation: https://www.open-mpi.org/doc/
openmpi/1.10.2/gcc/5.4.0      : ARC examples: /scratch/data/examples/openmpi/
openmpi/1.10.2/gcc/5.4.0      : Version: 1.10.2

and for information about what the module will set in the environment (in addition to the help text), you can use

$ module show openmpi/1.10.2/gcc/5.4.0
[ . . . .  Help text edited for space -- see above . . . . ]
whatis("Name: openmpi")
whatis("Description: OpenMPI implementation of the MPI protocol")
whatis("License information: https://www.open-mpi.org/community/license.php")
whatis("Category: Utility, Development, Core")
whatis("Package documentation: https://www.open-mpi.org/doc/")
whatis("ARC examples: /scratch/data/examples/openmpi/")
whatis("Version: 1.10.2")
prereq("gcc/5.4.0")
prepend_path("PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")
prepend_path("MANPATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/share/man")
prepend_path("LD_LIBRARY_PATH","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/lib")
setenv("MPI_HOME","/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0")

where the lines to attend to are the prepend_path()setenv(), and prereq(). There is also an append_path() function that you may see. The prereq() function sets the list of other modules that must be loaded before the one being displayed. The rest set or modify the environment variable listed as the first argument; for example,

prepend_path("PATH", "/sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin")

adds /sw/arcts/centos7/openmpi/1.10.2-gcc-5.4.0/bin to the beginning of the PATH environment variable.