Relion

RELION on Great Lakes

Introduction & Documentation

RELION (REgularised LIkelihood OptimisatioN — pronounce rely-on) is a program implementing an empirical Bayesian approach for cryo-EM refinement.

Useful links from the original document:

Connecting to Great Lakes (GUI)

Two supported ways to open the RELION GUI on Great Lakes:

Open On Demand (OOD) — preferred

  1. Open greatlakes.arc-ts.umich.edu in a browser.
  2. Go to Interactive Apps > Basic Desktop.
  3. Fill form: partition standard, 2 hours, 4 cores, 8 GB memory.
  4. Launch and open a terminal on the allocated desktop.
  5. Load RELION module: module load relion-***/x.x.x
  6. Change to your project directory and run: relion

SSH with X-forwarding

  1. ssh -X greatlakes.arc-ts.umich.edu
  2. module load relion-***/x.x.x
  3. cd /path/to/your/relion/project/directory
  4. relion

Compute and Running tabs (overview)

Compute tab options control how RELION reads data and uses memory / I/O:

  • Use parallel disc I/O? — Yes: all MPI followers read images; No: only leader reads and distributes (--no_parallel_disc_io).
  • Number of pooled particles — affects memory usage.
  • Pre-read all particles into RAM? — adds --preread_images; can improve performance if dataset small.
  • Copy particles to scratch directory — e.g. /tmpssd/$SLURM_JOB_ID.
  • Combine iterations through disk? — usually set to No to avoid extra I/O.
  • Use GPU acceleration? — Yes for relion-gpu; select appropriate SBATCH flags to allocate GPUs.

Running tab options map to SBATCH arguments:

  • Number of MPI procs--ntasks
  • Number of threads--cpus-per-task
  • Submit to queue? — set to Yes
  • Queue name--partition
  • Queue submit command — usually sbatch
  • Account--account
  • Walltime--time
  • Memory Per Thread--mem-per-cpu

Note on --preread_images

The amount of memory required per MPI proc is determined by:
GiB per MPI Proc =  (N × boxsize2 × 4 ) / 230 

where:

  • N = Number of particles
  • boxsize = Box size (in pixels) per particle
  • 4 = Number of bytes per pixel (float32)
  • 230 = Number of bytes in a gibibyte (GiB)
Example Calculation:
For N = 100,000 particles, boxsize = 350:
GiB per MPI Proc = (100,000 × 3502 × 4 ) / 230 45.63 GiB 
Therefore, 100,000 particles with box size 350 pixels will need approximately 46 GiB of RAM per MPI Proc.

Basic Parameters to Get Started

The original document gives example configurations by job type. Below are short summaries for typical settings.

Class2D & Class3D on GPUs (GPU partition)

  • Compute: Use parallel disc I/O? Yes
  • Running: Number of MPI procs: 3; Number of threads: 4; Partition: gpu; Walltime: 8:00:00; Memory per thread: 10g
  • SBATCH extras: --gpus=v100:2, --nodes=1

SPGPU Partition

Similar to GPU partition but examples use A40 GPUs and spgpu partition.

2D Classification & 3D Class (CPU)

  • Partition: standard
  • Number of MPI procs: 32; threads: 8; Walltime: 8:00:00; Memory per thread: 5g

Running Modes

RELION supports both multi-threading and distributed MPI tasks. Some job types only run single-task; others support MPI and multi-threading. Examples:

  • Import: single task only.
  • CTF estimation (ctffind): multiple distributed tasks, single-threaded.
  • MotionCor2: distributed tasks with multi-threading.

When MPI procs > 1 you must run MPI-enabled executables, typically launched via srun --mpi=pmix_v4.

Using GPUs

To enable GPUs: set Use GPU acceleration? to Yes and allocate GPUs in SBATCH directives, e.g. --gpus=v100:4. Leave "Which GPUs to use" blank in the GUI unless you need specific devices.

Match the partition to GPU availability (gpu, spgpu, gpu_mig40) and set threads per GPU to 4. Increasing threads too much may crash jobs.

Motion Correction & CTF

RELION's CPU motion correction

CPU-only motion correction can be memory intensive (~8g per CPU). The document includes an equation to estimate memory required based on image size and frames.

MotionCor2 (external GPU tool)

  • Set RELION motion implementation to No and provide correct MotionCor2 executable path (example: /sw/pkgs/lsi/motioncor2/1.4.7/bin/motioncor2).

CTFFIND-4

CTFFIND-4 is CPU-only; point the CTFFIND-4.1 executable to the correct path (example: /sw/pkgs/lsi/ctffind/4.1.14/bin/ctffind).

SBATCH submission template

#!/usr/bin/env bash
#SBATCH --job-name=XXXnameXXXrun
#SBATCH --ntasks=XXXmpinodesXXX
#SBATCH --partition=XXXqueueXXX
#SBATCH --cpus-per-task=XXXthreadsXXX
#SBATCH --error=XXXerrfileXXX
#SBATCH --output=XXXoutfileXXX
#SBATCH --open-mode=append
#SBATCH --account=XXXextra1XXX
#SBATCH --time=XXXextra2XXX
#SBATCH --mem-per-cpu=XXXextra3XXX
#SBATCH XXXextra4XXX
#SBATCH XXXextra5XXX
#SBATCH XXXextra6XXX
#SBATCH XXXextra7XXX
module --redirect list
cmd_exec=$(cat << EOF
XXXcommandXXX
EOF
)
# Source additional MPI utilities if available
mpi_utils="/sw/pkgs/lsi/relion-gpu/utils/add_extra_mpi_task.sh"
if [[ -f "$mpi_utils" ]]; then
    # shellcheck source=/dev/null
    source "$mpi_utils"
fi
# RELION will have two commands on separate lines in certain job types.
# This deals with that case.
lines=$(wc -l <<< "$cmd_exec")
counter=0
while IFS=$'\n' read -r cmd; do
    counter=$((counter + 1))
    if [ "$lines" -gt 1 ]; then
        echo ""
        echo "Command $counter:"
        echo "srun --mem-per-cpu=XXXextra3XXX --mpi=pmix_v4 ${cmd}"
        echo ""
        srun --mem-per-cpu=XXXextra3XXX --mpi=pmix_v4 ${cmd}
        echo ""
    else
        echo ""
        echo "Command:"
        echo "srun --mem-per-cpu=XXXextra3XXX --mpi=pmix_v4 ${cmd}"
        echo ""
        srun --mem-per-cpu=XXXextra3XXX --mpi=pmix_v4 ${cmd}
        echo ""
    fi
done <<< "$cmd_exec"

The file add_extra_mpi_task.sh is below:

#!/usr/bin/env bash
# Don't bother unless nodes have been allocated
if [[ -z $SLURM_JOB_NODELIST ]]; then
  if [[ -n $SLURM_HOSTFILE ]]; then
    unset SLURM_HOSTFILE
  fi
  return
fi
# Don't bother unless nodes have GPUs
if [[ -z $SLURM_JOB_GPUS ]]; then
  if [[ -n $SLURM_HOSTFILE ]]; then
    unset SLURM_HOSTFILE
  fi
  return
fi
# Don't bother unless multiple tasks have been allocated, and the number is odd
if [[ -z $SLURM_NTASKS_PER_NODE ]]; then
  if [[ -n $SLURM_HOSTFILE ]]; then
      unset SLURM_HOSTFILE
  fi
  return
# Check if SLURM_NTASKS_PER_NODE is less than 2
elif [[ ${SLURM_NTASKS_PER_NODE} -lt 2 ]]; then
  if [[ -n $SLURM_HOSTFILE ]]; then
    unset SLURM_HOSTFILE
  fi
  return
# Check if SLURM_NTASKS_PER_NODE is an even number
elif [[ $((SLURM_NTASKS_PER_NODE%2)) == 0 ]]; then
  if [[ -n $SLURM_HOSTFILE ]]; then
    unset SLURM_HOSTFILE
  fi
  return
fi
# Don't bother unless there is more than one node
mapfile -t array < <(scontrol show hostname $SLURM_JOB_NODELIST)
file=$(mktemp --suffix .SLURM_JOB_NODELIST)
if [[ ${#array[@]} -eq 1 ]]; then
  for (( j = 0; j < $((SLURM_NTASKS_PER_NODE)); j++ )); do
    echo ${array[0]} >> $file
  done
else
  echo ${array[0]} > $file
  for (( i = 0; i < ${SLURM_JOB_NUM_NODES}; i++ )); do
    for (( j = 0 ; j < $((SLURM_NTASKS_PER_NODE-1)); j++ )); do
      echo ${array[${i}]} >> $file
    done
  done
fi
# All conditions met, set hostfile and distribution, unset ntasks per node
export SLURM_HOSTFILE=$file
export SLURM_DISTRIBUTION=arbitrary
unset SLURM_NTASKS_PER_NODE

Known Problems

Zombification

Sometimes an MPI rank exits and the leader waits indefinitely. Cancel the job and restart or continue from the last checkpoint.

Not enough GPU memory

Reduce number of classes, pool size, or box size — or run on CPUs.

Overburdening GPUs

Assigning too many threads or tasks per GPU will cause CUDA allocation errors. Match tasks to available GPUs.

Empty or Corrupted Micrographs

If a micrograph has 0 bytes, RELION may fail. Check file sizes (ls -l *.mrc) or use relion_image_handler --stats --i to inspect images.