Configuration / ITS Documentation

Hardware

Node Type	Standard	Large Memory	GPU	GPU MIG40	SPGPU	VIZ
Partition Name	standard	largemem	gpu	gpu_mig40	spgpu	viz
Number of Nodes	455	8	25	2	28	4
Processors	2x 3.0 GHz Intel Xeon Gold 6154	2x 3.0 GHz Intel Xeon Gold 6154	2x 2.4 GHz Intel Xeon Gold 6148	2x 2.60 GHz Intel Xeon Platinum 8358P	2x 2.9 GHz Intel Xeon Gold 6226R	2x 2.4 GHz Intel Xeon Gold 6148
Cores per Node	36	36	40	64	32	40
RAM	187 GB (180 GB requestable)	1.5 TB (1,503 GB requestable)	187 GB (180 GB requestable)	1007 GB (1000 GB requestable)	376 GB (372 GB requestable)	187 GB (180 GB requestable)
Storage	480 GB SSD + 4 TB HDD	4 TB HDD	4 TB HDD	890 GB SSD + 3.5 TB SSD	480 GB SSD + 14 TB NVMe SSD	4 TB HDD
GPU	N/A	N/A	20 nodes: 2x NVIDIA Tesla V100 16GB 4 nodes: 3x V100 16GB	2 nodes: 4x NVIDIA A100 80GB, divided into 2 40GB MIG instances (16 total)	28 nodes: 8x NVIDIA A40 48GB	4 nodes: 1x NVIDIA Tesla P40 24GB

GPUs

Great Lakes has 52 NVIDIA Tesla V100 GPUs connected to 24 nodes and 4 NVIDIA A100 80GB GPUs connected to 1 node. 160 NVIDIA A40 GPUs connected to 28 nodes are also available for single-precision work.

GPU Model	NVIDIA Tesla V100	NVIDIA A40	NVIDIA A100
Number and Type of GPU	one Volta GPU	one Ampere GPU	one Ampere GPU
Peak double precision floating point perf.	7 TFLOPS	N/A	9.7 TFLOPS (non-Tensor) 19.5 TFLOPS (Tensor)
Peak single precision floating point perf.	14 TFLOPS	37.4 TFLOPS (non-Tensor) 74.8 TFLOPS (Tensor)	19.5 TFLOPS (non-tensor) 156 TFLOPS (Tensor)
Memory bandwidth (ECC off)	900 GB/sec	696 GB/sec	1935 GB/s
Memory size (GDDR5)	16 GB HBM2	48 GB GDDR5	80GB
CUDA cores	5120	10752	6912
RT cores	N/A	84	N/A
Tensor cores	N/A	336	432

Networking

The compute nodes are all interconnected with InfiniBand HDR100 networking, capable of 100 Gb/s throughput. In addition to the InfiniBand networking, there is 25 Gb/s ethernet for the login and transfer nodes and a gigabit Ethernet network that connects the remaining nodes. This is used for node management and NFS file system access.

Storage

The high-speed scratch file system provides 2 petabytes of storage at approximately 80 GB/s performance.

Scheduling & Billing

Computing jobs scheduling and billing on Great Lakes are managed completely through the Slurm Workload Manager

Operating Software

The Great Lakes cluster runs Redhat 8. We update the operating system on Great Lakes as Redhat releases new versions and our library of third-party applications offers support. Due to the need to support several types of drivers (AFS and Lustre file system drivers, InfiniBand network drivers and NVIDIA GPU drivers) and dozens of third party applications, we are cautious in upgrading and can lag Redhat releases by months.

Compilers, Parallel, & Scientific Libraries

Great Lakes supports the Gnu Compiler Collection, the Intel Compilers, and the PGI Compilers for C and Fortran. The Great Lakes cluster’s parallel library is OpenMPI. Great Lakes provides the Intel Math Kernel Library (MKL) set of high-performance mathematical libraries. Other common scientific libraries are compiled from source and include HDF5, NetCDF, FFTW3, Boost, and others.

Application Software

For detailed information, see the software page. (link TBD)