Thomas Bouvier

I collaborate with scientists to research, create, and optimize scalable, ML-accelerated pipelines that help answer broad scientific questions in X-ray science and materials discovery. I hold a PhD from Inria in High-Performance Computing (HPC) and am specialized in this field. In tandem with these efforts, I also work on low-level performance characterization on GPU platforms.

25 January 2025 00:00 · 6 min

Getting Started with Spack on Polaris

In this article, I will go through setting up Spack on the Polaris supercomputer.

Spack is a powerful package manager designed for HPC. Although presenting a steep learning curve, Spack may benefit workflows involving frequent builds with complex dependencies.

Polaris Supercomputer

The system has a theoretical peak performance of 34 petaflops (44 petaflops of Tensor Core FP64 performance), which ranks the system #47 on the top 500 list as of writing this blog post. The system is built from 560 nodes, each of which presents the following metrics:

Metric Value
Number of AMD EPYC Milan CPUs 1
Number of NVIDIA A100 GPUs 4
Total HBM2 Memory 160 GB
HBM2 Memory BW per GPU 1.6 TB/s
Total DDR4 Memory 512 GB
DDR4 Memory BW 204.8 GB/s
Number of NVMe SSDs 2
Total NVMe SSD Capacity 3.2 TB
Number of Cassini-based Slingshot 11 NICs 2
PCIe Gen4 BW 64 GB/s
NVLink BW 600 GB/s
Total GPU DP Tensor Core Flops 78 TF
Polaris Single Node Configuration

The official usage guide for Polaris is available here.

What makes this machine special?

Polaris is a Cray (a subsidiary of Hewlett Packard Enterprise) machine, so it needs some Cray special sauce to behave more sanely. The Cray Programming Environment (PE) uses three compiler wrappers for building software:

Each of these wrappers can select a specific vendor compiler based on the PrgEnv module loaded in the environment. Module files are an easy way to modify your environment in a controlled manner during a shell session. In general, they contain the information needed to run an application or use a library. The module command is used to interpret and execute module files, which stands as a convenient way to access system-provided packages and compilers.

The default PE on Polaris is currently NVHPC. The GNU compilers are available via another PE. The command module swap PrgEnv-nvhpc PrgEnv-gnu can be used to switch to the GNU PE (gcc, g++, gfortran). You should also run module load nvhpc-mixed to gain access to CUDA libraries that may be required for building executables.

module swap PrgEnv-nvhpc PrgEnv-gnu
module load nvhpc-mixed

These commands will be automated to be handled by Spack later in this tutorial, please bear with me.

You can find more information on compiling and linking on Polaris in the official overview.

Loading Spack

Spack is a powerful package manager designed for HPC. https://spack-tutorial.readthedocs.io/en/latest/tutorial_basics.html

First, log into Polaris using:

ssh <username>@polaris.alcf.anl.gov

Clone Spack using the following command:

mkdir -p ~/git
git clone --depth=2 https://github.com/spack/spack ~/git/spack-polaris

Polaris shares a home filesystem with other machines, which might be completely different hardware wise and use different module systems. The following use_polaris function loads a copy of Spack specifically for Polaris and uses a separate Spack instance for other machines. We use Spack’s SPACK_USER_CONFIG_PATH variable to keep these cleanly separate. Lines related to setting proxies are required for nodes not having outbound network connectivity. Put the following in your .bashrc file:

.bashrc
function use_polaris {
    if hostname -f | grep polaris &>/dev/null; then
        echo "Loading Spack for Polaris"
        source ${HOME}/git/spack-polaris/share/spack/setup-env.sh

        export HTTP_PROXY="http://proxy.alcf.anl.gov:3128"
        export HTTPS_PROXY="http://proxy.alcf.anl.gov:3128"
        export http_proxy="http://proxy.alcf.anl.gov:3128"
        export https_proxy="http://proxy.alcf.anl.gov:3128"
        export ftp_proxy="http://proxy.alcf.anl.gov:3128"

        export clustername=polaris
    fi
    
    export SPACK_USER_CONFIG_PATH="$HOME/.spack/$clustername"
    export SPACK_USER_CACHE_PATH="$SPACK_USER_CONFIG_PATH"
}

Once done, loading Spack with the relevant modules can then be conveniently achieved using command use_polaris.

Before proceeding, don’t forget to tweak your Spack configuration to your needs using spack config --scope defaults edit config. In particular, I like setting build_stage: /local/scratch/<username>/spack-stage to leverage local scratch SSD storage on compute nodes for building packages.

Tailoring Spack for Polaris

We now want Spack to use the compilers present in the PE, as explained above. Simply create the file compilers.yaml at /home/<username>/.spack/polaris/compilers.yaml with the following content:

compilers.yaml
compilers:
  - compiler:
      spec: gcc@12.3
      paths:
        cc: cc
        cxx: CC
        f77: ftn
        fc: ftn
      flags: {}
      operating_system: sles15
      target: any
      modules:
      - PrgEnv-gnu
      - nvhpc-mixed
      - gcc-native/12.3
      - libfabric
      - cudatoolkit-standalone
      environment: {}
      extra_rpaths: []

Besides, Spack allows you to customize how your software is built through the packages.yaml file. Using it, you can make Spack prefer particular implementations of virtual dependencies (e.g., using cray-mpich as the MPI implementation), or you can make it prefer to build with particular compilers. You can also tell Spack to use external software installations already present on your system to avoid configuring them. Please read the docs about package settings for detailed instructions.

To reuse my package configuration tailored for Polaris, simply create packages.yaml at /home/<username>/.spack/polaris/packages.yaml with the following content:

  packages:
    all:
      require:
      - "%gcc@12.3"
      - "target=zen3"
    mpi:
      require:
      - cray-mpich
    json-c:
      require:
      - "@0.13.0"
    pkgconfig:
      require:
      - pkg-config
    cray-mpich:
      buildable: false
      externals:
      - spec: cray-mpich@8.1.28
        modules:
        - cray-mpich/8.1.28
    mercury:
      buildable: true
      variants: ~boostsys ~checksum
    libfabric:
      buildable: false
      externals:
      - spec: libfabric@1.15.2.0
        modules:
        - libfabric/1.15.2.0
    autoconf:
      buildable: false
      externals:
      - spec: autoconf@2.69
        prefix: /usr
    automake:
      buildable: false
      externals:
      - spec: automake@1.15.1
        prefix: /usr
    gmake:
      buildable: false
      externals:
      - spec: gmake@4.2.1
        prefix: /usr
    cmake:
      buildable: false
      externals:
      - spec: cmake@3.27.7
        prefix: /soft/spack/gcc/0.6.1/install/linux-sles15-x86_64/gcc-12.3.0/cmake-3.27.7-a435jtzvweeos2es6enirbxdjdqhqgdp
    libtool:
      buildable: false
      externals:
      - spec: libtool@2.4.6
        prefix: /usr
    openssl:
      buildable: false
      externals:
      - spec: openssl@1.1.1d
        prefix: /usr
    m4:
      buildable: false
      externals:
      - spec: m4@1.4.18
        prefix: /usr
    zlib:
      buildable: false
      externals:
      - spec: zlib@1.2.11
        prefix: /usr
    pkg-config:
      buildable: false
      externals:
      - spec: pkg-config@0.29.2
        prefix: /usr
    git:
      buildable: false
      externals:
      - spec: git@2.35.3
        prefix: /usr
    cuda:
      buildable: false
      externals:
      - spec: cuda@12.3.2
        prefix: /soft/compilers/cudatoolkit/cuda-12.3.2
        modules:
        - cudatoolkit-standalone/12.3.2

A basic Spack environment

An environment is used to group a set of specs intended for some purpose to be built, rebuilt, and deployed in a coherent fashion. Environments define aspects of the installation of the software, such as:

Please refer to the Spack environment documentation for further details.

A spack.yaml file fully describes a Spack environment, including system-provided packages and compilers. It does so independently of any compilers.yaml or packages.yaml files installed in ~/.spack, thereby preventing interference with user-specific Spack configurations by default. To circumvent this behavior and reuse our compilers.yaml and packages.yaml defined earlier, one can use the include heading to pull in external configuration files and applies them to the environment. Create a spack.yaml file in the directory of your liking, and copy the following content to install PyTorch as well as Hugging Face’s Nanotron:

spack:
  include:
  - /home/<username>/.spack/polaris/compilers.yaml
  - /home/<username>/.spack/polaris/packages.yaml
  specs:
  - py-torch@2.5 ~gloo ~valgrind ~caffe2 ~kineto +cuda cuda_arch=80 +nccl +cudnn +magma +distributed
  - py-nanotron
  view: true

Use the following sequence to build the entire environment with the optimal performance on Polaris: