Now that you are connected to the head node, familiarize yourself with the cluster structure by running the following set of commands.
SLURM from SchedMD is one of the batch schedulers that you can use in AWS ParallelCluster. For an overview of the SLURM commands, see the SLURM Quick Start User Guide.
sinfoshows both the instances we currently have running and those that are not running (think of this as a queue limit). Initially we’ll see all the node in state
idle~, this means no instances are running. When we submit a job we’ll see some instances go into state
allocmeaning they’re currently completely allocated, or
mixmeaning some but not all cores are allocated. After the job completes the instance stays around for a few minutes (default cooldown is 10 mins) in state
idle%. This can be confusing, so we’ve tried to simplify it in the below table:
||Instance is not running but can launch when a job is submitted.|
||Instance is running and will shut down after ScaledownIdletime (default 10 mins).|
||Instance is partially allocated.|
||Instance is completely allocated.|
Environment Modules are a fairly standard tool in HPC that is used to dynamically change your environment variables (
openmpipre-installed. These MPI versions are compiled with support for the high-speed interconnect EFA.
module load intelmpi mpirun -V
showmount -e localhost
You’ll see a line like:
172.31.21.202@tcp:/zm5lzbmv 1.1T 1.2G 1.1T 1% /shared
This is a 1.2 TB filesystem, mounted at
/shared that’s 1% used.
In the next section we’ll install Spack on this shared filesystem!