b. Create your HPC Cluster

In this step, you create a cluster configuration that includes parameters for Amazon FSx for Lustre.

If you are not familiar with AWS ParallelCluster, we recommend that you first complete the AWS ParallelCluster lab before proceeding.

Create an Amazon S3 Bucket and Upload Files

First, create an Amazon S3 bucket and upload a file. Then, you can retrieve the file using Amazon FSx for Lustre.

  1. Open a terminal in your AWS Cloud9 instance.
  2. Run the following commands to create a new Amazon S3 bucket. These commands also retrieve and store two example files in this bucket: MatrixMarket and a velocity model from the Society of Exploration Geophysicists.

    # generate a uniqe postfix
    BUCKET_POSTFIX=$(uuidgen --random | cut -d'-' -f1)
    echo "Your bucket name will be mybucket-${BUCKET_POSTFIX}"
    aws s3 mb s3://mybucket-${BUCKET_POSTFIX}
    
    # retrieve local copies
    wget ftp://math.nist.gov/pub/MatrixMarket2/misc/cylshell/s3dkq4m2.mtx.gz
    wget http://s3.amazonaws.com/open.source.geoscience/open_data/seg_eage_salt/SEG_C3NA_Velocity.sgy
    
    # upload to your bucket
    aws s3 cp s3dkq4m2.mtx.gz s3://mybucket-${BUCKET_POSTFIX}/s3dkq4m2.mtx.gz
    aws s3 cp SEG_C3NA_Velocity.sgy s3://mybucket-${BUCKET_POSTFIX}/SEG_C3NA_Velocity.sgy
    
    # delete local copies
    rm s3dkq4m2.mtx.gz
    rm SEG_C3NA_Velocity.sgy

Before continuing to the next step, check the content of your bucket using the AWS CLI with the command aws s3 ls s3://mybucket-${BUCKET_POSTFIX} or the AWS console. Now, build our AWS ParallelCluster configuration.

Create a Cluster Configuration File for Amazon FSx for Lustre

This section assumes that you are familiar with AWS ParallelCluster and the process of bootstrapping a cluster.

Generate a new key-pair and new default AWS ParallelCluster configuration.

The cluster configuration that you generate for Amazon FSx for Lustre includes the following settings:

  • Lustre partition of 3.6 TB; use the Amazon S3 bucket created previously as the import and export path.
  • Set head node and compute nodes as c4.xlarge instances. You can change the instance type if you like, but you may run into EC2 limits that may prevent you from creating instance or create too many instances.
  • A placement group to maximize the bandwidth between instances and reduce the latency.
  • Set the cluster to 0 compute nodes when starting, the minimum size to 0, and maximum size to 8 instances. The cluster uses Auto Scaling Groups that will grow and shrink between the min and max limits based on the cluster utilization and job queue backlog.
  • A GP2 Amazon EBS volume will be attached to the head node then shared through NFS to be mounted by the compute nodes on /shared. It is generally a good location to store applications or scripts. Keep in mind that the /home directory is shared on NFS as well.
  • The job scheduler is SLURM, but you can use other options, such as SGE.

For more details about the configuration options, see the AWS ParallelCluster User Guide and the fsx parameters section of the AWS ParallelCluster User Guide.

If you are using a different terminal than above, make sure that the Amazon S3 bucket name is correct.

Paste the following commands into your terminal:

# generate a new keypair, remove those lines if you want to use the previous one
aws ec2 create-key-pair --key-name lab-4-your-key --query KeyMaterial --output text > ~/.ssh/lab-4-key
chmod 600 ~/.ssh/lab-4-key

# create the cluster configuration
IFACE=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/)
SUBNET_ID=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/${IFACE}/subnet-id)
VPC_ID=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/${IFACE}/vpc-id)
AZ=$(curl http://169.254.169.254/latest/meta-data/placement/availability-zone)
REGION=${AZ::-1}


mkdir -p ~/.parallelcluster
cat > ~/.parallelcluster/config << EOF
[aws]
aws_region_name = ${REGION}

[global]
cluster_template = default
update_check = false
sanity_check = true

[cluster default]
key_name = lab-4-your-key
vpc_settings = public
ebs_settings = myebs
fsx_settings = myfsx
compute_instance_type = c4.xlarge
master_instance_type = c4.xlarge
cluster_type = ondemand
placement_group = DYNAMIC
placement = compute
max_queue_size = 8
initial_queue_size = 0
disable_hyperthreading = true
scheduler = slurm

[vpc public]
vpc_id = ${VPC_ID}
master_subnet_id = ${SUBNET_ID}

[ebs myebs]
shared_dir = /shared
volume_type = gp2
volume_size = 20

[fsx myfsx]
shared_dir = /lustre
storage_capacity = 3600
import_path =  s3://mybucket-${BUCKET_POSTFIX}

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}
EOF

If you want to check the content of your configuration file, use the following command:

cat ~/.parallelcluster/config

Now, you are ready to create a cluster.

Generate a Cluster for Amazon FSx for Lustre

Create the cluster using the following command.

pcluster create my-fsx-cluster

This cluster generates additional resources for Amazon FSx for Lustre which will take a few minutes longer to create than the previous AWS ParallelCluster workshop.

Connect to Your Cluster

Once created, connect to your cluster.

pcluster ssh my-fsx-cluster -i ~/.ssh/lab-4-key

Next, take a deeper look at the Lustre file system.