In this step, you create a cluster configuration that includes parameters for Amazon FSx for Lustre.
If you are not familiar with AWS ParallelCluster, we recommend that you first complete the AWS ParallelCluster lab before proceeding.
First, create an Amazon S3 bucket and upload a file. Then, you can retrieve the file using Amazon FSx for Lustre.
Run the following commands to create a new Amazon S3 bucket. These commands also retrieve and store two example files in this bucket: MatrixMarket and a velocity model from the Society of Exploration Geophysicists.
# generate a uniqe postfix
BUCKET_POSTFIX=$(uuidgen --random | cut -d'-' -f1)
echo "Your bucket name will be mybucket-${BUCKET_POSTFIX}"
aws s3 mb s3://mybucket-${BUCKET_POSTFIX}
# retrieve local copies
wget ftp://math.nist.gov/pub/MatrixMarket2/misc/cylshell/s3dkq4m2.mtx.gz
wget http://s3.amazonaws.com/open.source.geoscience/open_data/seg_eage_salt/SEG_C3NA_Velocity.sgy
# upload to your bucket
aws s3 cp s3dkq4m2.mtx.gz s3://mybucket-${BUCKET_POSTFIX}/s3dkq4m2.mtx.gz
aws s3 cp SEG_C3NA_Velocity.sgy s3://mybucket-${BUCKET_POSTFIX}/SEG_C3NA_Velocity.sgy
# delete local copies
rm s3dkq4m2.mtx.gz
rm SEG_C3NA_Velocity.sgy
Before continuing to the next step, check the content of your bucket using the AWS CLI with the command aws s3 ls s3://mybucket-${BUCKET_POSTFIX}
or the AWS console. Now, build our AWS ParallelCluster configuration.
This section assumes that you are familiar with AWS ParallelCluster and the process of bootstrapping a cluster.
Generate a new key-pair and new default AWS ParallelCluster configuration.
The cluster configuration that you generate for Amazon FSx for Lustre includes the following settings:
For more details about the configuration options, see the AWS ParallelCluster User Guide and the fsx parameters section of the AWS ParallelCluster User Guide.
If you are using a different terminal than above, make sure that the Amazon S3 bucket name is correct.
Paste the following commands into your terminal:
# generate a new keypair, remove those lines if you want to use the previous one
aws ec2 create-key-pair --key-name lab-4-your-key --query KeyMaterial --output text > ~/.ssh/lab-4-key
chmod 600 ~/.ssh/lab-4-key
# create the cluster configuration
IFACE=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/)
SUBNET_ID=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/${IFACE}/subnet-id)
VPC_ID=$(curl --silent http://169.254.169.254/latest/meta-data/network/interfaces/macs/${IFACE}/vpc-id)
AZ=$(curl http://169.254.169.254/latest/meta-data/placement/availability-zone)
REGION=${AZ::-1}
mkdir -p ~/.parallelcluster
cat > ~/.parallelcluster/config << EOF
[aws]
aws_region_name = ${REGION}
[global]
cluster_template = default
update_check = false
sanity_check = false
[cluster default]
key_name = lab-4-your-key
vpc_settings = public
base_os = alinux2
ebs_settings = myebs
fsx_settings = myfsx
compute_instance_type = c5.18xlarge
master_instance_type = c5.xlarge
cluster_type = ondemand
placement_group = DYNAMIC
placement = compute
max_queue_size = 8
initial_queue_size = 0
disable_hyperthreading = true
scheduler = slurm
[vpc public]
vpc_id = ${VPC_ID}
master_subnet_id = ${SUBNET_ID}
[ebs myebs]
shared_dir = /shared
volume_type = gp2
volume_size = 20
[fsx myfsx]
shared_dir = /lustre
storage_capacity = 1200
import_path = s3://mybucket-${BUCKET_POSTFIX}
deployment_type = SCRATCH_2
[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}
EOF
If you want to check the content of your configuration file, use the following command:
cat ~/.parallelcluster/config
Now, you are ready to create a cluster.
Create the cluster using the following command.
pcluster create my-fsx-cluster
This cluster generates additional resources for Amazon FSx for Lustre which will take a few minutes longer to create than the previous AWS ParallelCluster workshop.
Once created, connect to your cluster.
pcluster ssh my-fsx-cluster -i ~/.ssh/lab-4-key
Next, take a deeper look at the Lustre file system.