AWS-Logo_White-Color
  • I ⁃ AWS HPC Overview
  • II ⁃ Getting Started in the Cloud
    • Prerequisites
    • a. Sign in to the Console
    • b. Create a Cloud9 Environment
    • c. Work with the AWS CLI
    • d. Create an S3 Bucket
    • e. Launch the EC2 Dashboard
    • f. Create an EC2 Instance
    • g. Create an IAM Role
    • Summary
  • III ⁃ Create an HPC Cluster
    • a. Install AWS ParallelCluster
    • b. Configure ParallelCluster
    • c. Create a Cluster Config
    • d. Build an HPC Cluster
    • e. Log in to Your Cluster
    • f. Submit your first HPC job
    • h. Update your cluster
    • i. Terminate Your Cluster
  • IV ⁃ Build a High-Performance File System
    • a. Install AWS ParallelCluster
    • b. Create your HPC Cluster
    • c. Examine the File System
    • d. About Lazy File Loading
    • d. Install IOR Benchmark Tool
    • e. Test IO Performance
    • f. View Metrics with CloudWatch
    • g. Summary and Cleanup
  • V ⁃ Simulations with AWS Batch
    • a. Lab Environment
    • b. Workshop Initial Setup
    • c. Build Your AMI with Packer
    • d. Create a Docker Repository
    • e. Set up Compute Environment
    • f. Set up a Job Queue
    • g. Set up a Job Definition
    • h. Describe Your Environment
    • i. Run a Single Job
    • j. Run an Array Job
    • k. Next Steps & Clean up
  • VI ⁃ Remote Visualization using NICE DCV
    • DCV using ParallelCluster
      • a. Create a cluster configured with NICE DCV
      • b. Build an HPC Cluster with NICE DCV
      • c. Connect to your NICE DCV Session
      • e. Terminate Your Cluster
    • DCV using web browser/native client
      • a. Deploy EC2 instance with NICE DCV
      • b. Connect to NICE DCV EC2 Instance
      • c. Connect to Remote Desktop Session
      • d. Terminate Your Instance
  • VII ⁃ Elastic Fabric Adapter (EFA)
    • a. EFA Basics
    • b. Create an HPC Cluster with EFA
    • c. Examine an EFA enabled instance
    • d. Work With Intel MPI
    • e. Download, compile and run the OSU Benchmark
    • f. Delete Your EFA Cluster
  • VIII ⁃ Distributed Machine Learning
    • a. Upload training data to S3
    • b. Create a distributed ML cluster
    • c. Run single node data preprocessing with Slurm
    • d. Run PyTorch Data Parallel training on ParallelCluster
    • e. Delete Distributed ML Cluster

More

  • Github
  • AWS - HPC
  • Authors
  • Feedback / Questions?
  • Tags

  • Clear History
Star Fork
Privacy | Site Terms | © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS HPC Workshops > Tags

Tags

  • cleanup
  • cluster
  • Conda
  • configuration
  • create
  • data
  • data parallel
  • EFA
  • ML
  • multi gpu
  • multi node
  • nccl
  • ParallelCluster
  • preprocessing
  • s3
  • sbatch
  • slurm
  • srun
  • training
  • basics
  • delete
  • tutorial
  • Benchmark
  • compile
  • ec2
  • mpi
  • OSU
  • intel
  • IntelMPI
  • module
  • fi_info
  • AWS
  • aws console
  • batch
  • DCV
  • FSx
  • HPC
  • HSM
  • initialize
  • install
  • IOR
  • Laxy Load
  • metrics
  • Native Client
  • NICE
  • NICE DCV
  • optional
  • packer
  • Performances
  • Prerequisite
  • Remote Desktop
  • summary
  • Visualization
  • Web Browser
  • aws cli
  • cloud9
  • dashboard
  • iam