AWS-Logo_White-Color
  • I ⁃ AWS HPC Overview
  • II ⁃ ParallelCluster UI
    • a. Deploy ParallelCluster UI
    • b. Connect to ParallelCluster UI
    • c. Summary
  • III ⁃ ParallelCluster using CLI (optional)
    • Prerequisites
    • a. Sign in to the Console
    • b. Create a Cloud9 Environment
    • c. Work with the AWS CLI
    • d. Create a Key Pair
    • e. Install AWS ParallelCluster
    • f. (Optional) Create config with 'pcluster configure'
    • g. Create a Cluster Config
    • h. Build an HPC Cluster
    • i. Log in to Your Cluster
    • j. Submit your first HPC job
    • Summary
  • IV ⁃ Create an HPC Cluster
    • a. Create a Cluster
    • b. Connect to the Cluster
    • c. Get to know your Cluster
    • f. Submit your first HPC job
    • g. Update your cluster
    • h. Terminate Your Cluster
  • V ⁃ Build a High-Performance File System
    • a. Create HPC Cluster
    • b. Create FSx Lustre
    • c. Create S3 Bucket
    • d. Link S3 to FSx Lustre
    • e. Examine the File System
    • f. About Lazy File Loading
    • g. Install IOR Benchmark Tool
    • h. View Metrics with CloudWatch
    • i. Test IO Performance
    • j. Summary and Cleanup
  • VI ⁃ Remote Visualization using NICE DCV
    • DCV Connect in ParallelCluster
      • a. Connect to your NICE DCV Session
      • a. Create a cluster configured with NICE DCV
    • DCV Queue in ParallelCluster
      • a. Create Security Group
      • b. Modify Cluster Configuration
      • c. Create DCV Session
      • d. No-Ingress DCV Session
    • DCV using web browser/native client
      • a. Deploy EC2 instance with NICE DCV
      • b. Connect to NICE DCV EC2 Instance
      • c. Connect to Remote Desktop Session
      • d. Terminate Your Instance
  • VII ⁃ Elastic Fabric Adapter (EFA)
    • a. EFA Basics
    • b. Create an HPC Cluster with EFA
    • c. Examine an EFA enabled instance
    • d. Work With Intel MPI
    • e. Download, compile and run the OSU Benchmark
    • f. Delete Your EFA Cluster
  • VIII ⁃ Cost controls
    • a. Configure Slurm Accounting and Prerequisites
    • b. Create Cost Controls
    • c. Test Cost Controls
    • d. View Metrics in CloudWatch
    • e. Summary
  • IX ⁃ Distributed Machine Learning
    • a. Upload training data to S3
    • b. Create a distributed ML cluster
    • c. Run single node data preprocessing with Slurm
    • d. Run PyTorch Data Parallel training on ParallelCluster
    • e. Delete Distributed ML Cluster

More

  • Github
  • AWS - HPC
  • External Worshops
  • Conferences Worshops
  • Authors
  • Feedback / Questions?
  • Tags

  • Clear History
Star Fork
Privacy | Site Terms | © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS HPC Workshops > Tags

Tags

  • create
  • ParallelCluster
  • tutorial
  • cleanup
  • cluster
  • Conda
  • configuration
  • data
  • data parallel
  • EFA
  • FSx
  • Machine Learning
  • ML
  • multi gpu
  • multi node
  • nccl
  • preprocessing
  • s3
  • sbatch
  • Slurm
  • srun
  • training
  • basics
  • delete
  • Benchmark
  • compile
  • EC2
  • MPI
  • OSU
  • intel
  • IntelMPI
  • module
  • fi_info
  • Elastic Fabric Adapter
  • HPC
  • Network
  • aws console
  • DCV
  • HSM
  • initialize
  • install
  • IOR
  • Laxy Load
  • metrics
  • Native Client
  • NICE
  • NICE DCV
  • Performances
  • Prerequisite
  • Remote Desktop
  • summary
  • Visualization
  • Web Browser
  • Batch
  • Introduction
  • Optional
  • Overview
  • aws cli
  • cloud9
  • key-pair
  • parallelcluster-ui