AWS-Logo_White-Color
  • I ⁃ AWS HPC Overview
  • II ⁃ Getting Started in the Cloud
    • Prerequisites
    • a. Sign in to the Console
    • b. Create a Cloud9 Environment
    • c. Work with the AWS CLI
    • d. Create an S3 Bucket
    • e. Launch the EC2 Dashboard
    • f. Create an EC2 Instance
    • g. Create an IAM Role
    • Summary
  • III ⁃ Create an HPC Cluster
    • a. Install AWS ParallelCluster
    • b. Configure ParallelCluster
    • c. Create a Cluster Config
    • d. Build an HPC Cluster
    • e. Log in to Your Cluster
    • f. Submit your first HPC job
    • h. Update your cluster
    • i. Terminate Your Cluster
  • IV ⁃ Build a High-Performance File System
    • a. Install AWS ParallelCluster
    • b. Create your HPC Cluster
    • c. Examine the File System
    • d. About Lazy File Loading
    • d. Install IOR Benchmark Tool
    • e. Test IO Performance
    • f. View Metrics with CloudWatch
    • g. Summary and Cleanup
  • V ⁃ Simulations with AWS Batch
    • a. Lab Environment
    • b. Workshop Initial Setup
    • c. Build Your AMI with Packer
    • d. Create a Docker Repository
    • e. Set up Compute Environment
    • f. Set up a Job Queue
    • g. Set up a Job Definition
    • h. Describe Your Environment
    • i. Run a Single Job
    • j. Run an Array Job
    • k. Next Steps & Clean up
  • VI ⁃ Remote Visualization using NICE DCV
    • DCV using ParallelCluster
      • a. Create a cluster configured with NICE DCV
      • b. Build an HPC Cluster with NICE DCV
      • c. Connect to your NICE DCV Session
      • e. Terminate Your Cluster
    • DCV using web browser/native client
      • a. Deploy EC2 instance with NICE DCV
      • b. Connect to NICE DCV EC2 Instance
      • c. Connect to Remote Desktop Session
      • d. Terminate Your Instance
  • VII ⁃ Elastic Fabric Adapter (EFA)
    • a. EFA Basics
    • b. Create an HPC Cluster with EFA
    • c. Examine an EFA enabled instance
    • d. Work With Intel MPI
    • e. Download, compile and run the OSU Benchmark
    • f. Delete Your EFA Cluster
  • VIII ⁃ Distributed Machine Learning
    • a. Upload training data to S3
    • b. Create a distributed ML cluster
    • c. Run single node data preprocessing with Slurm
    • d. Run PyTorch Data Parallel training on ParallelCluster
    • e. Delete Distributed ML Cluster

More

  • Github
  • AWS - HPC
  • Authors
  • Feedback / Questions?
  • Tags

  • Clear History
Star Fork
Privacy | Site Terms | © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS HPC Workshops > Tags > metrics

metrics

  • f. View Metrics with CloudWatch