HPC Performance Analysis

CU Boulder: CSCI 4830-014/7000-018 (Spring 2015)

Time: MW 6:30-7:45pm in ECCR 118

Instructor: Jed Brown, jed@jedbrown.org, ECOT 626

Class mailing list

Please use the hpc-perf-analysis mailing list for general discussion about the course. Notifications will be sent there.

Git repositories and assignments

In-class exercises and homework will be organized through the CUBoulder-HPCPerfAnalysis organization on GitHub. Each major topic will have a repository, questions should be asked using the issue tracker or inline comments, and changes submitted using pull requests. You are encouraged to share work with others in the class, but please merge the original author's commit so that it is properly attributed.

New repositories will be created as we move to new topics.

Abstract and scope

Each High Performance Computing (HPC) architecture is more challenging than the last to utilize effectively. This is due to changing memory/CPU balance, higher performance variability, more complicated software stacks, larger scale, the necessity of optimally scaling algorithms, and changing science/engineering objectives. This special topics course will outline the current state of HPC architecture, design tradeoffs in the roadmap, and modeling of performance-limiting factors and whole-program performance. We will use Janus for hands-on experiments with representative applications, will analyze profiles from larger machines of various architecture, and discuss examples from the research literature.

Architectural roadmaps and modeling

  • vectorization
  • instruction-level parallelism and hardware threads
  • memory hierarchy
  • coprocessors
  • modern networks
  • input/output and file systems
  • system software
  • variability and reliability

Designing performance experiments

  • application requirements and figures of merit
  • instrumenting software
  • how and when to scale up
  • performance tools
  • diagnosing bottlenecks and scalability problems
  • presentation of results

Application case studies

  • explicit PDE solvers (seismic wave propagation, turbulence)
  • implicit PDE solvers and multigrid methods (geodynamics, structural mechanics)
  • irregular graph algorithms (network analysis, genomics, game trees)
  • dense linear algebra and tensors (quantum chemistry)
  • fast methods for N-body problems (molecular dynamics, cosmology)
  • cross-cutting: data assimilation, uncertainty quantification

Recommended References



50% in-class exercises & homework, 30% project, 20% oral exam