**Description:**

Based
on a course developed by the author, *Introduction to High Performance
Scientific Computing *introduces methods for adding parallelism to numerical
methods for solving differential equations. It contains exercises and
programming projects that facilitate learning as well as examples and
discussions based on the C programming language, with additional comments for
those already familiar with C++.

The
text provides an overview of concepts and algorithmic techniques for modern
scientific computing and is divided into six self-contained parts that can be
assembled in any order to create an introductory course using available
computer hardware. Part I introduces the C programming language for those not
already familiar with programming in a compiled language. Part II describes
parallelism on shared memory architectures using OpenMP. Part III details
parallelism on computer clusters using MPI for coordinating a computation. Part
IV demonstrates the use of graphical programming units (GPUs) to solve problems
using the CUDA language for NVIDIA graphics cards. Part V addresses programming
on GPUs for non-NVIDIA graphics cards using the OpenCL framework. Finally, Part
VI contains a brief discussion of numerical methods and applications, giving
the reader an opportunity to test the methods on typical computing problems.

*Introduction to High Performance Scientific Computing* is
intended for advanced undergraduate or beginning graduate students who have
limited exposure to programming or parallel programming concepts. Extensive
knowledge of numerical methods is not assumed. The material can be adapted to
the available computational hardware, from OpenMP on simple laptops or desktops
to MPI on computer clusters or CUDA and OpenCL for computers containing NVIDIA
or other graphics cards. Experienced programmers unfamiliar with parallel
programming will benefit from comparing the various methods to determine the
type of parallel programming best suited for their application.

The
book can be used for courses on parallel scientific computing, high performance
computing, and numerical methods for parallel computing.

**Contents:**

**Preface **

**Chapter 1: Tools of the Trade** •
Integrated Development Environments • Text Editors • Compilers • Makefiles •
Debugging • Profilers • Version Control • Exercise

__Part I: Elementary C
Programming__

**Chapter 2: Structure of a C Program** •
Comments • The Preamble • The Main Program Exercise

**Chapter 3: Data Types and Structures** •
Basic Data Types • Mathematical Operations and Assigning Values • Boolean
Algebra • Mathematical Functions • Arrays and Pointers • Structures • Exercises

**Chapter 4: Input and Output** •
Terminal I/O • File I/O • Exercises

**Chapter 5: Flow Control • **for
Loops • while and do-while Loops • if–then–else • switch–case Statements •
Exercises

**Chapter 6: Functions** • Declarations and
Definitions • Function Arguments • Measuring Performance • Exercises

**Chapter 7: Using Libraries** • BLAS
and LAPACK • FFTW • Random Number Generation • Exercises

**Chapter 8: Projects for Serial Programming** •
Random Processes • Finite Difference Methods • Elliptic Equations and
Successive Overrelaxation • Pseudospectral Methods

__Part II: Parallel Computing
Using OpenMP__

**Chapter 9: Intro to OpenMP **•
Creating Multiple Threads: #pragma omp parallel • The num_threads Clause • The
shared Clause • The private Clause • The reduction Clause • Exercise

**Chapter 10: Subdividing for Loops **•
Subdividing Loops: #pragma omp for • The private Clause • The reduction Clause
• The ordered Clause • The schedule Clause • The nowait Clause • Efficiency
Measures • Exercises

**Chapter 11: Serial Tasks Inside Parallel Regions** •
Serial Subregions: #pragma omp single • The copyprivate Clause • Exercise

**Chapter 12: Distinct Tasks in Parallel** •
Multiple Parallel Distinct Tasks: #pragma omp sections • The private Clause •
The reduction Clause • The nowait Clause • Exercise

**Chapter 13: Critical and Atomic Code** •
Atomic Statements • Critical Statements • Exercise

**Chapter 14: OpenMP Libraries** •
FFTW • Random Numbers • Exercises

**Chapter 15: Projects for OpenMP Programming** •
Random Processes • Finite Difference Methods • Elliptic Equations and
Successive Overrelaxation • Pseudospectral Methods

__Part III Distributed
Programming and MPI__

**Chapter 16: Preliminaries **•
Parallel “Hello World!” • Compiling and Running MPI Code • Submitting Jobs •
Exercise

**Chapter 17: Passing Messages** •
Blocking Send and Receive • Nonblocking Send and Receive • Combined Send and
Receive • Gather/Scatter Communications • Broadcast and Reduction • Error
Handling • Exercises

**Chapter 18: Groups and Communicators** •
Subgroups and Communicators • Communicators Using Split • Grid Communicators •
Exercises

**Chapter 19: Measuring Efficiency and Checkpointing •**
Efficiency Measures • Checkpointing

**Chapter 20: MPI Libraries **•
ScaLAPACK • FFTW • Exercises

**Chapter 21: Projects for Distributed Programming** •
Random Processes • Finite Difference Methods • Elliptic Equations and SOR •
Pseudospectral Methods

__Part IV: GPU Programming and
CUDA__

**Chapter 22: Intro to CUDA** •
First CUDA Program • Selecting the Correct GPU • CUDA Development Tools •
Exercises

**Chapter 23: Parallel CUDA Using Blocks **•
Running Kernels in Parallel • Organization of Blocks • Threads • Error Handling
• Exercises

**Chapter 24: GPU Memory** • Shared Memory •
Constant Memory • Texture Memory • Warp Shuffles • Exercises

**Chapter 25: Streams** • Stream Creation and
Destruction • Asynchronous Memory Copies • Single Stream Example • Multiple
Streams • General Strategies • Measuring Performance • Exercises

**Chapter 26: CUDA Libraries** •
MAGMA • cuRAND • cuFFT • cuSPARSE • Exercises

**Chapter 27: Projects for CUDA Programming** •
Random Processes • Finite Difference Methods • Elliptic Equations and SOR •
Pseudospectral Methods

__Part V: GPU Programming and
OpenCL__

**Chapter 28: Intro to OpenCL** •
First OpenCL Program • Setting Up the Context • Exercise

**Chapter 29: Parallel OpenCL Using Work-Groups** •
Parallel OpenCL Example • Compiling Kernel Functions • Error Handling •
Organization of Work-Groups • Work-Items • Exercises

**Chapter 30: GPU Memory** • Local Memory •
Constant Memory • Exercises

**Chapter 31: Command Queues** •
Events • Measuring Performance • Concurrency • Using Multiple Devices •
Exercises

**Chapter 32: OpenCL Libraries **•
Random • clMAGMA • clFFT • Exercises

**Chapter 33: Projects for OpenCL Programming** •
Random Processes • Finite Difference Methods • Elliptic Equations and SOR •
Pseudospectral Methods

__Part VI: Applications__

**Chapter 34: Stochastic Differential Equations** •
Mathematical Description • Numerical Methods • Problems to Solve

**Chapter 35: Finite Difference Methods** •
Approximating Spatial Derivatives • Finite Difference Grid • Approximating
Temporal Derivatives • Problems to Solve

**Chapter 36: Iterative Solution of Elliptic Equations** •
Problems to Solve

**Chapter 37: Pseudospectral Methods** •
Fourier Transform • Spectral Differentiation • Pseudospectral Method • Higher
Dimensions • Problems to Solve

**Bibliography **

**Index **

About the Author:

**David L. Chopp** is a professor in the
Northwestern University Engineering Sciences and Applied Mathematics
Department, where he has been teaching since 1996 and has been chair since
2013. He was named a Charles Deering McCormick Professor of Teaching Excellence
in 2008. Chopp has developed multiple courses and is the author of nearly 50
refereed publications, including fundamental contributions to the development
of the popular level set method for computing moving interfaces. His research
interests include numerical methods and mathematical modeling in applications
such as microbiology, materials science, fracture mechanics, and neurobiology.

Target Audience:

The
book can be used for courses on parallel scientific computing, high performance
computing, and numerical methods for parallel computing.