Book Download: Parallel Scientiﬁc Computing in C ++ and MPI

Contents
1 Scientiﬁc Computing and Simulation Science 2
1.1 What is Simulation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 A Seamless Approach Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The Concept of Programming Language . . . . . . . . . . . . . . . . . . . . 6
1.4 Why C++ and What is MPI? . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 What About OpenMP? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Algorithms and Top Ten List . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Basic Concepts and Tools 12
2.1 Introduction to C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Two Basic Concepts in C++ . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 Learning the Syntax and Other Basic Commands . . . . . . . . . . . 21
2.1.3 Learning to Print . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.4 Learning to Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1.5 How to Program in Style . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 Mathematical and Computational Concepts . . . . . . . . . . . . . . . . . . 41
2.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.2.2 Binary Numbers and Round-oﬀ . . . . . . . . . . . . . . . . . . . . . 41
2.2.3 Condition Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.4 Vector and Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . 46
2.2.6 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.7 Basic Linear Algebra - BLAS . . . . . . . . . . . . . . . . . . . . . . 52
2.2.8 Exploiting the Structure of Sparse Matrices . . . . . . . . . . . . . . 61
2.2.9 Gram-Schmidt Vector Orthogonalization . . . . . . . . . . . . . . . . 62
2.3 Parallel Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.3.1 From Supercomputing to Soupercomputing . . . . . . . . . . . . . . . 70
2.3.2 Mathematical Parallelism and Recursive-Doubling . . . . . . . . . . . 76
2.3.3 Amdahl’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.3.4 MPI - Message Passing Interface . . . . . . . . . . . . . . . . . . . . 80
2.4 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3 Approximation 94
3.1 Polynomial Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.1.1 Vandermonde and Newton Interpolation . . . . . . . . . . . . . . . . 95
3.1.2 Arrays in C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.1.3 Lagrangian Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 114
3.1.4 The Runge Phenomenon . . . . . . . . . . . . . . . . . . . . . . . . . 117
3.1.5 Chebyshev Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.1.6 Hermite Interpolation and Splines . . . . . . . . . . . . . . . . . . . . 126
3.1.7 Least-Squares Approximation . . . . . . . . . . . . . . . . . . . . . . 131
3.1.8 Introduction to Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 142
3.1.9 Multi-Dimensional Interpolations . . . . . . . . . . . . . . . . . . . . 153
3.1.10 Simple Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
3.1.11 Curvilinear Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
3.2 Fourier Series Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3.2.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3.2.2 Periodic Extension of Functions . . . . . . . . . . . . . . . . . . . . . 166
3.2.3 Diﬀerentiation and the Lanczos Filter . . . . . . . . . . . . . . . . . . 168
3.2.4 Trigonometric Interpolation . . . . . . . . . . . . . . . . . . . . . . . 171
3.2.5 Noisy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
3.2.6 Matrix Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 174
3.2.7 The Fast Fourier Transform (FFT) . . . . . . . . . . . . . . . . . . . 176
3.2.8 The Fastest Fourier Transform in the West - FFTW . . . . . . . . . . 178
3.3 Wavelet Series Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 181
3.3.1 Basic Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
3.3.2 Dilation Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.3.3 Discrete Wavelet Transform: Mallat’s Algorithm . . . . . . . . . . . . 188
3.3.4 Some Orthonormal Wavelets . . . . . . . . . . . . . . . . . . . . . . . 190
3.4 Back to Parallel Computing: Send and Receive . . . . . . . . . . . . . . . . 197
3.5 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
3.5.1 Homework Problems for Section 3.1 . . . . . . . . . . . . . . . . . . . 201
3.5.2 Homework Problems for Section 3.2 . . . . . . . . . . . . . . . . . . . 205
3.5.3 Homework Problems for Section 3.3 . . . . . . . . . . . . . . . . . . . 206
4 Roots and Integrals 207
4.1 Root Finding Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
4.1.1 Polynomial Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 210
4.1.2 Fixed Point Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
4.1.3 Newton-Raphson Method . . . . . . . . . . . . . . . . . . . . . . . . 217
4.1.4 Passing Functions to Functions in C++ . . . . . . . . . . . . . . . . . 221
4.1.5 Secant Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
4.1.6 Systems of Nonlinear Equations . . . . . . . . . . . . . . . . . . . . . 227
4.1.7 Solution via Minimization:
Steepest Descent and Conjugate Gradients . . . . . . . . . . . . . . . 230
4.2 Numerical Integration Methods . . . . . . . . . . . . . . . . . . . . . . . . . 240
4.2.1 Simple Integration Algorithms . . . . . . . . . . . . . . . . . . . . . . 240
4.2.2 Advanced Quadrature Rules . . . . . . . . . . . . . . . . . . . . . . . 248
4.2.3 Multi-Dimensional Integration . . . . . . . . . . . . . . . . . . . . . . 265
4.3 Back to Parallel Computing: Reduction . . . . . . . . . . . . . . . . . . . . . 268
4.4 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
4.4.1 Homework Problems for Section 4.1 . . . . . . . . . . . . . . . . . . . 275
4.4.2 Homework Problems for Section 4.2 . . . . . . . . . . . . . . . . . . . 279
5 Explicit Discretizations 281
5.1 Explicit Space Discretizations . . . . . . . . . . . . . . . . . . . . . . . . . . 282
5.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
5.1.2 Uniform Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
5.1.3 MPI Parallel Implementation of Finite Diﬀerences . . . . . . . . . . . 296
5.1.4 Multi-Dimensional Arrays in C++ . . . . . . . . . . . . . . . . . . . 304
5.1.5 Non-Uniform Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
5.1.6 One-Dimensional Boundary Value Problem . . . . . . . . . . . . . . . 314
5.1.7 Multi-Dimensional Discretizations . . . . . . . . . . . . . . . . . . . . 316
5.2 Explicit Time Discretizations . . . . . . . . . . . . . . . . . . . . . . . . . . 323
5.2.1 Multi-Step Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
5.2.2 Convergence: Consistency and Stability . . . . . . . . . . . . . . . . . 326
5.2.3 Stability and Characteristic Polynomials . . . . . . . . . . . . . . . . 328
5.2.4 Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 334
5.2.5 Stability of Runge-Kutta Methods . . . . . . . . . . . . . . . . . . . . 338
5.3 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
6 Implicit Discretizations 345
6.1 Implicit Space Discretizations . . . . . . . . . . . . . . . . . . . . . . . . . . 346
6.1.1 Diﬀerence Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
6.1.2 Method of Undetermined Coeﬃcients . . . . . . . . . . . . . . . . . . 349
6.1.3 One-Dimensional Boundary Value Problem . . . . . . . . . . . . . . . 357
6.1.4 Thomas Algorithm for Tridiagonal Systems . . . . . . . . . . . . . . . 359
6.1.5 Parallel Algorithm for Tridiagonal Systems . . . . . . . . . . . . . . . 367
6.2 Implicit Time Discretizations . . . . . . . . . . . . . . . . . . . . . . . . . . 378
6.2.1 Fundamental Theorems for Multi-Step Methods . . . . . . . . . . . . 381
6.2.2 Stability of Stiﬀ ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . 381
6.2.3 Second-Order Initial Value Problems . . . . . . . . . . . . . . . . . . 384
6.2.4 How to March in Time . . . . . . . . . . . . . . . . . . . . . . . . . . 386
6.3 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
7 Relaxation: Discretization
and Solvers 390
7.1 Discrete Models of Unsteady Diﬀusion . . . . . . . . . . . . . . . . . . . . . 391
7.1.1 Temporal and Spatial Discretization . . . . . . . . . . . . . . . . . . 392
7.1.2 Accuracy of Diﬀerence Equation . . . . . . . . . . . . . . . . . . . . . 393
7.1.3 Stability of Diﬀerence Equation . . . . . . . . . . . . . . . . . . . . . 394
7.1.4 Spectrum of the Diﬀusion Operator . . . . . . . . . . . . . . . . . . . 403
7.1.5 Multi-Dimensional Time-Space Stencils . . . . . . . . . . . . . . . . . 409
7.2 Iterative Solvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
7.2.1 Jacobi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
7.2.2 Parallel Jacobi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 422
7.2.3 Gauss-Seidel Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 431
7.2.4 Parallel (Black-Red) Gauss-Seidel Algorithm . . . . . . . . . . . . . . 433
7.2.5 Successive Acceleration Techniques - SOR . . . . . . . . . . . . . . . 436
7.2.6 Symmetric Successive Acceleration Techniques - SSOR . . . . . . . . 438
7.2.7 SSOR with Chebyshev Acceleration . . . . . . . . . . . . . . . . . . . 439
7.2.8 Convergence Analysis of Iterative Solvers . . . . . . . . . . . . . . . . 441
7.2.9 Relaxed Jacobi and Gauss-Seidel . . . . . . . . . . . . . . . . . . . . 445
7.2.10 The Multigrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . 449
7.3 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
8 Propagation: Numerical Diﬀusion and Dispersion 466
8.1 Advection Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
8.1.1 Dispersion and Diﬀusion . . . . . . . . . . . . . . . . . . . . . . . . . 467
8.1.2 Other Advection Equations . . . . . . . . . . . . . . . . . . . . . . . 469
8.1.3 First-Order Discrete Schemes . . . . . . . . . . . . . . . . . . . . . . 470
8.1.4 High-Order Discrete Schemes . . . . . . . . . . . . . . . . . . . . . . 482
8.1.5 Eﬀects of Boundary Conditions . . . . . . . . . . . . . . . . . . . . . 493
8.2 Advection-Diﬀusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 497
8.2.1 Discrete Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
8.2.2 Eﬀects of Boundary Conditions . . . . . . . . . . . . . . . . . . . . . 505
8.3 MPI: Non-Blocking Communications . . . . . . . . . . . . . . . . . . . . . . 509
8.4 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
9 Fast Linear Solvers 517
9.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
9.1.1 LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
9.1.2 To Pivot or Not to Pivot? . . . . . . . . . . . . . . . . . . . . . . . . 524
9.1.3 Parallel LU Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 530
9.1.4 Parallel Back Substitution . . . . . . . . . . . . . . . . . . . . . . . . 534
9.1.5 Gaussian Elimination and Sparse Systems . . . . . . . . . . . . . . . 546
9.1.6 Parallel Cyclic Reduction for Tridiagonal Systems . . . . . . . . . . . 547
9.2 Cholesky Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
9.3 QR Factorization and Householder Transformation . . . . . . . . . . . . . . 560
9.3.1 Hessenberg and Tridiagonal Reduction . . . . . . . . . . . . . . . . . 568
9.4 Preconditioned Conjugate Gradient Method - PCGM . . . . . . . . . . . . . 572
9.4.1 Convergence Rate of CGM . . . . . . . . . . . . . . . . . . . . . . . . 572
9.4.2 Preconditioners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573
9.4.3 Toeplitz Matrices and Circulant Preconditioners . . . . . . . . . . . . 577
9.4.4 Parallel PCGM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
9.5 Non-Symmetric Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585
9.5.1 The Arnoldi Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
9.5.2 GMRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
9.5.3 GMRES(k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
9.5.4 Preconditioning GMRES . . . . . . . . . . . . . . . . . . . . . . . . . 597
9.5.5 Parallel GMRES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
9.6 What Solver to Choose? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
9.7 Available Software for Fast Solvers . . . . . . . . . . . . . . . . . . . . . . . 601
9.8 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
10 Fast Eigensolvers 608
10.1 Local Eigensolvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
10.1.1 Basic Power Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
10.1.2 Inverse Shifted Power Method . . . . . . . . . . . . . . . . . . . . . . 612
10.2 Householder Deﬂation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
10.3 Global Eigensolvers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
10.3.1 The QR Eigensolver . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
10.3.2 The Hessenberg QR Eigensolver . . . . . . . . . . . . . . . . . . . . . 625
10.3.3 Shifted QR Eigensolver . . . . . . . . . . . . . . . . . . . . . . . . . . 625
10.3.4 The Symmetric QR Eigensolver: Wilkinson Shift . . . . . . . . . . . 627
10.3.5 Parallel QR Eigensolver: Divide-and-Conquer . . . . . . . . . . . . . 627
10.3.6 The Lanczos Eigensolver . . . . . . . . . . . . . . . . . . . . . . . . . 635
10.4 Generalized Eigenproblems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 638
10.4.1 The QZ Eigensolver . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639
10.4.2 Singular Eigenproblems . . . . . . . . . . . . . . . . . . . . . . . . . 639
10.4.3 Polynomial Eigenproblems . . . . . . . . . . . . . . . . . . . . . . . . 640
10.5 Arnoldi Method: Non-Symmetric Eigenproblems . . . . . . . . . . . . . . . . 640
10.6 Available Software for Eigensolvers . . . . . . . . . . . . . . . . . . . . . . . 641
10.7 Homework Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
A A. C++ Basics 646
A.1 Compilation Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
A.2 C++ Basic Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
A.3 C++ Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
A.3.1 Input/Output Library – iostream.h . . . . . . . . . . . . . . . . . . . 647
A.3.2 Input/Output Manipulation Library – iomanip.h . . . . . . . . . . . 648
A.3.3 Mathematics Library – math.h . . . . . . . . . . . . . . . . . . . . . 648
A.4 Operator Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
A.5 C++ and BLAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
B B. MPI Basics 651
B.1 Compilation Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
B.2 MPI Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
B.2.1 Predeﬁned Variable Types in MPI . . . . . . . . . . . . . . . . . . . . 652
B.2.2 Predeﬁned Reduction Operators in MPI . . . . . . . . . . . . . . . . 653
B.2.3 MPI Function Declarations . . . . . . . . . . . . . . . . . . . . . . . . 653
B.2.4 MPI Constants and Deﬁnitions . . . . . . . . . . . . . . . . . . . . . 672

Another Parallel Programming Books
Download

Book Download

Wednesday, July 27, 2011

Parallel Scientiﬁc Computing in C ++ and MPI

No comments:

Post a Comment

Link Exchange