COURSE TITLE: ECE 358 Introduction to Parallel Computing
CATALOG DESCRIPTION: Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, distributed memory, message-passing data-parallel architectures, and programming.
REQUIRED TEXTS: A. Grama, A. Gupta, G. Karypis, and Vipin Kumar, Introduction to Parallel Computing, Addison Wesley, 2nd edition, 2003; Class Notes.
REFERENCE TEXTS:
1. I. Foster, Designing and Building Parallel Programs, Addison Wesley 1995.
2. B. Bauer, Practical Parallel Programming, Academic Press 1992.
3. W. Gropp et al, Using MPI: Portable Parallel Programming with the Message Passing Interface, MIT Press 1994.
4. C. Koelbel et al, The High-Performance Fortran Handbook, MIT Press 1994.
COURSE COORDINATOR: Gokhan Memik
COURSE GOALS: To provide an introduction to the field of parallel computing. The goals are to provide an overview of the three basic types of parallel computing: shared memory, distributed memory message-passing, and data parallel computing, with hands-on experience with real parallel programming on actual parallel machines.
PREREQUISITES: ECE 361 and (ECE 230 or ECE 231 or CS 211)
PREREQUISITES BY TOPIC:
DETAILED COURSE TOPICS:
Week 1: Introduction to parallel computing: motivation for parallel computing, options of parallel computing, economics of parallel computing, basic concepts of parallel algorithms. Introduction to parallel programming: data and task parallelism, coarse and fine grain parallelism, performance of parallel programs, load balancing and scheduling, analysis of simple parallel programs.
Week 2: Overview of shared memory parallel architectures: memory organization, interconnect organization, cache coherence, case studies of machines such as SGI Challenge, IBM J-30, HP/Convex Exemplar. Introduction to shared memory parallel programming: shared memory model, process creation and destruction, mutual exclusion, locks, barriers.
Week 3: Explicit shared memory programming: loop scheduling, static and dynamic, loop parallelization strategies. Shared memory parallel programming: use of PTHREADS libraries, case studies of explicit parallel programming, computation of PI, matrix multiplication, solution of partial differential equations.
Week 4: Implicit shared memory parallel programming: use of compiler directives for parallel programming, DOALL and DOACROSS and PRAGMA directives for loop level parallelism, parallel programming examples using directives.
Week 5: Distributed memory multicomputer architectures: overview of distributed memory parallel machines, message passing schemes, store and forward versus wormhole routing, interconnection networks, case studies of parallel machines such as Intel Paragon, IBM SP-2, Thinking Machine CM-5. Global Communication operations in distributed memory machines: one-to-all broadcast, reduction, shift, scatter, gather operations, analysis of performance of above operations on various parallel architectures.
Week 6: Introduction to message-passing programming: basics of message passing, global and local addresses, single-program multiple data (SPMD programs) introduction to Message Passing Interface (MPI). Intermediate concepts in message passing programming: global and local addresses, loop scheduling for parallel loops. Advanced message-passing concepts: topologies, and decompositions, case studies of example applications, matrix multiplication, solution of partial differential equations.
Week 7: Introduction to SIMD parallel architectures: Single-instruction multiple data stream architectures, control and data units, interconnection networks, case studies of machines such as Thinking Machines CM-2, CM-5 and Masspar MP-2. Introduction to data parallel programming: Fortran-90, array sections, array operations, array intrinsic operations.
Week 8: Introduction to High Performance Fortran (HPF): FORALL directives, INDEPENDENT directives, simple parallel programs. High Performance Fortran data distribution and alignment directives, simple parallel programming examples, matrix multiplication, solution of partial differential equations.
Week 9: Methodology for Parallel Algorithm Design: concurrency, locality, scalability, modularity; partitioning, agglomeration, communication, mapping, performance analysis of parallel algorithms. Parallel Matrix Algorithms: matrix representations, parallel dense matrix operations, matrix-vector, matrix-matrix multiplication, solutions of linear system of equations.
Week 10: Parallel Sparse Matrix Solvers: sparse matrix representations, parallel iterative methods, parallel direct methods. Parallel Search Algorithms: optimization methods, parallel best first search, parallel depth-first search, speedup anomalies.
COMPUTER USAGE: Students get hands-on parallel programming experience on 3 parallel machines at the Center for Parallel and Distributed Computing, including a 16 processor IBM SP-2 distributed memory machine, an 8 processor IBM J-40 shared memory machine, and an 8-processor SGI Origin 2000 distributed shared memory multiprocessor.
HOMEWORK ASSIGNMENTS:
Homework 1: Design problems dealing with shared memory parallel programming, examples of program transformations to parallelize loops, use of explicit and implicit parallel programs using both the SGI directives and PTHREADS.
Homework 2: Design problems dealing with distributed memory message-passing parallel programming, use of MPI, analysis of communication patterns.
Homework 3: Design programs related to data parallel programming, use of High Performance Fortran, data layouts and alignments.
Homework 4: Design of parallel algorithms for various problems including matrix operations on dense and sparse matrices, analysis of parallel algorithms.
LABORATORY PROJECTS:
Lab 1: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using both explicit parallel programs with PTHREADS and implicit parallel programs using SGI directions, experiments on SGI Origin 2000 and IBM J-40 shared memory multiprocessors.
Lab 2: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using message-passing parallel programming using C or Fortran with MPI message passing, and experiments on IBM SP-2 distributed memory and SGI Origin 2000 multiprocessor for portability.
Lab 3: Development of a parallel program for solving a set of linear system of equations using Gaussian Elimination using data parallel programming with High Performance Fortran and experiments on the IBM SP-2 distributed memory multiprocessor.
GRADES:
Four homeworks - 20 %
Three labs - 20 %
Midterm exam - 30%
Final exam - 30%
COURSE OBJECTIVES: When a student completes this course, s/he should be able to:
ABET CONTENT CATEGORY: 100% Engineering (Design component).