INTRODUCTION TO PARALLEL COMPUTING

COURSE NUMBER: ECE 358

INSTRUCTOR:  Prithviraj Banerjee

CATALOG DESCRIPTION :

Introduction to parallel computing for scientists and engineers. Shared memory parallel architectures and programming, concepts using shared address space, locks, events, barriers, loop scheduling, compiler directives such as DOALL, portable parallel libraries such as PTHREADS. Distributed memory message-passing parallel architectures and programming, concepts including message sends and receives, global communication promitives, single-program multiple data (SPMD) programs, portable parallel message programming using MPI. Data parallel architectures and programming, concepts such as array sections and array operations, data distribution and alignment, languages such as High Performance Fortran (HPF). Parallel algorithms for engineering applications.

PRE-REQUISITES

ECE 361 (Computer Architecture) and ECE 230 (Programming for Computer Engineers) or equivalent.

REQUIRED TEXTS:

Class notes (copies of lecture transparencies) to be handed out to students.

RECOMMENDED TEXTS:

1. V. Kumar et al, Introduction to Parallel Computing , Benjamin Cummings Press, 1994.

2. I. Foster, Designing and Building Parallel Programs , Addison Wesley, 1994.

3. B. Bauer, ``Practical Parallel Programming,'' Academic Press, 1992. Login to "origin.cpdc.ece.nwu.edu" and type "insight" for an online version.

4. W. Gropp et al, Using MPI: Portable Parallel Programming with the Message Passing Interface , MIT Press, 1994.

5. C. Koelbel et al, The High-Performance Fortran Handbook , MIT Press, 1994.

LABORATORY:

Northwestern University Center for Parallel and Distributed Computing Lab; accounts on various supercomputer centers to be arranged.

CLASS NEWSGROUP:

nwu.school.meas.class.ece-358

Fall 2002 Class

Lectures are Tuesdays-Thursdays 12:30-2PM, TECH LG66.
Instructor: Prith Banerjee, Office Hours Tuesdays 4-5PM, L352

GRADES:

Four homeworks and programming assignments (10 % each) - 40 % total

Two exams (30 % each) - 60 % total

LECTURE SCHEDULE:

For the current offering of the course in Fall 2000 quarter, look at the Fall 2000 Lecture Schedule

DETAILED COURSE OUTLINE :

The following is an approximate breakdown of topics that will be covered in this course (assuming each lecture is for 1.5 hours).

Lecture 1: Introduction to parallel computing: motivation for parallel computing, options of parallel computing, economics of parallel computing, basic concepts of parallel algorithms.

Lecture 2 : Introduction to parallel programming: data and task parallelism, coarse and fine grain parallelism, performance of parallel programs, load balancing and scheduling, analysis of simple parallel programs.

Lecture 3 : Overview of shared memory parallel architectures: memory organization, interconnect organization, cache coherence, case studies of machines such as SGI Challenge, IBM J-30, HP/Convex Exemplar.

Lecture 4 : Introduction to shared memory parallel programming: shared memory model, process creation and destruction, mutual exclusion, locks, barriers.

Lecture 5 : Explicit shared memory programming: loop scheduling, static and dynamic, loop parallelization strategies.

Lecture 6 : Shared memory parallel programming: use of PTHREADS libraries, case studies of explicit parallel programming, computation of PI, matrix multiplication, solution of partial differential equations.
Download PTHREAD DOCUMENTATION
Download Example PTHREAD program hello.c
Download Makefile for compiling PTHREADS Programs

Lecture 7 : Implicit shared memory parallel programming: use of compiler directives for parallel programming, DOALL and DOACROSS and PRAGRA directives for loop level parallelism, parallel programming examples using directives.

Lecture 8 : Distributed memory multicomputer architectures: overview of distributed memory parallel machines, message passing schemes, store and forward versus wormhole routing, interconnection networks, case studies of parallel machines such as Intel Paragon, IBM SP-2, Thinking Machine CM-5.

Lecture 9 : Global Communication operations in distributed memory machines: one-to-all broadcast, reduction, shift, scatter, gather operations, analysis of performance of above operations on various parallel architectures.

Lecture 10 : Introduction to message-passing programming: basics of message passing, global and local addresses, single-program multiple data (SPMD programs) introduction to Message Passing Interface (MPI).

Lecture 11 : Intermediate concepts in message passing programming: global and local addresses, loop scheduling for parallel loops.

Lecture 12 : Advanced message-passing concepts: topologies, and decompositions, case studies of example applications, matrix multiplication, solution of partial differential equations.

Lecture 13 : Introduction to SIMD parallel architectures: Single-instruction multiple data stream architectures, control and data units, interconnection networks, case studies of machines such as Thinking Machines CM-2, CM-5 and Masspar MP-2.

Lecture 14: Introduction to data parallel programming: Fortran-90, array sections, array operations, array intrinsic operations.

Lecture 15: Introduction to High Performance Fortran (HPF): FORALL directives, INDEPENDENT directives, simple parallel programs.

Lecture 16: High Performance Fortran: data distribution and alignment directives, simple parallel programming examples, matrix multiplication, solution of partial differential equations.

Lecture 17: Methodology for Parallel Algorithm Design: concurrency, locality, scalability, modularity; partitioning, agglomeration, communication, maping, performance analysis of parallel algorithms.

Lecture 18: Parallel Matrix Algorithms: matrix representations, parallel dense matrix operations, matrix-vector, matrix-matrix multiplication, solutions of linear system of equations.

Lecture 19: Parallel Sparse Matrix Solvers: sparse matrix representations, parallel iterative methods, parallel direct methods.