Winter Quarter 1997, Seminar Series


The speakers and topics of seminars that were held during winter are available here.
  • Jan. 20: John Holm (jgholm@crhc.uiuc.edu)
  • Jan. 27: Cancelled
  • Feb. 3: Todd Plantenga (tdplant@ca.sandia.gov)
  • Feb. 10: Junho Shim (shimjh@ece.nwu.edu)
  • Feb. 17: Prof. P. Scheuermann (peters@ece.nwu.edu)
  • Feb. 24: Pankaj Surana (surana@ece.nwu.edu)
  • Mar. 3 : Vijay Kumar (vijay@ece.nwu.edu)
  • Mar. 10 : Sanjay Goil (sgoil@ece.nwu.edu)

PERFORMANCE EVALUATION OF C++ MESSAGE-DRIVEN VLSI CAD APPLICATIONS

Abstract

One model of multithreading gaining popularity on multiprocessor systems is the message-driven model of computation. The advantage of programming in such a model is that the dependencies are explicit, messages are asynchronous, and the resulting code is highly multithreaded. The message-driven model presented in this study is the actor model of concurrent computation as implemented in the ProperCAD II C++ library developed at the University of Illinois.

In this talk, I will present an empirical performance study of large parallel VLSI CAD applications run both shared memory and message-passing parallel architectures. VLSI CAD applications provide a rich set of complex irregular applications to study. I will present generalized analysis of the parallelism structure, the communication characteristics, grain sizes of computations, and detailed measurements of system time, idle time, and user time in these applications.

In addition to providing a useful paradigm to program complex parallel applications, I believe that the message driven model integrates well with more advanced performance feedback and visualization tools. Further feedback related to multithreading will be presented in the form of dynamically weighted actor-method call graphs, method specific statistics, and program visualization.


OPTIMIZATION AND PARALLEL COMPUTING

Abstract

Optimization algorithms can be applied to a wide range of applications. This talk will focus on the design of optimal systems that are modeled by scientific computing software. Finding an optimal design is a computationally intensive task that often demands parallel processing capability.

The talk will briefly survey the computational issues presented by optimization algorithms, and the options afforded by parallel computing. The speaker will discuss an application in molecular design that is being adapted for distributed memory machines being built for Sandia National Laboratories.


Caching documents on the World Wide Web Abstract

The World Wide Web(WWW) has become a predominant client/server architecture. Its simplicity and incremental design lead not only to its quick widely usage, but also to significant performance downgrade such as high response time perceived by Web clients. One efficient way to reduce the high communication delay which mainly causes high response time is through caching documents; copying at the clients and moving documents from the server effectively closer to the clients.

In this talk, it will be presented that how the caching works and why it is so important in the WWW. Several previous different cache algorithms, mainly focused on the document replacement strategy in the cache, will be discussed. And a new way of caching replacement, as well as a new metric to evaluate the caching performance, will be presented, which take consideration of network delay. Finally the performance of the new algorithm will be compared to the previous ones over the WWW trace data which was gotten at Northwestern ACNS.


An Intelligent File Manager for Parallel Disk Systems

Abstract

Parallel disk systems such as disk arrays provide opportunities for high performance I/O by supporting efficiently inter-request and inter-request parallelism. However, to make effective use of commercially available architectures it is necessary to develop intelligent software tools that allow automatic tuning of the disk array to various workloads. We present the components of an intelligent file manager that performs striping on an individual or global basis, as desired by the application, and in addition it achieves load balancing by judicious file allocation and dynamic redistribution of data. We shall discuss in detail our "disk cooling" procedure for dynamic redistribution of data and show that this heuristic achieves excellent load balance in the presence of evolving access patterns. In addition, we shall discuss how our dynamic data migration model can be adopted to deal with predictable periodic access patterns when it maybe more beneficial to postpone a reorganization for a later point in time when where are fewer regular transactions.


Applying Parallel Computation Algorithms in the Design of Serial Algorithms

Abstract

The seminar will present a framework for improving serial algorithms proposed by Nimrod Megiddo. The abstract of his paper is as follows: "The goal of this presentation is to point out that analysis of parallelism in computational problems have practical implications even when multiprocessor machines are not available. This is true because, in many cases, a good parallel algorithm for one problem may turn out to be useful for designing an efficient serial algorithm for another problem."


Improved Access to Optical Bandwidth in Trees

Abstract

We present improved bounds for efficient bandwidth allocation in a WDM optical network whose topology is that of a directed tree of fibre-optic links. The problem of bandwidth allocation is modelled as a colouring problem, where each path in a set of communication paths must be assigned a colour (representing a wavelength) in such a way that no two paths using the same link in the same direction are assigned the same colour.

Letting $L$ be the largest number of paths using any directed link, we show that for an arbitrary set of paths, $7L/4$ colours are sufficient to route all paths. This improves upon the previous best-known upper bound of $15L/8$.

As a corollary, we also give an algorithm for trees of rings that uses $7L/2$ wavelengths.

We also establish a new lower bound of $5L/4$.


Concatenated Parallelism: A Technique for Efficient Parallel Divide and Conquer

Abstract

A number of problems have efficient algorithms that are based on the divide and conquer paradigm. Such problems can be solved in parallel by mapping the corresponding divide and conquer tree to the parallel computer under consideration. Two basic strategies are used in such parallelizations: Task Parallelism, in which different subproblems are assigned to different groups of processors and Data Parallelism, in which the tasks are solved one after the other using all the processors. Task parallelism involves significant data movement and data parallelism causes problems due to load imbalance.

We propose a new strategy, which we call Concatenated Parallelism, for efficient parallel solution of problems resulting in divide and conquer trees. Our strategy is useful when the communication time due to data movement in distributing the subproblems using Task parallelism is significant when compared to the time required to divide the subproblems. This happens to be the case for a number of important applications including quicksort, quickhull and the construction of quadtrees, octrees and multidimensional binary search trees. We consider balanced and unbalanced deterministic divide and conquer trees as well as randomized divide and conquer trees. We provide both theoretical and experimental analysis of Concatenated Parallelism for coarse-grained distributed memory parallel machines along with comparisons with Task and Data Parallelism.

(Joint work with Prof. Sanjay Ranka, University of Florida and Prof. Srinivas Aluru, New Mexico State University)