Winter Quarter 1997, Seminar Series
The speakers and topics of seminars that were held during winter are
available here.
- Jan. 20: John Holm (jgholm@crhc.uiuc.edu)
- Jan. 27: Cancelled
- Feb. 3: Todd Plantenga (tdplant@ca.sandia.gov)
- Feb. 10: Junho Shim (shimjh@ece.nwu.edu)
- Feb. 17: Prof. P. Scheuermann (peters@ece.nwu.edu)
- Feb. 24: Pankaj Surana (surana@ece.nwu.edu)
- Mar. 3 : Vijay Kumar (vijay@ece.nwu.edu)
- Mar. 10 : Sanjay Goil (sgoil@ece.nwu.edu)
|
PERFORMANCE EVALUATION OF C++ MESSAGE-DRIVEN VLSI CAD APPLICATIONS
Abstract
One model of multithreading gaining popularity on multiprocessor
systems is the message-driven model of computation. The advantage of
programming in such a model is that the dependencies are explicit,
messages are asynchronous, and the resulting code is highly
multithreaded. The message-driven model presented in this study is the
actor model of concurrent computation as implemented in the ProperCAD
II C++ library developed at the University of Illinois.
In this talk, I will present an empirical performance study of large
parallel VLSI CAD applications run both shared memory and
message-passing parallel architectures. VLSI CAD applications provide
a rich set of complex irregular applications to study. I will present
generalized analysis of the parallelism structure, the communication
characteristics, grain sizes of computations, and detailed measurements
of system time, idle time, and user time in these applications.
In addition to providing a useful paradigm to program complex parallel
applications, I believe that the message driven model integrates well
with more advanced performance feedback and visualization tools.
Further feedback related to multithreading will be presented in the
form of dynamically weighted actor-method call graphs, method specific
statistics, and program visualization.
OPTIMIZATION AND PARALLEL COMPUTING
Abstract
Optimization algorithms can be applied to a wide range of
applications. This talk will focus on the design of optimal
systems that are modeled by scientific computing software.
Finding an optimal design is a computationally intensive task
that often demands parallel processing capability.
The talk will briefly survey the computational issues presented
by optimization algorithms, and the options afforded by
parallel computing. The speaker will discuss an application
in molecular design that is being adapted for distributed
memory machines being built for Sandia National Laboratories.
Caching documents on the World Wide Web
Abstract
The World Wide Web(WWW) has become a predominant client/server
architecture. Its simplicity and incremental design lead not
only to its quick widely usage, but also to significant
performance downgrade such as high response time perceived
by Web clients. One efficient way to reduce the high
communication delay which mainly causes high response time
is through caching documents; copying at the clients and moving
documents from the server effectively closer to the clients.
In this talk, it will be presented that how the caching works
and why it is so important in the WWW. Several previous
different cache algorithms, mainly focused on the document
replacement strategy in the cache, will be discussed.
And a new way of caching replacement, as well as
a new metric to evaluate the caching performance,
will be presented, which take consideration of network delay.
Finally the performance of the new algorithm will
be compared to the previous ones over the WWW trace data
which was gotten at Northwestern ACNS.
An Intelligent File Manager for Parallel Disk Systems
Abstract
Parallel disk systems such as disk arrays provide opportunities for high
performance I/O by supporting efficiently inter-request and inter-request
parallelism. However, to make effective use of commercially available
architectures it is necessary to develop intelligent software tools that allow
automatic tuning of the disk array to various workloads. We present
the components of an intelligent file manager that performs striping
on an individual or global basis, as desired by the application, and in
addition it achieves load balancing by judicious file allocation and dynamic
redistribution of data. We shall discuss in detail our "disk cooling"
procedure for dynamic redistribution of data and show that this heuristic
achieves excellent load balance in the presence of evolving access patterns.
In addition, we shall discuss how our dynamic data migration model can be
adopted to deal with predictable periodic access patterns when it maybe more
beneficial to postpone a reorganization for a later point in time when where
are fewer regular transactions.
Applying Parallel Computation Algorithms in the Design of Serial Algorithms
Abstract
The seminar will present a framework for improving serial algorithms
proposed by Nimrod Megiddo. The abstract of his paper is as follows:
"The goal of this presentation is to point out that analysis of
parallelism in computational problems have practical implications even
when multiprocessor machines are not available. This is true because,
in many cases, a good parallel algorithm for one problem may turn out
to be useful for designing an efficient serial algorithm for another
problem."
Improved Access to Optical Bandwidth in Trees
Abstract
We present improved bounds for efficient bandwidth allocation in a WDM
optical network whose topology is that of a directed tree of fibre-optic
links. The problem of bandwidth allocation is modelled as a colouring
problem, where each path in a set of communication paths must be assigned
a colour (representing a wavelength) in such a way that no two paths using
the same link in the same direction are assigned the same colour.
Letting $L$ be the largest number of paths using any directed link,
we show that for an arbitrary set of paths, $7L/4$ colours
are sufficient to route all paths. This improves upon the previous
best-known upper bound of $15L/8$.
As a corollary, we also give an algorithm for trees of rings that uses
$7L/2$ wavelengths.
We also establish a new lower bound of $5L/4$.
Concatenated Parallelism: A Technique for Efficient Parallel Divide
and Conquer
Abstract
A number of problems have efficient algorithms that are based on the
divide and conquer paradigm. Such problems can be solved in parallel by
mapping the corresponding divide and conquer tree to the parallel computer
under consideration. Two basic strategies are used in such
parallelizations: Task Parallelism, in which different subproblems
are assigned to different groups of processors and Data Parallelism,
in which the tasks are solved one after the other using all the
processors. Task parallelism involves significant data movement and data
parallelism causes problems due to load imbalance.
We propose a new strategy, which we call Concatenated Parallelism, for
efficient parallel solution of problems resulting in divide and conquer
trees. Our strategy is useful when the communication time due to data
movement in distributing the subproblems using Task parallelism is
significant when compared to the time required to divide the subproblems.
This happens to be the case for a number of important applications
including quicksort, quickhull and the construction of quadtrees, octrees
and multidimensional binary search trees. We consider balanced and
unbalanced deterministic divide and conquer trees as well as randomized
divide and conquer trees. We provide both theoretical and experimental
analysis of Concatenated Parallelism for coarse-grained distributed memory
parallel machines along with comparisons with Task and Data Parallelism.
(Joint work with Prof. Sanjay Ranka, University of Florida and
Prof. Srinivas Aluru, New Mexico State University)