Parallel and Distributed Computing

Externally Funded Research Projects

MATCH: A MATLAB Compilation Environment for Adaptive Computing Systems

Investigators: P. Banerjee, A. Choudhary, S. Hauck

Sponsor: Defense Advanced Research Projects Agency (DARPA), 4/98 – 3/01

Adaptive computing systems constitute a new class of computing and communication technology composed of configurable hardware capable of system-level adaptation. Such systems are often built out of combinations of microprocessor-based embedded systems, specialized digital signal processors (DSP’s), and field-programmable gate arrays (FPGA’s). The objective of the MATCH (MATLAB Compiler for Heterogeneous adaptive computing systems) project is to make it easier for users to develop efficient codes for adaptive computing systems. As part of this project, we are developing a compiler that allows input of a user’s applications written in the high-level language, MATLAB, and generates efficient low-level code that runs on commercial off-the-shelf FPGA’s, embedded processors, and DSP’s. Our specific aims include:

development of a hardware testbed consisting of commercial FPGA’s, embedded processors,
and DSP’s;

• development of a basic compiler for mapping a given MATLAB application onto this
heterogeneous target;

• investigation of automated parallelization and mapping techniques;

• design and support of compiler directives;

• development of library functions and applications of interest to DOD;

development of faster algorithms for compilation.

PANTHER: A High-Performance Distributed Computing Infrastructure

Investigators: P. Banerjee, A. Choudhary, S. Hauck, D. T. Lee, M. Sarrafzadeh, P. Scheuermann,
and V. Taylor

Sponsor: National Science Foundation (NSF), 9/97 – 8/02

Specific aims of the PANTHER project include:

exploration of using high-speed networking and computing to investigate file-systems and data-management issues for high-performance distributed computing;

• investigation of the parallel-programming support of networks of high-speed workstations and
personal computers as an alternative to stand-alone parallel computers;

• investigation of high-performance CAD of electronic systems in a heterogeneous environment;

• development of a Web-based CAD computing center that takes advantage of high-speed
networking;

• exploration of new instructional techniques that take advantage of high bandwidth and high
speed.

This project provides for the acquisition of 50 high-performance Hewlett-Packard C-180 UNIX workstations, 20 medium-performance B-132 UNIX workstations; 3 J-280 UNIX fileservers, an
8-processor Silicon Graphics Origin 2000 distributed shared-memory multiprocessor, and four CISCO systems LS1010 ATM switches. All of the machines are connected via an OC-3 ATM network operating at 155 Mbps.

PARADIGM: Efficient Compilation Issues in Distributed Memory Multicomputers

Principal Investigator: P. Banerjee

Sponsor: NSF, 9/96 – 9/99

In this research, we are developing efficient compilation and run-time techniques for optimized code generation for distributed-memory multicomputers and distributed shared-memory multiprocessors. We are producing the next generation PARADIGM compiler to handle general block, cyclic and block-cyclic data distributions in regular applications. We are also developing techniques using global dataflow analysis to perform redundant array access elimination. We are investigating techniques for handling regular and irregular codes in a uniform framework based on a runtime library.

Filtration Combustion for Microgravity Applications: (1) Smoldering; (2) Combustion
Synthesis of Advanced Materials

Investigators: A. Bayliss, B. J. Matkowsky, and V. Volpert (Applied Mathematics)

Sponsor: National Aeronautics and Space Administration (NASA), 6/1/94 – 11/30/98

In this grant, we study filtration combustion with an emphasis on combustion synthesis and smoldering. The objectives are to understand these processes, describe observed modes of combustion, and predict new modes. The investigations employ both analytical and numerical methods. We consider a variety of physical effects including cellular flames in filtration combustion, and the dynamics of solid-fuel combustion when melting occurs.

Investigation of Nuclear Burning on Neutron Stars

Investigators: A. Bayliss and R. Taam (Physics)

Sponsor: NASA , 2/15/98 – 2/14/99

We investigate the role of nuclear burning on accreting neutron stars in the X-ray burst phenomenon. Particular attention is focused on the study of instabilities of the burning front and its evolution to the previously unexplored pulsing regime. The ignition, nuclear evolution, and propagation of symmetric and asymmetric conductive burning fronts on the surface of a neutron star in multi-dimensional circumstances is investigated. Studies also include a multi-dimensional analysis of a nuclear burning front in which energy is transported by convection. We provide detailed treatments of the nuclear physics, thermodynamics, and gas dynamics using hydrodynamic, nuclear reaction network, and flame-propagation computer codes based on adaptive spectral and finite-difference methods that we previously developed. The results are used to determine the propagation speed and variability of the conductive and convective burning fronts, thereby providing fundamental understanding of nuclear burning on the surfaces of accreting neutron stars. The resulting influence of multi-dimensional effects on the X-ray light curve offers additional insights in interpreting the bursting behavior and variability seen in observed sources. Armed with these new theoretical models and the prospect of new observations with the Rossi X-Ray Timing Explorer, the potential for new insights into the X-ray burst phenomenon is high.

Non-Reflecting Boundary Conditions Based on Far Field Expansions

Principal Investigator: A. Bayliss

Sponsor: NSF, 7/96 – 6/99

In this project we develop, implement, and assess outer grid boundary conditions for wave-propagation problems that are based on far-field expansions of the solution. The methodology is to obtain a far-field expansion for the solution, for example, in terms of the distance from a given source location, and then derive outer boundary conditions as differential operators that annihilate progressively more terms in the expansion. Techniques of interest use interior information to improve the performance of the outer boundary conditions. This permits dealing with situations where the source location is not known in advance, where sources move, or where waves from different sources impinge on different portions of the boundary at the same time. We also develop techniques to increase the order of the outer boundary condition without increasing the order of the spatial derivatives in the boundary operator. This is done by incorporating inhomogeneous terms in the boundary condition derived from far-field expansions of the solution.

Nonlinear Dynamics in Combustion

Principal Investigator: A. Bayliss

Sponsor: NSF, 7/94 – 6/98

In this project we formulate and apply adaptive pseudospectral methods to numerically model the development of highly nonlinear space-time patterns in combustion. Pseudo-spectral methods are more accurate than commonly used finite-difference and finite-element techniques. Mesh adaptation provides increased spatial resolution of the narrow combustion reaction zones. We consider applications involving both gaseous and solid-fuel combustion. In gaseous combustion, we study flames stabilized on a rotating burner, and the effect of buoyancy on flames. In solid-fuel combustion, we study various modes that can occur in the self-propagating high-temperature synthesis process. We numerically simulate recently observed combustion modes involving counterpropagating hot spots. In addition, we analyze other modes, not yet observed, involving repeated reversals of the spinning direction of the hot spots.

Mathematical Sciences Computing Research Environments

Investigators: A. Bayliss and several faculty in Applied Mathematics

Sponsor: NSF , 8/96 – 7/98

This grant enabled acquisition of several Hewlett-Packard HP J2240 workstations to perform numerical analyses. Specific problems of interest include filtration combustion with application to materials synthesis, and pattern formation and nonlinear dynamics in gaseous combustion.

Compiler and Run-Time Optimization Techniques for Parallel Programming

Principal Investigator: A. Choudhary

Sponsor: NSF Young Investigator Award, 5/93 – 7/98

This project develops compiler and run-time optimization techniques for scalable parallel programming. In particular, the objectives are to develop compiler techniques for parallel I/O and locality optimizations, and run-time techniques for data-redistribution and memory-hierarchy optimizations. Further, this research involves the development of fundamental models and compilation techniques for out-of-core computations.

System Software for Input-Output in Parallel Computers

Principal Investigator: A. Choudhary

Sponsor: NSF, 5/96 – 4/99

Large-scale parallelism and efficient supporting software can mitigate the performance difference between disk systems and CPU’s / interconnects to achieve an overall balanced, scalable operation. However, the use of parallelism in I/O systems introduces complexities for the system software including run-time systems and file systems. This project addresses the system-software problem, specifically run-time support and interface issues, to perform efficient I/O on parallel computers. Project goals include the design and development of a sophisticated run-time system (PASSION) that (1) incorporates the concepts of collective I/O, data reuse and prefetching, and other optimizations; and (2) implements high-level interfaces to facilitate I/O as easily as accessing any data structures from within a program. This approach should diminish I/O bottlenecks and permit the efficient performance of parallel I/O from application programs.

Modeling and Evaluation of I/O Architecture in Servers

Principal Investigator: A. Choudhary

Sponsor: Intel Corporation, 9/96 – 1/99

This project is developing an object-oriented simulation environment to study various architectural configurations for servers. We have developed detailed server-workload models from typical application domains such as OLTP, video-on-demand, decision support etc. These models are used to study the performance, reliability, and availability of server features. While the primary focus is to study different approaches for the server’s I/O subsystem configuration, the modeling environment is general enough to study other server components including the processing nodes and the communications network.

Run-Time Libraries for Large-Scale Parallel I/O

Principal Investigator: A. Choudhary

Sponsor: Sandia National Laboratory, 2/97 – 1/98

This project is developing run-time support for high-performance I/O. This research is motivated by the requirements of a large number of science and engineering applications, including teraflops applications, where data are required to be reorganized into a canonical form for further processing or restarts. The run-time library provides two designs, namely "collective I/O" and "pipelined collective I/O." In the first scheme, all processors participate in the I/O simultaneously. This makes scheduling of I/O requests simple but creates the possibility of contention at the I/O nodes. In the second approach, processors are organized into several groups. Only one group at a time performs I/O.
The other groups implement communications to rearrange data. This entire process is pipelined to reduce I/O node contention dynamically. Software caching, chunking, and on-line compression mechanisms are included in both models.

Interoperable Data Files for High-Performance Computing

Principal Investigator: A. Choudhary

Sponsor: NSF / University of Wyoming, 7/97 – 6/99

This project is developing data-management techniques that enhance both interoperability and high-performance parallel access to scientific data. Specific objectives include the development of software support for: (1) "portable" data files that provide user-defined abstractions; (2) automated filtering and conversion techniques capable of extracting the "meaning" of a data file and presenting it in a form compatible with a host application; (3) improved access techniques that permit parallel access to the files while preserving the basic abstractions; (4) run-time techniques that incorporate collective I/O, data reuse, and prefetching strategies; and (5) legacy files.

Design, Development, Benchmarking and Evaluation of Parallel Applications

Principal Investigator: A. Choudhary

Sponsor: U.S. Air Force Systems Command Rome Laboratory, 12/96 – 5/99

This project has the following design and implementation goals: (1) I/O, data-distribution, and task-scheduling techniques for individual components as well as integrated systems in applications such as space-time adaptive processing, fusion, and target detection; (2) I/O techniques for embedded high-performance system applications; (3) data-distribution and redistribution strategies for inter-task and intra-task data communication in real-time pipelined parallel applications; (4) task assignment and scheduling techniques to satisfy latency and throughput requirements; and (5) portable parallel software having well-defined interfaces.

Scalable I/O Initiative

Principal Investigator: A. Choudhary

Sponsor: DARPA / California Institute of Technology, 7/96 – 9/99

The I/O performance of massively parallel processors has not kept pace with their processing and communications capabilities. Poor I/O can severely degrade overall throughput. The need for high-performance I/O is so significant that almost all current parallel computers provide hardware and software support for parallel I/O. This project attacks the I/O problem from a language, compiler, and run-time-support point of view. PASSION software support is targeted for I/O-intensive, loosely synchronous problems. The PASSION Runtime Library provides routines to efficiently perform the I/O required in parallel programs. Another goal of the PASSION compiler is to translate out-of-core programs written in a data-parallel language to node programs with calls to the PASSION library.

High-Performance Data-Management, Access, and Storage Techniques for Tera-Scale
Scientific Applications

Investigators: A. Choudhary, P. Banerjee, and V. Taylor

Sponsor: Department of Energy (DOE) ASCI Level-2 Grant

Emerging large-scale scientific experiments and simulations require the storage, management, efficient access, and analysis of hundreds of gigabytes to hundreds of terabytes of data. Current data-management and analysis techniques do not satisfy these needs in term of performance, scalability, ease of use, and interfaces. This project is developing a scalable, high-performance, data-management system (SHPDM) to provide support for data management, query capability, and high-performance accesses to large datasets stored in hierarchical storage systems such as HPSS. SHPDM will provide the flexibility of conventional databases for indexing, searches, management of objects, and creating and keeping histories and trails of data accesses. It will also provide high-performance access methods and optimizations (pooled striping, prefetching, caching, collective I/O) for accessing large-scale data-objects found in scientific computations.

Job Task Analysis of Aviation Maintenance Technicians

Principal Investigator: G. Krulee

Sponsor: Federal Aviation Administration (FAA), 1/98 – 12/98

Numerical Optimization

Principal Investigator: J. Nocedal

Sponsor: NSF, 9/96 – 8/98

Interior-point methods have revolutionized linear programming. This project studies how the underlying ideas can be used to design new robust and efficient algorithms for large-scale nonlinear optimization. Part of the study is devoted to the development of the NITRO software which combines new ideas with proven techniques such as trust-region methods and sequential quadratic programming. The investigations include theory, algorithm design, and software development.

Challenges in Cise: Metacomputing Environments for Optimization

Principal Investigator: J. Nocedal

Sponsor: NSF, 9/97 – 8/00

The metacomputing paradigm has emerged as an economical way to harness the power of large, distributed collections of computers, making use of compute cycles that would otherwise be wasted. The goal of this project is to use metacomputing platforms to enable the solution of very large optimization problems that arise in science, engineering, and economics. By combining distributed hardware, network hardware / software infrastructure, and optimization algorithms and modeling tools, we aim to produce environments powerful enough to tackle optimization problems of unprecedented size and complexity.

Interior Point Methods for Power Generation Schedules

Principal Investigator: J. Nocedal

Sponsor: Electricite de France, 12/97 – 3/98

The goal of this project is to design and implement both algorithmic and computational improvements to increase the performance of the nonlinear interior-point algorithm used in the power-generation scheduling code created by Electricite de France. Efforts concentrate on finding efficient data structures, improving the performance of the linear-algebra routines, and testing state-of-the-art heuristics for primal-dual interior-point methods.

Interactive Environment for Optimization

Principal Investigator: J. Nocedal

Sponsor: Sandia National Laboratory, 6/98 – 9/98

Our long-term goal is to create a highly interactive mode of communication for nonlinear programming using NEOS Internet-based facilities. The objective of this project is to allow a user to perform all the simulations on his machine and only delegate to NEOS the task of computing a new iterate.
This new environment will be especially useful for scientists having very large and confidential simulation codes. Our principal collaborators are Sandia National Laboratories for protein-folding applications and NASA Langley for shape-design optimization.

Large Scale Optimization and its Application to Weather Forecasting

Principal Investigator: J. Nocedal

Sponsor: DOE, 8/95 – 7/98

Modern weather-forecasting techniques make use of the variational assimilation of data. This is an optimization-based method for estimating the unknown conditions of the atmosphere. The number of variables in these simulations is in the order of millions. In this project, we develop new algorithms that can solve problems of this type within the time limits required by operational forecasting.
The new optimization techniques are also applicable to climate simulations.

Climate Modeling on Parallel Machines

Principal Investigator: J. Nocedal

Sponsor: Argonne National Laboratory, 4/97 – 9/98

The focus of this project has changed from climate modeling to mixed-integer nonlinear programming using highly efficient branch-and-bound methods. This project combines these techniques with recent developments in nonlinear optimization and emerging computer architectures to tackle very large problems. The new algorithms should have many applications in industry from scheduling to portfolio management.

Scalable and Adaptive File Management in Networks of Workstations

Principal Investigator: P. Scheuermann

Sponsor: NASA, 9/96 – 8/99

This project is developing a prototype system to provide a scalable architecture for replication on the Web. The new client / server architecture allows for user-transparent geographical replication of Web services. This system is legacy-friendly, i.e., it requires no changes to the existing Web infrastructure. The clients are downloaded as applets to commercially available browsers, while the Web servers are extended with servlets. Both server and client machines take active roles: server machines determine the resources to be replicated / migrated and the best locations for these resources; while client machines choose the best server to submit their requests.

Putting Log Data to Work: Mass Storage Information Systems

Principal Investigator: P. Scheuermann*

Sponsor: NASA, 1/98 –1/99

This study is part of the Mass Storage System Performance Analysis project at NASA Goddard.
The goal is to capture useful information about the activity of the mass data storage and delivery system from system logs. Currently, these logs are large, ill-structured, and inaccessible for querying. Therefore, the overall performance of the mass-storage system, as well as the statistics about user access patterns, cannot be easily determined. We are investigating the use of data-warehousing and data-mining techniques to obtain easy access to summaries of the log data, as well as to search for interesting patterns of user activity in the storage system.

Prophecy: A Hierarchical Tool for Modeling and Analyzing Parallel Scientific Applications

Principal Investigator: V. Taylor

Sponsor: NASA, 1998 – 2000

Currently, there exists a large gap between the theoretical peak performance and the actual achieved performance of advanced parallel computers when running large scientific simulations. This gap stems in part from an incomplete understanding of how the parallel-system features impact the performance of the applications. Detailed performance models aid in understanding the relationship between the parallel system and the application. This project studies the use of several performance models, including nonlinear models, to aid in understanding and predicting application performance.

Young Investigator Award

Principal Investigator: V. Taylor

Sponsor: NSF, 8/93 – 7/99

This grant supports the PI’s research in three key areas related to the performance of parallel scientific applications: (1) development of the mesh-partitioning tool, "PART," for distributed systems; (2) investigation of methods to improve the performance of parallel shortest-path applications; and (3) development of a tool for setting up performance models. The third area is still in an initial stage.

Symmetric Multiprocessors for Research on Parallel Architectures, Compilers, Applications,
and On-Demand Network Computing Research

Investigators: V. Taylor, J. Fortes, and R. Eigenmann (Purdue)

Sponsor: NSF, 12/96 – 11/98

The focus of this project is on advancing the state-of-the-art in high-performance computing on four fronts: (1) design of high-performance computers capable of 100-trillion operations per second
(100 TOPS); (2) advanced tools for analysis and compilation of programs for execution on machines with multiple processors; (3) characterization of applications that require computation rates of
100–1000 TOPS; and (4) design of a network-based software infrastructure that allows the use of linked computers and linked programs (i.e., metacomputers and metaprograms) through conventional network browsers. The main ideas currently under investigation are being validated through extensive computational experiments and computer simulations. The equipment acquired with the funds provided by this grant provides the computational speed, memory, storage, and communication demands of the experimental component of this project which would otherwise not be possible or take unacceptably long times to perform. This work is a collaboration between researchers at Northwestern University and Purdue University.

A Systems Characterization of Large Computational Applications

Investigators: V. Taylor, J. Fortes, R. Eigenmann, and G. Fox (Purdue)

Sponsor: NSF, 1997 – 98

The characterization of large computational applications from a systems perspective is critical to the design of future high-performance computer systems. There is a growing acceptance of the fact that systems research needs to be driven by realistic applications. This project completes initial work aimed at: (1) developing an infrastructure for systems characterization of applications;
and (2) conducting initial characterizations of three scientific applications.

Characterization of EMG Signal for Signature Application

Principal Investigator: C. H. Wu

Sponsor: Motorola, Inc., 1/98 – 7/98

A recorded EMG signal for a limb movement consists of a train of pulses. To use such a signal for user identification, we need to extract unique features from it. It is known that the number, N, of pulses; the time width, W, between pulses; and the amplitude, H, of the pulses are different for different limb movements under various loading conditions. This allows (N,W,H) to be used for identification. However, without the information about the muscle-reflex mechanisms, the EMG signal is too noisy and lacks the information needed to distinguish different users. In this project, we are developing and testing an emulated muscle-reflex model that permits processing the EMG signal for robust user identification.

Book Sections and Chapters1

J. Nocedal,* "Large-Scale Unconstrained Optimization," in The State of the Art in Numerical Analysis,
A. Watson and I. Duff, eds., Oxford Press, 1997.

J. Nocedal* and Y. Yuan, "Combining Trust Region and Line Search Techniques," in Advances in
Nonlinear Programming
, Y. Yuan, ed., Dordrecht, Netherlands: Kluwer, 1998.

R. Byrd, G. Liu, and J. Nocedal,* "On the Local Behavior of an Interior Point Method for Nonlinear
Programming," in Numerical Analysis, D. F. Griffiths and D. J. Higham, eds., New York: Addison-
Wesley, 1997.

Journal Papers1

G. Hasteer and P. Banerjee,* "A parallel algorithm for state assignment of finite state machines,’’
IEEE Trans. Computers, vol. 47, pp. 242–246, Feb. 1998.

S. Ramaswamy, S. Sapatnekar, and P. Banerjee,* "A framework for exploiting data and functional
parallelism on distributed memory multicomputers," IEEE Trans. Parallel and Distributed
Systems
, vol. 8, pp. 1098–1116, Nov. 1997.

_______

1All citations are listed in the alphabetical order of the Group faculty member(s), denoted by a *.

A. Bayliss,* D. A. Schult, and B. J. Matkowsky, "Traveling waves in natural counterflow filtration
combustion and their stability," SIAM J. Applied Mathematics, vol. 58, pp. 806–852, 1998.

T. Belytschko, A. Bayliss,* C. Brinson, S. Carr, W. Kath, S. Krishnaswamy, B. Moran, J. Nocedal,*
and M. Peshkin, "Mechanics in the ‘Engineering-First’ curriculum at Northwestern University,"
International J. Engineering Education, vol. 13, pp. 457–472, 1997.

S. Adve, D. Burger, R. Eigenmann, A. Rawsthorne, M. Smith, C. Gebotys, M. Kandemir, D. Lilja,
A. Choudhary,* J. Fang, and P. Yew, "Changing interaction of compiler and architecture,"
IEEE Computer, vol. 30, pp. 51–58, Dec. 1997.

I. Foster, D. Kohr, Jr., R. Krishnaiyer, and A. Choudhary,* "A library-based approach to task
parallelism in a data-parallel language," J. Parallel and Distributed Computing, vol. 45,
pp. 148–158, Sept. 1997.

S. Goil and A. Choudhary,* "High performance OLAP and data mining on parallel computers," J. Data
Mining and Knowledge Discovery (Special Issue on Scalable High-Performance Computing for
KDD)
, vol. 1, pp. 391–417, 1997.

M. Kandemir, A. Choudhary,* J. Ramanujam, and M. Kandaswamy, "Locality optimization algorithms
for compilation of out-of-core codes," J. Information Science and Engineering, vol. 14,
pp. 107–138, March 1998.

M. Kandemir, J. Ramanujam, R. Bordawekar, and A. Choudhary,* "Compilation techniques for out-of-
core parallel computations," Parallel Computing, vol. 24, pp. 597–628, June 1998.

W.-C. Lin,* W.-S. Shih and C.-T. Chen, "Morphologic field morphing: Contour model-guided image
interpolation," Int’l. J. Imaging Systems and Technology, vol. 8, pp. 480–490, 1997.

R. Byrd and J. Nocedal,* "Active set and interior point methods for nonlinear optimization,"
Documenta Mathematica, Vol. III, Journal derDeutschen Mathematiker-Vereinigung,
pp. 667–676, 1998.

P. Scheuermann,* G. Weikum, and P. Zabback, "Data partitioning and load balancing in parallel disk
systems," The VLDB Journal, vol. 7, pp. 48–66, Feb. 1998.

P. Scheuermann,* W.-S. Li, and C. Clifton, "Multidatabase query processing with uncertainty in global
keys and attribute values," J. American Society for Information Science, vol. 49, no. 3,
pp. 283–301, 1998.

B. M. Maggs and E. J. Schwabe,* "Real-time emulations of bounded-degree networks," Information
Processing Letters
, vol. 66, pp. 269–276, June 1998.

V. E. Taylor,* R. Stevens, and K. Arnold, "Parallel molecular dynamics: Implications for massively
parallel machines," J. Parallel and Distributed Computing, vol. 45, pp. 166–175, Sept. 1997.

C. H. Wu,* K. S. Hwang, and S. L. Chang, "Analysis and implementation of a neuromuscular-like
control for robotic compliance," IEEE Trans. Control Systems Technology, vol. 5, pp. 586–597,
Nov. 1997.

Symposium Papers1

D. Chakrabarti, A. Lain, and P. Banerjee,* "Evaluation of compiler and runtime library approaches for
supporting parallel regular applications," Proc. Int’l. Parallel Processing Symp. (IPPS-98),
Orlando, FL, April 1998.

J. Chandy and P. Banerjee,* "A parallel circuit-partitioned algorithm for timing-driven standard cell
placement," Proc. Int’l. Conf. Computer Design (ICCD-97), Austin, TX, Oct. 1997.

G. Hasteer, A. Mathur, and P. Banerjee,* "A framework for equivalence checking of multi-phase
FSM’s," Proc. Int’l. High-Level Design Validation and Test Workshop, Oakland, CA, Nov. 1997.

G. Hasteer, A. Mathur, and P. Banerjee,* "An implicit algorithm for finding steady states and its
application to FSM verification," Proc. Design Automation Conf. (DAC-98), San Francisco, CA,
June 1998.

M. Kandemir, P. Banerjee,* A. Choudhary,* J. Ramanujam, and N. Shenoy, "A generalized
framework for global communication optimization," Proc. Int’l. Parallel Processing Symp.
(IPPS/SPDP’98)
, Orlando, FL, March-April 1998.

M. Kandemir, J. Ramanujam, P. Banerjee,* and A. Choudhary,* "Optimizing spatial locality in loop
nests using linear algebra," Proc. 7th Int’l. Workshop on Compilers for Parallel Computers,
Linkoping, Sweden, June 1998.

M. Kandemir, N. Shenoy, P. Banerjee,* J. Ramanujam, and A. Choudhary,* "Minimizing data and
synchronization costs in one-way communication," Proc. 1998 Int’l. Conf. Parallel Processing
(ICPP’98)
, Minneapolis, MN, Aug. 1998.

V. Kim and P. Banerjee,* "Parallel algorithms for power estimation," Proc. Design Automation Conf.
(DAC-98)
, San Francisco, CA, June 1998.

V. Krishnaswamy and P. Banerjee,* "Parallel-compiled event-driven VHDL simulation," Proc. Int’l.
Conf. Supercomputing (ICS-98
), Melbourne, Australia, July 1998.

P. Prabhakaran and P. Banerjee,* "Simultaneous scheduling, binding and floorplanning in high-level
synthesis," Proc. 11th Int’l. Conf. VLSI Design (VLSI Design’98), Chennai, India, Jan. 1998.

P. Prabhakaran and P. Banerjee,* "Parallel algorithms for scheduling, binding, and floorplanning in
high-level synthesis," Proc. Int’l. Conf. Circuits and Systems (ISCAS-98), Monterey, CA,
May 1998.

S. Roy and P. Banerjee,* " Resynthesis of sequential circuits for low power," Proc. Int’l. Conf. Circuits
and Systems (ISCAS-98)
, Monterey, CA, May 1998.

S. Roy, A. Harm, and P. Banerjee,* "PowerShake: A low-power driven clustering and factoring
methodology for Boolean expressions," Proc. Design, Automation and Test in Europe Conf.
(DATE’98)
, Paris, France, Feb. 1998.

S. Roy, M. Sarrafzadeh, and P. Banerjee,* "Partitioning sequential circuits for low power," Proc. 11th
Int’l. Conf. VLSI Design (VLSI Design’98)
, Chennai, India, Jan. 1998.

M. Wang, M. Sarrafzadeh, and P. Banerjee,* "Placement with incomplete data," Proc. Design
Automation Conf. (DAC-98)
, San Francisco, CA, June 1998.

Z. Xing and P. Banerjee,* "A parallel algorithm for zero-skew clock-tree routing," Proc. Int’l. Symp.
Physical Design (ISPD’98)
, Monterey, CA, April 1998.

Z. Xing and P. Banerjee,* "A parallel algorithm for timing-driven global routing for standard cells,"
Proc. Int’l. Conf. Parallel Processing (ICPP’98), Minneapolis, MN, Aug. 1998.

A. Choudhary,* M. Kandemir, N. Shenoy, P. Banerjee,* and J. Ramanujam, "Minimizing data and
synchronization costs in one-way communication," Proc. 1998 Int’l. Conf. Parallel Processing
(ICPP’98)
, Minneapolis, MN, Aug. 1998.

A. Choudhary,* W. Liao, D. Weiner, P. Varshney, R. Linderman, and M. Linderman, "Design,
implementation, and evaluation of parallel pipelined STAP on parallel computers," Proc. Int’l.
Parallel Processing Symp. (IPPS/SPDP’98)
, Orlando, FL, March-April 1998.

J. Carretero, J. No, P. Chen, and A. Choudhary,* "COMPASSION: A parallel I/O runtime system
including chunking and compression for irregular applications," Proc. IRREGULAR’98, Berkeley,
CA, Aug. 1998.

J. Carretero, J. No, S. Park, P. Chen, and A. Choudhary,* "COMPASSION: A parallel I/O runtime
system including chunking and compression for irregular applications," Proc. HPCN’98,
Amsterdam, Netherlands, April 1998.

D. Chakrabarti, N. Shenoy, A. Choudhary,* and P. Banerjee,* "An efficient uniform run-time scheme
for mixed regular-irregular applications," Proc. Int’l. Conf. Supercomputing (ICS-98), Melbourne,
Australia, July 1998.

S. Chaudhry and A. Choudhary,* "Time-dependent priority scheduling for guaranteed QOS systems,"
Proc. 6th Int’l. Conf. Computer Communications and Networks, Las Vegas, NV, Sept. 1997.

S. Goil and A. Choudhary,* "Parallel data cube construction for high-performance on-line analytical
processing," Proc. 4th Int’l. Conf. High Performance Computing (HiPC’97), Bangalore, India,
Dec. 1997.

S. Goil and A. Choudhary,* "High-performance data mining using data cubes on parallel computers,"
Proc. 12th Int’l. Parallel Processing Symp. & 9th Symp. Parallel and Distributed Processing
(IPPS/SPDP’98)
, Orlando, FL, March–April 1998.

S. Goil and A. Choudhary,* "On scalable parallel computation of the multidimensional data cube,"
Proc. Int’l. Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA’98),
Las Vegas, NV, July 1998.

M. Kandaswamy, M. Kandemir, A. Choudhary,* and D. Bernholdt, "Optimization and evaluation of
Hartree-Fock applications’ I/O With PASSION," Proc. Supercomputing 1997 (SC’97), Nov. 1997.

M. Kandaswamy, M. Kandemir, A. Choudhary,* and D. Bernholdt, "Performance implications of
architectural and software techniques on I/O-intensive applications," Proc. 1998 Int’l. Conf.
Parallel Processing (ICPP)
, Minneapolis, MN, Aug. 1998.

M. Kandemir, A. Choudhary,* and J. Ramanujam, "Improving locality in out-of-core computations
using data layout transformation," Proc. 4th Workshop on Languages, Compilers, and Run-Time
Systems for Scalable Computers (LCR)
, Pittsburgh, PA, May 1998.

M. Kandemir, A. Choudhary,* J. Ramanujam, and M. Kandaswamy, "A unified compiler algorithm for
optimizing locality, parallelism and communication in out-of-core computations," Proc. 5th Workshop
on I/O in Parallel and Distributed Systems (IOPADS’97)
, San Jose, CA, Nov. 1997.

M. Kandemir, A. Choudhary,* J. Ramanujam, N. Shenoy, and P. Banerjee,* "A generalized
framework for global communication optimization," Proc. Int’l. Parallel Processing Symp.
(IPPS-98)
, Orlando, FL, April 1998.

M. Kandemir, A. Choudhary,* N. Shenoy, P. Banerjee,* and J. Ramanujam, "A hyperplane-based
approach for optimizing spatial locality in loop nests," Proc. 1998 ACM Int’l. Conf.
Supercomputing (ICS)
, Melbourne, Australia, July 1998.

M. Kandemir, A. Choudhary,* N. Shenoy, J. Ramanujam, and P. Banerjee,* "A hyperplane-based
approach for optimizing spatial locality in loop nests," Proc. Int’l. Conf. Supercomputing (ICS-98),
Melbourne, Australia, July 1998.

M. Kandemir, M. Kandaswamy, and A. Choudhary,* "Global I/O optimizations for out-of core
computations," Proc. High-Performance Computing Conf. (HiPC’97), Bangalore, India, Dec. 1997.

M. Kandemir, J. Ramanujam, and A. Choudhary,* "Compiler algorithms for optimizing locality and
parallelism on shared and distributed memory machines," Proc. Int’l. Conf. Parallel Architectures
and Compilation Techniques (PACT’97)
, San Francisco, CA, Nov. 1997.

M. Kandemir, J. Ramanujam, A. Choudhary,* and P. Banerjee,* "An iteration space transformation
algorithm based on an explicit data layout representation for optimizing locality," Proc. Workshop
on Languages and Compilers for Parallel Computing (LCPC-98)
, Chapel Hill, NC, Aug. 1998.

J. No and A. Choudhary,* "Runtime library for parallel I/O for irregular applications," Proc. PARCO’97,
Bonn, Germany, Sept. 1997.

J. No and A. Choudhary,* "Techniques to provide run-time support for solving irregular problems,"
Proc. ICPADS’97, Seoul, Korea, Dec. 1997.

J. No, S. Park, J. Carretero, P. Chen, and A. Choudhary,* "Design and implementation of a parallel I/O
runtime system for irregular applications," Proc. 12th Int’l. Parallel Processing Symp. & 9th Symp.
Parallel and Distributed Processing (IPPS/SPDP’98)
, Orlando, FL, March–April, 1998.

C. Srinilta and A. Choudhary,* "Performance enhancement using intraserver caching in a continuous
media server," Proc. 8th Int’l. Workshop on Research Issues in Data Engineering: Continuous-
Media Databases and Applications (RIDE’98)
, Orlando, FL, Feb. 1998.

W.-C. Lin,* W.-S. V. Shih, and S.-Y. Rhee, "Morphological field warping : A volumetric deformation
method for medical image registration," Proc. 14th Int’l. Conf. Pattern Recognition, Brisbane,
Australia, Aug. 16–20, 1998, pp. 1680–1682.

P. Scheuermann,* "Issues in data mining and warehousing," RCI-NASA Applications of Data
Warehousing and Mining Special Executive Conference
, Santa Barbara, CA, April 1998.

P. Scheuermann,* J. Shim, and R. Vingralek, "A unified algorithm for cache replacement and
consistency in Web proxy servers," Proc. Workshop on the Web and Databases, Valencia, CA,
April 1998.

M. Sayal, Y. Breitbart, P. Scheuermann,* and R. Vingralek, "Selection algorithms for replicated Web
servers," Proc. Workshop on Internet Server Performance, Madison, WI, June 1998.

L. Singh, B. Chen, R. Haight, P. Scheuermann,* and K. Aoki, "A robust systems architecture for mining
semi-structured data," Proc. 4th Int’l. Conf. on Knowledge Discovery and Data Mining, New York,
Aug. 1998.

J. Chen and V. E. Taylor,* "Mesh partitioning for distributed systems," Proc. High-Performance
Distributed Computing Conf.
, Aug. 1998.

Z. B. Miled, J. A. B. Fortes, R. Eigenmann, and V. Taylor,* "A simulation-based cost-efficiency study
of hierarchical heterogeneous machines for compiler and hand-parallelized applications,"
1997 IEEE Symp. Parallel and Distributed Processing, Oct. 1997.

Z. B. Miled, J. A. B. Fortes, R. Eigenmann, and V. Taylor,* "On the implementation of broadcast,
scatter, and gather in a heterogeneous architecture," Hawaii Int’l. Conf. System Sciences,
Jan. 1998.

W. Smith, I. Foster, and V. Taylor,* "Predicting application run times using historical information,"
4th Workshop on Job-Scheduling Strategies for Parallel Processing (in conjunction with IPPS’98),
April 1998.

S. L. Chang and C. H. Wu,* "Design of an active suspension system based on a biological model,"
Proc. 1997 American Control Conf., Albuquerque, NM, 1997.

Invited Talks and Seminars1

P. Banerjee,* keynote speaker, Parallel and Distributed Computing Symp., Sept. 1997.

A. Bayliss,* "Adaptive pseudospectral methods and combustion instabilities," Astrophysics Dept.,
University of Chicago, April 12, 1998.

A. Bayliss,* "New modes of burning in solid-fuel combustion," Engineering Sciences and Applied
Mathematics Dept., Northwestern University, April 24, 1998.

A. Choudhary,* "High-performance data mining," Technical Strategies to Beat Your Competition by the
Year 2000
, New York, NY, Oct. 1997.

A. Choudhary,* "Recent results in high-performance I/O," Sandia National Laboratory, Albuquerque,
NM, Aug. 19, 1998.

J. Nocedal,* The network-enabled optimization system," Electricite de France, Nov. 1997.

J. Nocedal,* "Interior-point methods in optimization," NASA / ICASE, Norfolk VA, Feb. 5-8, 1998.

J. Nocedal,* "Active-set and interior methods in nonlinear optimization," INFORMS, Montreal,
April 25-28, 1998.

J. Nocedal,* "Metacomputing environments in optimization," CERFACS and Meteo, France, March
1998.

J. Nocedal,* "Metacomputing environments in optimization," ITAM Mexico, March 1998.

J. Nocedal,* "A survey of interior-point methods for nonlinear programming," ERICE 98, Italy,
June 1998.

J. Nocedal,* "Nonlinear optimization: The interplay between mathematical characterizations and
algorithms," Int’l. Congress of Mathematicians, Berlin, Germany, Aug 1998.

V. Taylor,* "PART: A mesh-partitioning tool for efficient use of distributed systems," NASA Ames,
Sept. 1997.

V. Taylor,* "Using parallel processing for transportation applications," NSF Workshop on Enabling
Technologies for Transportation
, Sept. 1997.

Symposium Sessions Organized / Chaired1

P. Banerjee,* session chair, "Compilers," at High-Performance Computing Symposium (HiPC’97),
Bangalore, India, Dec. 1997.

P. Banerjee,* session chair, "Compilers-II," at Int’l. Parallel Processing Symp., Orlando, FL, April 1998.

A. Choudhary,* conference chair, IOPADS and SC’97 Confs., San Jose, CA, Nov. 1997.

L. Henschen,* program chair, Int’l. Conf. Advanced Science and Technology, Naperville, IL, April 1998.

P. Scheuermann,* session chair, 18th Int’l. Conf. Distributed Systems, Amsterdam, Netherlands,
May 1998.

Ph.D. Dissertations

G. Hasteer, Equivalence Checking in a Modular Checking Framework, Computer Science Dept.,
University of Illinois at Urbana-Champaign, Dec. 1997. (P. Banerjee*)

S. Roy, Low-Power Driven Sequential Algorithms for Combinational and Sequential Circuits, ECE Dept.,
University of Illinois at Urbana-Champaign, Aug. 1998. (P. Banerjee*)

T.-K. Ma, The Reactive Rayleigh-Benard Problem with Throughflow, Engineering Sciences and
Applied Mathematics Dept., Dec. 1997. (A. Bayliss* and B. J. Matkowsky)

M. Kandaswamy, Design and Evaluation of Optimizations in I/O Intensive Applications, ECE Dept.,
Syracuse University, June 1998. (A. Choudhary*)

R. Krishnaiyer, Support for Communication and I/O in Heterogeneous Systems, June 1998.
(A. Choudhary*)

S.-H Oh, A Study of I/O Architecture for Massively Parallel Computing Environment, ECE Dept.,
Syracuse University, June 1998. (A. Choudhary*)

M. E. Corey, Computer-Aided Analysis of Inner Ear Problems, June 1998. (L. Henschen*)

C.-C. Lin, Novel Approaches to Extracting Facial Features, Dec. 1997. (W.-C. Lin*)

V. Kumar, Bandwidth-Allocation Algorithms for All-Optical Networks, June 1998. (E. Schwabe*)

S.-L. Chang, Design of an Active Suspension with a Controllable Damper, Dept. of Mechanical
Engineering, May 1998. (C.-H. Wu*).

J. Gyorfi, A Hierarchical Model for Coordinated Planning and Control of Multiple Machines in Surface-
Mount Manufacturing,
May 1998. (C.-H. Wu*)