next up previous contents
Next: Funding Up: RESEARCH ACTIVITIES Previous: Funding

Hierarchical Processor and Memory Architecture: A Petaflop Point Design Study

The objective of this project was to investigate the feasibility of a new architecture concept that can yield very high performance, in the range of 100 Tflops, at a reasonable cost. The proposed Hierarchical Processor And Memory architecture (HPAM) consists of a heterogeneous collection of processors organized as a multilevel hierarchy. Each level of the hierarchy has a different implementation and is intended to efficiently execute appropriate portions of the application. The higher levels can efficiently execute serial and low parallelism sections of the application. The low levels of the hierarchy have a large number of slower processors and can efficiently execute portions of the code with high degree of parallelism. This work entailed developing the initial infrastructure of a simulator (HPAM-Sim) and conducting in-depth studies of complete industrial applications as well as benchmark applications.

The feasibility study addressed the issues of architectural design, system software requirements, programming environment and application requirements. The study demonstrated the advantages of matching the number of processors to the parallelism profile for an application. Further, the study demonstrated applications generally exhibit data and instruction locality within a given degree of parallelism.

The following significant results were obtained this past year in this project.

Accomplishment 1

Conduced an analysis of an industrial application, in particular the finite element analysis of an aircraft brake system, to demonstrate the advanges of matching the processor allocation to the dynamic parallelism profile of the applications. The theoretical analysis indicted that a system that allows the processor allocation to change during execution, such as HPAM, is more effective, by one to two orders of magnitude, that a conventional system that has a fixed processor allocation throughout execution.

Accomplishment 2

Developed a simulator for HPAM, called HPAM-Sim. The simulator was used to demonstrate the cost-efficiency advantage of HPAM over conventional, homogenous machines. The results indicated that for a fixed budget, percent gains of 10% to 30% with respect to the optimal one-level homogeneous machine can be achived in most cases using a multilevel HPAM machine for mid-range to high-range budgets.

Accomplishment 3

Conducted extensive cache simulations that demonstrate that applications exhibit a less obvious characteristic, that parallelism has temporal locality for both instructions and data. Our empirical studies reveal that this behavior leads to little communication being required between HPAM levels when they execute different parts of an application.




next up previous contents
Next: Funding Up: RESEARCH ACTIVITIES Previous: Funding

CPDC Webmasters
Wed Dec 10 16:19:42 CST 1997