Traditional performance optimization techniques have focused on finding
the kernel in a program that is the most time consuming and attempting
to optimize it. An example of such an optimization entails
restructuring an algorithm to increase data reuse (i.e., blocking), thereby
reducing cache misses. It is well known that the performance
increase that is achieved when optimizing a given kernel in isolation
generally does not reflect the performance increase that occurs when the
new kernel is included in the larger application. This
disparity in performance increase between kernel and full application is
due to a lack of understanding of how the interaction, or
coupling, of kernels affects
the performance of the application. This problem is exacerbated with
multidisciplinary applications that are composed of different
applications. This work attempts to address this problem by
developing a methodology for measuring
this coupling and describe how the measurements can be used to obtain an
efficient application. The long-term goal entails automating this
methodology into a tool called Prophesy.
Funding: The work is funded by a grant from NASA Ames.