Testbed (Task 1):We have completed the development of a hardware testbed of an adaptive computing system. The testbed consists of: (1) A Wildchild board from Annapolis Micro Systems containing 9 XILINX XC4010 FPGAs and 2 MB of memory; (2) Two Motorola MVME-2604 embedded boards containing 200 MHz PowerPC 604 microprocessor, 64 MB RAM; (3) One Transtech TDMB 428 DSP board consisting of four Texas Instruments 60 MHz TMS 320C40 digital signal processors; (4) A Force 5V board consisting of a 100 MHz microSPARC-II CPU and 64 MB of RAM. These boards are mounted on a VME bus chassis. We have integrated the various software components to have these various boards to operate together.
Basic Compiler (Task 2):We have started the development of the MATLAB compiler. We have completed the MATLAB version 5.0 parser. We have built the abstract syntax tree (AST) representation of a given MATLAB program. We have built the control and data flow graph of a given MATLAB program on which we can perform various compiler optimizations. We have implemented a basic MATLAB compiler using the library-based approach. For each MATLAB function or expression for which a corresponding library function implementation exists on one of the platforms (DSP, embedded, FPGA, or RTExpress), the appropriate library function is called automatically using a C function call from the Force Board. We have developed various C program interfaces for individual library functions on each of the three platforms (DSP, embedded, FPGA or the RTExpress library from ISI).
Automatic Mapping (Task 3):We have started the development of automatic algorithms for partitioning and mapping the MATLAB programs on the heterogeneous target. We are developing algorithms for pipelining, partitioning, allocation of resources, and scheduling of the operations on the various platforms to perform time-constrained resource optimizations. We have developed a tool called SYMPHANY for performing the task of automated program partitioning and pipelining. Given a high-level sequential specification of the real-time computation with associated timing constraints (latency and throughput), the tool automatically arrives at a cost-effective solution to the system design problem using embedded processors, DSP processors, FPGAs. Our algorithm is based on a mixed integer linear programming formulation and uses an off-the-shelf LP solver called "lp_solve". We have applied our tool to the data flow graphs of three synthetic benchmarks and to the graphs for the STAP application and an MPEG decoder. In each benchmark, we have studied the solution to the problem for various combinations of throughputs and latency constraints. In each case the SYMPHANY tool gave the right solutions in terms of the number of pipeline stages used. It gave better solutions than a hand-optimized solution in most cases by about 10-20% in terms of the cost of the solution in dollars.
Compiler Directives (Task 4):We have developed initial set of directives to specify type, shape, size, precision, data distribution and alignment, task mapping, resource and timing constraints. The compiler recognizes many of these directives.
Applications (Task 5):We have started to look at various adaptive applications in order to identify which libraries to implement. The initial applications include the STAP application and the MPEG decoder. We have also begun developing implementations of NASA Hyperspectral algorithms on the MATCH prototype, in collaboration with NASA’s MODIS effort. We intend to develop implementations of Level-3 data product algorithms, and use these as benchmarks for the MATCH compiler.
Libraries (Task 6):We have started the development of various MATLAB libraries on the different platforms. The approach used was to develop each function as a parameterized function with the size of the data, the number of processors or FPGAs used, and the precision of the data (8 bit, 16 bit, 32 bit) for fixed point and floating point representations on three platforms. The platforms are the Annapolis Wildchild FPGA board, the Transtech DSP board and the Motorola embedded processor board.
We have completed the development of the following library functions on the Wildchild FPGA board (using RTL VHDL and the commercial synthesis tools, namely Synplicity and Xilinx XACT place and route tools). (1) Real matrix addition (2) Real matrix multiplication (3) IIR and FIR Filtering (4) One and two-dimensional FFT.
We have also developed the following library functions on the Transtech DSP board and the Motorola embedded board (using C plus MPI and the native C compilers for the Transtech and the Motorola boards). (1) Real and complex matrix addition (2) Real and complex matrix multiplication (3) One and two dimensional FFT. Each of these libraries has been developed with a variety of data distributions such as blocked, cyclic and block-cyclic distributions. We have characterized the performance of each of these library functions on various platforms for various data sizes and precision. In each case we have developed C program interfaces to our MATCH compiler so that the programs can be controlled from the host controller (Force board).