Challenges in Global Electronic Commerce

Yelena Yesha
Director, NASA-CESDIS
Goddard Space Flight Center

In order to conduct commerce on the information superhighway, one has to satisfy three key requirements. Ensure that it is an open marketplace for information and services, provide electronic analogs of current financial instruments, and protect intellectual property. Consequently, electronic commerce environments should provide the following essential services: searching and discovery (crucial for finding service providers), data interchange and conversion, authentication and security, and electronic payments. We focus on information integration and information services necessary to support searching and discovery for electronic commerce. We discuss certain core technical issues, such as mechanisms for abstracting information, resolving heterogeneity, legacy systems, matchmaking in heterogeneous environments, aiding users to manage large information spaces.

Performance Models and Scalability Analysis of Wavefront Algorithms on Large-Scale SMP clusters

Olaf M. Lubeck
Los Alamos National Laboratory

We present a model for the parallel performance and scalability of algorithms that consist of concurrent, two-dimensional wavefronts implemented in a message-passing environment. The model, based on a LogGP machine parametrization, combines the separate contributions of computation and communication wavefronts. The specifics of the model for two parallel architectures of interest (MPPs and clusters of SMPs) will be discussed. We use data from a deterministic particle transport application taken from the ASCI workload; however, the model is general to any wavefront algorithm implemented on a 2-D processor domain. We also use the validated model to estimate performance and scalability of a hypothetical 100-Teraflop computer system expected to be in existence within the next decade as part of the ASCI program.

Using Prediction to Accelerate Coherence Protocols

Xiaohui Shen
ECE Dept., Northwestern University

In this talk, Im going to review one paper on accelerating cache coherency protocol of DSM by prediction. Usually, shared memory machines are built as either a centralized shared-memory multiprocessor, or a Distributed Shared Memory machine(DSM). DSM is preferred especially for the large-scale system due to the bus bandwidth limitations in a centralized shared-memory multiprocessor. Caches are usually introduced to each node to reduce memory bandwidth requirement and memory access latency and a cache coherent protocol is maintained by the system to eliminate data consistency problem. But, DSM suffers remote memory access problem with the protocol: the remote memory accesses inherently take up to ten to a hundred times longer than local memory accesses. To ameliorate this problem, a lot of approaches have been proposed in literature. In this talk, I am going to talk one of the recent optimization methods: using prediction to accelerate cache coherency protocol. According to three sharing patterns, the authors proposed three instruction-based prediction optimizations : a migratory sharing optimization, a wide sharing optimization and a producer and consumer optimization based on speculative execution.

Register File Architectures

Antonio Gonzalez
Departament d'Arquitectura de Computadors
Universitat Politecnica de Catalunya
Barcelona, SPAIN

Register file access time depends on both the number of registers and the number of ports. High ILP rates require large register files with a high bandwidth, and thus, the register file access time is likely to be critical in future processor architectures. This talk presents several architectural alternatives to tackle this problem. The first approach consists of reducing the register pressure by delaying the allocation of physical registers until a late stage in the pipeline. This is achieved by the introduction of virtual-physical registers. The second approach relies on multi-bank register file architectures, such that each bank has different capabilities in terms of number of registers and ports. Values are dynamically allocated to the most convenient bank based on their expected requirements. The third approach tackles the access time problem by using a register file cache.

High-Availability in Scalable Distributed Data Structures

Witold Litwin
U. Paris 9 Dauphine, France

Multicomputers, and specifically large clusters and networks of workstations have become the most popular hardware platform. Scalable Distributed Data Structures (SDDSs) are new data structures designed specifically multicomputers. They allow for very large data collections, spanning dozens or even thousands of storage servers. One problem that needs to be addressed for a large SDDS is how to ensure access to all the data despite the likely unavailability of some storage sites. Traditional high-availability schemes, e.g., variants of RAID schemes, do not carry over efficiently. Either the scalability of the SDDS or the parallel access to its data suffers. New schemes have been proposed in recent years. They allow for high-availability SDDSs that tolerate the failure of k >= 1 sites where k may be predefined or may scale with the file. We will overview some of these schemes with respect to their principles and the mutual advantages and drawbacks.

Part-Mating Strategies using a Generic Assembly and Disassembly Workcell Model

Swee Mok, Northwestern University

A product that is difficult to assemble or disassemble will raise manufacturing cost. To study this problem, a Virtual Assembly and Disassembly (VIRAD) system that can be integrated into a CAD/CAM system is proposed for designers to evaluate products for assembly and disassembly efficiencies during the design process. The VIRAD system uses a model of Generic Assembly and Disassembly (GENAD) workcell to generate merging trees for simulating assembly and disassembly processes. GENAD subsequently uses a developed Structured Assembly Coding System (SACS) for encoding the mating operations of two objects in a three-dimensional task space. In a GENAD workcell, every object is either a part for making a product or a handler to facilitate the workcell operations. Specialized handlers are robots, fixtures, end-effectors and human-operators. Each workcell operation is associated with a cost to facilitate product assembly and disassembly cost calculations. The estimated cost is used for representing a product's manufacturability, an important feedback for the designer. This paper will present details of the GENAD workcell model and its inner representation. Simulation results of two products with up to 18 parts will also be presented.