Publication:
Exploiting Data-Flow for Fault-Tolerance in a Wide-Area Parallel System

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

University of Virginia, Department of Computer Science

Research Projects

Organizational Units

Journal Issue

Abstract

Wide-area parallel processing systems will soon be available to researchers to solve a range of problems. In these systems, it is certain that host failures and other faults will be a common occurrence. Unfortunately, most parallel processing systems have not been designed with fault-tolerance in mind. Mental is a high-performance object-oriented parallel processing system that is based on an extension of the data-ffow model. The functional nature of data-flow enables both parallelism and faulttolerance. In this paper, we exploit the data underpinning of Mental to provide easy - to - use and transparent fault-tolerance. We present results on both a srnall - scale network and a wide-area heterogeneous environment that consists of three sites: the National Center for Supercomputing Applications, the University of Virginia and the NASA Langley Research Center. Note: Abstract extracted from PDF file via OCR

Description

Original submission date: 2013-10-11T13:58:11Z

Subjects

Citation

NguyenTuong, Anh, Andrew Grimshaw, and Mark Hyett. "Exploiting Data-Flow for Fault-Tolerance in a Wide-Area Parallel System." University of Virginia Dept. of Computer Science Tech Report (1996).

Collections

Endorsement

Review

Supplemented By

Referenced By