A Fault-Tolerant System Architecture for Navy Applications
by W. T. Comfort
This paper describes the architecture of a computer system, being designed and built for the U.S. Navy, that is expected to be the standard Navy shipboard computer for the next twenty years or so. It has a requirement for very high system reliability, which is addressed by a multiprocessor system configuration that can recover dynamically from hardware faults and support on-line repair of failed hardware elements. Successfully accomplishing this requires various types of redundant hardware elements and special system architecture features, as well as intelligent fault-recovery software. This also requires that the application programs be designed to participate fully in the recovery and reconfiguration process. This paper presents the overall system architecture and discusses a number of significant new features designed to support fault-tolerant operation, including a duplex control bus, a computer interconnection system, a technique for remote diagnostics, a single-button maintenance procedure, and special fault-handling software.