远强 邱 / Computer School National University of Defense Technology
Outside the atmosphere, the impact of high-energy particle irradiation, which leads to Single Event Effects of hardware failure, has seriously affected their reliability and longevity, so it is necessary to take effective measures to reinforce tolerance. Compared with the traditional hardware-implement tolerance, software-implement tolerant for Single Event Effects has the advantage on the aspect of cost and flexibility. The current work mostly focus on error detection and recovery is not well-studied, the available recovery methods usually bring much temporal and spatial overhead. Based on the error detection algorithm, we present an effective and fine grained error recovery algorithm PROMER in this paper. It focus on the Storeless Basic Block and use the live variables analysis method, which not only ensure the effectiveness but also decrease the overhead of recovery. Through fault injection experiment, we find that average 96 percent of the error that has been detected can be recovered, but only introduce 11.4 percent of the performance overhead and only 5.67 percent of the space overhead.