Introduction
Introduction
- Checkpointing and Recovery (C&R) are classical techniques used for restarting a system after a break-down.
- Recent surge in the use of C&R: real-time systems, swap-space management, data-base recovery, distributed deadlock resolution, workflow integrity, idle-cycle-based systems (condor, batrun).
- Many groups (CMU, Rice univ, UT-Austin, UIUC, Umass, UB ) are working on new designs and application of C&R.