An Object-oriented Testbed for Evaluation of Checkpointing and Recovery Systems (OTEC)

B.Ramamurthy

CS and ECE Department

An Object-oriented Testbed for Evaluation of Checkpointing and Recovery Systems (OTEC)

Overview

Introduction

Dissertation Overview

This Talk

Error model

Error Model

Concurrent error detection

Checkpoints

Detection points

Publications: List 1

Comprehensive Error detection & Recovery Protocol (CREP)

Illustrating CREP

Communication Scenario 1

Error Occurrence

Error Detection Points

Message Validation

Rollback Recovery - Traditional

Rollback Recovery - CReP

Modeling the Activities

Message Validation

Data Structures

Global State Matrix Sp

Activities and Global State Matrix at Tp

Global State Matrix and Rollback

Updating Global State Matrix

Recovery Algorithms

Recovery Algorithms (contd.)

Analysis

Contributions of CReP

Publications: List 2

Evaluation of CReP

OTEC - Objectives

Requirements

OTEC Infra-structure

Architecture of OTEC

Generic Architecture

OTEC Architecture (Current)

Domain-dependent

Application Instance - CReP

Comparison : CReP & ORR

Mix and Match

Sample Experiments

Publications : List 3

Design principles

Design Principles (contd.)

Operational details

Co-routine structure

Sample Application Structure

Major Contributions

Future Work

Ongoing Work