Danny Dolev, H. Raymond Strong
IEEE Symposium on Reliability in Distributed Software and Database Systems (SRDS 1982), pages 53–60
IEEE Computer Society
July 1982
Two-Phase Commit and other distributed commit protocols provide a method to commit changes while preserving consistency in a distributed database. These protocols can cope with various failures occurring in the system. But in case of failure they do not guarantee termination (of protocol processing) within a given time: sometimes the protocol requires waiting for a failed processor to be returned to operation. It happens that a straightforward use of timeouts in a distributed system is fraught with unexpected peril and does not provide an easy solution to the problem. Byzantine Agreement is combined with Two-Phase Commit, using observations of Lamport to provide a method to cope with failure within a given time bound. An extra benefit of this combination of ideas is that it handles undetected and transient faults as well as the more usual system or processor down faults handled by other distributed commit protocols.