Non-determinism in Byzantine Fault-Tolerant Replication 2


Projects in this category follow a common pattern:

  1. Students study the corresponding paper and try to understand the consensus algorithm's functioning;
  2. They create a proof of concept implementation it using a technology of choice;
  3. They test the consensus algorithm's robustness agains crash and byzantine faults.

There is an implicit step hidden into the procedure above: consensus algorithms are usually aimed at implementing some sort of SMR system (e.g. a replicated DB). Even if the goal of a project in this category is to study & implement a consensus algorithm, students may need to actually use it in order to test it. To do so they will have to create a simple SMR system, like, e.g. a replicated Key-Value store supporting at least two operations: put and get.

Paper abstract:

Service replication distributes an application over many processes for tolerating faults, attacks, and misbehavior among a subset of the processes. The established state-machine replication paradigm inherently requires the application to be deterministic. This paper distinguishes three models for dealing with non-determinism in replicated services, where some processes are subject to faults and arbitrary behavior (so-called Byzantine faults): first, a modular approach that does not require any changes to the potentially non-deterministic application (and neither access to its internal data); second, a master-slave approach, in which ties are broken by a leader and the other processes validate the choices of the leader; and finally, a treatment of applications that use cryptography and secret keys. Cryptographic operations and secrets must be treated specially because they require strong randomness to satisfy their goals.
The paper also introduces two new protocols. The first uses the modular approach for filtering out non-deterministic operations in an application. It ensures that all correct processes produce the same outputs and that their internal states do not diverge. The second protocol implements cryptographically secure randomness generation with a verifiable random function and is appropriate for certain security models. All protocols are described in a generic way and do not assume a particular implementation of the underlying consensus primitive.

Students must work on the second protocol: Mastercrypt.