This short-article focusses on one of the most popular consensus problems in distributed computing known as “Byzantine’s General Problem” and when a distributed system is said to be Byzantine Fault Tolerant.
Various Byzantine Fault Tolerant algorithms are being used in Permissioned Blockchain Networks e.g Hyperledger Sawtooth is using Practical Byzantine Fault Tolerant(PBFT) to achieve consensus. So, if you want to understand how the consensus is actually achieved in such systems. This article is surely for you.
So let’s begin this journey.
First of let’s focus on the term ‘Byzantine Fault Tolerant’ and when a distributed system is said to be Byzantine Fault Tolerant?
The answer lies within the different types of failures that can occur in a distributed system-
1.Crash-Fail: In this type of failure, the component stop working without any warning. So you need to restart the node or replace it. We can say it is a ‘Fail-Stop’ failure.
2.Omission Failure: In this, component transmits a message but that message is not received by other nodes or we can say it is omitted.
3.Byzantine Failures: It is a ‘no-stop failure’.It occurs when there is a malicious or traitor node in the network which sends conflicting messages or block the messages by not sending them to the other nodes in the network, which may lead to faulty results.
Now I think it is self-explanatory that a distributed system is said to be Byzantine Fault Tolerant if it can cope-up with the Byzantine Failures.
The applications of BFT can be found in various domain like Blockchain and even in Boeing 777 and 787 flight controls.
Let’s move on to a specific problem which forms a base for understanding BFT-
The Byzantine General’s Problem :
Situation: Suppose there are several generals and they have to attack army camp C and they are surrounding the army camp such that they can’t communicate with each other directly. The only way communication can happen in between them is through a carrier and he needs to pass the enemy camp for transferring every message.
So they need proper protocols to reach a final decision whether to “attack ” on C, the next morning or “retreat”.
If they all agree to attack, and they do attack then they will surely win or if they agree to “retreat” then they can fight on another day. But if one of the general attack and other decides to retreat then they are surely gonna lose.
Problems:
- Malicious Generals create variation in the decision to the others
- Message Carriers may not reach
- Reach a single solution, considering the downsides of a few Generals
Keeping in mind the situation, let’s discuss this problem with three generals.
Three Generals Problem:
Suppose, there are one commander and two lieutenants surrounding the army camp C and they have to collectively reach a decision to ‘attack’ or ‘retreat’.
If neither of the generals is faulty then all will work good and they will surely reach a decision.
Let’s see the case if one of the generals starts behaving maliciously: