The Two Generals Problem and the TCP Three-way Handshake

2022.10.12

The Two Generals Problem and the TCP Three-way Handshake


This paper documents the two generals problem: For unreliable channels, a 100% reliable consensus cannot be reached after numerous confirmations. The TCP three-way handshake is to confirm the serial number of the packet in both directions, adding a timeout and retry, which is an engineering solution to the two generals problem.

The Two Generals Problem, also known as the Two Generals Paradox and the Two Armies Problem, is a classic computer thought experiment.

First, to avoid confusion, we need to realize that the two generals problem is related to the Byzantine generals problem, but they are not the same thing. The Byzantine Generals Problem is a more general version of the Two Generals Problem, which is usually hesitant to discuss widely in distributed systems fault tolerance and blockchain.

1. The two generals problem

Two armies, stationed on two hills and the same group of enemies in the valley, can only win if the two generals attack at the same time. The only way for the two generals to agree on a time is to send messengers through the valley, which is in the enemy-occupied area. If the messenger is captured, the information will be lost.

Phenomenon 1: General A first sends a messenger to General B to convey "We will attack together at 10:00 pm", but General A does not know whether the messenger can pass through the enemy-occupied area. Because he is worried that he will be the only attacker, General A may hesitate to attack as planned. ; At this time, General B can send a messenger to confirm receipt, and B's messenger may also be captured. General B will hesitate because he is worried that A has not received the confirmation signal, and General B will hesitate; New couriers may also be captured. So alternate confirmations are endless.

Phenomenon 2: General A sent a messenger, but no reply was received after a long time. General A did not know whether his own messenger was captured or the confirmation messenger of General B was captured.

We realize that even if both parties keep acknowledging that they have received the other's last message, there is no guarantee that the other party has reached a consensus with us.

The problem of the two generals is unsolvable. The current tcp handshakes three times and waves four times are all engineering solutions (this will be discussed later).

2. Brainstorming on the two generals

Many people have tried to solve/mitigate the double general problem, and have come up with some practical practices that can be implemented.

Here we still assume the uncertainty of the channel, the messenger will only be captured, but not betrayed and tampered with.

2.1 Shooting a bird with a shotgun

If General A sends 100 couriers (numbered 1 to 100) at a time, expect General B to receive a message from one courier at worst.

Based on the number of messengers received, General B evaluates the reliability of this channel, and also dispatches an appropriate number of confirmation messengers based on probability.

eg: General A sends 100 messengers, General B receives information from 10 messengers, General B can basically confirm that the reliability of this channel is 10%, and General B should send at least 10 messengers (according to the probability, one messenger will reach the other side. ).

2.2 Intermittent retries

The posture of shooting birds with shotguns is too messy, but at least it can help General B to improve his confidence and reach a consensus.

There is also a strategy that costs less messengers (and can improve the general’s confidence). Assuming that it takes 20 minutes to cross the valley to the other side and return, General A can send a messenger to the other side at intervals of 20 minutes until he receives the first messenger confirmation from General B on the other side. dispatch again).

The above two strategies are a trade-off of speed and cost, and which one to use depends on which one is more suitable for the problem we have.

3. Why is the tcp three-way handshake [1] an engineering solution to the two generals problem?

picture

There is a question on Zhihu: Why is TCP three-way handshake instead of two or four? [2] There are three answer angles.

①Why is TCP three-way handshake instead of two or four? - Answers of Punk Snowball Rabbit- Know [3]

②(Why is TCP three-way handshake instead of two or four times? - Che Xiaopang's answer- Zhihu [4]

③Why is TCP three-way handshake instead of two or four? - wuxinliulei's answer - know [5]

I hope you all read it carefully.

First we need to know:

The three-way handshake is to synchronize (syn) the serial number (seq=m) in both directions. To synchronize the serial number once, it takes two packets to go back and forth, and there are four packets in both directions. The 2nd and 3rd packets sent from one side can be merged together so the last three packets.

But according to the double general problem, who said that two packets can be guaranteed to be synchronized successfully.

In order to alleviate the double general problem, the tcp3 handshake adds a timeout retry mechanism. (Note: Retry is only on the initiator of information synchronization)

The first packet: the SYN sent by A to B was lost halfway and did not reach B

A will periodically timeout and retransmit until it receives an acknowledgment from B.

The second packet, that is, the SYN+ACK sent to A was lost in the middle and did not reach A

B will periodically timeout and retransmit until it receives confirmation from A

The third packet: that is, A sent to ACK and lost in the middle, and did not reach B

After A sends the ACK, it unilaterally believes that tcp is in the Established state, while B obviously believes that tcp is in the Active state.

a. Assuming that both parties have no data to send at this time, B will retransmit periodically over time until it receives the confirmation from A. After receiving the confirmation, B's TCP connection is also in the Established state, and packets can be sent in both directions.

b. Assuming that A has data to send at this time, and B receives A's Data + ACK, it will naturally switch to the established state and accept A's Data.

c. Assuming that B has data to send, but the data cannot be sent, the SYN + ACK will be retransmitted periodically over time, and the data cannot be sent until the confirmation from A is received.

  • https://finematics.com/two-generals-problem/• https://www.bilibili.com/read/cv16604716

Summarize

This paper documents the two generals problem: For unreliable channels, a 100% reliable consensus cannot be reached after numerous confirmations.

The TCP three-way handshake is to confirm the serial number of the packet in both directions, adding a timeout and retry, which is an engineering solution to the two generals problem.

    Citation link

    [1] tcp three-way handshake: https://blog.csdn.net/weixin_35942339/article/details/112733885

    [2] Why is TCP three-way handshake instead of two or four? : https://www.zhihu.com/question/24853633/answer/573627478

    [3] Why is TCP three-way handshake instead of two or four? - Answer of Punk Snowball Rabbit- Know: https://www.zhihu.com/question/24853633/answer/200721662

    [4] (Why is TCP three-way handshake instead of two or four times? - Che Xiaopang's answer- Zhihu: https://www.zhihu.com/question/24853633/answer/115173386

    [5] Why is TCP three-way handshake instead of two or four? - wuxinliulei's answer- Zhihu: https://www.zhihu.com/question/24853633/answer/63668444