How does Netty manage TCP streaming? A complete solution to the problem of sticky packets and unpacking, and the best practices of codecs

2025.04.12

When Netty involves network IO data transmission, the following interview questions may be involved:

What is TCP sticking and unpacking? Why doesn't UDP have this problem?

What causes sticking and unpacking?

How does Netty solve TCP sticking and unpacking?

1. Detailed explanation of TCP sticking and unpacking

1. Problem reproduction
Before formally explaining the problem, let's take a look at an example to see how the TCP sticking and unpacking problem occurs. The following two code snippets are the server settings and the business processor, which will continuously output the data sent by the client after establishing a connection with the client:

The core code of the server-side business processor is also very simple in logic. After receiving the message, it prints out directly:

Let's take a look at the client-side business processor and configuration class. The code of the business processor is very simple. After establishing a connection, it continuously sends 1,000 data, and the data content is: hello Netty Server!:

The configuration class is also a fixed template:

After starting the server and the client, we can see the following output, where a large amount of hello Netty Server! data is glued together to form a sticky package.

2. Cause Analysis
In TCP programming, when the server and the client communicate, the message will have a fixed message format. This format is usually called protocol, such as our common application layer protocols: HTTP, FTP, etc.

The reason for the sticky packets in the above example is that when our server and the client communicate, there is no confirmation of the protocol specifications. Because TCP is a connection-oriented and stream-oriented protocol, it will cause a complete data packet to be unpacked into countless small data packets for transmission due to various reasons, which will cause the receiver to be unable to process the data packet correctly after receiving the data, resulting in sticky packets and unpacking:

There are three reasons why TCP packets are split:

socket buffer and sliding window
Nagle algorithm
MSS

Let's talk about the joint effect of socket buffer and sliding window first. We all know that TCP is a full-duplex, stream-oriented protocol. This means that normal transmission and reception must be guaranteed during transmission, so TCP proposes a sliding window mechanism, which uses the size of the sliding window as a unit, allowing both parties to send and receive data based on the size of this window. The sender can only send data within the sliding window, and the receiver can only receive and process data within the window. Only when the receiver's sliding window receives the signal sender's data, processes the window and the sender's data, ACK

Since TCP is a stream-oriented protocol, the data sent and received by both parties during this period will also be stored in the socket buffer. This means that these two buffers cannot know whether these data belong to the same data packet. Similarly, the socket buffer is also divided into the send buffer (SO_SNDBUF) and the receive buffer (SO_RCVBUF). All data that the socket needs to send is also stored in the socket buffer and then passed to the kernel protocol stack through the kernel function for data transmission. The socket receive buffer also copies the data to the Bayesian buffer through the core of the operating system.

So. The socket buffer and sliding window mechanism work together to cause the following two abnormal situations:

(1) The sender stops sending data when the data reaches the limit of the sliding window. After the receiver's socket buffer receives the data, it directly transmits it to the application layer. Because the packet is incomplete, from the receiver's perspective, packet unpacking occurs.

(2) The sender sends multiple packets to the receiver's buffer. Because the receiver's socket buffer cannot process them in time, it cannot know the packet boundaries when it actually starts processing. It can only pass the packets upward at one time, resulting in packet sticking.

Let's talk about the Nagle algorithm. Considering that each time a data packet is sent, a 20-byte TCP header and a 20-byte IP header need to be added to the data, and the sender's ACK confirmation packet must be waited for, this may lead to the following situation that wastes network resources:

10 bytes of header information are assembled for 1 byte of useful information!

In order to make the best use of network bandwidth, the operating system proposed the Nagle algorithm. The algorithm requires that all small data packets (less than MSS) that have been sent must wait until the receiver replies with an ACK signal, and then these small data segments are packaged into one package and sent, so as to make the best use of bandwidth and avoid network congestion caused by the transmission of a large number of small network packets.

Obviously, if multiple small packets are sent together, the receiver may also encounter problems with packet sticking or unpacking because it cannot confirm the boundaries of the packets:

The last is mss, which is the abbreviation of Maximum Segement Size, which represents the maximum length of data that can be sent at one time. If the data exceeds the maximum value of MSS, the network packet will be split into multiple small packets for transmission. In this case, it is also likely that problems with packet sticking and unpacking will occur due to the sporadic transmission of packets.

For this, we might as well use WireShark to capture and analyze the packets, and filter based on the service port by typing the following command:

Check the data sent by the server each time, no matter the size or content is missing, and the kernel buffer space is sufficient, so the reason is obvious, because the TCP protocol is stream-oriented transmission, when the receiver reads from the kernel buffer, it gets too much or too little data, resulting in sticking or unpacking.

2. Solutions to half-packet sticking
1. Introduction to several solutions
In fact, the cause of the above problems is that TCP is a stream-oriented protocol, which causes the data packets to be unable to be normally cut into a stream of normal data packets. Take the above packet as an example, the data transmitted is hello Netty Server!, in fact, we have achieved the following segmentation methods:
If the data sent all ends with "!", then we will determine whether the received stream contains "!" during segmentation. Only if it does, we will pack the data into a packet and send it.

The length of the data sent above is 19. We can also specify that the length of the data sent is 19 bytes. Once the received data reaches 19 bytes, it will be assembled into a packet.

Customize a protocol and require the sender to assemble and send data packets according to the protocol requirements. For example, the data packet is required to contain two fields: length and data. Length records the length of the data packet. For the above data, the value of this field is 19, and data contains the data content.
2. DelimiterBasedFrameDecoder
First, let's look at the one based on the delimiter. You can see that there is an exclamation mark at the end of each data, so we can complete the data unpacking by judging the special symbol.

The code is as follows. We use DelimiterBasedFrameDecoder to unpack based on special delimiters. The corresponding meaning of each parameter is:

Maximum length of data packet.
Whether to remove delimiters during decoding.
Delimiter.

After starting, you can see that the problem can also be solved:

3. Decoder based on data length FixedLengthFrameDecoder
Similarly, we can also split the data packet based on the data length:

As can be seen above, the length of the data we send is 19, so the first solution is to configure a decoder based on length decompression in the pipeline on the server side to ensure that it intercepts every 19 bytes to ensure that the data packet can be read and parsed correctly. So we add a FixedLengthFrameDecoder to the pipeline, and set the length to 19.

4. Decoder based on protocol length field LengthFieldBasedFrameDecoder
The last one is also the one I recommend, that is, custom protocol. During the transmission process, the length of the data or the separator may not be guaranteed, so we can negotiate with the client to add the packet length to the transmitted data header, for example, using 4 bytes to represent the packet length.

So the client code for writing data after establishing a connection is changed to:

The final packet structure is shown in the figure below:
The server processor uses LengthFieldBasedFrameDecoder instead, and the construction method is as follows:
The corresponding parameter meaning is:

maxFrameLength: The maximum length of the packet. Here we set it to Integer.MAX_VALUE, which means no limit.
lengthFieldOffset: This value represents the offset of the field describing the packet length. For example, in our packet, it is 0, which means reading the length from the initial position.
lengthFieldLength: The number of bytes in the field describing the packet length. For example, in our packet, it is 4 bytes.
lengthAdjustment: The compensation value to be added to the length field value. This field is more interesting. Let's take an example to illustrate. Take the following data packet as an example. If we need to get the data of data, and the value of the length record is 12 bytes (head+length+data), in order to achieve our expectation of only taking 10 bytes of data, we can set it to the length of this data.

The corresponding value of the packet length record is correct. It is directly set to 0 here without adjustment.

initialBytesToStrip: How many bytes of the packet need to be skipped when reading. For example, 4 is used for our packet, which means we need to skip the 4-byte length field and only need the data. Correspondingly, we also give the following construction method:
So we have a decoder with the following structure. After stress testing again, the data can be parsed and processed normally:

5. More about Netty's built-in decoder
The designer also provides us with more use cases in the comments. Let's take a look at the first example. The packet length field is 2 bytes and the offset is 0. If we want to read the entire packet, the parameter setting method is:

lengthFieldOffset, that is, the offset is set to 0, which means that the length field is at the high position of the packet without offset.

lengthFieldLength is 2, which means that reading 2 bytes of data can get the packet length.

lengthAdjustment is 0, which means that the data described by the length field is the length of the subsequent data and no adjustment is required.

initialBytesToStrip is 0, which means that when reading, read from the very beginning of the data packet and add the data of the length described in the length field, without skipping.

Let's take a look at Example 2. The data packet is the same as above, but the data to be read does not contain the length field, so the parameters are set as follows:
• lengthFieldOffset, that is, the offset is set to 0, which means that the length field is at the high position of the data packet without offset.
• lengthFieldLength is 2, which means that the length of the data packet can be obtained by reading 2 bytes of data.
• lengthAdjustment is 0, which means that the data described by the length field is the length of the subsequent data, and no adjustment is required.
• initialBytesToStrip is 2, which means that when reading, start from the beginning of the data packet and skip 2 bytes of data, that is, skip the length field.

Let's look at case 3. The 2-byte length describes the length, but the length includes the length of the field describing the length, that is, the value of length is the length field length 2 + the length of the subsequent HELLO, WORLD string is 14. If we want to get a complete data packet, then the parameter needs to be set as:

lengthFieldOffset is set to 0, that is, the length field is in the high position of the data packet without offset.

lengthFieldLength is 2, that is, reading 2 bytes of data can get the packet length.

lengthAdjustment is -2, which means that the length field describes the length of the entire packet, and the length of the length field needs to be subtracted.

initialBytesToStrip is 0, that is, when reading, read from the very beginning of the data packet and add the length data described in the length field, without skipping.

Example 4 needs to skip the header field and read the length field. Finally, a data packet containing all parts needs to be obtained, so the parameters are as follows:

lengthFieldOffset, that is, the offset is set to 2, which means skipping the header.

lengthFieldLength is 3, which means reading 3 bytes of data to obtain the packet length.

lengthAdjustment is 0, which means that the length field describes the length of the subsequent data and no adjustment is required.

initialBytesToStrip is 0, which means that when reading, read from the very beginning of the data packet and add the length described in the length field, without skipping.

Example 5 is a special case. Length describes the length of the following data, but does not include the length of the following header. If we want to obtain all parts of the packet, the parameter needs to be set as follows:

lengthFieldOffset is set to 0, which means no offset is needed and the length is at the high position of the packet.

lengthFieldLength is 3, which means that the packet length can be obtained by reading 3 bytes of data.

lengthAdjustment is 2, which means that the length field only records the length of the Actual Content. There is also a header length that needs to be calculated after the length field, so it is set to 2, which means the actual length needs to be +2.

initialBytesToStrip is 0, which means that the data is read from the very beginning of the packet and the length described in the length field is added without skipping.

Example 6, the length is after hdr1, and the data finally read is hdr2 and Actual Content. The parameter settings are:

lengthFieldOffset, that is, the offset is set to 1, which means skipping HDR1.

lengthFieldLength is 2, which means reading 2 bytes of data to get the packet length.

lengthAdjustment is 1, which means that the length field only records the length of the Actual Content. There is also a HDR2 length that needs to be calculated after the length field, so it is set to 1, which means the actual length needs to be +1.

initialBytesToStrip is 3, which means skipping HDR1 and length to start reading.

Example 7, Length records the length of the entire packet. In order to obtain the data of HDR2 and Actual Content, the corresponding parameters are set as follows:

lengthFieldOffset, that is, the offset is set to 1, which means skipping HDR1.
lengthFieldLength is 2, which means reading 2 bytes of data to obtain the packet length.
lengthAdjustment is -3, which means subtracting the field length of HDR1 and LEN.
initialBytesToStrip is 3, which means skipping HDR1 and length to start reading.

3. Summary
The above is the author's source code analysis and practice on how Netty solves the half-package and sticky package problems. I hope it will be helpful to you.

新聞

How does Netty manage TCP streaming? A complete solution to the problem of sticky packets and unpacking, and the best practices of codecs

ARP and MAC address resolution: the first step in LAN communication

With user experience at its core and intelligent operations as its driving force