On the byte side: What is the difference between HTTP long connection and TCP long connection?

2022.12.04

On the byte side: What is the difference between HTTP long connection and TCP long connection?

HTTP's Keep-Alive is also called HTTP long connection. This function is implemented by the "application", which can use the same TCP connection to send and receive multiple HTTP requests/responses, reducing the number of HTTP short connections. The overhead of TCP connection establishment and release.

Hello everyone, I am Xiaolin.

A reader sent me a private letter before. During his Byte interview, he was asked these two questions:

picture

The first question: How is the NULL value stored in MySQL?

The second question: What is the difference between HTTP long connection and TCP long connection?

The first question is mainly to assess whether you know how a MySQL record is stored. I have written an article explaining it a few days ago. For those who haven’t read it, you can read this article: Byte side: How is the NULL value stored in MySQL?

The second question is actually asking what is the difference between HTTP Keep-Alive and TCP Keepalive?

This is a good question, and many people will be confused, because these two things look so similar, it is easy to mistake them for the same thing.

If you have carefully read the series of illustrated network articles on my website, you should know this question, because I have written it before.

However, there should be many students who forgot after reading it. This time, I will take everyone to review it again.

In fact, these two are completely different things, and the implementation levels are also different:

  • HTTP's Keep-Alive is implemented by the application layer (user mode), called HTTP long connection;
  • TCP's Keepalive is implemented by the TCP layer (kernel state), and is called the TCP keepalive mechanism;

Next, talk about them respectively.

HTTP 的 Keep-Alive

The HTTP protocol adopts the "request-response" mode, that is, the client initiates a request, and the server returns a response.

picture

request-response

Since HTTP is implemented based on the TCP transport protocol, before the client and server communicate over HTTP, a TCP connection needs to be established first, then the client sends an HTTP request, and the server returns a response after receiving it. So far, the "request-response" The pattern is complete and the TCP connection is released.

picture

an HTTP request

If each request has to go through this process: establish TCP -> request resource -> response resource -> release connection, then this method is HTTP short connection, as shown in the figure below:

picture

HTTP short connection

This is really tiring, and a connection can only request resources once.

Can the TCP connection not be disconnected after the first HTTP request, so that subsequent HTTP requests continue to use this connection?

Of course, HTTP's Keep-Alive implements this function. You can use the same TCP connection to send and receive multiple HTTP requests/responses, avoiding the overhead of connection establishment and release. This method is called HTTP long connection.

picture

HTTP long connection

The characteristic of HTTP long connection is that as long as either end does not explicitly propose to disconnect, the TCP connection state will be maintained.

How can I use the Keep-Alive function of HTTP?

It is disabled by default in HTTP 1.0. If the browser wants to enable Keep-Alive, it must be added to the header of the request:

Connection: Keep-Alive
  • 1.

Then when the server receives the request and responds, it also adds a header to the response:

Connection: Keep-Alive
  • 1.

Doing so, the connection is not interrupted, but remains connected. When the client sends another request, it uses the same connection. This continues until either the client or the server proposes a disconnect.

Starting from HTTP 1.1, Keep-Alive is enabled by default. If you want to disable Keep-Alive, you need to add in the header of the HTTP request:

Connection:close
  • 1.

Most browsers now use HTTP/1.1 by default, so Keep-Alive is turned on by default. Once the client and server reach an agreement, the long connection is established.

HTTP long connection not only reduces the overhead of TCP connection resources, but also provides a feasible basis for HTTP pipelining technology.

The so-called HTTP pipeline means that the client can send multiple requests at one time without waiting for the server's response during the sending process, which can reduce the overall response time.

For example, a client needs to request two resources. The previous practice was to send A request first in the same TCP connection, then wait for the server to respond, and then send B request after receiving it. The HTTP pipelining mechanism allows the client to issue A request and B request at the same time.

picture

On the right is the HTTP pipeline mechanism

But the server still responds in order, responding to A request first, and then responding to B request after completion.

Moreover, the client cannot send the next batch of requests until the server has responded to the first batch of requests sent by the client. That is to say, if the server responds to a blockage, the client cannot send the next batch of requests. At this time, the problem of "head of line blocking" is caused.

Some students may ask, if the HTTP long connection is used, if the client completes an HTTP request, it will not initiate a new request. Isn't it a waste of resources that the TCP connection has been occupied all the time?

That's right, so in order to avoid resource waste, web service software generally provides the keepalive_timeout parameter to specify the timeout time for HTTP long connections.

For example, if the timeout period of HTTP long connection is set to 60 seconds, the web service software will start a timer. If the client does not initiate a new request within 60 seconds after completing the last HTTP request, the timer time Once there, a callback function is triggered to release the connection.

picture

HTTP persistent connection timeout

TCP keepalives

TCP's Keepalive is actually the TCP's keepalive mechanism. Its working principle has been written in my previous article, so I will directly paste the previous content here.

If the TCP connection at both ends has no data interaction and the condition for triggering the TCP keep-alive mechanism is met, the TCP protocol stack in the kernel will send a detection message.

  • If the peer program is working normally. When the TCP keep-alive detection message is sent to the peer, the peer will respond normally, so the TCP keep-alive time will be reset, waiting for the arrival of the next TCP keep-alive time.
  • If the peer host crashes, or the peer is unreachable due to other reasons. When the TCP keep-alive detection message is sent to the peer, there is no response, and after several consecutive times, when the number of keep-alive detections is reached, TCP will report that the TCP connection has died.

Therefore, the TCP keep-alive mechanism can determine whether the other party's TCP connection is alive by detecting packets when there is no data interaction between the two parties. This work is done in the kernel.

picture

TCP keep-alive mechanism

Note that if the application wants to use the TCP keep-alive mechanism, it needs to set the SO_KEEPALIVE option through the socket interface to take effect. If it is not set, the TCP keep-alive mechanism cannot be used.

Summarize

HTTP's Keep-Alive is also called HTTP long connection. This function is implemented by the "application", which can use the same TCP connection to send and receive multiple HTTP requests/responses, reducing the number of HTTP short connections. The overhead of TCP connection establishment and release.

TCP's Keepalive is also called the TCP keep-alive mechanism. This function is implemented by the "kernel". When the client and the server have not exchanged data for a certain period of time, the kernel will send a probe to ensure that the connection is still valid. text, to detect whether the other party is still online, and then decide whether to close the connection.