How to really "engage" the body of the HTTP protocol

How to really "engage" the body of the HTTP protocol


In this article, we mainly talked about two things, one is files, and the other is transferring files. Regarding files, we talked about file types, language types, simple compression, etc. There are two main parts for transferring files, one is block transfer and the other is segment transfer. That's all there is to it, it's all theory.

We finished talking about the characteristics of HTTP and the part of the start line, and focused on chatting about request methods and status codes. These two things are very important, because they are often used in conjunction with the header field. I have repeatedly emphasized that the subsequent content is related to relevant content. From this chapter until HTTP/2, I will take you to learn and practice the core header fields of HTTP/1 through Node. In fact, most of the capabilities of HTTP are extended through header fields.

So in this chapter, let's learn about the header fields related to the body.

Let's recall first, about the body part, what do we know so far? In the era of 0.9, it can be said that there is only the body returned by the response, but not the body of the request. It was not until 1.0 that there was a request body and a response body, that is, both the request and the response had a body. In 1.1, some fields about the body were expanded. Let's take a look at the content of the header field of the body and how it is negotiated.

1. MIME

I have briefly talked about this thing before, and everyone must have a little impression that MIME plays a very important role in the HTTP body system. We need a deeper understanding. As I said before, when we need to pass some data or content, how can I pass the data to the other party and translate the data correctly? We don't go into too much detail about the transfer to TCP, and the translation work is done by both parties, or the client and the server.

The client sends a picture to the server, how does the server know that it is a picture? Or vice versa, how does the client know that this is a picture? In theory, no matter what method is used. Unless, I tell you "this is a picture". Does it feel a bit simple? To put it bluntly, it is negotiation. Even the server received the message from the client and knew it was a picture, but I just didn’t analyze it according to the picture, and reported an error to you directly, and there was nothing you could do.

But the meaning of "standard" is that we have to follow the standard, so... Although the server may not follow the rules, we have to learn according to the rules.

Let's continue, hahaha, what is MIME? The full name of MIME is Multipurpose Internet Mail Extensions (Multipurpose Internet Mail Extensions). It was originally used in the email system to allow emails to send multiple types of data. Here is a more detailed introduction. If you are interested, you can see for yourself. When HTTP also needs this thing, I found out, huh? MIME is good, I can use it directly, saving me to make another set myself, so HTTP just took some of it and used it.

MIME divides data into eight categories, and the format is almost like this: type/subtype. Its eight types are almost as follows:

  • Text: Text messages used for standardized representation, text messages can be in multiple character sets and or multiple formats;
  • Multipart: used to connect multiple parts of the message body to form a message, these parts can be different types of data;
  • Application: used to transmit application data or binary data;
  • Message: used to package an E-mail message;
  • Image: used to transmit static image data;
  • Audio: used to transmit audio or sound data;
  • Video: used to transmit dynamic image data, which can be a video data format edited together with audio;
  • Font: used to transfer font files;
  • Model: used to transfer 3D model files.

Are you familiar with some of these types, such as Text, Multipart, Application, Image, Video, etc. We must have come into contact with them more or less in actual work. Then, let's take a look at what the subtypes are:

  • text/plain (plain text)
  • text/html (HTML file)
  • application/xhtml+xml (XHTML file)
  • image/gif (GIF image)
  • image/jpeg (JPEG image)
  • image/png (PNG image)
  • audio/mpeg (MP3 audio)
  • audio/aac (AAC audio)
  • video/mpeg (MPEG video)
  • video/mp4 (MPEG-4 video)
  • application/octet-stream (arbitrary binary data)
  • application/json (JSON file)
  • application/pdf (PDF file)
  • application/msword (Microsoft Word files)
  • application/vnd.openxmlformats-officedocument.wordprocessingml.document (Microsoft Word 2007 document)
  • application/vnd.wap.xhtml+xml (wap1.0+)
  • application/xhtml+xml (wap2.0+)
  • message/rfc822 (RFC 822 form)
  • multipart/alternative (HTML form and plain text form of HTML mail, the same content is expressed in different forms)
  • application/x-www-form-urlencoded (form sent using the HTTP POST method)
  • multipart/form-data (same as above, but mainly used when the form is sent with file upload)

I have listed most of the data types and their subtypes. Of course, these things do not necessarily require everyone to fully understand them, as long as they are known by their names. And the part I bolded is actually the most common data types in our daily work.

2. Data type

In HTTP, we can tell the server what type of data we want to receive through the Accept field, and the server uses the Content header field to tell the client what data is actually sent. Note that Accept and Content are a category, and they are also the core content we will talk about in this chapter. It contains a lot of header fields, and we will talk about it slowly.

Let's continue to talk about the header field that indicates the data type. The Accept field will indicate the MIME type that the client can understand. It can be separated by "," to list multiple types, so that the server has more options, such as:

Accept: application/json,text/html,application/xml
  • 1.

This is to tell the server that the data types I can parse include json, html, and xml, and can give me data within the range of these types.

Correspondingly, the server will use the Content-Type header field to inform the client of the real type of entity data:

Content-Type: application/json
  • 1.

In this way, the browser reads the Content-Type to know that it is a json file, and then parses it through the engine, and it's over.

Simple right?

Then... I still want to emphasize that if the server receives the data type that the client wants, but I don't give it to you according to what you want, then there is actually no problem at all, so, in the early RFC1945 , Accept is attached to other functions. It was not officially added to the standard until 1.1.

3. Data compression

Usually, when we transmit data, in order to better save bandwidth, we will compress the data before transmission. The same is true in HTTP. There are often many ways to compress data. Of course, this is much less than MIME, and there are only three:

gzip: Familiar with it, it is the GNU zip compression format, which is also the most popular compression format on the Internet;

deflate: zlib (deflate) compression format, second only to gzip in popularity;

br: A new compression algorithm optimized specifically for HTTP (Brotli).

Then the client can use the Accept-Encoding field to mark the supported compression format, or use "," to split multiple supported formats, and the server will put the actual compressed format in the Content-Encoding field.

Accept-Encoding: gzip, deflate, br

Content-Encoding:
  • 1.
  • 2.
  • 3.

In actual use, these two fields can be omitted. Omission by the client means that compression is not supported, and omission by the server is to inform the client that the data transmitted is not compressed.

4. Language type

With the data type and compression type, the machine can identify what the transmitted data is and how to decompress it. But there are so many countries and regions around the world, and different countries and regions use different languages, and even people in the same country and regions may use different languages, so how does the browser display a language that everyone can understand? What about text? In other words, how do I correctly encode this data according to different situations? In other words, it is actually a question of internationalization.

I guess you who have learned here already know how to solve it, negotiate and field. Hahaha, feeling a bit boring. . There is no suspense at all.

The field used for the request header is Accept-Language, and the entity header field in the response message is Content-Language. Here, everyone should pay attention to the fact that the Accept header field is the request header field, and the Content is the entity header field. Not the response header field. Everyone should pay attention to this.

 Examples are as follows:

Accept-Language: zh-CN, zh, en

Content-Language: zh-CN
  • 1.
  • 2.
  • 3.

It's simple and not complicated, but it's not over yet. What are the values ​​corresponding to these header fields? Well... these things are called language types, which are natural languages ​​used by humans, such as English, Chinese, French, etc., and these natural languages ​​also have their subordinate dialects, so similar to data types, they are also in the form of type-subtype, and language The type is different. The data type uses "/" to separate the parent class and the child class, and the language type uses "-" to separate.

For example, en means English, en-US means American English, and en-GB means British English. Of course, there are more language types. You can understand it yourself. It is useless to say more here.

At this point, the server knows what type of language to use, but you need to know that the underlying nature of the computer is 0 and 1. How can I translate 0 and 1 into the corresponding language? This requires the use of character sets. In the early days of computer development, it was very chaotic. People in various countries and regions defined their own systems and invented many encodings to process their own characters, such as ASCII for English and ASCII for Chinese. GBK. This leads to the same piece of text, which may be displayed differently with different encodings.

So Unicode and UTF-8 appeared later, accommodating all languages ​​in the world in one scheme.

In the request header in HTTP, the encoding type acceptable to the client can be expressed through Accept-Charset, but there is no corresponding field in the response header, but "charset=xxx" is used after the data type of the Content-Type field To say that you should pay special attention to this point.

Accept-Charset: gbk, utf-8

Content-Type: text/html; charset=utf-8
  • 1.
  • 2.
  • 3.

However, modern browsers support multiple character sets, so usually, the Accept-Charset request header will not be sent, and the server will not return Content-Language, because the language used can be inferred from the character set, so Generally, there is only Accent-Language in the request header, and only Content-Type in the response header.

5. Quality value

The English name of the quality value is called quality factory, and the literal translation is called quality factor, which actually means weight. It uses q as a parameter in HTTP, the form is "q=value", this value can be between 0 and 1, including two decimal places of 0 and 1. Use ";" to separate the values ​​in the field.

What I want to emphasize here is that in most other languages, such as JavaScript, the semicolon ";" is stronger than the comma ",", but the opposite is true in HTTP. Let's look at an example:

Accept: text/html,application/xml;q=0.9,*/*;q=0.8
  • 1.

What does this passage mean, that is, the browser most hopes that the server will send the html file, and if it is not written, the default weight is 1, followed by the xml file, with a weight of 0.9, and finally any file type with a weight of 0.8. After the server receives the request, it will give priority to returning HTML based on this content.

6. Vary

This thing is a bit weird, let's learn it. It means, which header fields are referred to in the response message I return to you. That is to say, the process of negotiating between the client and the browser to determine how to return the response message is actually not transparent. You don't know how to negotiate, or it is possible that the server doesn't care whether you negotiate or not.

But a friendly server will add a Vary field to the response header to record the request header field that the server refers to during content negotiation and give a little information.

Vary: Accept-Encoding,User-Agent,Accept
  • 1.

The above example indicates that the server returns a response message after referring to the three fields Accept-Encoding, User-Agent, and Accept.

The Vary field can be considered as a special "version mark" of the response message. Whenever the request header such as Accept changes, Vary will also change with the response message. That is to say, the same URI may have multiple different "versions", which are mainly used by proxy servers in the middle of the transmission link to implement caching services, which will be mentioned later when we talk about "HTTP caching".

7. Block transmission

In the first six sections, we talked about how data is negotiated in HTTP so that both the client and the server know how to process the data. And if the data volume is too large, we can also compress and transmit the transmitted data by negotiating the compression method. It seems that everything is fine, but what if the file size I want to transfer is particularly large? For example, a video... the small one is hundreds of megabytes, the big one is several gigabytes, and the compression efficiency for video is very low, so how do you transmit it?

Well...the title is the answer. We have no way to reduce a large volume of data as a whole, so we can only divide this particularly large data into small pieces, and the server transmits these small pieces of data to the browser, and browse After the device is received, it is assembled and restored according to certain rules.

This idea is called chunked in HTTP, that is, chunked transfer encoding, which can be represented by "Transfer-Encoding: chunked" in the response message, which means that the body in the response message is not sent at one time, but It is divided into many blocks and sent one after another.

One thing everyone should pay attention to is that the length of a response message is either known or unknown. It is impossible to know and not know at the same time. What does it mean? That is, Transfer-Encoding: chunked and Content-length are mutually exclusive and cannot appear in the response header at the same time.

8. Range request

With block transfer, we can send a huge amount of data one by one to solve the problem of large files being stuck during the transfer process. Let’s still take video as an example. You are watching a TV series on Tencent Video or iQiyi, and you are enjoying it. Suddenly an in-video advertisement pops up, or you don’t want to watch the beginning and end of the TV series. You will drag the progress bar To skip this part, then this implementation needs to use range requests.

In other words, we want to be able to obtain a certain piece of a large file, but chunk transfer cannot do this, chunk transfer can only be divided into chunks and sent to you at the beginning of the transfer, and we cannot be sure A certain range of data is required.

To solve such a problem, you need to use a range request, which allows the client to use a dedicated field in the request header to indicate that only a part of the entire file is obtained.

The range request is not a necessary function of the web server, it can be implemented or not, so the server must use the field "Accept-Ranges: bytes" in the response header to clearly inform the client that I support the range request. If it is not supported, you can use "Accept-Ranges: none" to inform the client, or simply do not send the Accept-Ranges field.

The client uses "Range" as the request header format of the range request is "bytes=xy", x and y are the range data in bytes, x must start from 0, for example, 0-9 refers to the first 10 characters section, and so on.

When the server receives the Range field, it will do four things:

First, the server will check whether the range you sent is legal. If the range is not legal, the server will directly send you a 416, telling you that the requested data range is illegal.

Secondly, if the scope is legal, the server will read the fragment of the file according to your scope and return a 206 status code, which is Partial Content, indicating that a part of the original data has been returned.

When the server returns some data, it will add a Content-Range response header to tell the client the actual offset and the total size of the resource. The format is "bytes xy/length".

The last thing is to send the data.

Not only can you use the range request to watch the dragging progress of the video, but also the multi-terminal download and resumable upload when downloading are actually based on it. We will talk about this in the next chapter.

Nine, multi-segment data

Based on the range request, we can also request more than one range of fragments, that is, request multiple pieces of data at once. In this case, a special MIME type is required: multipart/byterange, indicating that the body of the message is composed of multiple byte sequences, and a boundary=xxx is also required to give the division mark between the segments. like this:

Content-Type: multipart/byteranges; boundary=00000000001
  • 1.

This boundary=00000000001 is the split mark, starting with --00000000001 and ending with --00000000001--.

Hmm... so the theory must be a bit vague, and we can clearly see what its form looks like when we practice it in the next article.

Summarize

In this article, we mainly talked about two things, one is files, and the other is transferring files. Regarding files, we talked about file types, language types, simple compression, etc. There are two main parts for transferring files, one is block transfer and the other is segment transfer. That's all there is to it, it's all theory.

In addition, I would like to emphasize the language type and internationalization of the fourth part. In fact, internationalization in HTTP refers to the data language in the file you transmit, not what we use in front-end single-page applications. There is a difference between the two internationalization plugins.