How the ETag value in the Http response header is generated

2023.03.08

How the ETag value in the Http response header is generated


1. Try to facilitate the calculation without particularly consuming CPU. Such generation using digest algorithms (MD5, SHA128, SHA256) needs to be carefully considered, because they are CPU-intensive operations.

​The generation of etag needs to meet several conditions, at least loosely satisfied

  1. When the file changes, the etag value must change.
  2. As easy as possible to calculate, not particularly CPU-intensive. Such generation using digest algorithms (MD5, SHA128, SHA256) needs to be carefully considered, because they are CPU-intensive operations
  3. It must be horizontally expanded, and the etag values ​​generated on multiple server nodes are consistent during distributed deployment. In this way, the inode is excluded

The above conditions are theoretically established conditions, so how should we deal with them in real practice?

Let's take a look at how it is done in nginx

Generation of ETag in nginx

I read the source code of nginx and translated it into pseudo code as follows: spliced ​​by last_modified and content_length

etag = header.last_modified + header.content_lenth
  • 1.

You can see the source code location and post it at: ngx_http_core_modules.c

etag->value.len = ngx_sprintf(etag->value.data, "\"%xT-%xO\"",
                                  r->headers_out.last_modified_time,
                                  r->headers_out.content_length_n)
                      - etag->value.data;
  • 1.
  • 2.
  • 3.
  • 4.

Summary: etag in nginx is composed of Last-Modified and Content-Length of the response header expressed in hexadecimal.

Find an nginx service in my k8s cluster and test it

$ curl --head 10.97.109.49
HTTP/1.1 200 OK
Server: nginx/1.16.0
Date: Tue, 10 Dec 2019 06:45:24 GMT
Content-Type: text/html
Content-Length: 612
Last-Modified: Tue, 23 Apr 2019 10:18:21 GMT
Connection: keep-alive
ETag: "5cbee66d-264"
Accept-Ranges: bytes
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.

Calculate Last-Modified and Content-Length from etag, use js to calculate as follows, the result is consistent

> new Date(parseInt('5cbee66d', 16) * 1000).toJSON()
"2019-04-23T10:18:21.000Z"
> parseInt('264', 16)
612
  • 1.
  • 2.
  • 3.
  • 4.

ETag algorithm in Nginx and its shortcomings

The negotiation cache is used to calculate whether the resource returns 304. We know that there are two ways to negotiate the cache

  • Last-Modified/if-Modified-Since
  • ETag/If-None-Match

Since ETag is composed of Last-Modified and Content-Length in nginx, it can be regarded as an enhanced version of Last-Modified, so where is the enhancement?

Last-Modified is represented by a unix timestamp, which means that it can only be applied to second-level changes, while ETag in nginx adds additional conditions for file size

Then the next question: If the ETag value in the http response header changes, does it mean that the file content must have changed

Answer: no.

Therefore, using nginx to calculate 304 has certain limitations: the file is modified within 1s and the file size remains unchanged. But the probability of this happening is extremely low, so a less than perfect but efficient algorithm can be tolerated under normal circumstances.

The article comes from: Front-end restaurant , if you want to reprint this article, please contact the front-end restaurant ReTech today's headlines.