Ten Questions About HTTP(S) and RPC - Network Knowledge Part 3

2022.08.04
Ten Questions About HTTP(S) and RPC - Network Knowledge Part 3

Regarding the knowledge of the network, the previous part shared the knowledge of the transport layer, but did not analyze the TCP flow control, error control and congestion control in depth. I will make a special article to share it later. Today we will look at the HTTP(S) protocol and RPC.

Regarding the knowledge of the network, the last article shared the knowledge of the transport layer, but did not analyze the TCP flow control, error control and congestion control in depth. S) Protocol and RPC.

picture

Why learn HTTP(S) protocol and why learn RPC?

The HTTP(S) protocol is the most widely used and most common protocol on the Internet. We open web pages every day and visit various websites basically using the HTTP(S) protocol. Learning the interaction of HTTP(S) will help us understand the transmission of web pages. with vital help.

RPC=Remote Produce Call is a technical term. At present, the back-end micro-service architecture in the industry is implemented based on the idea of ​​RPC. RPC mainly solves the problem of calling between services in a distributed system. , to be as convenient as a local call, so that the caller cannot perceive the logic of a remote call. For back-end programmers, understanding what RPC is is a prerequisite for understanding the implementation of the microservice architecture.

What is the HTTP(S) protocol and what is RPC?

  • The HTTP protocol is the abbreviation of Hyper Text Transfer Protocol (Hyper Text Transfer Protocol), which is a transfer protocol for transferring hypertext from a World Wide Web (WWW: World Wide Web) server to a local browser.
  • HTTP is a communication protocol based on TCP/IP to transfer data (HTML files, image files, query results, etc.).
  • HTTP is a stateless, object-oriented protocol belonging to the application layer.
  • The HTTP protocol works on a client-server architecture. As an HTTP client, the browser sends all requests to the HTTP server, that is, the WEB server, through the URL. The web server sends response information to the client according to the received request.
  • HTTPS (full name: Hyper Text Transfer Protocol over SecureSocket Layer) is a secure HTTP channel. On the basis of HTTP, transmission encryption and identity authentication are used to ensure the security of the transmission process. HTTPS adds SSL on the basis of HTTP. The security basis of HTTPS is SSL, so the encrypted details require SSL.
  • HTTPS has a different default port than HTTP and a cryptographic authentication layer.
  • SSL (Secure Socket Layer, Secure Socket Layer): Developed by Netscape in 1994, the SSL protocol is located between the TCP/IP protocol and various application layer protocols, providing secure support for data communication.
  • TLS (Transport Layer Security): Its predecessor was SSL. Its first versions (SSL 1.0, SSL 2.0, and SSL 3.0) were developed by Netscape. In 1999, it was standardized and renamed by IETF since 3.1. So far, there are four versions of TLS 1.0, TLS 1.1, TLS 1.2, and TLS 1.3.

The following figure represents a simple diagram of an HTTP request:

picture

The following figure represents a simple illustration of an HTTPS request:

picture

  • RPC (Remote Procedure Call) is a remote procedure call that allows calling a remote service like a local service. RPC can be divided into two parts: user call interface and specific network protocol.

The following is a simple diagram of the RPC protocol:

picturepicture

What are the characteristics of the HTTP(S) protocol? , What are the characteristics of RPC?

HTTP

  • Simple: HTTP is simple to use. When a client requests a service from a server, it only needs to transmit the request method and path. Commonly used request methods are GET, HEAD, and POST. Each method specifies a different type of contact between the client and the server.
  • Flexible and extensible: HTTP allows the transmission of data objects of any type. The type being transferred is marked by Content-Type.
  • Stateless: The HTTP protocol is a stateless protocol. Stateless means that the protocol has no memory capability for transaction processing.
  • Support B/S [Browser/Server, browser/server] and C/S [Client/Server client/server] mode.

HTTPS

  • More secure: HTTPS can provide more high-quality confidential information and ensure the security of user data. In addition, HTTPS also protects the server to a certain extent, and the cost of malicious attacks and disguised data is greatly increased.
  • Extended page loading: The HTTPS protocol handshakes multiple times, resulting in a nearly 50% increase in page loading time.

RPC

  • The call method is simple: make remote calls behave like local calls.
  • Data passing through serialization and deserialization.
  • The passed data is used to locate the interface methods and parameters through the reflection principle.
  • Support multi-threaded concurrent request business.

What is the HTTP(S) protocol message like? What is the RPC protocol message like?

There is basically no difference between the HTTP request message and the HTTPS request message. The HTTP2 request message will be different in the request header part. For details, you can see the following example picture for comparison, but in order, the HTTP request is divided into three parts:

  • Request line (General): request method, request URL field, HTTP protocol version.
  • Request Header: Request Header: Pass data in the form of key-value pairs, see the request header field, general header field, and entity header field for details.
  • Request body (Payload): If the method is GET, this item is empty; if the method is a POST field, the data to be submitted is usually placed.

The specific agreement is as follows:

picture

Let's take a look at the example introduction: the following picture is the domain name requesting Baidu:

picture

The picture below is to request my own domain name zengzhihai.com. My website uses the http2 protocol, so there are some differences in the request headers, such as: authority headers, and other differences are not very big.picture

The response packets of HTTP and HTTPS are also basically the same and are divided into three parts.

  • Response line (General): status code, HTTP protocol version
  • Response Headers: Pass data in the form of key-value pairs, see the response header field, general header field, and entity header field for details.
  • Response body (Response): It contains the content of the response. It can contain HTML code, pictures, etc. The body consists of bytes of data that are transported immediately following the header in the HTTP message.

The specific agreement is as follows:

picture

For example, the response example of visiting zengzhihai.com is as follows:

picture

HTTP common header fields

The common header field refers to the header that is used by both the request message and the response message.

Cache request header fields:

picture

Cache response command header fields:

picture

request header fields

The request header field is a field used in sending a request message from the client to the server, and is used to supplement the additional information of the request, client information, and priority related to the response content.

Request headers inform the server about the client's request. Typical request headers are:

方法名 | 描述
Content-Length | 表示请求消息正文的长度
Host | 请求的主机名,Host首部字段在HTTP/1.1规范内是唯一一个必须被包含在请求内的首部字段。
Accept | Accept首部字段可通知服务器,用户代理能够处理的媒体类型及媒体类型的相对优先级。可使用type/subtype这种形式,一次指定多种媒体类型。
Accept-Charset | Accept-Charset首部字段可用来通知服务器用户代理支持的字符集及字符集的相对优先顺序。另外,可一次性指定多种字符集。与首部字段Accept相同的是可用权重q值来表示相对优先级。
Accept-Encoding | Accept-Encoding首部字段用来告知服务器用户代理支持的内容编码及内容编码的优先级顺序。可一次性指定多种内容编码。
Accept-Language | 首部字段Accept-Language用来告知服务器用户代理能够处理的自然语言集(指中文或英文等),以及自然语言集的相对优先级。
Authorization | 首部字段Authorization是用来告知服务器,用户代理的认证信息(证书值)。
Referer | 首部字段Referer会告知服务器请求的原始资源的URI。客户端一般都会发送Referer首部字段给服务器。但当直接在浏览器的地址栏输入URI,或出于安全性的考虑时,也可以不发送该首部字段。
User-Agent | 首部字段User-Agent会将创建请求的浏览器和用户代理名称等信息传达给服务器。
Connection | 允许客户端和服务端指向请求/响应连接相关的选项,例如设置Keep-Alive 表示保持连接,HTTP2协议是没有这个选项。
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.

response header fields

The response header field is the field used in the response message returned by the server to the client, and is used to supplement additional information of the response, server information, and additional requirements for the client. Typical response headers are:

方法名 | 描述
Location | 使用首部字段Location可以将响应接收方引导至某个与请求URI位置不同的资源。
Server | 首部字段Server告知客户端当前服务器上安装的HTTP服务器应用程序的信息。不单单会标出服务器上的软件应用名称,还有可能包括版本号和安装时启用的可选项。
Transfer-Encoding | 告诉浏览器数据的传送格式
Age | 首部字段Age能告知客户端,源服务器在多久前创建了响应。字段值的单位为秒
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

entity header field

The entity header field is the header used by the entity part contained in the request message and the response message, and is used to supplement the information related to the entity such as the update time of the content. Typical entity header fields are:

方法名 | 描述
Allow | 首部字段Allow用于通知客户端能够支持Request-URI指定资源的所有HTTP方法。
Content-Encoding | 首部字段Content-Encoding会告知客户端服务器对实体的主体部分选用的内容编码方式。
Content-Length | 首部字段Content-Length表明了实体主体部分的大小(单位是字节)。
Content-Language | 首部字段Content-Language会告知客户端,实体主体使用的自然语言(指中文或英文等语言)。
Content-Type | 首部字段Content-Type说明了实体主体内对象的媒体类型。和首部字段Accept一样,字段值用type/subtype形式赋值。
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.

RPC is a protocol for remote procedure calls, using this protocol to request services from a program on another computer without knowing the protocol of the underlying network technology.

What is a complete HTTPS request transmission process, and what is a complete RPC transmission process?

The HTTPS protocol actually adds certificate verification to the HTTP protocol, so I only share the request transmission process of HTTPS here.

A complete HTTPS process has 13 steps:

  1. The client requests a domain name from the browser or client.
  2. The domain name is parsed by the dns server and returned to the ip
  3. The client requests the server by specifying the ip
  4. Server returns certificate (including public key)
  5. The client or traffic judges whether the certificate is legal
  6. The client or browser generates a random symmetric key A
  7. The client or browser encrypts the symmetric key A with the public key
  8. The client or browser transmits the encrypted symmetric key A
  9. The server decrypts the symmetric key A through the private key
  10. The server encrypts the data with the decrypted symmetric key A
  11. The server transmits encrypted data
  12. The client decrypts through the symmetric symmetric key and reads the data
  13. All content is transmitted via symmetric key encryption

The specific schematic diagram is as follows:

picture

Why is data transmission using symmetric encryption?

  • The encryption and decryption efficiency of asymmetric encryption is very low. In HTTP application scenarios, there are usually a lot of interactions between terminals, so the efficiency of asymmetric encryption is unacceptable.
  • In the HTTPS scenario, only the server saves the private key, and a pair of public and private keys can only implement one-way encryption and decryption. Therefore, the content transmission encryption in HTTPS adopts symmetric encryption instead of asymmetric encryption.

Why do you need a CA certification authority to issue certificates?

  • The HTTP protocol is considered insecure because the transmission process is easy to be captured by listeners or forged servers, while the HTTPS protocol mainly solves the security problem of network transmission.

Regarding the RPC protocol, it has been mentioned above that it is a protocol for remote calls. In fact, the implementation of different frameworks may be different. At present, the RPC frameworks of JAVA and Go in the industry mainly include GRPC, Thrift, Dubbo, etc. Here I mainly share the process of implementing RPC in Go's GRPC framework.

GRPC was developed by Google in 2015 mainly for mobile application development and is designed based on the HTTP/2 protocol standard, developed based on the ProtoBuf serialization protocol, and supports many development languages.

The main process of the RPC call process of GRPC has the following steps:

  1. Client application encapsulates request, message encoding
  2. Send the stub prepared by the client
  3. Through the client RPCRuntime communication packet
  4. send request over network
  5. Through the server RPCRuntime communication packet
  6. Through the server-side provider Stub
  7. The server decapsulates the request, and the message decodes to the server application
  8. The server encapsulates the response result and the result message encoding
  9. Call the stub of the server
  10. Through the server-side RPCRuntime communication packet
  11. Send request result over network
  12. Through the client RPCRuntime communication packet
  13. Call the client's Stub
  14. After the client of the client decapsulates the result and decodes the message, the result is successfully responded here.

The specific GRPC call flow chart is as follows:

picture