Interview Blitz: Why HTTPS? what's it for?
When it comes to HTTPS, I believe most people are no strangers, because most of the websites we use are based on HTTPS, such as the following:
So the question is, why are they using HTTPS? What's so great about HTTPS?
1. HTTP
Before talking about HTTPS, we need to understand HTTP, because HTTP is the basis of HTTPS communication. HTTP (HyperText Transport Protocol) hypertext transfer protocol, which is used to transmit data between client and server. HTTP is very simple and convenient to use, but it has the following three fatal problems:
Using clear text communications, the content can be eavesdropped.
Failure to verify the true identity of the communicating party may result in disguise.
The integrity of the message cannot be proven, and it is easy to be tampered with.
In view of the above problems, the current system will use HTTPS instead of HTTP.
2. HTTPS
First of all, HTTPS is not a new protocol, but an encryption mechanism SSL (Secure Socket Layer) or TLS (Transport Layer Security) is added to the HTTP protocol. HTTPS = HTTP + encryption + authentication + integrity protection.
SSL and TLS
SSL (Secure Socket Layer) was first developed by the browser developer Netscape, which developed SSL 3.0 and versions before 3.0, and then handed over SSL to the Internet Engineering Task Force of the IETF (Internet Engineering Task Force). In hand, the IETF developed TLS 1.0 based on SSL 3.0, so TLS can be considered a "new version" of SSL.
2.1 Solving the trust problem
As for HTTPS, the first thing to solve is the problem of trust, that is, the problem of identity verification. If the trust problem is not solved, there will be server masquerading, that is, the problem of "man-in-the-middle attack". The so-called man-in-the-middle attack means that under normal circumstances, the client and the server should interact directly, but a "bad guy" (man-in-the-middle) emerges here, which is included between the client and the server for Stealing and tampering with the content of the communication between the two parties, as shown in the following figure:
HTTPS solves the trust problem by using a digital certificate solution, that is, at the beginning of the creation of the server, it will first apply for a reliable digital certificate from a third-party platform recognized by everyone, and then when the client accesses (server side), The server will first give the client a digital certificate to prove that it is a reliable server rather than a "middleman". At this time, the browser will be responsible for validating and checking the validity of the digital certificate. If there is a problem with the digital certificate, the client will immediately stop the communication. If there is no problem, the subsequent process will be executed, as shown in the following figure:
With the digital certificate, the real identity of the server can be verified, which solves the problem of "man-in-the-middle attack" and the problem of camouflage.
2.2 Solve the problem of plaintext transmission and integrity
Although we have solved the trust problem above, because the two parties communicate in plain text, there is still a risk of eavesdropping on the content of the communication during communication. What should we do at this time? So we thought, using encryption to solve the problem of information exposure.
encrypted classification
There are two main types of encryption: symmetric encryption and asymmetric encryption.
In symmetric encryption, there is a shared secret key, through which information encryption and information decryption can be achieved. It is characterized by fast encryption and decryption, but because of the shared secret key, once the shared secret key If the key is intercepted, then the so-called encryption and decoding are just empty talk.
- In asymmetric encryption, there is a pair of secret keys: a public key and a private key. The public key can be used to encrypt information, but the information cannot be decrypted, and the private key can be used to decrypt the information. Its characteristic is that the server saves the private key and does not expose it to the outside world. It only sends the public key to the client, and others cannot decrypt the encrypted information even if they get the public key, so this method is more secure, but asymmetric encryption The execution speed is relatively slow.
- So in HTTPS to use symmetric encryption or asymmetric encryption? Use symmetric encryption, which is fast, but not secure; use asymmetric encryption, which is secure, but slow. Only children can choose, and adults need it, so both asymmetric encryption and symmetric encryption are used in HTTPS. The entire interaction process is as follows:
The HTTPS execution process is as follows:
- The client uses HTTPS to access the server.
- The server returns a digital certificate, and uses asymmetric encryption to generate a public key to the client (the private key is kept by the server itself).
- The client verifies whether the digital certificate is valid, if invalid, terminates the access, if valid:
Generate a shared secret using symmetric encryption;
Encrypt data using a symmetric encrypted shared key;
Public key encryption (generated by symmetric encryption) using asymmetric encryption with a shared secret key.
Send the encrypted key and data to the server.
- The server uses the private key to decrypt the client's shared secret key (generated by symmetric encryption), and then uses the shared secret key to decrypt the specific content of the data.
- After that, the client and the server interact with the content encrypted by the shared secret key.
In this way, HTTPS not only ensures security, but also ensures high efficiency, which can be said to have both.
The use of encryption also indirectly ensures the integrity of the data. If the data is incomplete or there is redundant data, an error will be reported during decryption, so that the integrity of the data can be indirectly guaranteed.
Summarize
The use of HTTP protocol has problems such as plaintext communication and man-in-the-middle attacks, but these problems have been effectively solved in HTTPS. HTTPS solves the problem of man-in-the-middle attacks through digital certificates, and uses encryption to solve the problems of plaintext communication and data integrity.
References & Acknowledgements
"Illustrated HTTP"
"Easy Introduction to Graphical TCP/IP Network Knowledge"