Advanced Chapter: The new dark cloud of HTTPS is coming

2024.05.10

Looking at the TCP/IP four-layer model, we have spent many chapters explaining all the core basic parts. As long as the host is connected to the Internet and the TCP/IP driver is installed on the host, we can smoothly transfer messages from one end of the application to the other. It can be smoothly transmitted to the application at any other end of the world, and the colorful and beautiful world can also be displayed with the help of the client browser.

We seemed to be under a beautiful and clear sky, comfortable and proud, enjoying the sunshine to our heart's content. However, a big dark cloud suddenly came before us: safety issues.

Security issues are a huge issue in the field, and this series focuses on the security issues of the HTTP protocol itself.

2. Weaknesses of HTTP

The HTTP protocol was not designed with security issues in mind. It was originally designed to transmit and share data.

With the popularity of web applications, a small number of hackers have taken advantage of the weaknesses of HTTP to carry out attacks, ranging from obtaining your preferences or browsing history to stealing bank accounts and selling your personal information.

This series mainly focuses on man-in-the-middle attacks. The so-called man-in-the-middle is an invisible black hand between the client and the server. The client and server are not aware of the existence of the man-in-the-middle at all, and there is no way to defend against it.

So what are the weaknesses of HTTP?

(1) The first weakness is: the data is not encrypted, and all information runs naked on the Internet.

We know that from the browser to the final server, it will pass through a large number of intermediate nodes. If the data is not encrypted, it can be easily intercepted on a certain node and combined with tools to understand its meaning.

Some readers may ask: Can I use various encryption algorithms to encrypt and decrypt content at the code level? That is, the transmitted data itself is encrypted at the application layer.

This does not fundamentally solve the problem, because even if you can encrypt the transmitted data, you cannot encrypt the HTTP header, and whether the HTTP header information is encrypted is also within the scope of security considerations.

At the same time, this solution is not universal. For example, the server encrypts the HTML data and sends it to the browser, but the browser itself will not independently adapt to your server to decrypt and then render the page.

(2) The second weakness is: the identity cannot be verified, and the peer chatting with you may be a dog.

For the HTTP protocol, as long as the transmitted information complies with HTTP specifications, the identities of both parties will not be confirmed, and there is no standard for confirming the identities of both parties.

Based on this point, it is very likely that the website actually visited is not the official real website. The reason may be that it was hijacked on your computer or the intermediate DNS to the scammer's website.

I believe readers will feel chills on their backs at this point. When you pay, you can’t confirm who the other party is. Do you still dare to pay? Don't even dare to enter your password, because it is very likely that your password and personal information will also be stolen.

It’s really abominable and impossible to guard against!

(3) The third weakness is: the data is tampered with, and I love you becomes I hate you.

In view of the existence of a man-in-the-middle attack, after the middleman obtains your information, not only can he see your privacy as he wants, but he can even tamper with the information you sent before sending it out.

However, since the client and server do not have a verification mechanism to verify whether the data has been tampered with, the server unconditionally trusts all received information.

Okay, kill people but also kill your heart, this broken net can't be used again!

3. A preliminary study on cryptography

The man I love is a great hero, and one day he will come to marry me on the colorful clouds.

The name of your favorite is HTTPS. You must master basic cryptography knowledge before understanding HTTPS. HTTPS is essentially a combination of cryptographic algorithms.

The reason why this dark cloud is so big is that the main culprit involves cryptography, and cryptography is related to mathematics. If you want to go deeper into this, it is another subject; and the reason why the dark cloud is so dark is because security is not trivial. , once something goes wrong, it will be a big problem, which is why cryptography is so profound and important.

The implementation of currently popular cryptography algorithms is public, has gone through a long period of testing, and was created by mathematicians with profound mathematical knowledge. For developers, even if they do not understand the principles behind it, they can It's easy to get started using them.

When using them, we must have this attitude:

  • There is no cryptographic algorithm in the world that can solve all security problems, and specific problems need to be analyzed in detail.
  • There is no absolutely secure cryptographic algorithm in the world. For algorithm users, just ensure that the currently used cryptographic algorithm is safe.

So our most important task is to understand what problems specific cryptographic algorithms solve and how to use them correctly in the right scenarios.

4. Cryptozoology Goals

We mentioned three weaknesses of HTTP above, and cryptography is introduced to solve the problems. Let’s take a look at the goals of cryptography.

(1) Confidentiality

To solve the problem of streaking information, the core task of cryptography is of course to encrypt data. As long as outsiders don't know the key, it is meaningless even if they get the ciphertext information.

(2) Integrity

To solve the problem of data tampering, even if the data is encrypted, if the data is tampered with by an intermediary, it should be known through some kind of strategy to ensure that the data received is the original data sent by the sender.

In cryptography, the Message Authentication Code (MAC) algorithm is mainly used to ensure integrity.

(3) Identity verification and non-repudiation

Both communicating parties must ensure that the other end is the one they want to communicate with. In cryptography, digital signature technology is generally used to confirm identity.

In addition, digital signature technology can also avoid denial. For example, A borrowed money from B and wrote an IOU. When B wanted A to repay the money, A denied that the IOU was not written by him. The reason is that someone pretended to be him. Got this IOU.

Next, let’s peel back the details and slowly explore how HTTPS solves the above problems.