Let’s talk about distribution and network principles together
Let’s talk about distribution and network principles together
How does concurrency appear?
As we all know, China has a large population and rich population resources, with a population of more than one billion. Today's Internet/Mobile Internet is also developing in a huge way. Because there are so many people, when everyone is using the same app, it will cause congestion. Just like a supermarket, more people will inevitably lead to crowding and fewer things. If you need to rush to buy eggs, you won't be able to buy them if you are slow. Have you ever encountered the scene where uncles and aunts rush to buy eggs in the morning? Moreover, the checkout speed will be slower than usual, and there may even be purchase restrictions. This is actually the same as the high concurrency principle in the Internet environment.
On the contrary, some countries have a small population base, perhaps tens of millions, which is incomparable to ours. The congestion phenomenon there will be better, but after all, there are fewer people. There are more people and more resources. Correspondingly, our big data is developing very fast. With more people, user portrait analysis and user behavior analysis will be easier to do. There are only tens of millions of people in our country, but there may be millions of active users, so big data is not easy to play with.
What's the difference between more people and fewer people? Consumption, if a country has a large population, then there will be more resources to develop. If you have certain means, you can leverage them to consume. As an e-commerce platform, if you do a good job in operation/marketing/promotion, you will attract more users and the market will be better. As the share increases, consumption will increase. If you are an e-commerce company promoting in a country with a small population, the market is just that big, and the consumption level cannot rise. There is no way to compare with many people and few people, right~ Lao Tie. So, what will everyone easily think of?
- Celebrities get married and Weibo is paralyzed
- Spring Festival Transport, 12306 is down, or I can’t get a ticket
- These are all inevitable phenomena that occur when there are many people.
In addition, we also have something called the fan economy. Nowadays, everyone is a self-media. As long as you have large traffic and many fans, you can make money, just like Li Jiaqi or Luo Pang, right. Live shopping is a way to promote products. In this way, it can stimulate everyone's desire to buy and send traffic to the e-commerce platform.
At this time, some users in the live broadcast room will enter the e-commerce platform to shop. So what is the approximate traffic ratio? Generally speaking, it is the 28 or 37 rule. 20% of users will come in, or even less. Assuming there are 10 million users, then the influx of traffic at this time will be 2 million. Can your platform handle it? Is concurrency support sufficient? Whether there is load balancing, whether high concurrency processing is possible, and whether high availability is achieved, all these must be considered, and they are all related to the network. Therefore, to support the number of users and make a lot of money for the company, the technology must be consistent with the business, and the concurrency, high availability, and stability of the overall system must be ensured. If there is a problem due to a certain shortcoming effect, the loss of users will be heavy, and the impact on the platform and the enterprise will be huge. Unless it is 12306, users must use it, otherwise the user traffic will be distributed through other different channels and channels. business platform.
Therefore, technology is used to drive the market and stabilize users. Concurrency scenarios are everywhere, so this entire process is actually developed in line with the architecture of distributed systems. For example:
Single-->Distributed-->Cluster high availability-->High concurrency-->Microservices-->Containerization.
Distributed architecture evolution
load balancing
Round robin training, weighted round robin training, ip_hash (consistent hashing), url_hash, minimum number of connections
Load balancing is divided into layer 7 load and layer 4 load
OSI network model principle
All of the above are actually based on the Internet. Only with the Internet can we interact. Internet communication is closely related to our lives. So when it comes to network communication, we have to talk about the seven-layer model. This is actually the basis of the network. Many friends are a little confused when it comes to nginx and clusters. This is actually more network-oriented. Here Let's talk about it together.
Face-to-face dialogue and communication between people are through speech, and calls are through telephone lines. If these two people communicate on both sides of the computer (or mobile phone) screen, then user A interacts with the computer, and then the computer interacts with user B. achieve the purpose of communication. In fact, whether it is human-computer interaction or computer-computer interaction, there will be a communication process.
Communication involves computer networks and is related to the Internet. I remember that this was a required knowledge point in my university. This will involve the OSI seven-layer network model.
picture
What is OSI? You can think of it as a specification. Communication and data interaction between computers must comply with the OSI standard before data can be sent from one end to the other end so that another user can see it, thus achieving interaction. Moreover, if it is layered, the purpose of each layer will be clearer. What to do is decided by its own layer, just like MVC, each performs its own duties and is decoupled. And for developers at each layer, they are more focused. Different developers maintain things at different levels and will not be coupled together. Therefore, layered decoupling is everywhere and should be considered more during the development process. Let’s take a brief look first:
picture
- Application layer (layer 7): As shown in the picture above, two users communicate through QQ or WeChat, and they rely on computers or smartphones. Then the application software is the first layer, which is the application layer. This layer will stipulate the http protocol, data format, etc. for calling. As long as there are some software applications, they are all based on the application layer, that is, The most direct medium for communicating with people is software, such as qq/WeChat/browser/idea/eclipse.
- Presentation layer (layer 6): protocol, string representation, encryption
- Session layer (layer 5): session establishment and maintenance, establishment and management of communication between applications
- Transmission control layer (layer 4): how to establish a connection, how to transmit data, whether the data processing and transmission is OK or failed
- Network layer (layer 3): data routing, how to find which node to process, how to communicate data, how to send it
- Data link layer (layer 2): What is the protocol for communication and how to send it out?
- Physical layer (layer 1): physical transmission equipment, such as wifi, 2g/3g/4g, network cables, etc. that are visible and tangible.
Seven-layer model (protocol) classification and merging
The seven-layer model is actually the definition and division of some protocols. Each layer performs different functions and corresponds to different protocols. And we can merge them again, as shown below, into 4 layers or 5 layers or 7 layers. OK.
picture
Each layer can only be different, as follows:
- Physical layer: physical transmission equipment, such as wifi, 2g/3g/4g, network cables, etc. that are visible and tangible. When you chat with a friend on WeChat, you use your computer to plug in a network cable, and your friend uses a mobile phone to connect to wifi. These all belong to the physical layer. The same goes for making phone calls, which requires a phone line.
- So the transmission between computers passes through the physical layer. In what form is it transmitted? They are all binary data, like 1010001001.
- Data link layer: When the computer receives the binary numbers 1010001, they are very long and need to be parsed, so they can actually be divided into groups of 8 bits. If each group is 8 bits, then Can perform data operations and processing. So each group has 8 bits (or 16 bits/64 bits), who will do this job? It is divided by the link layer. The physical layer cannot do it and has insufficient functions.
- The standard protocol for link layer processing is the Ethernet protocol. It is a standard and a specification. In the early days, many companies grouped binary data into different groups. It worked, but it was too messy. In order to be more standardized, we adopted Ethernet protocol Ethernet.
- Computer communication will send out data packets, which contain two parts: head and data.
- The head contains the sender, receiver and data type (source mac address, destination mac address)
- data contains the specific content data of the data packet
- Is it similar to a rest request? : - )
- When data is sent, head and data are sent together, and there is a certain length limit. I forgot the details. If it is too long, it will be sent in pieces, that is, cut. What is fragmentation? Baidu has talked about it a lot before.
- So this data packet is actually the same as when you make a phone call. The caller is the data sender, the callee is the data receiver, and the conversation content is the data packet, right.
- By the way, the address in the computer is the same as everyone's mobile phone number. It is called the mac address and is related to your network card.
- Regarding the mac address: I mentioned the Ethernet protocol before. This thing also stipulates that to communicate on the Internet, you must have a mac address, and the mac address exists in the network card, so you can only access the Internet with a network card. Each network card corresponds to a mac. Address (knowledge point, we have mentioned this in Keepalived and LVS in the architect course, virtual IP can virtualize a new VIP through the network card in the virtual machine), it is the same as your mobile phone card, you need to make a phone call , you have to buy a mobile phone card. Each mobile phone card has a unique mobile phone number. Only through the mobile phone number can you make calls and be called, okay.
- Therefore, the link layer of this layer mainly defines the formatted transmission of data. One end transmits it out, and the other end accepts incoming packets for analysis.
- Network layer: This layer actually defines the IP protocol. It has the concept of a gateway. What do you think of when talking about network cards? nginx? zuul? gateway? So will they all be on a certain computer node? They all have an IP, so in fact, when the computer data is sent out, it will go through the gateway, and when it is received, it will also go through the gateway. It is equivalent to an intermediary. In layman's terms, when you make a call, you have to go through the operator. Well, from the outgoing call to the reception, the intermediate process is handled by the operator. At this time, you can think of the operator as a gateway (which also plays the role of data routing). It exists at a certain address. , that is, the IP address, then each area has different operators, and different operators manage different locations. Then you can understand the area (location) as a local area network, and the mobile phone is each mac address. The LAN IPs in the same area are all the same, but the mac addresses corresponding to the mobile phone numbers are different, right? Take a look at the picture below to understand (add meals)
picture
picture
4. Transport layer: Establishes communication between ports. What does it mean? That is to say, we can find the corresponding target computer node through the IP and mac address. So suppose we are chatting through WeChat or QQ. At this time, the data is transmitted to the other party's computer, so how does the other party's QQ or WeChat get it? What about accepting your data? Or how to hand over data to WeChat QQ to display to users? At this time, there is the concept of port. Each application will have a port. If an application needs to be opened multiple times, the port number will definitely be different. The principle is the same as tomcat. Each port will be associated with the network card. When we interact with computers, we will carry the port, such as 8080. In this way, the corresponding other party's application will receive the data and display it. OK.
- For example, let’s take a phone call as an example. Many times, just calling is useless because no one answers the call. You need to add the extension number at the end. So 8080 is the extension number and can be called to the HR person. 8088 can be called to the boss. Honey, then this is what the port means.
- The protocols of the transport layer are TCP and UDP. LVS belongs to the transport layer and is a layer 4 load balancing.
5. Application layer: Application layer + presentation layer + session layer can be jointly defined as the application layer. All applications used by users are based on the application layer, such as QQ, WeChat, browsers, idea, and eclipse. This is the most intuitive interaction between computers and people. Each application can have its own different data formats and data composition forms, so the application layer specifies the data format of the application. For example, QQ/WeChat/Mailbox/Browser, when data is transmitted in these applications, the protocols are different, and the data formats are also different. The differences in protocols are standardized at the presentation layer. The session layer is built on top of the transport layer. It is consistent with what we call session. It is used to maintain communication between two points, that is, to establish and manage communication between applications. If my computer restarts, it will Opening the software requires re-establishing the connection, that is, re-establishing the session, OK.