Can ping pass, TCP will be able to connect?

Can ping pass, TCP will be able to connect?

When none of the routing tables match, the default gateway will be used. When there are multiple matches, the matching length will be checked first. If they are the same, the management distance will be checked, and the path cost will be the same. If even the path cost is the same, then the path of equal cost. If the route has ECMP enabled, then these routes can be used for transmission at the same time.

​Usually, we want to know that the network between our machine and the destination machine is blocked, and we usually execute the ping command.

Generally, for a network in good condition, you can see that its corresponding loss​packet loss rate is 0%​, which is the so-called pingable. If the packet loss rate is 100%, it means that the ping cannot be reached.

picture

ping is normal

picture

ping nowhere

So the question is, if I can ping a certain machine, then if I use the TCP protocol to send data to the destination machine, will I be able to pass it?

Or to put it another way, are the network paths taken by the ping and tcp protocols the same?

At this time, the first reaction is not necessarily, because after ping, a router in the intermediate link may hang up (power off), and then use TCP to connect, it will take another path.

That's right. But suppose, nothing has changed in the intermediate link?

I'll go straight to the answer first.

Not necessarily, the network path taken may still be different.

Let's talk about why today.

I wrote an article before "The network is disconnected, can I still ping 127.0.0.1? ", which mentioned the difference between ping packets and tcp packets.

picture

The difference between ping and TCP messaging

We know that the network is layered, and each layer has a corresponding protocol.

picture

Analysis of Message Body Changes Corresponding to Layer-5 Network Protocols

And this network layer is like building blocks, and the upper-layer protocols are built based on the lower-layer protocols.

Whether it is ping (using the ICMP protocol) or tcp, it is essentially a data packet based on the IP protocol of the network layer, and at the physical layer, it is a binary 01 string, which is sent out through the network card.

If the network environment has not changed and the destination is the same, then it stands to reason that the network paths they take should be the same. Under what circumstances will they be different?

Let's start with the topic of routing.

network path

In our imagination, when we want to transfer data between two machines. A connection will be established between the local machine and the destination machine, like a pipeline, with data from one end to the other. This pipeline is actually a concept that we abstract for the convenience of understanding.

In fact, after we send the data packet from the local network card, it will go through various routers (or switches) before reaching the destination machine.

There are a large number of these routers, and they can be interconnected with each other. After they are connected, it looks like a big network, so the name "network" can be said to be very vivid.

picture

network of routers

Considering the functions of switches, routers basically support them, so we only discuss routers here.

So now the question comes, after the router receives the data, how does it know which path to take and to which router?

What determines the path?

In such a large network as above, any router may take any path and send data to another router.

However, the distance and bandwidth between routes may be different.

Therefore, it is very necessary to know which path to take between two points is the optimal path.

So the problem becomes such a graph structure. Each edge has a cost or weight, and calculate the shortest distance between any two points on it.

picture

Router and Dijkstra

At this time, everyone's memories must be overwhelmed.

I am familiar with this question. This is the Dijkstra algorithm that I brushed in college. Chrysanthemum Factory’s OJ written test questions often appear in the set. Now I finally understand why their written test questions seem to have more graph-related questions than other big manufacturers, because Chrysanthemum Factory is engaged in communication and is an old player of routers. .

Generation of routing table

Based on the Dijkstra algorithm, a new protocol is encapsulated, the OSPF protocol (Open Shortest Path First, Open Shortest Path First).

With OSPF, the router obtains the shortest distance between itself and other points in the network graph, so it knows which optimal path the data packet should take to reach a certain point.

Combine this information into a table, which is what we often call the routing table.

The routing table records what port needs to go to what IP, and the cost (metric) of taking this path.

You can view it with the route command.

picture

route table

The routing table determines the packet path

In the process of sending data packets, the destination address IP will be added to the network layer.

The router will match this IP with the routing table.

Then the routing table will tell the router what kind of message should be forwarded to what port.

for example.

picture

Forward data through routing table

Suppose A wants to send a message to D. That is, 192.168.0.105/24​to send a message to 192.168.1.11/24.

Then A will send the message to the router.

The router knows the destination IP192.168.1.11/24​, matches it with the routing table, and finds that 192.168.1.0/24 is on the e2 port, then the message will be sent from the e2 port, (maybe through the switch) Finally, the message is sent to the destination machine.

Of course, if you can't find it in the routing table, then hit the default gateway, that is, send it from the e1 port and send it to IP192.0.2.1. This router's routing table doesn't know where to go, and maybe other routers know.

Routing table matching rules

In the above example, only one item in the routing table is matched, so it can only be it.

However, all roads lead to Rome. In fact, there must be many paths to the destination.

What if there are many entries in the routing table that are matched?

If multiple routing entries can reach the destination, the one with longer matching length is preferred. For example, if the destination is 192.168.1.11, it is found that 192.168.1.0/24 and 192.168.0.0/16 in the routing table can match, but obviously the matching length of the former is longer, so the forwarding corresponding to 192.168.1.0/24 will be forwarded in the end. port.

But what if both entries have the same matching length?

Then it depends on the protocol that generates this routing table entry. Choose the one with the higher priority. The higher the priority, the smaller the so-called Administrative Distance (AD, Administrative Distance). For example, manually configured static routes are preferred, and entries learned dynamically by OSPF are second preferred.

If it is still the same, it depends on the metrics, which is actually the cost of the path. The smaller the cost, the easier it is to be selected.

There are many routes that the router can choose, but logically, the best one is only "one", so so far, we can think that for the same destination, the paths taken by ping and TCP are the same.

but.

What if even the path cost is the same? That is to say, there are multiple optimal paths.

Use it all.

This is the so-called Equal Cost MultiPath, ECMP (Equal Cost MultiPath).

We can use traceroute to see if there is an equal-cost multipath on the link.

picture

It can be seen that there are several IPs in some lines in the middle, that is to say, several destination machines can be selected at the same time in this hop, indicating that this path supports ECMP.

What is the use of ECMP

With equal-cost multipath, we can increase the link bandwidth.

for example.

picture

Only one path can be selected without ECMP

From point A to point B, if these two paths have different costs, the bandwidth is 1 gigabit. Then the data packet must choose the low-cost path. If this path fails, take the following path. But no matter what, at the same time, only one path is used. The other one is a bit of a waste, is there any way to make use of it?

Yes, if the costs of the two paths are set to be the same, they become equal-cost routes, and then the router in the middle enables the ECMP feature, so that the two links can be used at the same time. The bandwidth has changed from the original 1 gigabit to 2 gigabits. The data can then be chosen at will between the two paths.

picture

Two links can be used simultaneously with ECMP

But this also brings up another problem. Exacerbated packet out-of-order.

It turns out that I only use one network path, and the data is sent out in sequence, and if there is no accident, it will arrive in sequence.

Now the two packets take two paths, and the first packet may arrive later. This is out of order.

Then the problem comes again.

What's wrong with out-of-order?

For the most commonly used TCP protocol, it is a reliable network protocol. The reliability mentioned here is not only to ensure that the data can be sent to the destination, but also to ensure that the data sequence is the same as the original sender.

The implementation is also very simple, TCP numbers each packet (segment). After the data arrives at the receiving end, it is found that it is an out-of-order data packet according to the data packet number, and it will be thrown into the out-of-order queue to sort the data packets. If the previous data packets have not arrived, even if the latter data packets arrive first, they have to wait in the out-of-order queue until they are all received by the upper layer.

For example, the sender sends out three packets, numbered 1, 2, 3​, assuming that at the transport layer 2 and 3​ arrive first, 1​ has not arrived yet. At this time, the application layer cannot get the data packets of 2 and 3. It has to wait for 1 to arrive before the application layer can get these three packets at one time. Because these three packets may originally represent a complete message, if 1 is missing, then the message is incomplete, and it is meaningless for the application layer to get it.

Like this, the phenomenon that the following data cannot be delivered to the application layer in time due to the loss of the previous data is what we often call TCP head-of-line blocking.

picture

Out-of-order queue waiting for the arrival of packets

When out-of-order occurs, 2 and 3 need to stay in the out-of-order queue, and the out-of-order queue actually uses the memory of the receive buffer, and the receive buffer has a size limit. The size of the receive buffer can be seen by the following command.

# 查看接收缓冲区
$ sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096(min)    87380(default)  6291456(max)
# 缓冲区会在min和max之间动态调整
  • 1.
  • 2.
  • 3.
  • 4.

The more out-of-order situations, the more memory in the receive buffer will be occupied, the corresponding receive window will become smaller, the data that can be received normally will become less, and the network throughput will become worse, that is, the performance will change. Worse.

Therefore, we need to try our best to ensure that all TCP packets under the same TCP connection take the same path, so as to avoid packet loss to the greatest extent.

Path selection strategy of ECMP

At first, ECMP was turned on to improve performance, but now it has increased out-of-order and reduced TCP transmission performance.

How can this be tolerated.

To solve this problem, we need to have a reasonable path selection strategy. In order to avoid out-of-order data packets in the same connection, we need to ensure that the data packets in the same connection all take the same path.

This is easy to do. We can locate a unique connection through the information of the quintuple of the connection (the sender's IP and port, the receiver's IP and port, and the communication protocol).

picture

Quintuple

Then generate a hash key for the quintuple information, and let the data of the same hash key take the same path, and the problem is perfectly solved.

picture

Five tuples are mapped to hash keys

picture

ECMP path selection based on quintuple

Do TCP and Ping take the same network path?

Now we return to the question at the beginning of the article.

For the same sender and receiver, do TCP and Ping take the same network path?

Not necessarily the same, because one of the information in the quintuple is a communication protocol. Ping uses the ICMP protocol, which is different from the TCP protocol, and ping does not need a port. Therefore, different quintuples generate different hash keys, and the paths selected by ECMP may also be different.

picture

Five-tuple difference between TCP and ping

The same TCP protocol is used, do the data packets take the same network path?

Or the same sender and receiver, the same TCP protocol, are the network paths taken by different TCP connections the same?

Like the above problem, it is actually a quintuple problem, both of which are TCP protocols. For the same sender and receiver, their IP and receiver ports must be the same, but the sender's port can be changed at any time. , so the path taken through ECMP may also be different.

picture

Five-tuple differences between different TCP connections

But here comes the problem.

I know what's the use of this? I'm doing business development, and I don't have permission to set up network routing.

Use this knowledge point to troubleshoot problems

For business development, this is definitely not a useless knowledge point.

If one day, you find that you can ping the destination machine, but use TCP to connect, but occasionally cannot connect to the destination machine. And the machines at both ends are quite idle, and there is no performance bottleneck. Really desperate.

You can think about whether ECMP is used in the network, and there is a problem with one of the links.

picture

Ping succeeds but some TCP connections fail

The troubleshooting method is also very simple.

You know the IP of the local machine and the IP and port number of the destination machine, and you also know that you are using a TCP connection.

As long as you print the error message when reporting an error, you will know the port number of the sender.

In this way, you will know what a quintuple is.

The next step is to specify the port number of the sender to re-initiate the TCP request, the same quintuple, and take the same path. It stands to reason that if there is a problem with the link, it will definitely recur.

If you don't want to change your own code, you can use the nc command to specify the client port to see if the TCP connection can be established normally.

nc -p 6666 baidu.com 80
  • 1.

-p 6666​ is to specify that the client port making the request is 6666, followed by the domain name of the connection and port 80.

picture

Successfully established tcp connection through nc

Assuming that using the quintuple of port 6666 to connect always fails, but using 6667 or other ports is successful, you can take this information to find a colleague in charge of the network.

Summarize

The router can generate a routing table through the OSPF protocol, use the IP address in the data packet to match the routing table, and select the optimal path for forwarding.

When none of the routing tables match, the default gateway will be used. When there are multiple matches, the matching length will be checked first. If they are the same, the management distance will be checked, and the path cost will be the same. If even the path cost is the same, then the path of equal cost. If the route has ECMP enabled, then these routes can be used for transmission at the same time.

ECMP can improve the link bandwidth, and at the same time use the quintuple as the hash key for path selection, which ensures that the data packets of the same connection take the same path and reduces the disorder.

You can use the traceroute command to check whether ECMP is available on the link.

In the network link with ECMP enabled, the TCP and ping commands may take different paths. Even if the same TCP is used, the paths taken are different between different connections. Therefore, there is a problem of good and bad connections. It is really desperate. , you can consider whether it is related to ECMP.

Of course, when you encounter a problem, you must doubt yourself, and believe that most of the time it really has nothing to do with ECMP.