The cases shared in this issue are related to wired/wireless network issues.
Background
The customer is a manufacturer of aluminum processing automation production lines. It uses industrial-grade products of a certain P brand (switches, transceivers, APs, CPEs, etc.) as network transmission. Currently, there are network problems on site in a workshop in Anhui, which are directly manifested as a high packet loss rate in the communication between the host computer and the PLC, affecting the downtime of business operations. The entire industrial automation is configured by Siemens.
The project is relatively large, with hundreds of network devices. The simplified on-site topology is as follows:
The planning configuration is as follows:
Fool-style network, network segment: 172.20.240.0/24
All routers and switches are managed devices
Problem phenomenon
On the wired side, the host computer and PC and other devices, ping PLC and other industrial terminals, packet loss of more than 3% packet loss, business communication often warns.
Troubleshooting ideas
For this packet loss problem, we generally consider two situations:
Packet loss in the middle link: This is the most common. For example, in this topology, the switch loses packets.
The terminal device cannot receive packets: This is relatively rare. For example, in this topology, the PLC receives packets abnormally, resulting in no response. Therefore, the host computer pings it and loses packets and interacts abnormally.
Based on this, we need to analyze step by step from comparative tests and other steps.
Troubleshooting and analysis
Step 1: Compare whether the PC under the same switch loses packets
Plug the monitoring PC directly into the access switch connected to the PLC, and connect the control PC2 as the control group
The monitoring PC pings PLC (172.20.240.71) and PC2 (172.20.240.248)
Test conclusion: The monitoring PC tests the target PLC and loses more than 3% of packets, while pinging PC2 does not lose packets. It can be roughly guessed that the actual problem is that the PLC receives or returns packets abnormally, rather than the forwarding problem of the intermediate switch link.
So why does this happen?
According to our communication with the on-site personnel, we learned that this more obvious problem has only occurred recently because many PLCs and I/O modules have been added to the workshop. It can be basically inferred that the network traffic has changed, and the PLC may have received "something" to have this performance, so we capture the packet.
Step 2: Analyze the captured network environment packet
By performing "port monitoring" on the ports of the access switch connected to PLC and other devices:
We found that a large number of PN-PTCP packets were flooding the network:
This packet is a layer 2 packet, and the destination MAC is a multicast MAC: 01:80:c2:00:00:0e, which means that the network is multicast flooded. It is understood that PN-PTCP is Profinet TCP, a Siemens private protocol, and the multicast PN-PTCP we saw is the "timing synchronization" message used in Siemens configuration, which is basically sent by every Siemens PLC.
Step 3: Traffic Analysis
We roughly checked that the rate of each Siemens PLC sending PN-PTCP messages is 300 per second. There are more than a dozen Siemens PLCs in the entire network, and the total amount of random multicast stacking is 5,000 packets per second
So we have reason to believe that the switch forwards these messages unconditionally, and after the PLCs receive each other, their own packet receiving/sending performance becomes abnormal. How to solve it? Naturally, control forwarding - do "multicast suppression"!
Solution
From the above, we can see that after Siemens PLC receives a large number of PN-PTCP messages, its own packet receiving/sending is abnormal. It is necessary to do "multicast suppression" on the upstream switch. For convenience, do it on all ports.
After completing the configuration, the effect is as follows:
PN-PTCP packet flooding is suppressed to 260 packets/second. There is no packet loss when pinging PLC, and it communicates normally with the host computer.