Actual combat case: After a building network was renovated, many terminals could not get IP

The case shared in this issue is related to wired network issues.

Background of the problem
The client is an integrator specializing in weak current projects. Recently, we undertook a network transformation project for a certain park. The main purpose is to improve network reliability and ensure uninterrupted business. The routing and switching equipment is a certain W. The specific measures for improvement are:


The original single-device export router was changed to dual-machine hot standby and BFD was deployed;
The core used a stacking switch to replace the original aggregation switch as a VLAN gateway and DHCP server;
The core and aggregation switch were connected to a port aggregation eth-trunk interface.
The basic topology is as follows:

Problem description
However, after a building was renovated, during rush hour, some users could not automatically obtain IP addresses or had difficulty obtaining IP addresses. The following is the packet capture and network card information on a computer that failed to obtain IP addresses:

Let's take a look at how to troubleshoot this problem~

Troubleshooting and analysis
Step 1: Confirm whether the DHCP packet receiving and sending of the core switch is normal

First, check whether the interface corresponding to the core switch has received the DHCP request from the client and made an offer response. Through packet analysis, we can see that the core will respond to each DHCP request (DHCP discovery) handed over to the core switch, but the diagnosis found that the request of the computer that could not obtain the IP address was not handed over to the core switch, that is, the packet may be lost on the downstream device.
Step 2: Confirm that the total throughput cannot reach 2Gbps

Check the ports of the aggregation switch and find that a large number of discarded messages are found. Analyzing the message headers, we find that the discarded messages are all broadcast messages such as discovery and offer.

Step 3: Configuration check

Check the port configuration and find that there is a configuration for broadcast suppression:
This means that the maximum number of broadcast packets that can be forwarded per second is 100. Delete this configuration or increase the value to solve the problem.

Solution
(1) Root cause

Due to the DHCP protocol workflow, as shown in the figure below:
The first stage DHCP DISCOVER packet is broadcast by the client to discover available DHCP servers. Therefore, in the case of a large number of clients, broadcast suppression of the port configuration may affect the user's automatic acquisition of IP addresses.

(2) Suggestions and Summary

The difficulty in obtaining a DHCP address may be related to the broadcast suppression of the port configuration.


When deploying services, the broadcast suppression value of the port should be set reasonably. The specific value should be observed and adjusted according to the actual business situation so as not to affect the lower limit of the business.