IPv6 upgrade causes major communication failure

At the beginning of the new year, another major communication failure occurred.

On February 1, 2022, Japanese mobile operator NTT DoCoMo suffered a major nationwide communication failure, causing some 4G and 5G users to be unable to access the Internet and make VoLTE voice calls for up to five hours. On February 7, the president of NTT held a press conference to apologize for the accident and announced the cause of the malfunction.

Fault description
1. Time of failure:
February 1, 2022 7:30am to 0:13pm

2. Failure effects:
Some mobile users cannot use mobile internet service and voice calling service (VoLTE)

3. Failure reasons:
When the IPv6 single-stack mode is introduced, the load of the server increases sharply, and the signaling that controls the communication with the network is sent to the terminal.

4. Affect the number of users:
About 18,000

5. Scope of influence:
National

6. Solutions:

In response to the failure of this IPv6 single-stack network upgrade, DoCoMo has taken temporary measures to fall back to the "IPv4/IPv6 dual-stack mode" in order to quickly restore services. In the future, the operator stated that it will restart the IPv6 single-stack mode upgrade by increasing server capacity and optimizing design.

Why introduce IPv6 single-stack mode?
As we all know, the IPv4 address length is 32 bits, about 4.2 billion (2^32) IP addresses, which is equivalent to an average of 3 people in the world sharing 2 IP addresses. With the popularity of PCs and smartphones, they have been basically allocated. , the development has reached its limit. The IPv6 address length is 128 bits, and the number of IP addresses that can be provided is very large, which can "let every grain of sand on the earth have an IP address".
Today, with the increase of IoT devices and the popularization of 5G, the number of devices connected to the Internet is increasing, and the demand for IP addresses is also increasing. To this end, operators, ISPs, cloud providers and other sectors are actively promoting IPv6 upgrades to solve the problem of IPv4 address exhaustion.

In the transition stage from IPv4 to IPv6, everyone generally adopts the IPv4/IPv6 dual-stack mode, that is, assigning two sets of IP addresses, IPv4 and IPv6, to the terminal. If the peer is an IPv4 server, use IPv4 for communication; if the peer is IPv6, use IPv4 Communicate with IPv6.

However, in the dual-stack mode, all devices in the mobile network are required to support both IPv4/IPv6 protocol stacks, which has the problems of slow upgrade cycle and high maintenance cost, and cannot fundamentally solve the problem of IPv4 address shortage.

In this context, in order to maximize IPv6 utilization, DoCoMo proposes to introduce IPv6 single-stack mode in mobile networks in the spring of 2022.
IPv6 single stack, also called pure IPv6 or IPv6-only, means that only IPv6 addresses are allocated to terminals in mobile data communication.

However, because the IPv6 protocol and IPv4 protocol are incompatible, as shown in the figure above, after upgrading to an IPv6 single-stack mobile network, the network only allocates IPv6 addresses to the terminal. If the target server accessed by the terminal is still IPv4, it needs to pass DoCoMo The server or switch device in the mobile network performs address translation processing and converts the IPv6 address into IPv4, so that the terminal can access the target node with only the IPv4 address.

In this regard, DoCoMo adopts two IP address translation methods: DNS64/NAT64 and 464XLAT. The former can solve the problem of Native IPv6 users connecting to IPv4 servers, and the latter allows IPv4-Only applications on terminals to communicate on IPv6 single-stack networks.

According to DoCoMo's official notification, this communication failure is precisely because of the introduction of the "IPv6 single-stack" mode into the mobile network, which increases the server load and sends signaling to control the communication with the network, resulting in some mobile users being unable to access. ISP (sp-mode) and IMS (VoLTE), data communication and voice calls cannot be performed. Subsequently, the operator took emergency measures to fall back to the "IPv4/IPv6 dual-stack mode", which returned to normal after about 5 hours.