3 good practices for network observability

2021.06.28
Network observability is an important method for developing network intelligence, but many network teams do not use this method.

Observing the network may significantly increase the success rate of enterprise network operations (NetOps). Enterprise teams can take several steps to achieve network observability. Doing so will enable network administrators to better understand their networks and ensure that they provide adequate services to their end users.





What is network observability?
When the team monitors the network, they pay attention to the performance of the network. If there is a problem in the network, through monitoring, the network administrator can detect the problem. Although the team can solve network problems through management and monitoring, network observability can provide a more thorough assessment of the network. When the team observes the network, their goal is to understand how problems occur, how to correct them, and how to improve the network to prevent future errors.

Göran Edin, chief technology officer of software engineering consulting firm Data Ductus, said in a recent webinar that network observability can also be defined as "resolving the problem of reconstructing the state variables of the end user experience in the shortest possible time-comparative measurement State variable to".

Edin's definition is an amendment to Rudolf Kalman's definition of control system observability. He lists the following principles that companies can use to make their network services observable:

Measure the end user experience;
Use telemetry methods to collect data;
Provide service guarantees to ensure that customers are provided with high-quality services.
1. Focus on the end user experience
Research shows that measuring the end user experience has a positive impact. According to a study conducted by Enterprise Management Associates on the 2020 network management megatrends, one-third of IT problems are reported by end users before the NetOps team detects these problems. Among the interviewees, those companies that measure and monitor the end-user experience are more successful in their operations.

Although these statistics emphasize the importance of network monitoring, observing the end user experience can provide more valuable information on how to improve the network. Edin said that monitoring the network only allows the team to collect information about the network, which is "not enough."

Network experts should observe the network to gain insights and create data-driven systems to make decisions that are most suitable for network development. As more and more applications migrate to the cloud, or evolve into complex distributed systems, companies investing in observability systems based on end-user experience can simplify NetOps management. Edin said that ideally, the system should be able to predict potential problems, simulate scenarios, and recommend network improvements.
2. Use telemetry for NetOps
Network professionals need to collect enough data to create systems so that their network services can be observed. They must use the most relevant telemetry methods to collect data to monitor and observe network services. There are many telemetry methods, but the most relevant types for network monitoring are data configuration, synthetic data, and device telemetry.

Data configuration is the selection of data by network administrators to represent operational intent. Discovering operational intentions is a step towards an intent-based network, which allows network professionals to understand their network behavior. Edin said that based on his experience, it is difficult for network professionals to monitor end-user services without knowing the intent of the operation.
Synthetic data allows the team to use synthetic traffic for testing to simulate the end-user experience, Edin said, this is their closest approach to simulating the end-user experience. Imitating user interaction allows administrators to evaluate how users interact with the network.
Device telemetry is the use of indicators by administrators to check network status. According to Edin, this form of telemetry is a valuable data collection tool for the team when combined with synthetic data because it can determine the root cause of the problem.
Although these methods are useful for collecting data, they are mainly used to monitor the network. When teams want to provide service guarantees, they become more relevant, because this data can be used to determine whether the network is functioning properly and whether its services are working properly.

In order to collect high-quality data for network observability, the network team must ensure that the data they collect is relevant, consistent, accessible, consistent, and well-defined. With high-quality data, they can identify which services are effective in their network, which content needs improvement and how to deploy any modifications.

3. Ensure service guarantee
Edin said that network observability is part of the service assurance process. He added that when building an observability platform or system using the telemetry method of the monitoring network, the team should also prepare a "data preprocessing layer" that can "clean up" the data collected from the telemetry method. This cleaning process ensures the high quality of the data for use in the observability platform.

Network teams with software capabilities can create their own data preprocessing layer or other service assurance systems. They also have the opportunity to use 5G to virtualize the infrastructure and run test agents to confirm whether the network's high-performance services are running. Nonetheless, the observability platform must ultimately generate relevant data for the team to understand their network and provide customers with service guarantees.
Edin said that service assurance should also be part of the entire service life cycle.

He pointed out: "Doing so can not only eliminate the risk of introducing errors through manual processing, but also shorten the delivery speed from weeks or months to at least a few days," adding that speeding up the process will also reduce labor costs.

Integrating observability and DevOps
The network team can also follow the same steps outlined by Edin to incorporate service assurance into the DevOps process. First, they should measure the end user experience. They can then identify questions about their network and need corresponding answers. The simplicity of answering these questions also helps determine the observability of the network.

Network professionals should use the best telemetry methods to gain insight into their network services and create their systems. Edin said he recommends that the team start with data configuration to determine operational intent.

He pointed out: "Make sure you have a real source, show and tell you what services you have."

He next suggested that the team use equipment and synthetic telemetry to coherently explain the end user experience and check whether the system resources are successful. If needed, the team can add other telemetry methods.
Finally, service assurance should be integrated into network automation. The entire process should be executed, reviewed, and repeated as many times as needed.

As NetOps becomes more automated and new services are developed, the team may change the behavior of its network, thereby changing the end user experience. Edin said that ensuring service assurance, as well as other steps in the service life cycle, can reduce this risk through network observability.

Hongmeng official strategic cooperation and co-construction-HarmonyOS technology community

【Editor in charge: Zhao Ningning TEL: (010) 68476606】