Five Minutes Technology Talk | Semantic Communication Technology Helps Safe Rural Construction

2023.07.26

Five Minutes Technology Talk | Semantic Communication Technology Helps Safe Rural Construction


The Central Committee of the Communist Party of China and the State Council successively issued the No. 1 document of the Central Committee to comprehensively promote rural revitalization. Among them, the construction of a safe village is an important part of the implementation of the rural revitalization strategy, which vigorously promotes the modernization, informatization, and networking of rural governance, and improves the sense of security and happiness of villagers.

Part 01

semantic communication technology 

The rapid popularization of Ping An's rural services and the high-definition cameras have brought a "sense of security" to users' lives, but at the same time, they are also facing challenges in terms of massive access terminals, continuous increase in bit rate, and increasingly complex scenarios. In the traditional framework, the encoding optimization path of exchanging computational complexity for compression rate has gradually slowed down the bit rate, showing a bottleneck trend; at the same time, the capacity of communication channels has approached the limit, and it is difficult to meet the rapidly growing massive video data in transmission, storage, and analysis. etc. needs. The human brain can achieve ultra-high image and video compression performance. The mechanism is that the visual cortex has functions such as edge detection, shape recognition, and motion recognition. The inferior temporal lobe can recognize complex objects and faces, that is, extract structured semantic information. Traditional image and video communication uses pixels as the representation unit, which cannot conform to the structural characteristics such as symmetry, repetition, and correlation inherent in natural images, and it is difficult to greatly improve the representation efficiency. Learning from the human brain's visual perception and cognitive mechanisms, based on the discipline of artificial intelligence, and exploring video semantic representation models can improve representation efficiency to a certain extent. Semantic communication draws on the human brain's ultra-high image and video compression performance mechanism, breaks through the existing theoretical framework, and integrates the human brain's visual perception and cognitive mechanisms into the communication process to achieve efficient semantic representation and video clarity and fluency at extremely low bit rates.

picture

Research on semantic-based multimedia communication technology, realize high-quality, low-bandwidth, and low-storage multimedia semantic communication in a network-restricted scenario, and promote the verification and application of related technical achievements in safe villages. The technical indicators and application scale have reached the leading position at home and abroad level. Unlike traditional video compression, which uses pixels as units, semantic communication extracts image semantic information to achieve efficient compression. In the case of limited resources at the encoding end, it achieves efficient and accurate semantic representation and accurate images at the receiving end.

- Semantic communication codec technology

Semantic communication coding and decoding technology establishes a shared prior knowledge base based on scene tasks, and links the semantic extraction of the target at the encoding end with the target generation at the decoding end. The encoding end detects the target in the video frame based on the prior knowledge, performs semantic extraction and converts it into The value sketch image is encoded and transmitted, and the decoding end generates the target according to the knowledge base and the sketch image, and fuses it with the background image to reconstruct the video. Through the compact feature representation and efficient feature retrieval of joint video semantic coding, the fast retrieval of massive videos is realized, which is used in security and other business scenarios.

picturepicture

Among them, the performance requirements for massive video feature retrieval are high. In order to ensure fast and accurate video retrieval, Semantic Communication proposes a joint optimization scheme of video coding and compact feature representation to obtain more compact feature descriptors. Construct a tree-shaped index structure based on reinforcement learning to improve retrieval efficiency while ensuring accuracy.

picturepicture

picturepicture

- Key technology of video semantic communication QoE measurement

Current QoE optimizes the QoE experience of multimedia content by studying the impact of video objective factors such as video resolution, freeze time, frame rate, and bit rate on user subjective experience. However, these studies on QoE influencing factors focus on the objective characteristics of video, which cannot effectively reflect the impact of semantic information on user experience. A QoE evaluation method based on semantic factors is proposed, and an evaluation-feedback mechanism oriented to semantic communication is established.

For the QoE evaluation of the general scene of the semantic communication system, the average key point distance, the key point missing rate and the average Euclidean distance are used as the influencing factors, combined with the traditional QoS start time, buffer ratio, average media bit rate, and video resolution and frame rate And objective factors such as bit rate.

picture

After calculating the QoE evaluation of semantic communication video, it is necessary to adjust and optimize the entire semantic communication system with this indicator feedback. Based on the characteristics and process of semantic communication, the index and feedback adjustment mechanism of semantic QoE are designed. The semantic factor is added to the subjective QoE prediction, so that the predicted value of the prediction model model is close to the real user evaluation, and at the same time, the objective QoE index calculation is based on three levels of pixel, location, and timing design indicators. Feedback adjustment is performed through the QoE calculation results of the cloud and the client. When the system produces key point offset, frame rate drop, contour distortion, and timing instability, it means that the video reconstruction quality is low at this time. Enable contour constraints, adjust the transmission bit rate, and increase The number of key points, codec model adjustment and optimization system to meet user needs.

picturepicture

Part 02

end 

Compared with the mainstream H.265 codec, the average code rate of video transmission based on semantic communication is reduced by more than 80% when the subjective quality is equivalent. Computational and storage overheads are reduced by more than 50%. In order to promote the application of multimedia semantic communication technology in safe villages, the demonstration application platform in Fumin Village, Nantong City, Jiangsu Province, completed the construction of a digital village demonstration application platform, and verified the application of multimedia semantic communication in four scenarios in safe villages and the feedback and evaluation effect of semantic communication QoE . Through scene detection, using the semantic characteristics of strong consistency in static scenes, it is expected to save cloud storage and bandwidth by more than 60% for safe rural scenes, about 750 million yuan per year.

picture