AI in telecom research: classifying network outages
AI will transform the future of telecom networks. In this series, we will look further into our use of AI methods in telecom research, and how the idea of Self Driving Networks can be realised.
Artificial Intelligence is essential in future telecom networks for many reasons. It can be used to increase the efficiency and automation of routine tasks, for predictive maintenance, enhance the quality of service, and meet the need for self-driving networks. When the industry progresses to 6G, it is expected that AI will play a much more dominant role.
In CRNA, we use AI to analyze the vast amount of telemetry data available in telecom networks. In the future, we can envisage networks repairing themselves based on analysis of data and aggregated experience in how to react to a specific incident, so called "self-driving networks”. This is important if the network carries essential traffic and must be repaired quickly, even at night when expert network engineers may have limited availability and in the event of an unforeseen event. To realise this vision, a first step is to classify what actually happened in the network, which is always the best place to start before remedies are suggested.
Analyzing network outages
Network outages, both short and long, occur occasionally across all providers and are much more common than one might expect. In the latest study on this subject from CRNA, two years of outages from a small global network for high-quality services have been analysed using AI. The AI has been trained to learn from the manual Root Cause Analysis done by the company’s Network Operation Centre (NOC).
Customer service in the company used telemetry visualization tools, like this example, that required a high level of experience to interpret. The NOC can suffer from large numbers of false positive alerts, and false negatives, or an overwhelming amount of log messages and alerts during outage events. Resulting in customer service not being able to respond to a customer case because they don’t have enough information and the NOC is busy troubleshooting.
Reducing the effort to identify and repair outages
With this AI-based system, we can speed up troubleshooting, quickly create tickets with the providers, and improve customer satisfaction by providing fast and accurate feedback to all support cases.
In the data that was collected, more than 700.000 alerts were related to various outage issues. Of these, the vast majority were automatically resolved, resulting in 2855 outages that had an impact on customers. Problems in Layer2 (the data link layer of the network) were classified with a 99% accuracy, and the other problems with 40%–100% accuracy.
The AI machine learning model has been trained on the historical data of outages with known root causes. The AI model can then quickly predict the cause and severity of a large number of alerts and initiate troubleshooting processes, if required. This approach significantly minimizes the effort spent on non-issues and reduces the Time to Repair (TTR) for identified problems. In addition, the generation of a required Reason for Outage report (RFO) has been automated.
In our paper, we show how automatic classification of errors in a Wide Area Network (WAN) achieves very high confidence by learning from human classification. This mechanism is a key part of realising the vision of self-driving networks in the future.
In the next blog post on AI in telecom research, we’ll look further into our research on Self Driving Cellular Networks and how we believe the idea can be realised.
For more details, read our paper: Jan Marius Evang, Azza H. Ahmed, Ahmed Elmokashfi, and Haakon Bryhni. 2022. Crosslayer network outage classification using machine learning. In Proceedings of the Workshop on Applied Networking Research (ANRW '22). Association for Computing Machinery, New York, NY, USA, Article 2, 1–7. https://doi.org/10.1145/3547115.3547193
(This blog post was written by PhD student Jan Marius Evang, Research Professor Haakon Bryhni, and Maria Normann).