WO2025010581A1 - Attention-based model for wireless digital twin - Google Patents
Attention-based model for wireless digital twin Download PDFInfo
- Publication number
- WO2025010581A1 WO2025010581A1 PCT/CN2023/106518 CN2023106518W WO2025010581A1 WO 2025010581 A1 WO2025010581 A1 WO 2025010581A1 CN 2023106518 W CN2023106518 W CN 2023106518W WO 2025010581 A1 WO2025010581 A1 WO 2025010581A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- performance data
- attention
- network
- state
- action
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/022—Capturing of monitoring data by sampling
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/04—Processing captured monitoring data, e.g. for logfile generation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/08—Testing, supervising or monitoring using real traffic
Definitions
- the present disclosure relates to wireless networks, particularly to network optimization and testing in a self-driving L4-L5 communication network.
- the disclosure proposes a computer-implemented method and a wireless digital twin of a wireless network system.
- Wireless network digital twins can be used for network optimization and testing in a self-driving L4-L5 communication network.
- FIG. 1 illustrates an autonomous driving network-level schematic.
- a wireless network digital twin simulates different scenarios and configurations and evaluates their impact on network performance and key performance indicators (KPIs) by creating a virtual replica of the physical network.
- KPIs key performance indicators
- wireless network digital twins allows for proactive maintenance of communication networks, detecting potential faults and performance degradations before they occur, and minimizing downtime and service disruptions. It is especially imperative for safety-critical applications like banks and self-driving cars, where a brief interruption of communication can have dire consequences.
- wireless network digital twins can predict the performance of communication networks for self-driving vehicles through simulations and models.
- To have a highly accurate model there are two challenges: 1) models need to have high parameter (e.g., action) sensitivity, and 2) data from the real network contains a high level of noise.
- FIG. 2 shows a process of throughput prediction for cellular networks.
- multiple machine learning models are trained (e.g. Random Forest (RF) , Support Vector Machine (SVM) , Multi Layers Perceptron (MLP) ) and then all the trained models will be evaluated based on some criteria (e.g. Absolute Percentage Error (APE) and R-squared) .
- RF Random Forest
- SVM Support Vector Machine
- MLP Multi Layers Perceptron
- APE Absolute Percentage Error
- R-squared e.g. Absolute Percentage Error
- This solution only considers the short-term prediction (at most it can predict 12s) , and it fails to take into account the parameter sensitivity.
- the second solution proposes a method for network performance forecasting. This method will evaluate how a KPI changes with traffic (e.g., a user may be interested in the impact of rising traffic on the dropped call rate) and then build a regression model to model the relation between one KPI and the others.
- the second solution is limited in that it is based on an assumption and an estimation of traffic growth rate for forecasting in the future, and also no parameter sensitivity is taken into account.
- the present disclosure aims to build an advanced wireless digital twin for network optimization and testing.
- an objective is to have a highly accurate model for the wireless digital twin.
- Another objective is to enable long-term network performance prediction.
- Another objective is to build the model with a high level of sensitivity toward the parameters.
- a first aspect of the disclosure provides a computer-implemented method for performance prediction in a wireless communication network.
- the method comprises: obtaining historical performance data of the network, wherein the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator, wherein each set of performance data is collected during a time period with a first sampling granularity, and wherein each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity; and predicting at least one piece of future performance data based on the plurality of sets of performance data.
- This disclosure proposes an advanced solution for building the wireless digital twin for a wireless communication network.
- the disclosure enables a long-term prediction, particularly the prediction of long-term KPIs, such as traffic and throughput of the wireless network.
- this disclosure proposes to capture patterns of the network in different time scales, i.e., the performance data collected in different sampling granularity. The prediction is performed using data captured from different time scales.
- each set of performance data which is collected with the first sampling granularity, represents a daily pattern of the network.
- the time period is considered a day.
- the behavior of daily patterns of the network may be learned and modeled. Additionally, hourly patterns of the KPIs are also collected.
- Each piece of performance data included in the set of performance data, which is collected with the second sampling granularity, represents a daily pattern of the network.
- the step of predicting comprises applying at least one machine learning model to the plurality of sets of performance data.
- the prediction is performed based on machine learning model (s) .
- the at least one machine learning model comprises a first attention-based model and/or a second attention-based model, each comprising the following input components: query, key, and value.
- a popular artificial neural network block which is based on the Attention mechanism, contains three types of components: query, key, and value.
- the Attention mechanism With the Attention mechanism, the similarity between queries and keys will be computed and used as the weight to generate a weighted sum of values.
- the Attention mechanism is the base block of the Transformer model.
- This disclosure further proposes a novelty module to model the collected data patterns, which may be referred to as “AttenInAtten” .
- AttenInAtten Attention in Attention
- Such AttenInAtten (Attention in Attention) neural network architecture may have two attention layers. Possibly, each attention layer may be based on one of the first attention-based model and the second attention-based model.
- each set of performance data comprises state-action pairs indicating the pieces of performance data, and each state is associated with the at least one network performance indicator, and each action is associated with at least one network parameter.
- the collected data may be in a state-action pair format, i.e., ⁇ (s 1 , a 1 ) , (s 2 , a 2 ) , ..., (s 7 , a 7 ) ⁇ , where s i is the i th state, and a i is the i th action.
- the at least one piece of future performance data comprises a state-action pair
- the method further comprises: obtaining an action of the at least one piece of future performance data; and predicting a state of the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data, using a linear projection of the action of the at least one piece of future performance data as the query.
- the models designed according to this disclosure further exhibit a high level of sensitivity toward the parameters.
- the state-action pairs from the historical data and the latest action of the network e.g., ⁇ (s 1 , a 1 ) , (s 2 , a 2 ) , ..., (s 7 , a 7 ) , a 8 ⁇
- Due to the use of action as the input “query” the output of the attention mechanism will be heavily influenced by the action (query) .
- actions of the network are controlled by for example network maintainers. It is considered known to the network if an action is taken or to be taken.
- the step of predicting the state of the at least one piece of future performance data, by applying the first attention-based model to the plurality of sets of performance data further comprises: using a first linear projection of the state-action pairs of each set of performance data as the key, and using a second linear projection of the state-action pairs of each set of performance data as the value.
- linear projections of the state-action pairs are used as input to the model to predict the next state of the network.
- the linear functions used for generating the linear projections for the input “key” and the input “value” are different.
- an embodiment of this disclosure solves the long-term prediction, i.e., the final forecasting task, by first solving several short-term forecasting tasks, which may be named sub-forecasting tasks. For example, with the information about the network patterns on the first day, the network patterns on the second day can be forecasted or predicted.
- Each sub-forecasting task may be represented using the attention mechanism. This may be considered the first attention layer in the AttenInAtten architecture.
- the method further comprises applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict a third set of performance data.
- the network patterns on the third day can be forecasted or predicted using the information from the two previous days.
- the step of applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict the third set of performance data comprises: obtaining an action of the third set of performance data; and predicting a state of the third set of performance data by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, using a linear projection of a predicted state of the second set of performance data as the query.
- s 3 can be further forecasted from ⁇ (s 1 , a 1 ) , (s 2 , a 2 ) , a 3 ⁇ , and so on.
- the query that is to be input to the second attention-based model for predicting the state of the third set of performance data is the linear projection of the state to be predicted in the previous sub-forecasting task, i.e., the predicted state of the second set of performance data.
- the step of predicting the state of the third set of performance data, by applying the second attention-based model to the at least the first set of performance data and the second set of performance data further comprises using a first linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the key, and using a second linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the value.
- linear projections of the state-action pairs are also used as input to the second attention-based model to predict states for the sub-forecasting task. Possibly, the linear functions used for generating the linear projections for the input “key” and the input “value” are different.
- the method further comprises applying the second attention-based model to the second set of performance data and the third set of performance data to identify a correlation between the second set of performance data and the third set of performance data.
- the sub-forecasting tasks may be similar or share some knowledge about how to forecast future states. Such similarity may be determined and used for forecasting the final forecasting task.
- the method further comprises predicting the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data based on the correlation.
- This disclosure further proposes to consider the similarity of the sub-forecasting tasks when predicting the final forecasting task.
- the output from the sub-forecasting tasks i.e., the first attention layer
- the first attention-based model which can be considered as the second attention layer in the AttenInAtten architecture. It may be understood that the second attention layer receives the output from the first attention layer and sum them up with corresponding weights.
- the at least one performance indicator comprises traffic, and/or throughput.
- the KPI discussed in this disclosure is traffic or throughput of the network.
- the present disclosure can also be applied to other KPIs of the network.
- a second aspect of the disclosure provides a wireless digital twin of a wireless network system, comprising a model configured to implement the method according to the first aspect or any implementation forms of the first aspect.
- an embodiment of this disclosure further proposes a wireless digital twin, which is built for predicting the performance of a wireless communication network.
- Implementation forms of the wireless digital twin of the second aspect may correspond to the implementation forms of the computer-implemented method of the first aspect described above.
- the wireless digital twin of the second aspect and its implementation forms achieve the same advantages and effects as described above for the computer-implemented method of the first aspect and its implementation forms.
- a third aspect of the disclosure provides a computer program product comprising a program code for carrying out when implemented on a processor, the method according to the first aspect, or any implementation forms of the first aspect.
- Implementation forms of the computer program product of the third aspect may correspond to the implementation forms of the computer-implemented method of the first aspect described above.
- the computer program product of the third aspect and its implementation forms achieve the same advantages and effects as described above for the computer-implemented method of the first aspect and its implementation forms.
- FIG. 1 shows an autonomous driving network level schematic
- FIG. 2 shows an exemplary process for throughput prediction for cellular networks
- FIG. 3 shows a computer-implemented method according to an embodiment of the disclosure
- FIG. 4 shows an overall model architecture according to an embodiment of the disclosure
- FIG. 5 shows an exemplary transformer encoder
- FIG. 6 shows a schematic of the AttenInAtten block according to an embodiment of the disclosure
- FIG. 7 shows a wireless network and wireless digital twin diagram according to an embodiment of the disclosure
- FIG. 8 shows a forecasting result according to an embodiment of the disclosure.
- an embodiment/example may refer to other embodiments/examples.
- any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.
- KPIs e.g. traffic and throughput
- a comprehensive and realistic model of the physical network is needed to accurately predict KPIs.
- Base stations, antennas, user equipment, and the propagation environment should be included in this model, as well as their interactions and dependencies.
- various data sources and modeling techniques can be used to build such a model, including network measurement data, traffic models, and machine learning techniques.
- network operators can collect data from various sources such as network probes, drive tests, and user feedback, to characterize network performance and behavior under different conditions.
- traffic models can predict the amount and type of traffic in the network.
- machine learning techniques complex relationships and dependencies between network components and variables can be analyzed and modeled, and accurate predictions can be made based on historical data.
- Wireless digital twins can simulate the physical network using these data sources and modeling techniques. This will enable accurate prediction of key performance indicators such as traffic and throughput. As a result, network operators can optimize network design, configuration, and maintenance. Furthermore, it can ensure reliable and efficient communication across a wide range of applications.
- FIG. 3 shows a computer-implemented method 300 for performance prediction in a wireless communication network, according to an embodiment of the disclosure.
- the method 300 comprises a step 301 of obtaining historical performance data of the network.
- the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator.
- Each set of performance data is collected during a time period with a first sampling granularity.
- Each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity.
- the method 300 further comprises a step 302 of predicting at least one piece of future performance data based on the plurality of sets of performance data.
- Embodiments of the disclosure enable a long-term prediction, particularly the prediction of long-term KPIs, such as traffic and throughput of the wireless network.
- a main idea of this disclosure is to rely on predicting future data using data captured from different time scales.
- each set of performance data which is collected with the first sampling granularity, represents a daily pattern of the network.
- the time period is considered a day.
- the behavior of daily patterns of the network may be learned and modeled. Additionally, hourly patterns of the KPIs are also collected.
- Each piece of performance data included in the set of performance data, which is collected with the second sampling granularity, represents a daily pattern of the network.
- each set of performance data comprises state-action pairs indicating the pieces of performance data, and each state is associated with the at least one network performance indicator, and each action is associated with at least one network parameter.
- the data in a state-action pair format may be represented as (s i , a i ) , where s i is the i th state, and a i is the i th action.
- the plurality of sets of performance data may be represented as ⁇ (s 1 , a 1 ) , (s 2 , a 2 ) , ..., (s i , a i ) ⁇ . If each set of performance data represents a daily pattern of the network, the state-action pair (s 1 , a 1 ) represents the data of the first day.
- each piece of performance data of a set of performance data may represent an hourly pattern of the network.
- each piece of performance data may be represented as where i represents the hour, and j represents the day.
- history data of 7 days are concerned. It should be understood that this is just one example and the present application does not limit itself to a particular number of days.
- the state can be considered a vector.
- the sizes of a daily state and an hourly state are different.
- One example is: for the daily state of one day, the shape of the state vector is 24*1, where 24 represents 24 hours in one day.
- the shape of the state vector can be, for example, 12*1, if considering the sampling interval is 5 minutes (12*5 minutes is one hour) .
- FIG. 4 shows an overall model architecture according to an embodiment of this disclosure.
- a model with an attention mechanism is designed and two different branches are used to capture different time scales (e.g., daily and hourly) patterns.
- the predicting step 302 as shown in FIG. 3 may comprise applying at least one machine learning model to the plurality of sets of performance data (i.e., daily and hourly patterns of the network) .
- a conventional transformer encoder (branch 1, shown in the upper part of FIG. 4) is utilized to represent the hourly patterns of the KPIs.
- a novelty AttenInAtten block (branch 2, shown in the lower part of FIG. 4) is used to learn the behavior of daily patterns.
- FCN Fully Connected Network
- FIG. 4 represents the fully connected layer, which implements affine transform from input to output, i.e.:
- FIG. 5 An example of the conventional transformer encoder is shown in FIG. 5. Details are not discussed in this application.
- the Attention mechanism is a popular artificial neural network block, which contains three types of components: query (Q) , key (K) , and value (V) .
- query (Q) query
- key (K) key
- value (V) value
- the Attention mechanism is the base block of the Transformer model.
- This disclosure further proposes a novelty module, “AttenInAtten” , which may have two attention layers.
- the at least one machine learning model comprises a first attention-based model and/or a second attention-based model, each comprising the following input components: query, key, and value.
- this disclosure proposes to utilize the attention mechanism by using the latest action in the state-action pairs as the query for action sensitivity.
- the similarity between the query and the key will be computed, and then a weighted summation of value will be calculated based on the similarity.
- the output of the attention mechanism will be heavily influenced by the action (query) .
- Such a high level of sensitivity toward the parameters can be very useful for network optimization.
- FIG. 6 shows a detailed schematic of the AttenInAtten block according to an embodiment of this disclosure.
- the idea of this block is that solving the sub-forecasting task may be helpful to solve the final forecasting task.
- the sub-forecasting task may be represented using the function f L+1 (sa L ) ⁇ s L+1 , where L is a positive integer.
- a 2 ⁇ is known (i.e., f 2 (a 1 , s 1 a 2, ) ⁇ s 2 )
- the model can take advantage of information from the previous sub-forecasting tasks to predict s 8 from ⁇ (s 1 , a 1 ) , (s 2 , a 2 ) , ..., (s 7 , a 7 ) , a 8 ⁇ .
- this disclosure proposes to use the attention mechanism to represent the sub-forecasting task, which is in the first attention layer.
- the value is a projection of the state-action pairs that are used to predict the next state
- the query is the linear projection of the state to be predicted in the sub-forecasting
- the key is another linear projection of the state-action pairs that are used to predict the next state.
- the outputs (for example, A 2 ⁇ A 7 in FIG. 6) of the first attention layer are the representation of each sub-forecasting task. These outputs will become the inputs for the second attention layer.
- the method 300 shown in FIG. 3 further comprises applying the second attention-based model to at least a first set of performance data (e.g., A 1 ) in the plurality of sets of performance data to predict a second set of performance data (e.g., A 2 ) .
- a first set of performance data e.g., A 1
- a second set of performance data e.g., A 2
- the method 300 may further comprise applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict a third set of performance data (e.g., A 3 ) .
- a third set of performance data e.g., A 3
- the step of applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict the third set of performance data comprises obtaining an action of the third set of performance data; and predicting a state of the third set of performance data by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, using a linear projection of a predicted state of the second set of performance data as the query.
- the step of predicting the state of the third set of performance data, by applying the second attention-based model to the at least the first set of performance data and the second set of performance data further comprises using a first linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the key and using a second linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the value.
- the second attention layer In the second attention layer, another attention mechanism will be adopted. We compute the similarity between the last action (a L+1 in FIG. 6, as the query) and all outputs (as keys and values) of the first attention layer, then these similarities will be used as the weight to get a linear combination of all sub-forecasting tasks.
- the output of the second attention layer (A 8 in FIG. 6) is the extracted feature from all sub-forecasting which will be further used to predict the state of the next day. That is, the final forecasting task may be represented as w 2 A 2 +w 3 A 3 +...+w 7 A 7 ⁇ A 8 . It must be noted that the last action (a L+1 in FIG. 6) is used as the query for the prediction model, so that action sensitivity will be substantial.
- the at least one piece of future performance data comprises a state-action pair
- the method 300 comprises obtaining an action of the at least one piece of future performance data; and predicting a state of the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data, using a linear projection of the action of the at least one piece of future performance data as the query.
- the state-action pairs from the historical data and the latest action of the network may be used to forecast the future state of the network, e.g., the value of s 8 .
- the output of the attention mechanism will be heavily influenced by the action (query) .
- actions of the network are controlled by for example network maintainers. It is considered known to the network if an action is taken or to be taken. the models designed according to this disclosure further exhibit a high level of sensitivity toward the parameters, which can be useful for network optimization.
- the step of predicting the state of the at least one piece of future performance data, by applying the first attention-based model to the plurality of sets of performance data further comprises: using a first linear projection of the state-action pairs of each set of performance data as the key, and using a second linear projection of the state-action pairs of each set of performance data as the value.
- the method 300 as shown in FIG. 3 further comprises a step of applying the second attention-based model to the second set of performance data and the third set of performance data to identify a correlation (e.g., the weight) between the second set of performance data and the third set of performance data.
- a correlation e.g., the weight
- the method 300 as shown in FIG. 3 further comprises predicting the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data based on the correlation.
- KPIs discussed in this disclosure include traffic and throughput of the network.
- present disclosure can also be applied to other KPIs of the network.
- FIG. 7 depicts an application scenario according to an embodiment of this disclosure.
- the upper part depicts a wireless network system, including Node B and Mobile Network Automatic Engine (MAE) (wireless network management node) .
- the lower part depicts the wireless network digital twin 700 according to an embodiment of this disclosure, which may include a fundamental model library, optimization library, as well as application characteristics such as traffic prediction models, throughput prediction models, and wireless parameter optimization.
- the proposed attention-based modeling algorithm which serves as a key capability of the fundamental model library in the wireless network digital twin 700, is provided inside the digital twin.
- the wireless network provides data for the wireless network digital twin 700, and the wireless network digital twin 700 provides modeling and optimization capabilities for the wireless network.
- the attention-based model can be considered a part of the wireless digital twin 700 (the product will be a software package) .
- wireless digital twins contain a base model/optimization part and an application part, as shown in FIG. 7.
- the attention-based methods will be one of the core capabilities of the base model module, which is not only used for some applications about modeling but also the optimization library/application.
- the observed data is a time series
- the time scale is an hour, which means that the model gets the throughput data each hour.
- the hourly action data is also obtained.
- the throughput will rely on the previous throughput and the action that has been taken or to-be-taken. When predicting the throughput, the previous throughput and the action will be used.
- the output of the model is the prediction of throughput of the next day, i.e., the eighth day.
- the data of 12 days may be used, and the model can be tested on data from the next 3 days.
- the model we used is the same as FIG. 4, which includes two different branches: 1) the conventional transformer encoder and 2) the AttenInAtten model.
- the dimension of throughput/traffic is 24, and the dimension of action is 38.
- the input sequence length is 7
- the embedding_dim of the attention encoder is 64
- the number of heads of multi-head attention in the attention encoder is 6.
- APE represents an Absolute Percentage Error
- APE@0.25 means that the ratio of predictions whose APE is less than 0.25.
- FIG. 8 shows the forecasting result of throughput and the autocorrelation function of prediction error. It can be seen from FIG. 8 that most of the prediction curves match with the ground truth quite well.
- the prediction residual diagnostics show that the residuals are uncorrelated, which means there no more information is left in the residual which should be used for forecasting. It’s another proof of the good modeling capability of the proposed model.
- the observed data is a time series format, and the time scale is an hour, which means that the model gets one measurement value traffic data each hour.
- the hourly action data is also obtained.
- the traffic will also rely on the previous traffic and the action that has been taken or to-be-taken.
- the previous traffic and the action will be used.
- the output of the model is the traffic prediction of the next day.
- the data of 12 days may be used, and the model can be tested on data from the next 3 days.
- the model is the same as FIG. 4.
- FIG. 9 shows the forecasting result of traffic and the autocorrelation function of prediction error. It can be seen from FIG. 9 that most of the prediction curves match with the ground truth quite well. Some traffic curves in the prediction horizon show very different behavior compared with the historical period, and the forecasting result is able to follow the ground truth, which shows that the proposed attention-based model has a good parameter sensitivity. When using the model for downstream optimization tasks, parameter sensitivity is a critical feature. Furthermore, the prediction residual diagnostics show that the residuals are uncorrelated, which means that there no more information is left in the residual which should be used for forecasting. It’s another proof of the good modeling capability of our model.
- this disclosure proposes to build the wireless digital twin using an attention mechanism and multi-time scale modeling method. This enables modeling the long-term trend and short-term fluctuations at the same time. Further, in the wireless digital twin scenario, this disclosure proposes to build a model which also takes into account the parameter sensitivity. By considering the parameter sensitivity when building the model, it becomes more suitable for downstream tasks, like optimizing the parameters of the network.
- any method according to embodiments of the disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method.
- the computer program is included in a computer-readable medium of a computer program product.
- the computer-readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory) , a PROM (Programmable Read-Only Memory) , an EPROM (Erasable PROM) , a Flash memory, an EEPROM (Electrically Erasable PROM) , or a hard disk drive.
- embodiments of the proposed wireless digital twin 700 and the corresponding computer program product comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution.
- means, units, elements, and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc. which are suitably arranged together for performing the solution.
- TCM trellis-coded modulation
- the processor (s) of the wireless digital twin 700 and the corresponding computer program product may comprise, e.g., one or more instances of a Central Processing Unit (CPU) , a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC) , a microprocessor, or other processing logic that may interpret and execute instructions.
- the expression “processor” may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above.
- the processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present disclosure relates to performance prediction in a network. The disclosure proposes a computer-implemented method for performance prediction in a wireless communication network, the method comprising: obtaining historical performance data of the network, wherein the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator, wherein each set of performance data is collected during a time period with a first sampling granularity, and wherein each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity; and predicting at least one piece of future performance data based on the plurality of sets of performance data. This disclosure further proposes a wireless digital twin of a wireless network system.
Description
The present disclosure relates to wireless networks, particularly to network optimization and testing in a self-driving L4-L5 communication network. In order to improve the prediction of the performance of communication networks and thus optimize network designs and configurations, the disclosure proposes a computer-implemented method and a wireless digital twin of a wireless network system.
Wireless network digital twins can be used for network optimization and testing in a self-driving L4-L5 communication network. FIG. 1 illustrates an autonomous driving network-level schematic. A wireless network digital twin simulates different scenarios and configurations and evaluates their impact on network performance and key performance indicators (KPIs) by creating a virtual replica of the physical network.
The use of wireless network digital twins allows for proactive maintenance of communication networks, detecting potential faults and performance degradations before they occur, and minimizing downtime and service disruptions. It is especially imperative for safety-critical applications like banks and self-driving cars, where a brief interruption of communication can have dire consequences.
In real-world scenarios, wireless network digital twins can predict the performance of communication networks for self-driving vehicles through simulations and models. To have a highly accurate model, there are two challenges: 1) models need to have high parameter (e.g., action) sensitivity, and 2) data from the real network contains a high level of noise.
To build the digital twin of a wireless network system, there are two existing solutions. The first solution proposes a method for throughput prediction for cellular networks. FIG. 2 shows a process of throughput prediction for cellular networks. In this work, multiple machine learning models are trained (e.g. Random Forest (RF) , Support Vector Machine
(SVM) , Multi Layers Perceptron (MLP) ) and then all the trained models will be evaluated based on some criteria (e.g. Absolute Percentage Error (APE) and R-squared) . One model will be selected and the history length and prediction horizon will be decided based on the evaluation result. However, this solution only considers the short-term prediction (at most it can predict 12s) , and it fails to take into account the parameter sensitivity.
The second solution proposes a method for network performance forecasting. This method will evaluate how a KPI changes with traffic (e.g., a user may be interested in the impact of rising traffic on the dropped call rate) and then build a regression model to model the relation between one KPI and the others. However, the second solution is limited in that it is based on an assumption and an estimation of traffic growth rate for forecasting in the future, and also no parameter sensitivity is taken into account.
Therefore, an improved solution for building the digital twin for a wireless communication network is desired.
In view of the above-mentioned limitations, the present disclosure aims to build an advanced wireless digital twin for network optimization and testing. In particular, an objective is to have a highly accurate model for the wireless digital twin. Another objective is to enable long-term network performance prediction. Another objective is to build the model with a high level of sensitivity toward the parameters.
These and other objectives are achieved by the solutions of this disclosure as provided in the independent claims. Advantageous implementations are further defined in the dependent claims.
A first aspect of the disclosure provides a computer-implemented method for performance prediction in a wireless communication network. The method comprises: obtaining historical performance data of the network, wherein the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator, wherein each set of performance data is collected during
a time period with a first sampling granularity, and wherein each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity; and predicting at least one piece of future performance data based on the plurality of sets of performance data.
This disclosure proposes an advanced solution for building the wireless digital twin for a wireless communication network. The disclosure enables a long-term prediction, particularly the prediction of long-term KPIs, such as traffic and throughput of the wireless network. In particular, this disclosure proposes to capture patterns of the network in different time scales, i.e., the performance data collected in different sampling granularity. The prediction is performed using data captured from different time scales.
For instance, it can be considered that each set of performance data, which is collected with the first sampling granularity, represents a daily pattern of the network. In this example, the time period is considered a day. The behavior of daily patterns of the network may be learned and modeled. Additionally, hourly patterns of the KPIs are also collected. Each piece of performance data included in the set of performance data, which is collected with the second sampling granularity, represents a daily pattern of the network.
In an implementation form of the first aspect, the step of predicting comprises applying at least one machine learning model to the plurality of sets of performance data.
Optionally, the prediction is performed based on machine learning model (s) .
In an implementation form of the first aspect, the at least one machine learning model comprises a first attention-based model and/or a second attention-based model, each comprising the following input components: query, key, and value.
Notably, a popular artificial neural network block, which is based on the Attention mechanism, contains three types of components: query, key, and value. With the Attention mechanism, the similarity between queries and keys will be computed and used as the weight to generate a weighted sum of values. The Attention mechanism is the base block of the Transformer model. This disclosure further proposes a novelty module to
model the collected data patterns, which may be referred to as “AttenInAtten” . Such AttenInAtten (Attention in Attention) neural network architecture may have two attention layers. Possibly, each attention layer may be based on one of the first attention-based model and the second attention-based model.
In an implementation form of the first aspect, each set of performance data comprises state-action pairs indicating the pieces of performance data, and each state is associated with the at least one network performance indicator, and each action is associated with at least one network parameter.
For example, the collected data may be in a state-action pair format, i.e., { (s1, a1) , (s2, a2) , …, (s7, a7) } , where si is the ith state, and ai is the ith action.
In an implementation form of the first aspect, the at least one piece of future performance data comprises a state-action pair, and the method further comprises: obtaining an action of the at least one piece of future performance data; and predicting a state of the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data, using a linear projection of the action of the at least one piece of future performance data as the query.
In order to be useful for network optimization, the models designed according to this disclosure further exhibit a high level of sensitivity toward the parameters. It may be understood that the state-action pairs from the historical data and the latest action of the network, e.g., { (s1, a1) , (s2, a2) , …, (s7, a7) , a8} , may be used to forecast the future state of the network, e.g., the value of s8. Due to the use of action as the input “query” , the output of the attention mechanism will be heavily influenced by the action (query) . Notably, actions of the network are controlled by for example network maintainers. It is considered known to the network if an action is taken or to be taken.
In an implementation form of the first aspect, the step of predicting the state of the at least one piece of future performance data, by applying the first attention-based model to the plurality of sets of performance data, further comprises: using a first linear projection of the state-action pairs of each set of performance data as the key, and using a second linear projection of the state-action pairs of each set of performance data as the value.
Notably, linear projections of the state-action pairs are used as input to the model to predict the next state of the network. Possibly, the linear functions used for generating the linear projections for the input “key” and the input “value” are different.
In an implementation form of the first aspect, the method further comprises applying the second attention-based model to at least a first set of performance data in the plurality of sets of performance data to predict a second set of performance data.
Optionally, an embodiment of this disclosure solves the long-term prediction, i.e., the final forecasting task, by first solving several short-term forecasting tasks, which may be named sub-forecasting tasks. For example, with the information about the network patterns on the first day, the network patterns on the second day can be forecasted or predicted. Each sub-forecasting task may be represented using the attention mechanism. This may be considered the first attention layer in the AttenInAtten architecture.
In an implementation form of the first aspect, the method further comprises applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict a third set of performance data.
Following the previous example, once the network patterns on the second day are known (after the sub-forecasting task) , the network patterns on the third day can be forecasted or predicted using the information from the two previous days.
In an implementation form of the first aspect, the step of applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict the third set of performance data comprises: obtaining an action of the third set of performance data; and predicting a state of the third set of performance data by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, using a linear projection of a predicted state of the second set of performance data as the query.
If how to predict s2 from { (s1, a1) , a2} is known, s3 can be further forecasted from { (s1, a1) , (s2, a2) , a3} , and so on. Specifically, the query that is to be input to the second
attention-based model for predicting the state of the third set of performance data is the linear projection of the state to be predicted in the previous sub-forecasting task, i.e., the predicted state of the second set of performance data.
In an implementation form of the first aspect, the step of predicting the state of the third set of performance data, by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, further comprises using a first linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the key, and using a second linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the value.
Similar to the first attention-based model, linear projections of the state-action pairs are also used as input to the second attention-based model to predict states for the sub-forecasting task. Possibly, the linear functions used for generating the linear projections for the input “key” and the input “value” are different.
In an implementation form of the first aspect, the method further comprises applying the second attention-based model to the second set of performance data and the third set of performance data to identify a correlation between the second set of performance data and the third set of performance data.
Possibly, the sub-forecasting tasks may be similar or share some knowledge about how to forecast future states. Such similarity may be determined and used for forecasting the final forecasting task.
In an implementation form of the first aspect, the method further comprises predicting the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data based on the correlation.
This disclosure further proposes to consider the similarity of the sub-forecasting tasks when predicting the final forecasting task. In particular, the output from the sub-forecasting tasks, i.e., the first attention layer, will be input to the first attention-based model, which can be considered as the second attention layer in the AttenInAtten
architecture. It may be understood that the second attention layer receives the output from the first attention layer and sum them up with corresponding weights.
In an implementation form of the first aspect, the at least one performance indicator comprises traffic, and/or throughput.
For instance, the KPI discussed in this disclosure is traffic or throughput of the network. However, the present disclosure can also be applied to other KPIs of the network.
A second aspect of the disclosure provides a wireless digital twin of a wireless network system, comprising a model configured to implement the method according to the first aspect or any implementation forms of the first aspect.
Accordingly, an embodiment of this disclosure further proposes a wireless digital twin, which is built for predicting the performance of a wireless communication network.
Implementation forms of the wireless digital twin of the second aspect may correspond to the implementation forms of the computer-implemented method of the first aspect described above. The wireless digital twin of the second aspect and its implementation forms achieve the same advantages and effects as described above for the computer-implemented method of the first aspect and its implementation forms.
A third aspect of the disclosure provides a computer program product comprising a program code for carrying out when implemented on a processor, the method according to the first aspect, or any implementation forms of the first aspect.
Implementation forms of the computer program product of the third aspect may correspond to the implementation forms of the computer-implemented method of the first aspect described above. The computer program product of the third aspect and its implementation forms achieve the same advantages and effects as described above for the computer-implemented method of the first aspect and its implementation forms.
It has to be noted that all devices, elements, units, and means described in the present application could be implemented in software or hardware elements or any kind of
combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity that performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements or any kind of combination thereof.
The above-described aspects and implementation forms of the present disclosure will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which:
FIG. 1 shows an autonomous driving network level schematic;
FIG. 2 shows an exemplary process for throughput prediction for cellular networks;
FIG. 3 shows a computer-implemented method according to an embodiment of the disclosure;
FIG. 4 shows an overall model architecture according to an embodiment of the disclosure;
FIG. 5 shows an exemplary transformer encoder;
FIG. 6 shows a schematic of the AttenInAtten block according to an embodiment of the disclosure;
FIG. 7 shows a wireless network and wireless digital twin diagram according to an embodiment of the disclosure;
FIG. 8 shows a forecasting result according to an embodiment of the disclosure; and
FIG. 9 shows a forecasting result according to an embodiment of the disclosure.
Illustrative embodiments of a computer-implemented method, a wireless digital twin, and a corresponding computer program product for performance prediction in a wireless communication network are described with reference to the figures. Although this description provides a detailed example of possible implementations, it should be noted that the details are intended to be exemplary and in no way limit the scope of the application.
Moreover, an embodiment/example may refer to other embodiments/examples. For example, any description including but not limited to terminology, element, process, explanation and/or technical advantage mentioned in one embodiment/example is applicative to the other embodiments/examples.
In real-world scenarios, wireless network digital twins can predict the performance of communication networks for self-driving vehicles through simulations and models. By optimizing network design, configuration, and maintenance, network operators can ensure reliable and secure communication between vehicles and infrastructure.
The ability to predict KPIs (e.g. traffic and throughput) is key to achieving wireless digital twin. A comprehensive and realistic model of the physical network is needed to accurately predict KPIs. Base stations, antennas, user equipment, and the propagation environment should be included in this model, as well as their interactions and dependencies.
Typically, various data sources and modeling techniques can be used to build such a model, including network measurement data, traffic models, and machine learning techniques. In particular, network operators can collect data from various sources such as
network probes, drive tests, and user feedback, to characterize network performance and behavior under different conditions. Based on user behavior, application requirements, and network topology, traffic models can predict the amount and type of traffic in the network. With machine learning techniques, complex relationships and dependencies between network components and variables can be analyzed and modeled, and accurate predictions can be made based on historical data.
Wireless digital twins can simulate the physical network using these data sources and modeling techniques. This will enable accurate prediction of key performance indicators such as traffic and throughput. As a result, network operators can optimize network design, configuration, and maintenance. Furthermore, it can ensure reliable and efficient communication across a wide range of applications.
FIG. 3 shows a computer-implemented method 300 for performance prediction in a wireless communication network, according to an embodiment of the disclosure. The method 300 comprises a step 301 of obtaining historical performance data of the network. In particular, the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator. Each set of performance data is collected during a time period with a first sampling granularity. Each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity. The method 300 further comprises a step 302 of predicting at least one piece of future performance data based on the plurality of sets of performance data.
This disclosure proposes an advanced solution for building the wireless digital twin for a wireless communication network. Embodiments of the disclosure enable a long-term prediction, particularly the prediction of long-term KPIs, such as traffic and throughput of the wireless network. A main idea of this disclosure is to rely on predicting future data using data captured from different time scales.
For instance, it can be considered that each set of performance data, which is collected with the first sampling granularity, represents a daily pattern of the network. In this example, the time period is considered a day. The behavior of daily patterns of the network may be learned and modeled. Additionally, hourly patterns of the KPIs are also
collected. Each piece of performance data included in the set of performance data, which is collected with the second sampling granularity, represents a daily pattern of the network.
Optionally, according to an embodiment of the disclosure, each set of performance data comprises state-action pairs indicating the pieces of performance data, and each state is associated with the at least one network performance indicator, and each action is associated with at least one network parameter.
For example, the data in a state-action pair format may be represented as (si, ai) , where si is the ith state, and ai is the ith action. For instance, the plurality of sets of performance data may be represented as { (s1, a1) , (s2, a2) , …, (si, ai) } . If each set of performance data represents a daily pattern of the network, the state-action pair (s1, a1) represents the data of the first day.
In one example, each piece of performance data of a set of performance data (i.e., daily data) may represent an hourly pattern of the network. Thus, each piece of performance data may be represented aswhere i represents the hour, and j represents the day. In this example, history data of 7 days are concerned. It should be understood that this is just one example and the present application does not limit itself to a particular number of days.
It may be understood that the state can be considered a vector. The sizes of a daily state and an hourly state are different. One example is: for the daily state of one day, the shape of the state vector is 24*1, where 24 represents 24 hours in one day. For the hourly state of one hour, the shape of the state vector can be, for example, 12*1, if considering the sampling interval is 5 minutes (12*5 minutes is one hour) .
FIG. 4 shows an overall model architecture according to an embodiment of this disclosure. To solve the long-term prediction problem, a model with an attention mechanism is designed and two different branches are used to capture different time scales (e.g., daily and hourly) patterns.
Optionally, according to an embodiment of the disclosure, the predicting step 302 as
shown in FIG. 3 may comprise applying at least one machine learning model to the plurality of sets of performance data (i.e., daily and hourly patterns of the network) .
In this particular example, a conventional transformer encoder (branch 1, shown in the upper part of FIG. 4) is utilized to represent the hourly patterns of the KPIs. A novelty AttenInAtten block (branch 2, shown in the lower part of FIG. 4) is used to learn the behavior of daily patterns.
FCN (Fully Connected Network) shown in FIG. 4 represents the fully connected layer, which implements affine transform from input to output, i.e.:
An example of the conventional transformer encoder is shown in FIG. 5. Details are not discussed in this application.
As previously discussed, the Attention mechanism is a popular artificial neural network block, which contains three types of components: query (Q) , key (K) , and value (V) . With the Attention mechanism, the similarity between queries and keys will be computed and used as the weight to generate a weighted sum of values. The Attention mechanism is the base block of the Transformer model. This disclosure further proposes a novelty module, “AttenInAtten” , which may have two attention layers.
Optionally, according to an embodiment of the disclosure, the at least one machine learning model comprises a first attention-based model and/or a second attention-based model, each comprising the following input components: query, key, and value.
To improve the impact of actions, this disclosure proposes to utilize the attention mechanism by using the latest action in the state-action pairs as the query for action sensitivity. Generally, in the attention mechanism, the similarity between the query and the key will be computed, and then a weighted summation of value will be calculated based on the similarity. Due to the use of action as a query, the output of the attention mechanism will be heavily influenced by the action (query) . Such a high level of sensitivity toward the parameters can be very useful for network optimization.
FIG. 6 shows a detailed schematic of the AttenInAtten block according to an embodiment of this disclosure. Based on the observation that inside of the long-term prediction problem, there are a number of short-term forecasting tasks (or can be called sub-forecasting tasks) . The idea of this block is that solving the sub-forecasting task may be helpful to solve the final forecasting task. The sub-forecasting task may be represented using the function fL+1 (saL) →sL+1, where L is a positive integer. For example, it is considered how to predicts2 from { (s1, a1) , a2} is known (i.e., f2 (a1, s1a2, ) →s2) , how to forecast s3 from { (s1, a1) , (s2, a2) , a3} (i.e., f3 (a1, s1a2, s2, a3, ) →s3) , is also known, and so on, if the eighth day’s data (i.e., AS) need to be predicted, the model can take advantage of information from the previous sub-forecasting tasks to predict s8 from { (s1, a1) , (s2, a2) , …, (s7, a7) , a8} .
Notably, this disclosure proposes to use the attention mechanism to represent the sub-forecasting task, which is in the first attention layer. Specifically, the value is a projection of the state-action pairs that are used to predict the next state, the query is the linear projection of the state to be predicted in the sub-forecasting, and the key is another linear projection of the state-action pairs that are used to predict the next state. The outputs (for example, A2 ~ A7 in FIG. 6) of the first attention layer are the representation of each sub-forecasting task. These outputs will become the inputs for the second attention layer.
Therefore, according to an embodiment of the disclosure, the method 300 shown in FIG. 3 further comprises applying the second attention-based model to at least a first set of performance data (e.g., A1) in the plurality of sets of performance data to predict a second set of performance data (e.g., A2) .
Optionally, the method 300 may further comprise applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict a third set of performance data (e.g., A3) .
Optionally, according to an embodiment of the disclosure, the step of applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict the third set of performance data comprises obtaining an action of the third set of performance data; and predicting a state of the third set of
performance data by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, using a linear projection of a predicted state of the second set of performance data as the query.
Optionally, according to an embodiment of the disclosure, the step of predicting the state of the third set of performance data, by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, further comprises using a first linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the key and using a second linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the value.
In the second attention layer, another attention mechanism will be adopted. We compute the similarity between the last action (aL+1 in FIG. 6, as the query) and all outputs (as keys and values) of the first attention layer, then these similarities will be used as the weight to get a linear combination of all sub-forecasting tasks. The output of the second attention layer (A8 in FIG. 6) is the extracted feature from all sub-forecasting which will be further used to predict the state of the next day. That is, the final forecasting task may be represented as w2A2+w3A3 +…+w7A7 →A8. It must be noted that the last action (aL+1 in FIG. 6) is used as the query for the prediction model, so that action sensitivity will be substantial.
Optionally, according to an embodiment of the disclosure, the at least one piece of future performance data comprises a state-action pair, and the method 300 comprises obtaining an action of the at least one piece of future performance data; and predicting a state of the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data, using a linear projection of the action of the at least one piece of future performance data as the query.
That is, the state-action pairs from the historical data and the latest action of the network, e.g., { (s1, a1) , (s2, a2) , …, (s7, a7) , a8} , may be used to forecast the future state of the network, e.g., the value of s8. Due to the use of action as the input “query” , the output of the attention mechanism will be heavily influenced by the action (query) . Notably, actions of the network are controlled by for example network maintainers. It is
considered known to the network if an action is taken or to be taken. the models designed according to this disclosure further exhibit a high level of sensitivity toward the parameters, which can be useful for network optimization.
Optionally, according to an embodiment of the disclosure, the step of predicting the state of the at least one piece of future performance data, by applying the first attention-based model to the plurality of sets of performance data, further comprises: using a first linear projection of the state-action pairs of each set of performance data as the key, and using a second linear projection of the state-action pairs of each set of performance data as the value.
Optionally, according to an embodiment of the disclosure, the method 300 as shown in FIG. 3 further comprises a step of applying the second attention-based model to the second set of performance data and the third set of performance data to identify a correlation (e.g., the weight) between the second set of performance data and the third set of performance data.
Optionally, according to an embodiment of the disclosure, the method 300 as shown in FIG. 3 further comprises predicting the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data based on the correlation.
It may be worth further mentioning that the KPIs discussed in this disclosure include traffic and throughput of the network. However, the present disclosure can also be applied to other KPIs of the network.
FIG. 7 depicts an application scenario according to an embodiment of this disclosure. The upper part depicts a wireless network system, including Node B and Mobile Network Automatic Engine (MAE) (wireless network management node) . The lower part depicts the wireless network digital twin 700 according to an embodiment of this disclosure, which may include a fundamental model library, optimization library, as well as application characteristics such as traffic prediction models, throughput prediction models, and wireless parameter optimization. The proposed attention-based modeling algorithm, which serves as a key capability of the fundamental model library in the
wireless network digital twin 700, is provided inside the digital twin. The wireless network provides data for the wireless network digital twin 700, and the wireless network digital twin 700 provides modeling and optimization capabilities for the wireless network.
The attention-based model can be considered a part of the wireless digital twin 700 (the product will be a software package) . Basically, wireless digital twins contain a base model/optimization part and an application part, as shown in FIG. 7. The attention-based methods will be one of the core capabilities of the base model module, which is not only used for some applications about modeling but also the optimization library/application.
In the following, a particular embodiment of how to use the proposed attention based-model to predict the throughput for wireless communication networks is discussed. In this example, the observed data is a time series, and the time scale is an hour, which means that the model gets the throughput data each hour. Besides the throughput, the hourly action data is also obtained. The throughput will rely on the previous throughput and the action that has been taken or to-be-taken. When predicting the throughput, the previous throughput and the action will be used. The formula of the forecasting model may be represented as throughput {t+1, …, t+N} =f (throughput {t-l, …, t} , action {t-l, …, t} ; w) , w is the parameter of the model that needs to be estimated.
If using one week of data as input of the model, the output of the model is the prediction of throughput of the next day, i.e., the eighth day. In one example, to train the model, the data of 12 days may be used, and the model can be tested on data from the next 3 days. The model we used is the same as FIG. 4, which includes two different branches: 1) the conventional transformer encoder and 2) the AttenInAtten model. In this case, the dimension of throughput/traffic is 24, and the dimension of action is 38. The input sequence length is 7, the embedding_dim of the attention encoder is 64, and the number of heads of multi-head attention in the attention encoder is 6.
The result of our attention-based model is shown in Table 1 as follows:
Table 1
Notably, APE represents an Absolute Percentage Error, and APE@0.25 means that the ratio of predictions whose APE is less than 0.25. To further analyze the proposed method, some visualization results are shown in FIG. 8, which shows the forecasting result of throughput and the autocorrelation function of prediction error. It can be seen from FIG. 8 that most of the prediction curves match with the ground truth quite well. The prediction residual diagnostics show that the residuals are uncorrelated, which means there no more information is left in the residual which should be used for forecasting. It’s another proof of the good modeling capability of the proposed model.
In another particular embodiment, how to use the proposed attention-based model to forecast the traffic of wireless communication networks is discussed. Similarly, in this example, the observed data is a time series format, and the time scale is an hour, which means that the model gets one measurement value traffic data each hour. Besides the traffic, the hourly action data is also obtained. Same to the throughput, the traffic will also rely on the previous traffic and the action that has been taken or to-be-taken. When predicting the traffic, the previous traffic and the action will be used. The formula of the forecasting model may be represented as traffic {t+1, …, t+N} =f (traffic, action {t-l, …, t} ; w) , w is the parameter of the model that needs to be estimated.
If using one week of data as input for the model, the output of the model is the traffic prediction of the next day. In one example, to train the model, the data of 12 days may be used, and the model can be tested on data from the next 3 days. The model is the same as FIG. 4.
The result of forecasting traffic with the proposed attention-based model is shown in Table 2 as follows:
Table 2
To further analyze the proposed method, some visualization results are shown in FIG. 9, which shows the forecasting result of traffic and the autocorrelation function of prediction error. It can be seen from FIG. 9 that most of the prediction curves match with the ground truth quite well. Some traffic curves in the prediction horizon show very different behavior compared with the historical period, and the forecasting result is able to follow the ground truth, which shows that the proposed attention-based model has a good parameter sensitivity. When using the model for downstream optimization tasks, parameter sensitivity is a critical feature. Furthermore, the prediction residual diagnostics show that the residuals are uncorrelated, which means that there no more information is left in the residual which should be used for forecasting. It’s another proof of the good modeling capability of our model.
To summarize, this disclosure proposes to build the wireless digital twin using an attention mechanism and multi-time scale modeling method. This enables modeling the long-term trend and short-term fluctuations at the same time. Further, in the wireless digital twin scenario, this disclosure proposes to build a model which also takes into account the parameter sensitivity. By considering the parameter sensitivity when building the model, it becomes more suitable for downstream tasks, like optimizing the parameters of the network.
The present disclosure has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed embodiments of the disclosure, from the studies of the drawings, this disclosure, and the independent claims. In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutually different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.
Furthermore, any method according to embodiments of the disclosure may be implemented in a computer program, having code means, which when run by processing means causes the processing means to execute the steps of the method. The computer program is included in a computer-readable medium of a computer program product. The computer-readable medium may comprise essentially any memory, such as a ROM (Read-Only Memory) , a PROM (Programmable Read-Only Memory) , an EPROM (Erasable PROM) , a Flash memory, an EEPROM (Electrically Erasable PROM) , or a hard disk drive.
Moreover, it is realized by the skilled person that embodiments of the proposed wireless digital twin 700 and the corresponding computer program product, comprises the necessary communication capabilities in the form of e.g., functions, means, units, elements, etc., for performing the solution. Examples of other such means, units, elements, and functions are: processors, memory, buffers, control logic, encoders, decoders, rate matchers, de-rate matchers, mapping units, multipliers, decision units, selecting units, switches, interleavers, de-interleavers, modulators, demodulators, inputs, outputs, antennas, amplifiers, receiver units, transmitter units, DSPs, trellis-coded modulation (TCM) encoder, TCM decoder, power supply units, power feeders, communication interfaces, communication protocols, etc. which are suitably arranged together for performing the solution.
Especially, the processor (s) of the wireless digital twin 700 and the corresponding computer program product may comprise, e.g., one or more instances of a Central Processing Unit (CPU) , a processing unit, a processing circuit, a processor, an Application Specific Integrated Circuit (ASIC) , a microprocessor, or other processing logic that may interpret and execute instructions. The expression “processor” may thus represent a processing circuitry comprising a plurality of processing circuits, such as, e.g., any, some or all of the ones mentioned above. The processing circuitry may further perform data processing functions for inputting, outputting, and processing of data comprising data buffering and device control functions, such as call processing control, user interface control, or the like.
Claims (15)
- A computer-implemented method (300) for performance prediction in a wireless communication network, the method (300) comprising:obtaining (301) historical performance data of the network, wherein the historical performance data comprises a plurality of sets of performance data associated with at least one network performance indicator, wherein each set of performance data is collected during a time period with a first sampling granularity, and wherein each piece of performance data included in the set of performance data is collected within the time period with a second sampling granularity; andpredicting (302) at least one piece of future performance data based on the plurality of sets of performance data.
- The method (300) according to claim 1, wherein the predicting (302) comprises applying at least one machine learning model to the plurality of sets of performance data.
- The method (300) according to claim 2, wherein the at least one machine learning model comprises a first attention-based model and/or a second attention-based model, each comprising the following input components: query, key and value.
- The method (300) according to one of the claims 1 to 3, wherein each set of performance data comprises state-action pairs indicating the pieces of performance data, and each state is associated with the at least one network performance indicator, and each action is associated with at least one network parameter.
- The method (300) according to claims 3 and 4, wherein the at least one piece of future performance data comprises a state-action pair, and the method (300) comprises:obtaining an action of the at least one piece of future performance data; andpredicting a state of the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data, using a linear projection of the action of the at least one piece of future performance data as the query.
- The method (300) according to claim 5, wherein predicting the state of the at least one piece of future performance data, by applying the first attention-based model to the plurality of sets of performance data, further comprises:using a first linear projection of the state-action pairs of each set of performance data as the key, and using a second linear projection of the state-action pairs of each set of performance data as the value.
- The method (300) according to claim 3 or one of the claims 4 to 6 when depending on claim 3, wherein the method (300) comprises:applying the second attention-based model to at least a first set of performance data in the plurality of sets of performance data to predict a second set of performance data.
- The method (300) according to claim 7, wherein the method (300) comprises:applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict a third set of performance data.
- The method (300) according to claim 8, wherein applying the second attention-based model to at least the first set of performance data and the second set of performance data to predict the third set of performance data comprises:obtaining an action of the third set of performance data; andpredicting a state of the third set of performance data by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, using a linear projection of a predicted state of the second set of performance data as the query.
- The method (300) according to claim 9, wherein predicting the state of the third set of performance data, by applying the second attention-based model to the at least the first set of performance data and the second set of performance data, further comprises:using a first linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the key, and using a second linear projection of the state-action pairs of the first set of performance data and the second set of performance data as the value.
- The method (300) according to one of the claims 8 to 10, wherein the method (300) comprises:applying the second attention-based model to the second set of performance data and the third set of performance data to identify a correlation between the second set of performance data and the third set of performance data.
- The method (300) according to claim 11, wherein the method (300) comprises:predicting the at least one piece of future performance data by applying the first attention-based model to the plurality of sets of performance data based on the correlation.
- The method (300) according to any one of the claims 1 to 12,wherein the at least one performance indicator comprises traffic, and/or throughput.
- A wireless digital twin (700) of a wireless network system, comprising a model configured to implement the method (300) according to any one of the claims 1 to 13.
- A computer program product comprising a program code for carrying out, when implemented on a processor, the method (300) according to any one of the claims 1 to 13.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/106518 WO2025010581A1 (en) | 2023-07-10 | 2023-07-10 | Attention-based model for wireless digital twin |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/106518 WO2025010581A1 (en) | 2023-07-10 | 2023-07-10 | Attention-based model for wireless digital twin |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025010581A1 true WO2025010581A1 (en) | 2025-01-16 |
Family
ID=94214610
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/106518 Pending WO2025010581A1 (en) | 2023-07-10 | 2023-07-10 | Attention-based model for wireless digital twin |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025010581A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200076520A1 (en) * | 2018-08-31 | 2020-03-05 | At&T Intellectual Property I, L.P. | System and Method for Throughput Prediction for Cellular Networks |
| WO2022074015A1 (en) * | 2020-10-06 | 2022-04-14 | Telefonaktiebolaget Lm Ericsson (Publ) | Conditional generative model recommendation for radio network |
| US20220210682A1 (en) * | 2020-12-30 | 2022-06-30 | Samsung Electronics Co., Ltd. | SYSTEM AND METHOD FOR ARTIFICIAL INTELLIGENCE (AI) DRIVEN VOICE OVER LONG-TERM EVOLUTION (VoLTE) ANALYTICS |
| CN115134816A (en) * | 2021-03-18 | 2022-09-30 | 中国电信股份有限公司 | Base station flow prediction method based on space-time convolution and multiple time scales |
-
2023
- 2023-07-10 WO PCT/CN2023/106518 patent/WO2025010581A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200076520A1 (en) * | 2018-08-31 | 2020-03-05 | At&T Intellectual Property I, L.P. | System and Method for Throughput Prediction for Cellular Networks |
| WO2022074015A1 (en) * | 2020-10-06 | 2022-04-14 | Telefonaktiebolaget Lm Ericsson (Publ) | Conditional generative model recommendation for radio network |
| US20220210682A1 (en) * | 2020-12-30 | 2022-06-30 | Samsung Electronics Co., Ltd. | SYSTEM AND METHOD FOR ARTIFICIAL INTELLIGENCE (AI) DRIVEN VOICE OVER LONG-TERM EVOLUTION (VoLTE) ANALYTICS |
| CN115134816A (en) * | 2021-03-18 | 2022-09-30 | 中国电信股份有限公司 | Base station flow prediction method based on space-time convolution and multiple time scales |
Non-Patent Citations (2)
| Title |
|---|
| FUTUREWEI: "Functional Framework for RAN Intelligence to support different learning problems", 3GPP DRAFT; R3-211615, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. RAN WG3, no. Online; 20210517 - 20210528, 6 May 2021 (2021-05-06), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France , XP052001361 * |
| ZTE, LENOVO, MOTOLORA MOBILITY, CHINA UNICOM: "AI based Energy Saving", 3GPP DRAFT; R3-206720, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. RAN WG3, no. e-meeting ;20201102 - 20201112, 23 October 2020 (2020-10-23), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France, XP052399738 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Liu et al. | Multivariate time-series forecasting with temporal polynomial graph neural networks | |
| WO2025092993A1 (en) | Electrical load prediction method and device based on spatial-temporal correlation | |
| Srinivasan et al. | Semi-Markov decision process with partial information for maintenance decisions | |
| KR102330423B1 (en) | Online default forecasting system using image recognition deep learning algorithm | |
| CN118643467B (en) | Day runoff prediction method based on multi-feature fusion and two-dimensional time convolution network | |
| CN119155715B (en) | A communication performance testing method and system for base station network side transmission path | |
| Zhao et al. | TranDRL: A transformer-driven deep reinforcement learning enabled prescriptive maintenance framework | |
| CN119295060B (en) | Predictive maintenance management method and system for industrial robot | |
| CN120144925B (en) | Real-time monitoring and optimization method and system for source, grid, load and storage equipment on a virtual power plant platform | |
| CN115146764A (en) | Training method and device of prediction model, electronic equipment and storage medium | |
| Kommaragiri | Machine Learning Models for Predictive Maintenance and Performance Optimization in Telecom Infrastructure | |
| US11205111B2 (en) | End of period metric projection with intra-period alerts | |
| WO2025064274A1 (en) | Log representation learning for automated system maintenance | |
| Rodrigues et al. | A system for analysis and prediction of electricity-load streams | |
| Zhang et al. | Prophet: Traffic engineering-centric traffic matrix prediction | |
| CN120389955B (en) | A method, system, device and medium for optimizing stable transmission of Internet of Things data | |
| Brosinsky et al. | Machine learning and digital twins: monitoring and control for dynamic security in power systems | |
| WO2025010581A1 (en) | Attention-based model for wireless digital twin | |
| Chen et al. | An aero–engine remaining useful life prediction model based on clustering analysis and the improved GRU–TCN | |
| CN118412862B (en) | A regional wind power prediction method, device and server taking into account extreme weather | |
| Javed et al. | Cloud-based collaborative learning (ccl) for the automated condition monitoring of wind farms | |
| CN118171464B (en) | A management method for digital twin feature layers | |
| Sharma et al. | The fundamentals and strategies of maintenance, repair, and overhaul (MRO) in Industry 4.0 | |
| Alzalam et al. | Time-Series Forecasting Models for 5G Mobile Networks: A Comparative Study in a Cloud Implementation | |
| CN118037016A (en) | A power resource response control method and system based on deep learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23944604 Country of ref document: EP Kind code of ref document: A1 |