WO2024245570A1

WO2024245570A1 - Machine learning model performance

Info

Publication number: WO2024245570A1
Application number: PCT/EP2023/064820
Authority: WO
Inventors: Muhammad Majid BUTT; Fahad SYED MUHAMMAD; Afef Feki; Miltiadis FILIPPOU
Original assignee: Nokia Technologies Oy
Current assignee: Nokia Technologies Oy
Priority date: 2023-06-02
Filing date: 2023-06-02
Publication date: 2024-12-05
Anticipated expiration: 2025-12-02

Abstract

There is provided an apparatus being caused at least to: obtain performance data of inference of at least one model from a plurality of user devices, wherein each of the user devices use at least one local model and/or a global model, and in the case the performance data is of the at least one local model, transmitting a request to obtain performance data of the global model; store the obtained performance data of the at least one global model with time information for observing at least one temporal effect; cluster the stored performance data based on information on the local model used for observing at least one spatial effect; obtain an evaluation of performance of the global model cluster-wise using the stored performance data and the time information.

Description

MACHINE LEARNING MODEL PERFORMANCE

TECHNICAL FIELD

Various example embodiments relate generally to communications.

BACKGROUND

3GPP standardization efforts are going-on with regard to the use of Artificial Intelligence (Al) and/or machine learning (ML). The Network Data Analytics Function (NWDAF) was introduced in 3GPP Rel-15 providing network slice analysis capabilities. In Rel-16, It was expanded to provide data collection and exposure. In Rel-17, UE carrying out data collection was introduced. Self-Organizing Network (SON) and Minimization of Drive Tests (MDT) have been defining data collection procedures releases starting from Rel-16. The work for utilizing AI/ML in new applications continue.

BRIEF DESCRIPTION

According to some aspects, there is provided the subject matter of the independent claims. Some further aspects are defined in the dependent claims. The embodiments that do not fall under the scope of the claims are to be interpreted as examples useful for understanding the disclosure.

LIST OF THE DRAWINGS

In the following, the invention will be described in greater detail with reference to the embodiments and the accompanying drawings, in which

Figure 1 presents a network to which one or more embodiments are applicable;

Figures 2a and 2b depict an example method according to an embodiment;

Figure 3 shows a signalling flow diagram, according to an embodiment;

Figure 4 illustrates an example apparatus according to some embodiments.

DESCRIPTION OF EMBODIMENTS

The following embodiments are exemplary. Although the specification may refer to “an”, “one”, or “some” embodiment(s) in several locations of the text, this does not necessarily mean that each reference is made to the same embodi- ment(s), or that a particular feature only applies to a single embodiment. Single features of different embodiments may also be combined to provide other embodiments. For the purposes of the present disclosure, the phrases “at least one of A or B”, “at least one of A and B”, “A and/or B” means (A), (B), or (A and B). For the purposes of the present disclosure, the phrases “A or B” and “A and/or B” means (A), (B), or (A and B). For the purposes of the present disclosure, the phrase “A, B, and/or C” means (A), (B), (C), (A and B), (A and C), (B and C), or (A, B, and C).

It shall be understood that although the terms “first” and “second” etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and similarly, a second element could be termed a first element, without departing from the scope of example embodiments.

Embodiments described maybe implemented in a radio system, such as one comprising at least one of the following radio access technologies (RATs): Worldwide Interoperability for Micro-wave Access (WiMAX), Global System for Mobile communications (GSM, 2G), GSM EDGE radio access Network (GERAN), General Packet Radio Service (GRPS), Universal Mobile Telecommunication System (UMTS, 3G) based on basic wideband-code division multiple access (W-CDMA), high-speed packet access (HSPA), Long Term Evolution (LTE), LTE-Advanced, and enhanced LTE (eLTE). Term ‘eLTE’ here denotes the LTE evolution that connects to a 5G core. LTE is also known as evolved UMTS terrestrial radio access (EUTRA) or as evolved UMTS terrestrial radio access network (EUTRAN). A term “resource” may refer to radio resources, such as a physical resource block (PRB), a radio frame, a subframe, a time slot, a subband, a frequency region, a sub-carrier, a beam, etc. The term “transmission” and/or “reception” may refer to wirelessly transmitting and/or receiving via a wireless propagation channel on radio resources

The embodiments are not, however, restricted to the systems/RATs given as an example but a person skilled in the art may apply the solution to other communication systems/networks provided with necessary properties. Some examples of a suitable communication networks include a 5G network and/or a 6G network. The 3GPP solution to 5G is referred to as New Radio (NR). 6G is envisaged to be a further development of 5G. NR has been envisaged to use multiple-input- multiple-output (M1M0) multi-antenna transmission techniques, more base stations or nodes than the current network deployments of LTE (a so-called small cell concept), including macro sites operating in co-operation with smaller local area access nodes and perhaps also employing a variety of radio technologies for better coverage and enhanced data rates. 5G will likely be comprised of more than one radio access technology / radio access network (RAT /RAN), each optimized for certain use cases and/or spectrum. 5G mobile communications may have a wider range of use cases and related applications including video streaming, augmented reality, different ways of data sharing and various forms of machine type applications, including vehicular safety, different sensors and real-time control. 5G is expected to have multiple radio interfaces, namely below 6GHz, cmWave and mmWave, and being integrable with existing legacy radio access technologies, such as the LTE.

The current architecture in LTE networks is distributed in the radio and centralized in the core network. The low latency applications and services in 5G may require bringing the content close to the radio which leads to local break out and multi-access edge computing (MEC). 5G enables analytics and knowledge generation to occur at the source of the data. This approach requires leveraging resources that may not be continuously connected to a network such as laptops, smartphones, tablets and sensors. MEC provides a distributed computing environment for application and service hosting. It also has the ability to store and process content in close proximity to cellular subscribers for faster response time. Edge computing covers a wide range of technologies such as wireless sensor networks, mobile data acquisition, mobile signature analysis, cooperative distributed peer- to-peer ad hoc networking and processing also classifiable as local cloud/fog computing and grid/mesh computing, dew computing, mobile edge computing, cloudlet, distributed data storage and retrieval, autonomic self-healing networks, remote cloud services, augmented and virtual reality, data caching, Internet of Things (massive connectivity and/or latency critical), critical communications (autonomous vehicles, traffic safety, real-time analytics, time-critical control, healthcare applications). Edge cloud may be brought into RAN by utilizing network function virtualization (NVF) and software defined networking (SDN). Using edge cloud may mean access node operations to be carried out, at least partly, in a server, host or node operationally coupled to a remote radio head or base station comprising radio parts. Network slicing allows multiple virtual networks to be created on top of a common shared physical infrastructure. The virtual networks are then customised to meet the specific needs of applications, services, devices, customers or operators.

In radio communications, node operations may in be carried out, at least partly, in a central/centralized unit, CU, (e.g. server, host or node) operationally coupled to distributed unit, DU, (e.g. a radio head/node). It is also possible that node operations will be distributed among a plurality of servers, nodes or hosts. It should also be understood that the distribution of work between core network operations and base station operations may vary depending on implementation. Thus, 5G networks architecture maybe based on a so-called CU-DU split. One gNB- CU controls several gNB-DUs. The term ‘gNB’ may correspond in 5G to the eNB in LTE. The gNBs (one or more) may communicate with one or more UEs. The gNB- CU (central node) may control a plurality of spatially separated gNB-DUs, acting at least as transmit/receive (Tx/Rx) nodes. In some embodiments, however, the gNB- DUs (also called DU) may comprise e.g. a radio link control (RLC), medium access control (MAC) layer and a physical (PHY) layer, whereas the gNB-CU (also called a CU) may comprise the layers above RLC layer, such as a packet data convergence protocol (PDCP) layer, a radio resource control (RRC) and an internet protocol (IP) layers. Other functional splits are possible too. It is considered that skilled person is familiar with the OS1 model and the functionalities within each layer.

In an embodiment, the server or CU may generate a virtual network through which the server communicates with the radio node. In general, virtual networking may involve a process of combining hardware and software network resources and network functionality into a single, software-based administrative entity, a virtual network. Such virtual network may provide flexible distribution of operations between the server and the radio head/node. In practice, any digital signal processing task may be performed in either the CU or the DU and the boundary where the responsibility is shifted between the CU and the DU may be selected according to implementation.

Some other possible technology advancements to be used are Software- Defined Networking (SDN), Big Data, and all-lP, to mention only a few non-limiting examples. For example, network slicing may be a form of virtual network architecture using the same principles behind software defined networking (SDN) and network functions virtualisation (NFV) in fixed networks. SDN and NFV may deliver greater network flexibility by allowing traditional network architectures to be partitioned into virtual elements that can be linked (also through software). Network slicing allows multiple virtual networks to be created on top of a common shared physical infrastructure. The virtual networks are then customised to meet the specific needs of applications, services, devices, customers or operators.

The plurality of gNBs (access points/nodes), each comprising the CU and one or more DUs, may be connected to each other via the Xn interface over which the gNBs may negotiate. The gNBs may also be connected over next generation (NG) interfaces to a 5G core network (5GC), which may be a 5G equivalent for the core network of LTE. Such 5G CU-DU split architecture may be implemented using cloud/server so that the CU having higher layers locates in the cloud and the DU is closer to or comprises actual radio and antenna unit. There are similar plans ongoing for LTE/LTE-A/eLTE as well. When both eLTE and 5G will use similar architecture in a same cloud hardware (HW), the next step may be to combine software (SW) so that one common SW controls both radio access networks/technol- ogies (RAN /RAT). This may allow then new ways to control radio resources of both RANs. Furthermore, it may be possible to have configurations where the full protocol stack is controlled by the same HW and handled by the same radio unit as the CU.

It should also be understood that the distribution of labour between core network operations and base station operations may differ from that of the LTE or even be non-existent. Some other technology advancements probably to be used are Big Data and all-lP, which may change the way networks are being constructed and managed. 5G (or new radio, NR) networks are being designed to support multiple hierarchies, where MEC servers can be placed between the core and the base station or nodeB (gNB). It should be appreciated that MEC can be applied in 4G networks as well.

5G may also utilize satellite communication to enhance or complement the coverage of 5G service, for example by providing backhauling. Possible use cases are providing service continuity for machine-to-machine (M2M) or Internet of Things (loT) devices or for passengers on board of vehicles, or ensuring service availability for critical communications, and future railway/maritime/aeronautical communications. Satellite communication may utilize geostationary earth orbit (GEO) satellite systems, but also low earth orbit (LEO) satellite systems, in particular mega-constellations (systems in which hundreds of (nano)satellites are deployed). Each satellite in the mega-constellation may cover several satellite-enabled network entities that create on-ground cells. The on-ground cells may be created through an on-ground relay node or by a gNB located on-ground or in a satellite.

The embodiments may be also applicable to narrow-band (NB) Inter- net-of-things (loT) systems which may enable a wide range of devices and services to be connected using cellular telecommunications bands. NB-loT is a narrowband radio technology designed for the Internet of Things (loT) and is one of technologies standardized by the 3rd Generation Partnership Project (3GPP). Other 3GPP loT technologies also suitable to implement the embodiments include machine type communication (MTC) and eMTC (enhanced Machine-Type Communication). NB-loT focuses specifically on low cost, long battery life, and enabling a large number of connected devices. The NB-loT technology is deployed “in-band” in spectrum allocated to Long Term Evolution (LTE) - using resource blocks within a normal LTE carrier, or in the unused resource blocks within an LTE carrier’s guard-band - or “standalone” for deployments in dedicated spectrum.

The embodiments may be also applicable to device-to-device (D2D), machine-to-machine, peer-to-peer (P2P) communications. The embodiments may be also applicable to vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V21), in- frastructure-to-vehicle (12V), or in general to V2X or X2V communications.

Figure 1 illustrates an example of a communication system to which embodiments of the invention may be applied. The system may comprise a control node 110 providing one or more cells, such as cell 100, and a control node 112 providing one or more other cells, such as cell 102. Each cell may be, e.g., a macro cell, a micro cell, femto, or a pico cell, for example. In another point of view, the cell may define a coverage area or a service area of the corresponding access node. The control node 110, 112 may be an evolved Node B (eNB) as in the LTE and LTE-A, ng-eNB as in eLTE, gNB of 5G, or any other apparatus capable of controlling radio communication and managing radio resources within a cell. The control node 110, 112 may be called a base station, network node, or an access node.

The system may be a cellular communication system composed of a radio access network of access nodes, each controlling a respective cell or cells. The access node 110 may provide user equipment (UE) 120 (one or more UEs) with wireless access to other networks such as the Internet. The wireless access may comprise downlink (DL) communication from the control node to the UE 120 and uplink (UL) communication from the UE 120 to the control node.

Additionally, although not shown, one or more local area access nodes may be arranged such that a cell provided by the local area access node at least partially overlaps the cell of the access node 110 and/or 112. The local area access node may provide wireless access within a sub-cell. Examples of the sub-cell may include a micro, pico and/or femto cell. Typically, the sub-cell provides a hot spot within a macro cell. The operation of the local area access node may be controlled by an access node under whose control area the sub-cell is provided. In general, the control node for the small cell may be likewise called a base station, network node, or an access node.

There may be a plurality of UEs 120, 122 in the system. Each of them may be served by the same or by different control nodes 110, 112. The UEs 120, 122 may communicate with each other, in case D2D communication interface is established between them.

The term “terminal device” or “UE” refers to any end device that may be capable of wireless communication. By way of example rather than limitation, a terminal device may also be referred to as a communication device, user equipment (UE), a Subscriber Station CSS}, a Portable Subscriber Station, a Mobile Station (MS), or an Access Terminal (AT). The terminal device may include, but not limited to, a mobile phone, a cellular phone, a smart phone, voice over IP (VoIP) phones, wireless local loop phones, a tablet, a wearable terminal device, a personal digital assistant (PDA), portable computers, desktop computer, image capture terminal devices such as digital cameras, gaming terminal devices, music storage and playback appliances, vehicle-mounted wireless terminal devices, wireless endpoints, mobile stations, laptop-embedded equipment (LEE), laptop-mounted equipment (LME), USB dongles, smart devices, wireless customer-premises equipment (CPE), an Internet of Things (loT) device, a watch or other wearable, a head-mounted display (HMD), a vehicle, a drone, a medical device and applications (e.g., remote surgery), an industrial device and applications (e.g., a robot and/or other wireless devices operating in an industrial and/or an automated processing chain contexts), a consumer electronics device, a device operating on commercial and/or industrial wireless networks, and the like. In the following description, the terms “terminal device”, “communication device”, “terminal”, “user equipment” and “UE” may be used interchangeably.

In the case of multiple access nodes in the communication network, the access nodes may be connected to each other with an interface. LTE specifications call such an interface as X2 interface. For IEEE 802.11 network (i.e. wireless local area network, WLAN, WiFi), a similar interface may be provided between access points. An interface between an LTE access point and a 5G access point, or between two 5G access points may be called Xn. Other communication methods between the access nodes may also be possible. The access nodes 110 and 112 may be further connected via another interface to a core network 116 of the cellular communication system. The LTE specifications specify the core network as an evolved packet core (EPC), and the core network may comprise a mobility management entity (MME) and a gateway node. The MME may handle mobility of terminal devices in a tracking area encompassing a plurality of cells and handle signalling connections between the terminal devices and the core network. The gateway node may handle data routing in the core network and to/from the terminal devices. The 5G specifications specify the core network as a 5G core (5GC), and there the core network may comprise e.g. an access and mobility management function (AMF) and a user plane function/gateway (UPF), to mention only a few. The AMF may handle termination of non-access stratum (NAS) signalling, NAS ciphering & integrity protection, registration management, connection management, mobility management, access authentication and authorization, security context management. The UPF node may support packet routing & forwarding, packet inspection and QoS handling, for example.

Artificial intelligence (Al) and machine learning (ML) algorithms are used in telecom networks for bringing improved resource use and service experience to users. Al can improve network security and network efficiency as well as enhance service quality. The Network Data Analytics Function (NWDAF) was introduced in 3GPP Rel-15 providing network slice analysis capabilities. In Rel-16, It was expanded to provide data collection and exposure. In Rel-17, UE carrying out data collection was introduced. Self-Organizing Network (SON) and Minimization of Drive Tests (MDT) have been defining data collection procedures releases starting from Rel-16. The work for utilizing Al/ML in new applications continue.

One Al/ML technique is federated learning. Federated learning (FL also known as collaborative learning) is a machine learning technique that trains an algorithm via multiple independent sessions, each using its own dataset. This approach enables multiple training actors to create a common machine learning model using multiple local datasets contained in local nodes without explicitly exchanging data samples, thus addressing issues such as data privacy, data security, data access rights and access to heterogeneous data. The general principle consists in training local models on local data samples and exchanging parameters (e.g. the weights and biases of a deep neural network) between these local nodes or be transmitted to a central node and not just exchanged among the local nodes to generate the global model to generate a global model shared by all nodes.

Federated learning can be centralized or decentralized. In the decentralised federated learning, the nodes are able to coordinate themselves to obtain the global model. In the centralized learning, a central unit is responsible of the control of the learning phase.

In the federated learning, a global model can be seen to be an FL model trained based on the contribution of local models from a larger group of local actors, such as user devices or terminal devices (UE, loT device, mobile phone, etc.) and can be generalized to more unknown situations. A local model can be seen to be an FL model trained based on local model contributions from a subset of local actors (UEs, for example) as compared to larger group of UEs in global FL model. The model is valid for a smaller group of UEs and cannot be generalized to too many new inputs.

After training a local machine learning (ML) model, each individual learner transfers its local model parameters, e.g., weight and bias values of a neural network, to an aggregating unit, e.g., centralized or peer aggregation entity. The aggregating unit utilizes the local model parameters of all or selected involved learners/ participants to update a global model, iteration is usually applied in the training. As a result, each local learner benefits from the datasets of the other local learners only through the global model, shared by the aggregator, without explicitly accessing high volume of privacy-sensitive data available at each of the other local learners. Federated learning may comprise the following steps:

• Initialization: A machine learning model (e.g., linear regression or neural network) is chosen to be trained on local nodes and initialized at central server side.

• Client selection: all or a fraction of local nodes is selected to start training on local data. The selected nodes acquire the current statistical model by the central server, while the others wait for the next federated round.

• Reporting and Aggregation: each selected node sends its locally updated version of the model to the central server for aggregation. The central server aggregates the received models (e.g., by means of averaging corresponding model parameters) and sends back the model updates to the nodes. This process is repeated iteratively.

• Termination: once a pre-defined termination criterion is met (e.g., a maximum number of iterations is reached) the central server aggregates the updates and finalizes the global model.

In communications network, especially in air interface applications, it is important to find the right trade-off between carrying out high complexity FL model retraining only when needed to avoid unnecessary radio link congestion and not delaying such retraining too much to avoid having too many local FL models.

Embodiments provide timing of triggering a global FL model retraining or updating taking into consideration temporal and spatial effects by using spatiotemporal clustering based on both spatial and temporal aspects.

Figures 2a and 2b depict an example method. The method may be implemented by a software code. The method may be carried out by an aggregator or Network Data Analytics Function (NWDAF) entity (server or any device comprising adequate capabilities) available for or in an access node (e.g. gNB, central unit and/or distributed unit) or a user device, or any suitable apparatus depending on the implementation of federated learning control functions, and availability of a side link, for example.

In block 200, performance data of inference of at least one model is obtained from a plurality of user devices, wherein each of the user devices uses at least one local model and/or a global model, and in the case the performance data is of the local model, transmitting a request to obtain performance data of one or more of the at least one global model.

It should be appreciated that, in the case the performance (measured using a key performance indicator (KPI) such as upstream/downstream throughput, handover failure rate, latency, delay) using the global FL model is below tar- get/threshold, the user device (UE) may discontinue using the global FL model for inference and start using its local FL model, or not using a machine learning algorithm, depending on the policy.

User devices report the performance data of the model they use, be it a local or global model as requested based on a policy.

The Al/ML control function entity may request user devices reporting performance data for a local model to turn on a test inference using the global FL model available for these user devices with the same inference input data in the local and global model for improved or balanced performance evaluation or comparison between difference models. For this test inference, timing information for the test may be indicated in the request. The timing may be informed in the form of a time window. This way performance of the global model may be monitored by user devices not using the global model for communication services. This forced inference may be carried out for obtaining more performance data for evaluating the performance of the global model for taking into account temporal and spatial effects by using spatiotemporal clustering based on both spatial and temporal aspects. For example, it is possible to have a large number of user devices, in a single local FL group, having a low global FL performance. However, it is possible that a global FL model does not need training yet, since the global model may perform well for other user groups that is to say, there may be spatial bias in the performance data. Therefore, the data is collected for analysis from various user groups in experiencing different conditions. As an example of consideration: if several groups use local FL models or test inference shows low performance, retraining the global model may be triggered.

User devices report the test performance data to the control entity. Regular or normal performance data may be reported as well.

In block 202, the obtained performance data of the at least one global model is stored with time information for observing at least one temporal effect.

The temporal effect may be continuity and/or repetitiveness of the at least one temporal effect.

Time information may be a time stamp indicating the time when the performance data is received, or, if indicated in the report, when a user device carried out the inference of the model. Timing information is used to evaluate variation of results as a function of time.

It is possible that local input data distribution based on radio measurements carried out by a user device is temporarily affected in a manner radio conditions are deteriorated (e.g. rain, hailstones, temporal obstacle on the line of sight), or interference level is high only for a short time. Thus, the decision to trigger training of a global model should take the duration of the low performance into account. Once data from the user devices is available as a function of time, a time correlation test may be performed to see if test results are constantly poor over time. One example embodiment:

Cl: (Mean Performance results for Nth window)- (Mean Performance results for (N-l)th window) < a (time correlation)

C2: Mean performance results at time t <p

If both Cl and C2 are true, then global FL model does not work for this UE. Evaluating Cl and C2 for all UEs evaluated can provide binary decision for each UE and collecting all such binary decisions can provide information if local FL is required for this local group or not.

Setting Value of a and p in the example may be carried out as follows: a is a system parameter with values ranging 0-1 while left hand side of Cl is normalized to keep this KPI value within this limit. Its value dictates time correlation condition. If a is small (0-0.1), then even small variability of results over time means that system is not stable.

P is a factor measuring quality of FL model inference through KPI performance evaluation. For instance, % of successful handover decisions could be bounded by p. If % of successful handovers is below for global FL model, then local FL model is still preferred option for the UE (or set of UEs).

In block 204, the stored performance data is clustered based on information on the local model used for observing at least one spatial effect.

The controller keeps track on the models use as a part of the user devices’ performance reporting. For this purpose, the report may comprise indication of the local model, or weights etc. on the local model used.

In block 206, an evaluation of performance of the global model is obtained cluster-wise using the stored performance data and the time information.

The performance evaluation is carried out for all clusters or at least part of the clusters selected based on history data and/or location or selected randomly.

The performance evaluation may be based on at least one threshold set for the performance and/or comparison among a plurality of the clusters for ruling out the effect of insignificant performance differences. The threshold may be set for one or more selected KPIs case by case.

The evaluation is carried out for discovering groups with non-temporal low performance. The temporal effect may be continuity and/or repetitiveness of the at least one temporal effect. It is possible to utilize other information, such as weather report or existence of sport or other events in the area in evaluating whether the effect is long-lasting or not.

It is possible that a particular group of user devices using a local FL model use the local model due to low performance of a global model but the global model is performing well for most of other user devices. In this case, triggering a model retraining may not the best option. Instead, those users using a local model may continue to do so for a time being. That is to say, the global FL model is facing low performance due to a spatial effect. The user devices coming to the area, where the global model does not perform well, may be instructed to use the local model instead of the global model. The information may be broadcast in the area or informed by using a radio resource control signalling associated with the handover procedure.

It is an option to further analyse the cause of the low performance by selecting from different or even each group of user devices using the same local FL model, a same number of users. Additionally, it is possible to mark the selected user devices with time stamps with regard to the selection to prevent selecting the same user devices several times.

Additionally, it is possible to set a timer for triggering a new test inference among the one or more clusters, where the model is not performing well, to limit the time the instructions to use a local model are valid.

In block 208, in the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, a retraining request of the global model is transmitted.

One case example is that there is a large number of clusters facing low performance with the global model and it is not temporal, for example, no cause that can be taken as temporal, such as a rain shower, is found, or after a timer for repeating the test inference runs out, and the new test inference shows continuation of the low performance, a global model retraining is triggered.

The request may be transmitted without a delay, or the retraining may be timed to an appropriate time, for example outside a high load time on-going or predicted based on a normal traffic pattern.

Additionally, as an option or in addition to the transmitting the retraining request, it is possible to obtain an evaluation of performance of the at least one local model in the certain number of clusters, and transmit, to the certain number of clusters, a request to use the at least one local model, in the case the performance as evaluated is adequate for the at least one local model. As put forward above, in the case the local model performs well, the user devices in those groups may be instructed to use the local model until global model retraining is completed.

If the evaluation prompts for global FL model retraining, user devices and/or the network are instructed accordingly. If evaluation shows that there is no need for FL model retraining at this time, it may further be evaluated whether there is need for using local FL models in some areas. If both global and local FL models are not performing well, user devices may be requested to start using local default procedure without Al/ML.

Figure 3 shows a simplified signalling chart to further illustrate the examples with regard to Figure 2. It should be appreciated that this example is not limiting for number of user devices or the location of the control of the model performance evaluation etc.

In Figure 3, there are one user device UE 120 using a local model and one user device UE 122 using a global model. The apparatus taking care of the control of the retraining is in this example a gNB 110. However, the apparatus may be or be comprised in other units as well, such as an aggregator or Network Data Analytics Function (NWDAF) entity (server or any device comprising adequate capabilities) available for an access node (e.g. gNB) or a user device, or any suitable apparatus depending on the implementation of federated learning control functions, and availability of a side link, for example.

The gNB has information on which model each user device is using, or it may request it. Therefore a gNB transmits a request for performance data (KPI) for a user device using a global model. This is an option that may be used for obtaining data in addition to a data the user devices report anyway. The gNB has information on what model each user device is using. Based on the information, the gNB transmits a request for a test drive of the global model to user devices using a local model. The request for the test drive may comprise timing information for the test. The user device using the global model transmits the requested performance data of the global model inference as a feedback to the gNB. The user device using the local model transmits the requested performance data of the global model inference in a test drive as a feedback to the gNB. It may also transmit the performance data of the local model inference that may be requested or be normal reporting.

The gNB carries out storing the obtained performance data of the at least one global model with time information for observing at least one temporal effect, clustering the stored performance data based on information on the local model used for observing at least one spatial effect, and obtaining an evaluation of performance of the global model cluster-wise using the stored performance data and the time information. In the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, it transmits a global model retraining request, or as an option or in addition to the transmitting the retraining request, it may obtain an evaluation of performance of the at least one local model in the certain number of clusters, and transmit, to the certain number of clusters, a request to use the at least one local model, in the case the performance as evaluated is adequate for the at least one local model. Thus, the signalling based on the evaluation maybe or comprise a retraining request, or request to use the local model that performs adequately. In the case no well-performing local model is provided, a request not to use Al/ML may be transmitted.

An embodiment, as shown in Figure 4, provides an apparatus 10 comprising a control circuitry (CTRL) 12, such as at least one processor, and at least one memory 14 storing instructions that, when executed by the at least one processor, cause the apparatus at least to carry out any one of the above-described processes as shown in Figures 2 and 3. In an example, the at least one memory and the computer program code (software), are configured, with the at least one processor, to cause the apparatus to carry out any one of the above-described processes. The memory may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, flash memory, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The memory may comprise a database for storing data.

As used in this application, the term ‘circuitry’ refers to all of the following: (a) hardware-only circuit implementations, such as implementations in only analog and/or digital circuitry, and (b) combinations of circuits and soft-ware (and/or firmware), such as (as applicable): (i) a combination of processor(s) or (ii) portions of processor(s)/software including digital signal processor(s), software, and memory(ies) that work together to cause an apparatus to perform various functions, and (c) circuits, such as a microprocessor(s) or a portion of a micropro- cessor(s), that require software or firmware for operation, even if the software or firmware is not physically present. This definition of ‘circuitry’ applies to all uses of this term in this application. As a further example, as used in this application, the term ‘circuitry’ would also cover an implementation of merely a processor (or multiple processors) or a portion of a processor and its (or their) accompanying software and/or firmware. The term ‘circuitry’ would also cover, for example and if applicable to the particular element, a baseband integrated circuit or applications processor integrated circuit for a mobile phone or a similar integrated circuit in a server, a cellular network device, or another network device.

In an embodiment, the apparatus 10 may comprise the terminal device of a communication system, e.g. a user terminal (UT), a computer (PC), a laptop, a tabloid computer, a cellular phone, a mobile phone, a communicator, a smart phone, a palm computer, a mobile transportation apparatus (such as a car), a household appliance, or any other communication apparatus, commonly called as UE in the description. Alternatively, the apparatus is comprised in such a terminal device. Further, the apparatus may be or comprise a module (to be attached to the UE) providing connectivity, such as a plug-in unit, an “USB dongle”, or any other kind of unit. The unit may be installed either inside the UE or attached to the UE with a connector or even wirelessly. The apparatus may also comprise a user interface 18 comprising, for example, at least one keypad, a microphone, a touch display, a display, a speaker, etc. The user interface may be used to control the apparatus by the user.

In an embodiment, the apparatus 10 may be or be comprised in a network node, such as in gNB/gNB-CU/gNB-DU of 5G. In an embodiment, the apparatus is or is comprised in the network node 110.

The apparatus may further comprise a radio interface (TRX) 16 comprising hardware and/or software for realizing communication connectivity according to one or more communication protocols. The TRX may provide the apparatus with communication capabilities to access the radio access network, for example.

In an embodiment, a CU-DU (central unit - distributed unit) architecture is implemented. In such case the apparatus 50 may be comprised in a central unit (e.g. a control unit, an edge cloud server, a server) operatively coupled (e.g. via a wireless or wired network) to a distributed unit (e.g. a remote radio head/node). That is, the central unit (e.g. an edge cloud server) and the radio node may be standalone apparatuses communicating with each other via a radio path or via a wired connection. Alternatively, they may be in a same entity communicating via a wired connection, etc. The edge cloud or edge cloud server may serve a plurality of radio nodes or a radio access networks. In an embodiment, at least some of the described processes may be performed by the central unit. In another embodiment, the apparatus may be instead comprised in the distributed unit, and at least some of the described processes may be performed by the distributed unit. In an embodiment, the execution of at least some of the functionalities of the apparatus 50 may be shared between two physically separate devices (DU and CU) forming one operational entity. Therefore, the apparatus may be seen to depict the operational entity comprising one or more physically separate devices for executing at least some of the described processes. In an embodiment, the apparatus controls the execution of the processes, regardless of the location of the apparatus and regardless of where the processes/functions are carried out.

In an embodiment, an apparatus carrying out at least some of the embodiments described comprises at least one processor and at least one memory including a computer program code, wherein the at least one memory and the computer program code are configured, with the at least one processor, to cause the apparatus to carry out the functionalities according to any one of the embodiments described. According to an aspect, when the at least one processor executes the computer program code, the computer program code causes the apparatus to carry out the functionalities according to any one of the embodiments described. According to another embodiment, the apparatus carrying out at least some of the embodiments comprises the at least one processor and at least one memory including a computer program code, wherein the at least one processor and the computer program code perform at least some of the functionalities according to any one of the embodiments described. Accordingly, the apparatus may comprise means (12, 14, 16) for obtaining performance data of inference of at least one model from a plurality of user devices, wherein each of the user devices use at least one local model and/or a global model, and in the case the performance data is of the at least one local model, transmitting a request to obtain performance data of the global model, means (14) for storing the obtained performance data of the at least one global model with time information for observing at least one temporal effect, means (12, 14) for clustering the stored performance data based on information on the local model used for observing at least one spatial effect, means (12, 14) for obtaining an evaluation of performance of the global model cluster-wise using the stored performance data and the time information, means (12, 14, 16) for, in the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, transmitting a global model retraining request (an/or the other options described above).

As used herein the term “means” is to be construed in singular form, i.e. referring to a single element, or in plural form, i.e. referring to a combination of single elements. Therefore, terminology “means for [performing A, B, C]”, is to be interpreted to cover an apparatus in which there is only one means for performing A, B and C, or where there are separate means for performing A, B and C, or partially or fully overlapping means for performing A, B, C. Further, terminology “means for performing A, means for performing B, means for performing C” is to be interpreted to cover an apparatus in which there is only one means for performing A, B and C, or where there are separate means for performing A, B and C, or partially or fully overlapping means for performing A, B, C.

According to yet another embodiment, the apparatus carrying out at least some of the embodiments comprises a circuitry including at least one processor and at least one memory including computer program code. When activated, the circuitry causes the apparatus to perform the at least some of the functionalities according to any one of the embodiments described.

In an embodiment, at least some of the processes described may be carried out by an apparatus comprising corresponding means for carrying out at least some of the described processes. Some example means for carrying out the processes may include at least one of the following: detector, processor (including dual-core and multiple-core processors), digital signal processor, controller, receiver, transmitter, encoder, decoder, memory, RAM, ROM, software, firmware, display, user interface, display circuitry, user interface circuitry, user interface software, display software, circuit, antenna, antenna circuitry, and circuitry.

A term non-transitory, as used herein, is a limitation of the medium itself (i.e. tangible, not a signal) as opposed to a limitation on data storage persistency (e.g. RAM vs. ROM).

The techniques and methods described herein may be implemented by various means. For example, these techniques may be implemented in hardware (one or more devices), firmware (one or more devices), software (one or more modules), or combinations thereof. For a hardware implementation, the apparatuses) of embodiments may be implemented within one or more applicationspecific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, other electronic units designed to perform the functions described herein, or a combination thereof. For firmware or software, the implementation can be carried out through modules of at least one chip set (e.g. procedures, functions, and so on) that perform the functions described herein. The software codes may be stored in a memory unit and executed by processors. The memory unit may be implemented within the processor or externally to the processor. In the latter case, it can be communicatively coupled to the processor via various means, as is known in the art. Additionally, the components of the systems described herein may be rearranged and/or complemented by additional components in order to facilitate the achievements of the various aspects, etc., described with regard thereto, and they are not limited to the precise configurations set forth in the given figures, as will be appreciated by one skilled in the art.

Embodiments as described may also be carried out in the form of a computer process defined by a computer program or portions thereof. Embodiments of the methods described may be carried out by executing at least one portion of a computer program comprising corresponding instructions. The computer program may be in source code form, object code form, or in some intermediate form, and it may be stored in some sort of carrier, which may be any entity or device capable of carrying the program. For example, the computer program may be stored on a computer program distribution medium readable by a computer or a processor. The computer program medium may be, for example but not limited to, a record medium, computer memory, read-only memory, electrical carrier signal, telecommunications signal, and software distribution package, for example. The computer program medium may be a non-transitory medium. Coding of software for carrying out the embodiments as shown and described is well within the scope of a person of ordinary skill in the art.

Even though the invention has been described above with reference to an example according to the accompanying drawings, it is clear that the invention is not restricted thereto but can be modified in several ways within the scope of the appended claims. Therefore, all words and expressions should be interpreted broadly and they are intended to illustrate, not to restrict, the embodiment. It will be obvious to a person skilled in the art that, as technology advances, the inventive concept can be implemented in various ways. Further, it is clear to a person skilled in the art that the described embodiments may, but are not required to, be com- bined with other embodiments in various ways.

Claims

1. An apparatus, comprising: at least one processor and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus at least to: obtain performance data of inference of at least one model from a plurality of user devices, wherein each of the user devices use at least one of a local model and a global model, and in the case the performance data is of the at least one local model, transmitting a request to obtain performance data of the global model; store the obtained performance data of the at least one global model with time information for observing at least one temporal effect; cluster the stored performance data based on information on the local model used for observing at least one spatial effect; obtain an evaluation of performance of the global model cluster-wise using the stored performance data and the time information, and in the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, transmit a global model retraining request.

2. The apparatus of claim 1, wherein the request further comprises timing information for the inference of the at least one global model.

3. The apparatus of claim 1, wherein the time information is a time stamp, and the at least one temporal effect is continuity and/or repetitiveness of the at least one temporal effect.

4. The apparatus of claim 1, wherein the performance evaluation is based on at least one threshold set for the performance and/or comparison among a plurality of the clusters.

5. The apparatus of claim 1, wherein the performance evaluation is carried out for all clusters or at least part of the clusters selected based on history data and/or location or selected randomly.

6. The apparatus of claim 1, wherein, in the case the performance as evaluated is not adequate in the certain number of clusters taking into consideration the at least one temporal effect, as an option or in addition to the transmitting the retraining request, the apparatus is further caused to: obtain an evaluation of performance of the at least one local model in the certain number of clusters, and transmit, to the certain number of clusters, a request to use the at least one local model, in the case the performance as evaluated is adequate for the at least one local model.

7. The apparatus of claim 1, wherein the at least one local model and/or the global model training method is based on federated learning principles.

8. The apparatus of claim 1, herein the apparatus is a network node, a distributed unit or a central unit of the network node, or any device providing services in the radio network where the performance data is obtained.

9. A method, comprising: obtaining performance data of inference of at least one model from a plurality of user devices, wherein each of the user devices use at least one local model and/or a global model, and in the case the performance data is of the at least one local model, transmitting a request to obtain performance data of the global model; storing the obtained performance data of the at least one global model with time information for observing at least one temporal effect; clustering the stored performance data based on information on the local model used for observing at least one spatial effect; obtaining an evaluation of performance of the global model cluster-wise using the stored performance data and the time information; in the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, transmitting a global model retraining request.

10. The apparatus of claim 9, wherein the request further comprises timing information for the inference of the at least one global model.

11. The apparatus of claim 9 or 10, wherein the time information is a time stamp, and the at least one temporal effect is continuity and/or repetitiveness of the at least one temporal effect.

12. The apparatus of any preceding claim 9-11, wherein the performance evaluation is based on at least one threshold set for the performance and/or comparison among a plurality of the clusters.

13. The apparatus of any preceding claim 9-12, wherein the performance evaluation is carried out for all clusters or at least part of the clusters selected based on history data and/or location or selected randomly.

14. The apparatus of any preceding claim 9-13, wherein, in the case the performance as evaluated is not adequate in the certain number of clusters taking into consideration the at least one temporal effect, as an option or in addition to the transmitting the retraining request, the apparatus is further caused to: obtain an evaluation of performance of the at least one local model in the certain number of clusters, and transmit, to the certain number of clusters, a request to use the at least one local model, in the case the performance as evaluated is adequate for the at least one local model.

15. The apparatus of any preceding claim 9- 14, wherein the at least one local model and/or the global model training method is based on federated learning principles.

16. The apparatus of any preceding claim 9-15, herein the apparatus is a network node, a distributed unit or a central unit of the network node, or any device providing services in the radio network where the performance data is obtained.

17. An apparatus, comprising means for carrying out the method according to any of claims 9-16.

18. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of any of claims 9-16.

19. A distributed computing system, comprising means for carrying out the method according to any of claims 9-16.

20. A computer system, comprising: one or more processors; at least one data storage, and one or more computer program instructions to be executed by the one or more processors in association with the at least one data storage cause the computer system at least to: obtain performance data of inference of at least one model from a plurality of user devices, wherein each of the user devices use at least one of a local model and a global model, and in the case the performance data is of the at least one local model, transmitting a request to obtain performance data of the global model; store the obtained performance data of the at least one global model with time information for observing at least one temporal effect; cluster the stored performance data based on information on the local model used for observing at least one spatial effect; obtain an evaluation of performance of the global model cluster-wise using the stored performance data and the time information, and in the case the performance as evaluated is not adequate in a certain number of clusters taking into consideration the at least one temporal effect, transmit a global model retraining request.