WO2024110160A1 - Ml model transfer and update between ue and network - Google Patents
Ml model transfer and update between ue and network Download PDFInfo
- Publication number
- WO2024110160A1 WO2024110160A1 PCT/EP2023/080460 EP2023080460W WO2024110160A1 WO 2024110160 A1 WO2024110160 A1 WO 2024110160A1 EP 2023080460 W EP2023080460 W EP 2023080460W WO 2024110160 A1 WO2024110160 A1 WO 2024110160A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- machine learning
- learning model
- model
- user equipment
- performance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0803—Configuration setting
- H04L41/0813—Configuration setting characterised by the conditions triggering a change of settings
- H04L41/0816—Configuration setting characterised by the conditions triggering a change of settings the condition being an adaptation, e.g. in response to network events
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/06—Generation of reports
- H04L43/065—Generation of reports related to network devices
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
Definitions
- Federated learning refers to a machine learning (ML) technique.
- a central server may pool the learning occurring across machine learning models at client nodes, without the central server having access to the local training data at the client nodes so the privacy of the local data is maintained.
- ML models may be transferred frequently between the central server and the client nodes.
- a method that includes transmitting, by a user equipment and to an access node, a request for a machine learning model, wherein the request comprises information on at least one model adaptation constraint for training of the machine learning model or an inference of the machine learning model; receiving, by the user equipment and from the access node, the machine learning model that is adapted in accordance with the at least one model adaptation constraint, and at least one instruction for monitoring performance of the machine learning model and/or for monitoring at least one user equipment performance indicator; applying, by the user equipment, the machine learning model to the training of the machine learning model or the inference of the machine learning model; monitoring, by the user equipment, the machine learning model and/or the at least one user equipment performance indicator according to the at least one instruction; and transmitting, by the user equipment and to the access node, information on at least one of the performance or failure information indicating the user equipment failure to apply the machine learning model.
- the user equipment may receive from the access node the machine learning model that is not adapted in accordance with the at least one model adaptation constraint, and the at least one instruction for monitoring performance of the machine learning model and/or for monitoring the at least one user equipment performance indicator.
- the user equipment may adapt the machine learning model before continuing with the applying, the monitoring, and the transmitting information on at least one of the performance or the failure information indicating the user equipment failure to at least one of apply or adapt the machine learning model.
- the user equipment may continue to use the machine learning model.
- the user equipment may switch to at least one of a non-machine learning mode for performing a task of the machine learning model and a prior version of the machine learning mode for performing the task.
- the machine learning model may be adapted in accordance with the at least one model adaptation constraint by at least compressing the machine learning model by at least one of a weight pruning, a structural pruning, a weight quantization, or a machine learning model architecture change.
- the at least one model adaptation constraint may include at least one of a constraint related to the machine learning model, a constraint related to a user equipment resource constraint, a battery life of the user equipment, or a latency requirement for the inference of the machine learning model.
- the at least one instruction for monitoring performance of the machine learning model comprises one or more metrics for evaluating the performance of the machine learning model, and/or wherein the at least one user equipment performance indicator comprises one or more key performance indicators.
- a method that includes receiving, by an access node and from a user equipment, a request for a machine learning model, wherein the request comprises information on at least one model adaptation constraint for training of the machine learning model or an inference of the machine learning model; and in response to the access node being able to adapt the machine learning model, adapting, by the access node, the machine learning model using the at least one model adaptation constraint, determining, by the access node, at least one instruction for monitoring performance of the machine learning model and/or for monitoring at least one user equipment performance indicator, transmitting, by the access node and to the user equipment, the machine learning model and the at least one instruction, and receiving, by the access node from the user equipment, information on monitoring carried out based on the at least one instruction, or failure information indicating the user equipment failure to apply the machine learning model.
- the access node may transmit the machine learning model in an un-adapted form to the user equipment, in response to the access node not being able to adapt the machine learning model using the at least one model adaptation constraint.
- the method may comprise further adapting, by the access node, the machine learning model using the information from the user equipment.
- the machine learning model that is adapted in accordance with the at least one model adaptation constraint may be adapted by at least compressing the machine learning model using at least one of a weight pruning, a structural pruning, a weight quantization, or a machine learning model architecture change.
- the at least one instruction for monitoring performance of the machine learning model may comprise one or more metrics for evaluating the performance of the machine learning model, and/or wherein the at least one user equipment performance indicator comprises one or more key performance indicators.
- the access node may comprise or be comprised in at least one of a radio access network node, a gNB type base station, a server.
- FIG. 1 shows an example of federated learning process among a plurality of user equipment, in accordance with some embodiments
- FIGs. 2A, 2B, and 2C depict examples of processes for transferring a machine learning model, in accordance with some example embodiments
- FIG. 3A depicts an example of various constraints, in accordance with some embodiments.
- FIG. 3B depicts an example process at a user equipment for ML model adaption, in accordance with some embodiments
- FIG. 3C depicts an example process at an access node for ML model adaption, in accordance with some embodiments
- FIG. 4 depict an example of a ML model, in accordance with some example embodiments.
- FIG. 5 depicts an example of a network node, in accordance with some example embodiments.
- FIG. 6 depicts an example of an apparatus, in accordance with some example embodiments.
- an ML model may be trained such that during an inference phase the ML model may be used in 5G to perform a task, such as using the ML model for channel state information (CSI) compression, prediction of “best” beams for beam selection in time and spatial domain, mobility handling of the UE, link level performance, and for other applications or functions in the cellular system.
- CSI channel state information
- the ML model may need to be transferred over an air interface between a network node, such as a base station (e.g., next generation evolved Node B, gNB) and a user equipment (UE).
- UE may vary in their constraints to handle the processing associated with a given ML model.
- a network node e.g., gNB
- the network node can assess the constraints (also referred to herein as model adaption constraints) at the UE for handling a given ML model.
- constraints also referred to herein as model adaption constraints
- one or more constraints may enable the network node to adapt an ML model for a given UE (or group of UEs having the same or similar constraints for handling ML models).
- the adaptation may include compressing the ML model by for example pruning the ML model and/or adapting the quantization used for the parameters of the ML model.
- FIG. 1 shows an example of a federated learning process among a plurality of user equipment, in accordance with some embodiments.
- federated learning this is merely an example as the ML models transferred over the air interface may use other types of learning, such as unsupervised learning, supervised learning, reinforcement learning, semi-supervised learning, self-supervised learning, and/or the like.
- ML models may be transferred over the UE- gNB air interface at least one time as demonstrated in the example below.
- the gNB 106 may provide to, for example, UEs 102A-C an initial ML model.
- Each of the UEs 102A-C use their own local data 104A-C to train the initial ML model.
- the UE 102 A may train the initial ML model using its local data 104 A without accessing the local data 104B-C;
- UE 102B may train the initial ML model using its local data 104B;
- UE 102C trains the initial ML model using its local data 104C.
- the UEs 104A-C each sends a partial ML model 108A-C towards the gNB 106.
- the sending of the partial ML model may comprise sending the parameters (e.g., weights, activations, and/or other configuration information) for the partial ML model (but the local data 104A-C is not sent as noted).
- the gNB 106 may then combine (e.g., aggregate the partial ML models 108A-C) to form a "global” ML model 110 (also referred to as an “aggregate” ML model).
- the global ML model may be transferred at 112A-C to at least the UEs 102A-C.
- the UEs 102A-C may perform additional training of the global ML model 108 and return it to the gNB and/or central server 109. For example, a predetermined quantity of training iterations (or a predetermined error threshold) may be used to determine whether additional training of the global ML model is needed at each of the UEs 102A-C using their corresponding local data 104A-C.
- the global (or aggregate) ML model 110 provided at 112A-C may be used for an inference phase to perform a task. If the training iterations is 2 for example, the global (or aggregate) ML model 110 provided at 112A-C may be trained again using local data by each of the UEs 102A-C and returned to the gNB for aggregation, forming thus a second version of the global ML model. This second version of the global (or aggregate) ML model 110 provided at 112A-C to the UEs may be used for an inference phase to perform a task. Although this example refers to training iterations, a predetermined error threshold (e.g., perform additional training until an error of the task performed by the global ML model is below a threshold error) may be used as well.
- a predetermined error threshold e.g., perform additional training until an error of the task performed by the global ML model is below a threshold error
- the UEs 102A-C may use the global ML model 110 provided at 112A-C for an inference phase, during which the global ML model is used to perform a task, such as a ML task (e.g., the CSI compression, beam selection, or other task).
- a ML task e.g., the CSI compression, beam selection, or other task.
- the global ML model 110 may be provided to additional UEs 102D-N with the global ML model 110 for use during an inference phase to enable those additional UEs to perform the task.
- the central server 109 may be comprised as an edge server located at the radio access network associated with the gNB 106. Alternatively, or additionally, the central server 109 may be comprised in the core network, such as the 5G core network. Alternatively, or additionally, the central server 109 may be comprised as a cloud service.
- the process may include an initialization phase during which an ML model is selected to be trained at the local nodes, such as UEs 102A-C.
- the ML model may be selected for CSI compression, while another ML model may be selected to perform another task such as beam selection and/or the like.
- the process may also include a training client selection phase, during which one or more of the local nodes are selected to perform the federated learning of the ML model.
- the central server 109 may select a subset of the UEs 102A-N for federated learning, so in the example of FIG. 1 the selected nodes are UE 102A-C.
- Each of the selected UEs 102A-C may perform the local training of the initial (or global) ML model using a corresponding set of local data 104A-C.
- the selected UEs 102A-C may each send their locally trained ML model 108A-C to the gNB 106 (and/or central server 109), where ML model aggregation is performed.
- a pre-defined termination criterion may be satisfied (e.g., a predefined criterion, such as a maximum number of iterations during training, performance threshold, error threshold, and/or the like).
- the central server 109 may send the global ML model 110 to the UEs for additional training using local data, and these UE respond with partial ML models.
- the central server 109 may publish the global ML model 110 to at least the UEs 102A-C as well as other UEs to enable the UEs to use the global ML model for inference (e.g., to perform one or more tasks such as the CSI compression, beam selection, and/or the like).
- ML model transfer over the air interface between the UEs 102A-C and base station 106 may, as noted, be realized with other types of ML models as well as other types of ML training schemes.
- the ML model transfer to the UE may involve partial training at the UE (e.g., where the UEs perform a part of the training as in the case of federated learning) or full training at the network side, such as at the gNB 106 (or server 109).
- the trained ML model may be transferred over the air interface to the UEs 102A-C to perform ML related tasks using the trained ML model. This transfer of the ML model raises some issues.
- a first issue relates to how an ML model may be adapted to a given UE’s constraints, such as availability of CPU resources, availability of a specialized machine learning processor such as an Al chip, availability of memory, availability of a hardware accelerator, current state of the battery, current mode of the UE (e.g., power save mode), UE mobility state changes, and/or other model adaptation constraints at the UE.
- constraints such as availability of CPU resources, availability of a specialized machine learning processor such as an Al chip, availability of memory, availability of a hardware accelerator, current state of the battery, current mode of the UE (e.g., power save mode), UE mobility state changes, and/or other model adaptation constraints at the UE.
- constraints such as availability of CPU resources, availability of a specialized machine learning processor such as an Al chip, availability of memory, availability of a hardware accelerator, current state of the battery, current mode of the UE (e.g., power save mode), UE mobility state changes, and/or other model adaptation constraints at the UE.
- the network may need to be aware of a UE’s constraints (e.g., capabilities) when deciding what (or even if) to transfer a ML model and/or an “adapted” ML model.
- constraints e.g., capabilities
- Another issue with respect to the ML model use at the UE relates to how often, how much, and/or to what extent the ML model should be updated via additional training.
- a network node such as a gNB type base station or other node.
- An ML model may be transferred, as noted, via the air interface between a network node (e.g., a radio access network node, a gNB, etc.) and one or more UEs.
- the ML model may be trained in a variety of ways (e.g., using federated learning, supervised learning, unsupervised learning, and/or the like).
- the ML model’s output may provide an inference as in the case of the “best” beam selection noted above or perform some other task, such as the CSI compression, classification, and/or the like.
- a given UE may not have the capabilities, as noted, to handle a given ML model, which may operate in a power hungry manner, so the inference of the ML model may need additional processing resources for ML model execution at the UE, when compared to a UE that does not execute an ML model.
- the ML model’s inference phase may use too much of a UE’s available resources and/or take too much time for a specific inference.
- a signaling mechanism such as a messaging exchange, between radio access network (e.g., a gNB or other type of base station or access point) and the UE, such that the signaling supports ML model transfer (and/or ML model update) while taking into account ML model adaptation constraints of the UE.
- radio access network e.g., a gNB or other type of base station or access point
- ML model transfer and/or ML model update
- gNB UE signaling that defines the ML model adaptation constraints at a UE (or group of UEs) with selected constraints affecting the ML model training and/or inference.
- the UE s model adaptation constraints (or constraints, for short) with respect to machine learning may be processed at for example the gNB in order to decide whether to adapt the ML model before providing the ML model to the UE.
- the signaling may indicate the scope (e.g., how much) of the adaptation of the ML model before providing the “adapted” ML model to the UE.
- the UE’s constraints with respect to machine learning are indicated using categories, such as ML categories indicative of the constraints at the UE for handling an ML model.
- categories such as ML categories indicative of the constraints at the UE for handling an ML model.
- a new function at gNB such that the new function adapts (e.g., prepares, modifies, updates, and/or the like) the ML model in response to the UE’s model adaption constraints (which may, for example, be in the form of the UE’s ML category) signaled to the gNB by the UE.
- the UE may indicate its constraints with respect to executing an ML model at the UE during an initial ML model deployment to the UE as well as at other times, such as when conditions at the UE change due to for example changes in battery level, available memory, available processor resources, UE mobility state changes, and/or the like.
- the network may cluster one or more UEs using the constraints of the UEs, and may prepare an ML model (or an update to an ML model) in response to these constraints.
- a group of UEs may be clustered by the network, such that the UEs in the cluster have the same or similar ML constraints (e.g., as signaled to the network in a UE capability exchange with the network during an initial network attachment or at other times such as UE service request, etc.).
- the network may adapt an ML model based on the ML constraints and transfer the “adapted” ML model to the entire cluster of UEs. As noted, the transfer may include sending the parameters of the adapted ML model.
- the network may send the adapted (also referred to as prepared) ML model to one or more UEs, and this ML model may be a partial ML model (in which case the partial ML model may be further trained or tuned by the UE as noted in the federated learning example above) and/or a “complete” ML model ready for use to perform inferences at the UE (so the “complete” ML model does not require additional training or tuning by the UE).
- this ML model may be a partial ML model (in which case the partial ML model may be further trained or tuned by the UE as noted in the federated learning example above) and/or a “complete” ML model ready for use to perform inferences at the UE (so the “complete” ML model does not require additional training or tuning by the UE).
- the network may indicate, via for example assistance information, to the UE the selected option with respect to whether a partial or complete ML model is provided to the UE.
- the network may transfer a partial ML model to the UE.
- the UE may tune, adjust, and/or complete the partial ML model based on the UE’s constraints.
- the network may provide a complete ML model (e.g., complete in the sense that the UE does not need to tune or adjust the ML model) tailored to the UE’s constraints.
- the network may indicate which of these two options is the selected option to inform the UE.
- the network such as the gNB or other network node, may not be able to adapt an ML model that meets the model adaptation constraints of a UE.
- the UE’s constraints may be so limited that the network may not be able to limit the scope of the ML model while providing a viable ML model which can be used by the UE to accurately perform the corresponding inference task of the ML model.
- the network may send to the UE(s) an indication that the ML model cannot be provided to the UE.
- the network may send to the UE(s) the original trained ML model (e.g., which has not be prepared or adapted to accommodate UE ML constraints) with additional information that assists the UE with adaptation (e.g., for example different compression/pruning options, such as bit width restriction 16 or 32 bit, pruning options such as a suggestion to drop a given number of layers) while ML model accuracy can still be maintained after the adaptation.
- the UE may adapt the ML model to its own constraints and inform the gNB of the adaptations (and/or of the performance of the UE or ML model or a failure to apply or adapt the ML model).
- the ML model exchanged between a UE and a network node such as the gNB may, as noted, be provide by sending ML model parameters (e.g., weights of a neural network and/or the like). And, the provided ML model parameters may also be sent with additional metadata, such as other configuration information for the ML model or ML model operation (e.g., operations related to training, data collection, pre-processing of data, and/or post-processing of data).
- ML model parameters e.g., weights of a neural network and/or the like
- additional metadata such as other configuration information for the ML model or ML model operation (e.g., operations related to training, data collection, pre-processing of data, and/or post-processing of data).
- FIG. 2A depicts an example of a signaling process, in accordance with some embodiments.
- FIG. 2 A depict the UE 102A and a network node, such as gNB 106.
- the ML model is prepared and transferred (e.g., transmitted, sent, granted access, etc.) by the gNB based on a request from the UE.
- the ML model may be prepared based on the UE’s constraints (e.g., UE’s capabilities with respect to ML).
- the UE 102 A may have a requirement for an ML model, in accordance with some embodiments. And, the requirement may be for a specific task, such as an ML model trained to perform CSI compression, optimum beam selection, or other task.
- the UE 102A may send towards the network (e.g., a base station, such as gNB 106 or other type of base station, access point, or network node) a request 202.
- the request 202 may indicate to the network that an ML model is requested for transfer to the UE.
- the UE may initiate the request 202 indicating a specific task or a specific ML model (e.g., an identifier that identifies a specific type of ML model, such as a ML model for a specific ML task, such as CSI compression or other task).
- the ML model requested at 202 may be a “partial” ML model that may need additional training by the UE or a so-called “complete” ML model that may not need additional training by the UE.
- the identifier may map to (or otherwise identify) an ML model and/or a task (which is mapped to an ML model).
- a first indicator may map to an ML model trained for a CSI compression task, while a second indicator may map to an ML trained for a beam selection task.
- the identifier and/or mapping to an ML model may be pre-defined (e.g., in a 3GPP standard or other type of standard).
- the identifier may be a 128-bit identifier (although the identifier may take other lengths as well).
- the most significant 32 bits of the 128 bits may indicate a vendor ID (or a location of the ML model)
- the next 32 bits may include a scenario (e.g., application, task/use case, such as CSI compression, and/or the like)
- the next 64 bits may serve as a data set ID that identifies the data used (or to be used) to train the ML model.
- the ML model transfer request 202 may include one or more UE model adaptation constraints (labeled “SetOfConstraints”) related to the UE’s execution of the ML model.
- the model adaptation constraints (or “constraints,” for short) may be associated with version information, such as a counter, time stamp, or the like, to indicate to the UE and the network different versions of the constraints as at least some of the constraints may vary over time.
- the one or more UE model adaptation constraints may corresponds to one or more of the following constraints at the UE: a physical restriction at the UE with respect to hardware resources and/or software processing of the ML model; a maximum ML model size (e.g., ML model size in kilobytes, megabytes, and/or the like); a processing constraint dictated by a UE’s specific capabilities, such as available memory at a UE; a maximum allowed processing power available at the UE for ML modeling (e.g., in floating point operations (FLOPS), teraFLOPS (TFLOPS), and/or the like); a bit width limitation for the ML model parameters (e.g., weights and bias that define or configure the ML model) in terms of for example 16 bits, 32 bits, 64 bits, or the like; an availability of a hardware accelerator or a generic processor capabilities at the UE (e.g., the presence or absence of a GPU, Al chip, a single core processor, a multi-core processor,
- the model adaptation constraints at the UE may be structured into a two-dimensional table, such as Table 1 below.
- Table 1 the UE model adaptation constraints are categorized, such as set 1, set 2, and so forth with a corresponding set of constraints.
- the maximum model size is 16, with a quantization of 16, GPU cores available for the ML model of 4, and memory available for the ML model of 4 megabytes (MB), for example.
- the selection of Set 1 defines the listed constraints, but if Set 2 is indicated, the constraints of a maximum model size is 32, with a quantization of 86, GPU cores available for the ML model of 2, and memory available for the ML model of 8 megabytes (MB), for example.
- a given use case may be mapped to one or more of the sets. For example, if an ML model is for a CSI compression task, then Set 1 may be specified, but if an ML model is for beam selection, Set 2 may be selected.
- the model adaptation constraints may be dynamic (and as such may vary over time), in which case the UE may signal via request 202 an indication of the updated set of UE constraints for the ML model execution at the UE. For example, if the state of the UE changes due to for example processing resources change (or other change such as the UE battery power level going below a threshold amount, or the UE switching its operation from a single connection to dual connectivity), the constraints at the UE change, so this change may trigger the UE to send a request 202 with updated UE model adaptation constraints to show for example the decrease in available resources for the ML model.
- the dynamic signaling (which is due to for example the change in UE state or resources) provided by the UE at 202 may be transmitted via, for example, an uplink message including the updated information.
- This message may include the updated constraints and/or a pointer to the constraints being updated (e.g., an earlier message with prior constraints), such that the pointer enables the network to update the UE constraints at the pointer location or flag the constraints as no longer valid due to the update.
- the UE may signal at 202 the model adaptation constraints in the form of an ML specific UE category (e.g., an ML category 1 UE).
- the UE may inform the network of the UE’s constraints with respect to ML model by using an additional entry in a first dimension of a table (as noted above at Table 1) and this entry may indicate that the UE needs the ML model within a given time (e.g., this may be the case due to an impending mobility event wherein the UE needs an ML model to perform measurement prediction of a set of cells or beams).
- the network may take this into account as a latency (e.g., time) sensitive request for adaptation of the ML model.
- the UE may need to predict the latency of transmission of an ultra-reliable low latency communications (URLLC) packet in the uplink (UL) and/or downlink (DL), for which an ML model is required within a given amount of time. If the UE does not receive an updated ML model within this period of time, the UE may initiate a fallback to a non-ML model approach or may continue to use a currently active ML model.
- URLLC ultra-reliable low latency communications
- the network node may prepare, based at least in part on the model adaptation constraints received at 202, the ML model, in accordance with some embodiments.
- the gNB may prepare the ML model, which may be a “partial” ML model or a “complete” ML model.
- the preparation may include adapting the ML model by, for example, compressing an ML model. This compressing may take the form of changing the quantization of the ML model parameters (e.g., weights, biases, and/or other configuration and operation information for the ML model).
- the gNB may adapt the ML model by reducing the quantity of bits in the parameters to 16 bits (which may have the effect of compressing the size or footprint of the ML model when sent to the UE).
- the network node such as the gNB 106 or other type of network node, may adapt the ML model by, for example, pruning the ML model. Pruning provides data compression by reducing the size of the ML model (e.g., reducing a number of nodes, layers, and/or weights).
- one or more nodes of the neural network may be removed for example, one or more layers of the neural network may be removed (e.g., based on the UE’s ML constraint indicating a maximum number of layers a UE can handle).
- the pruning may also reduce the number of processing operations (e.g., in terms of FLOPs or TFLOPs) to execute the pruned ML model.
- a larger ML model may be made smaller by removing nodes, layers, weights, connections, and/or reducing quantization to form a smaller ML model that is then transferred to the UE.
- the pruning and/or quantization may reduce the size of the ML model, such that the ML model can be more efficiently transferred over the air (or radio) interface between the UE and gNB and reduce the memory or storage footprint of the ML model.
- the smaller, pruned ML model may execute more rapidly and thus provide an output (e.g., as an inference) more quickly with less latency, when compared to a larger, unpruned ML model.
- the pruned ML model may be, as noted, less robust (in terms of accuracy) when compared to the larger, unpruned ML model, but the amount of pruning and/or quantization may be controlled to minimize (or manage) any loss in accuracy such that the pruned ML model still provides a viable and useful alternative to the larger, unpruned ML model.
- some types of UE types may (based on their UE model adaptation constraints) need the network to adapt an ML model before transfer due to physical computing resource limitations. But the need to adapt the ML model may also take into account more temporal variances at the UE, such as noted state (or condition) changes at the UE due to an energy saving mode, loss of processing resources, battery level, and/or the like.
- the number of layers may be defined by constraints that affect the rate at which inference may be performed (e.g., an inference latency).
- the network (which may have more processing resources than the UE) may for a 10-layer ML model have the same or similar inference latency as a pruned 5-layer ML model executed by the UE (which may have less processing resources than the network).
- the network may store the ML model with greater fidelity (e.g., as a larger ML model with higher quantization, higher quantity of nodes, layers, weights, and/or the like), when compared to the ML model that is adapted and transferred to the UE.
- the network may store as a “complete” ML model (which, e.g., is trained and ready for inference) having a certain number of layers and quantization, but the compressed (e.g., pruned and the like) ML model transferred to the UE may have fewer nodes, layers, weights, and/or the like) and/or have parameters quantized down from 32 bits to 24 (or 16 bits), for example to enable the pruned ML model to fit into a smaller memory footprint at the UE.
- a “complete” ML model which, e.g., is trained and ready for inference
- the compressed (e.g., pruned and the like) ML model transferred to the UE may have fewer nodes, layers, weights, and/or the like) and/or have parameters quantized down from 32 bits to 24 (or 16 bits), for example to enable the pruned ML model to fit into a smaller memory footprint at the UE.
- the network may receive a plurality of sets of model adaptation constraints and form a group of UEs that require a given ML model.
- the UE group may have the same (or similar) set of model adaptation constraints and the UE group may have a requirement for the same (or similar) ML model for a required use case or ML task.
- the network may perform the adaptation of the ML model for a given ML model and provide the given ML model to the entire UE group.
- the network may combine UE requests form a plurality of UE having the same (or similar) set of constraints and perform a single preparation of the ML model the entire group (or cluster).
- the adaptation may take into account whether the requested ML model is for a time sensitive or a critical service, such as URLLC, in which case a larger amount of UE processing resources may be allocated at the UE, when compared to non-time sensitive services.
- the UE may, at 202, indicate to the network whether the ML model is for a time sensitive or critical service, such as URLLC.
- a given ML model may map to a task that is time sensitive or critical, so in this example the request for the ML model implicitly indicates to the network that the model is for a time sensitive task.
- some UEs may be more tolerant of errors in the ML model, when compared to other ML models.
- a UE may, at 202, indicate to the network an indication regarding the amount of error (in the ML model) that can be tolerated (or allowed) by the UE.
- the network may take this into account during preparation at 203 A. For example, a UE may indicate that it prefers a very low ML model error rate, in which case the network may perform less pruning when compared to a ML model provided to a UE that indicates that it tolerates ML errors.
- Some initial number of layers may be set by simulation or training the ML model for the types provided so as to allow setting the number of layers to match a given inference latency rate.
- this information may be applicable to a particular class of UEs supporting high reliability e.g., URLLC type of traffic. This information may then be signaled to the UE either when requested, or grouped for UEs with similar requirements.
- the network such as the gNB 106 or other type of network node, may wait to prepare an ML model at 203 A until it has a plurality of requests (from a plurality of UEs) that can be clustered into a group having the same (or similar) constraints and/or same (or similar) required for a given type of ML model.
- the network may form multiple clusters of the same (or similar) UEs (with respect to constraints and need for a given type of ML model).
- the network may prepare (e.g., by pruning or quantization reduction) an ML model for each cluster and response to with a single ML model for a given one of the clusters.
- the formation of groups of UEs may use a profile.
- a set of model adaptation constraints may be collected over time and stored at for example a server, such as server 109, an edge server, and/or other type of server.
- the profile refers to a collection of set of model adaptation constraints that are gathered over time for a group of UE(s) and/or classified according to UE categories, types, and/or task being performed by the UE. If a UE’s profile matches a stored profile in the server, the profile may be quickly accessed to provide the set of constraints (and thus define the amount of adaptation to be performed for the ML model).
- a group of UEs may be of the same type, such as loT sensors, vehicle-to-everything (V2X) mobile devices, and/or the like.
- V2X vehicle-to-everything
- the database can adapt the ML model for the constraints mapped to the loT device profile.
- the network node 106 may initiate a transfer of an ML model (which as noted may be a partial or a complete ML model), in accordance with some embodiments.
- an ML model (which as noted may be a partial or a complete ML model)
- the ML model (which is prepared at 203 A) may be transferred over the air interface from the gNB 106 to the UE 102A.
- the ML model transfer may include a transfer of the parameters of the ML model, such parameters may include for example weights of the connections (and/or other parameters or metadata for the configuration and/or execution of the ML model).
- the ML model transfer 204 A may also include information (also referred to as “assistance information”), which may indicate for example how to monitor the performance of the ML model and/or whether (and/or when) to report the performance to the network.
- the assistance information provided to the UE at 204A may include at least one instruction for monitoring performance of the machine learning model and/or for monitoring at least one user equipment performance indicator.
- the network may initiate a ML transfer with a ML model (e.g., version 1 of the ML model) and assistance information (e.g., version 1).
- This assistance information may instruct the UE to record and report back to the network the UE’s consumption of hardware and/or software resources (e.g., reporting back to the network a percentage available processor resources being used relative to processor resources available at the UE, a percentage of a buffer consumed relative to available buffer capacity, and/or the impact of other UE constraints such as radio measurement performance, overheating, inference latency, and the like when executing the ML model).
- Table 2 below depicts an example of the parameters recorded by the UE and reported back to the network.
- the parameters may be used by the network to accept the ML model (e.g., VI) for use at the UE, reject the ML model (e.g., if it degrades the operation of the UE below a threshold level), and/or further adapt the ML model to form a new version of the ML model, which can be transferred as V2 to the UE.
- the data set identifier (ID) may provide a pointer or a reference to a collection of labelled data samples (or data sets) that are used for model training and validation purposes.
- the UE 102 A may apply the ML model provided at 204 A by training the ML model or for the inference of the ML model, in accordance with some example embodiments. Moreover, the UE may also monitor the effects of the ML model at the UE and/or monitor at least one user equipment performance indicator. This monitoring may be in response to the assistance information (and in particular the instruction(s)) provided at 204A (which may indicate to the UE whether (and what) to monitor performance, what resource(s) to monitor at the UE, and/or whether (and/or when) to report back the observations obtained via the monitoring).
- the assistance information and in particular the instruction(s)
- the monitoring may include observing the change in processor resources, memory resources, impact to other functionality (e.g., radio measurement performance, overheating, inference latency, etc.). and/or the like and reporting the observations to the network, such as the gNB 106 or other type of network node.
- the assistance information may instruct the UE to report back if for example processor resources (or, e.g., memory and/or the like) used while executing the ML model at the UE exceed a threshold amount; and if the threshold exceed, the UE reports back to the network.
- the UE may report to the network any observation performed by the UE (which may as noted be indicated by the assistance information provided at 204A).
- the UE may transmit to the access node (which in this example is a gNB) information on the performance of the ML model (and/or a UE performance indicator).
- the UE may continue to use the ML model (if. E.g., the UE so chooses).
- the UE 102A may record observation related to the consumption of resources at the UE during ML model execution and report at 206A the observation. These observations may (as noted above with respect to Table 2), be used by the UE (or network) to accept a ML model for use at a given UE. If for example the observations indicate the performance of the ML model decreases below a threshold performance (e.g., with respect to resources at the UE as noted above), the network may not “accept” the use of the prepared ML model at the UE (as well as other UEs with the same or similar constraints).
- the observed information may be formatted into a two dimensional form where the first dimension comprises of the information described earlier and the 2nd dimension would be the data set identifier used for the acceptance process for the model in the model transfer feedback message.
- the network node 106 may use the received feedback at 206 A to adapt (e.g., tune) how the ML model is prepared for the UE. If for example the feedback indicates the processor resources exceed a threshold amount of processor resources, the network may do additional compressing, such as pruning (or decrease quantization) of the ML model, and provide an updated ML model at 208. Likewise, if the feedback indicates the processor resources are below the threshold amount of processor resources, the network may undue some of the pruning (e.g., add nodes, layers, or increase quantization) of the ML model and provide an updated ML model (e.g., version 2 of the ML model) at 208.
- additional compressing such as pruning (or decrease quantization) of the ML model
- the network may undue some of the pruning (e.g., add nodes, layers, or increase quantization) of the ML model and provide an updated ML model (e.g., version 2 of the ML model) at 208.
- the model transfer at 208 may include only the changed parameters of the ML model).
- the model transfer at 208 e.g., ML Model version 2 (v2) may include the entire ML model (e.g., all of the parameters of the ML model).
- FIG. 2B depicts another example of a process for signaling, in accordance with some embodiments.
- FIG. 2B is similar in some respects to FIG. 2A, so the process at FIG. 2B includes 201, 202, 203 A, and 204A noted above.
- the UE 102A cannot apply (or chooses not to apply) the ML model due to for example model adaption constraints at the UE.
- the constraints may include a local model adaption constraint, such as the UE choosing to fall back to a non-ML mode or other type of constraint (e.g., a change in the mode of the UE, such as power savings mode, and/or the like).
- the UE may respond at 206B to the network node 106 with an indication of a failure (labeled “model application failure”).
- the message at 206B may also include a cause of the failure and/or an activation delay.
- the cause may indicate a reason why the UE chose not to apply the ML model, while the activation delay refers to a maximum time before which the model is required to be applied (e.g., due to the underlying use case requirement).
- the UE may not use the ML model due to an inference latency requirement (e.g., the latency for providing the ML model’s inference exceeds a threshold latency), so the ML model cannot be applied or used.
- the UE may not be able to use the ML model due to a run-time execution issue, in which case the ML model cannot be executed at the UE.
- the UE may, in response to not being able to use the ML model, switch to a non- ML mode of operation for a task or continue to use a current ML model (e.g., an ML model being used prior to the model transfer at 204A).
- FIG. 2C depicts another example of a process that provides signaling, in accordance with some embodiments.
- FIG. 2C is similar in some respects to FIG. 2A, so the process at FIG. 2C includes 201 and 202 but at 203 C the network node 106 cannot prepare an ML model based on the UE model adaption constraints provided to the network at 202.
- the network node 106 may receive the set of constraints 202 and note that for a given ML model, the ML model should not be executed at the UE as the model adaption constraints indicate the UE cannot handle the execution of the ML model (even with adaptation) and/or the amount of requested/needed compression (e.g., pruning, quantization, and/or the like) will yield an ML model below a threshold level of accuracy.
- the UE may be responsible for the adaptation of the ML model, so the ML model is fully transferred to the UE to allow adaptation by the UE.
- the network node 106 may at 204C indicate to the UE that the network node 106 cannot transfer an adapted ML model and/or may instead transfer a “full” ML model (where “full” refers to an ML model which is full that is has not been adapted at 203C by the network).
- the model transfer also may include assistance information as noted above but the assistance information may also indicate that the ML model is a “full” unpruned ML model.
- the UE may choose to adapt (e.g., compress by for example prune or otherwise adapt) the ML model received at 204C based on the UE’s model adaption constraints.
- the UE may respond to the network with feedback in the form of observations as noted above with respect to 206A or an indication failure as noted above with respect to 206B.
- the UE may continue in its current state (e.g., using an ML model or non-ML model for a task) or switches to the ML model provided and adapted at 204C/205C based on the feedback.
- the network node 106 may be unable to prepare an ML model with the UE’s constraints, in which case the network node does not perform pruning/quantization.
- the network node may transfer the full ML model to the UE and leaves it up to the UE to prune the model leaving it to UE implementation.
- the network node may still provide assistance information about the ML model that assists the UE in (if any) model pruning and/or quantization process and/or other metadata to assist the UE in training or inference.
- the network node may guide (e.g., via assistance information) the UE to use a given configuration (e.g., 16 bit quantization, pruning of 2 layers, and direct connection of inner layers).
- the network node may guide (e.g., via assistance information) the UE to use 24 bit quantization and remove 3 layers.
- the network node may guide (e.g., via assistance information) the UE to consider removing 2 layers.
- the UE may (given this assistance information) decide after it performs the ML model adaptation (e.g., pruning, quantization, layer removal, and/or the like), if the resulting ML model can be execute at the UE.
- the UE (which received, the full ML model at 204C) may use a side link communication (e.g., the Proximity Services, ProSe) upon indication from the network to exchange the full ML model with other UEs having a similar or a same constraints (or a similar or a same category) so that the UEs can also perform the ML model adaptation.
- the UE may also exchange a pruned/quantized ML model if there is a constraint in the amount of data that can be exchanged with the neighboring (side link) UE(s).
- the UE 102A may provide to the network node 106 the UE’s model adaption constraints with respect to ML model execution as category information (“ue- MLcategory”) that indicates UE’s capabilities for ML model execution.
- category information (“ue- MLcategory”) that indicates UE’s capabilities for ML model execution.
- Table 3 depicts an example of the UE category information.
- the network node may receive the UE category information and treat the category information as indicative of the set of constraints for the UE, such that the category information allows the network to prepare the ML model based on the UE’s unique constraints.
- the UE category information may be pre-defined, such as in a standard or 3GPP standard to enable standardization in the system.
- the UE categories for machine learning may be defined based on a variety of factors.
- categories may define that the UE may have one or more of the following constraints:
- a memory size of a given size e.g., in megabytes
- an amount of supported quantization for the ML model parameters e.g., weights, biases, and/or other configuration information for the ML model
- a maximum number of training parameters for the ML model which may correspond to the total number of parameters to be estimated during the training of an ML model (e.g., in a neural network of 2 hidden layers and 50 hidden nodes per layer, the maximum number of training parameters may correspond to a total of 261,000 of trainable parameters);
- data handling capacity including memory which determines what length of data batches the UE is able to handle to perform model training (e.g., data handling capacity can also refer to the limitations in the amount of training iterations the UE is able to do);
- an inference speed test (e.g., how long it takes a UE to perform an inference using a given ML model), which may be a cumulative distribution function (CDF) of the inference times for a group of data samples.
- CDF cumulative distribution function
- training speed tests (how long does it take to train a ML model given a particular dataset);
- the network may determine that a given UE may be able to use a particular ML model (e.g., for a given task or use case) only when the UE has a provided UE category information with respect to ML.
- each of the ML models (which the network node 106 is able to transfer to a UE) may have a minimum ue-MLcategory mapped to the ML model.
- the network node sends a ML model to the UE when the signaled ue-MLcategory meets or exceeds the ML model’s minimum (“ue-MLcategory.
- the network node may cluster, as noted, the UE’s based at least in part on the UE category information (ue-MLcategory) with respect to ML.
- the network can prepare an ML model based on the indication of the UE category information with respect to ML (“ue-MLcategory”).
- the gNB may include or have access to a trained “complete” ML model (which uses 1,200,000 parameters, each of the parameters are quantized as 32 bit integers and the whole ML model occupies 65 megabytes (MB).
- the UE may indicate at 202 for example or at other times, ue-MLcategory of 3.
- the network may prepare the ML model as follows.
- Prune an ML model to reduce the number of trainable parameters to 500,000 e.g., using an iterative process or based on the outcome of earlier pruning procedures for similar ML models, such that the accuracy of the pruned ML model may be checked and if the accuracy meets a threshold accuracy, further adaptation of the ML model may occur).
- the further adaptation may quantize the ML model to match the UE’s ue-MLcategory of 3, which in this example corresponds to 8 bit integer per parameter (so the 32 bit integers are truncated to 8 bits), and again the accuracy of the pruned and quantized model is checked and if the accuracy meets a threshold accuracy, further adaptation may be performed.
- the further adaptation may include memory size of the pruned and quantized ML model, such that the memory size is compared with the allowed size indicated by the ue-MLcategory 3 (which in this example is 32 megabytes, MB); if the pruned and quantized model size is smaller than the maximum allowed value of 32 MB, the adaptation may be considered complete, so the ML model is ready to be transferred to the UE at 204A, for example.
- the adaptation may be repeated with a change (e.g., by varying the available model hyperparameters for the training procedure, until a suitable model format is achieved).
- the process for preparing the ML model may also vary based on the type of ML model (e.g., whether the ML model is a convolutional neural network (CNN), deep neural network (DNN), long short-term memory network (LSTM), and/or other type of ML model), so the ML-enabled function description might also need to include some information about the general ML model type used or acceptable by the UE.
- CNN convolutional neural network
- DNN deep neural network
- LSTM long short-term memory network
- the ML-enabled function description might also need to include some information about the general ML model type used or acceptable by the UE.
- a partial ML model transfer occur from gNB to UE.
- a first part in UE ML model may be considered static (which should not be changed by the UE) and the second part may be changed by the UE, so in a ML model transfer to the UE, the UE may change the second part of the ML model.
- the model transfer may indicate which portions may be changed (e.g., changed during training) by the UE.
- the set of model adaption constraints may be abstractly shared without exposing the UE’s underlying architecture to the network (e.g., abstract enough to cover a few levels in a real-time model transfer).
- One such example implementation of a set of constraints is a UE profile as noted above.
- some of the UE model adaption constraints may be more static, such as absolute constraints, while other constraints may be more dynamic, such as evolving constraints, and other constraints may be also be subjective.
- the model adaption constraints may include an absolute set of criteria, such as an absolute memory size, hardware type, and/or the like.
- the criteria may also include a list of evolving criteria such as semi-static criteria, example of which include an ongoing traffic level at the UE, ongoing battery level, available memory, processing power, and/or the like).
- the criteria may also a set of subjective criteria that may be guided by the end user, example of such criteria may include eco-mode (or battery save) configured at the UE, a maximum performance mode, and/or the like.
- the set of constraints or the ue-MLcategory may also be used.
- UE vendors may be allowed to indicate the UE ML capability, without having to expose the workings of the architecture of the UE (which may be considered proprietary or private). In other words, some of the constraints may vary over time (and thus more dynamic), such as the battery level of the UE, mode the user puts the UE in such as a power saving mode.
- machine learning model adaptation may further include adaptation of the input data pre-processing and/or output data post-processing.
- the battery-life of the edge device such as the UE, may cause limitations for computing capacity.
- some applications, services, and/or use case may have strict latency requirements (e.g., time requirements with respect to how long it takes the ML model to perform an inference or how long it takes the ML model to converge when training).
- the ML model may be optimized for real-time inference (or updates for ML model training), which may be carried out by ML model compression techniques, such as pruning, quantization, and/or the like.
- Pruning may be used as an ML model compression technique in machine learning to reduce the size of the ML Model by removing elements of the ML model that are non-critical and/or redundant from a ML standpoint.
- the ML model may be optimized for real-time inferences for resource-constrained devices, such as UEs.
- the ML model pruning may also be used with other ML model compression techniques such as quantization, low-rank matrix factorization, and/or the like, to further reduce the size of the ML model.
- the original, unpruned ML model and the pruned ML model may have the same (or similar) architecture in some respects but the pruned model being sparser (e.g., with weights with the low magnitude being set to zeros).
- ML model monitoring this may be performed during for example an operational stage of the machine learning lifecycle: changes in a machine learning model performance are monitored, such as model output performance, input data drift, and/or concept drift for ensuring that the model is maintaining an acceptable level of performance.
- the monitoring may be carried out by evaluating the performance on real-world data.
- the monitored performance indicator may be system performance indicators or intermediary performance indicators.
- ML model compression the same training data may be used for both the original ML model and the compressed (e.g., adapted) ML model.
- ML model monitoring may be based on evaluation metrics and related conditions, such as threshold values. The choice of the evaluation metrics (e.g., confusion matrix, accuracy, precision, recall, and Fl score) may depend on a given machine learning task, and the ML model being used.
- FIG. 3B depicts an example process at a user equipment for ML model adaption, in accordance with some embodiments.
- the user equipment may transmit, by a user equipment and to an access node, a request for a machine learning model, wherein the request includes information on at least one model adaptation constraint for training of the machine learning model or an inference of the machine learning model, in accordance with some embodiments.
- the UE 102 may transmit towards an access node (e.g., a radio access node, gNB base station, server, such as an edge server) a request for a ML model as noted in the examples of FIGs. 2A-C at 202.
- the request may include at least one model adaptation constraint for the ML model while training or inferring.
- the model adaptation constraint may one or more constraints.
- the model adaptation constraint may be in the form of a UE category (such as the categories noted above with respect to Table 3) that map to one or more constraints with respect to the UE’s training of a machine learning model and/or the UE performing inferences using the machine learning model.
- a UE category such as the categories noted above with respect to Table 3
- the model adaptation constraints may include a constraint related to the machine learning model (e.g., a worst case inference latency requirement for the ML model, an average inference latency, a maximum number of predicted time steps in case of predicting a future series of events, a time window duration corresponding to the number of predicted time steps, and/or the like), a constraint related to a user equipment resource constraint (e.g., available processor resources, memory resources, and the like), a battery life of the user equipment, as well as other types of constraints that impact the model adaptation.
- a constraint related to the machine learning model e.g., a worst case inference latency requirement for the ML model, an average inference latency, a maximum number of predicted time steps in case of predicting a future series of events, a time window duration corresponding to the number of predicted time steps, and/or the like
- a constraint related to a user equipment resource constraint e.g., available processor resources, memory resources, and the like
- the user equipment may receive from the access node the machine learning model (that is adapted in accordance with the at least one model adaptation constraint) and may receive at least one instruction for monitoring performance of the machine learning model and/or for monitoring at least one user equipment performance indicator, in accordance with some embodiments.
- the UE 102A for example may receive (e.g., as shown at 204A) from the access node a ML model which has been adapted using the at least one model adaptation constraint.
- the ML model may be adapted to reduce its size, which may reduce the time (or latency) for the ML model to perform an inference.
- the performance indicator may include key performance indicators (KPIs) for different network segments, layers, mechanisms, aspects, services, and/or activities.
- KPIs key performance indicators
- the UE performance indicator may indicate performance in relation to network transport, front-haul, a radio link quality, a data plane efficiency, and/or control plane operations (e.g., hand over execution time, user attachment time, and/or the like).
- Additional examples of UE performance indicators include latency, throughput for a network and/or a network slice, UE throughput, UE power consumption, and/or the like.
- the network e.g., a node of the network
- the user equipment may apply the machine learning model to the training of the machine learning model or the inference of the machine learning model, in accordance with some embodiments.
- the UE may apply the ML model at the UE by using the ML model for inference or to train the ML model.
- the user equipment may monitor the machine learning model and/or the at least one user equipment performance indicator according to the at least one instruction, in accordance with some embodiments.
- the UE may receive one or more instruction at 204A.
- the instructions may inform the UE regarding observing the performance of the ML model and/or the UE performance (e.g., the user equipment performance indicator(s)) while using the ML model for training or inference.
- the instruction may include one or more metrics for evaluating the performance of the machine learning model and/or the UE (e.g., latency of an inference, impact to the UE’s ability to perform other tasks, such as radio and/or channel quality measurements (e.g., for measuring frequencies the UE is currently operating on as well as the frequencies within the same radio access technology or outside current radio access technology and/or the like)).
- the instruction may include a KPI or other condition, such as a performance degradation of the user equipment when using the ML model for training or inference.
- the condition may include a threshold value (e.g., a percentage usage of a processor, memory, and/or other resource), and if the threshold value is exceeded, the UE reports the observation to the access node.
- the user equipment may transmit to the access node information on at least one of the performance or failure information indicating the user equipment failure to apply the machine learning model, in accordance with some embodiments.
- the UE 102A may transmit as feedback to the access node one or more observations made based on the instruction to monitor performance.
- the UE 102A may transmit to the access node an indication of a failure to apply the ML model at the UE (see, e.g., 206B).
- the UE may choose to not use the ML model for training or inference.
- the UE may indicate to the access node a failure.
- the ML model received at 304 is not adapted by the access node, in accordance with some embodiments.
- the access node (which in this example is gNB 106) may not be able to adapt the ML model given the UE’s model adaptation constraint s).
- the UE may receive (from the access node) a machine learning model that has not been adapted (e.g., un-adapted ML model) to allow the UE to adapt the ML model.
- the un-adapted ML model may be received with instructions for monitoring performance of the machine learning model and/or for monitoring the UE performance indicator so the UE can report back to the access node.
- the access node may provide assistance information to the UE, such as information indicating the ML model is not adapted.
- assistance information such as information indicating the ML model is not adapted.
- the UE may adapt the ML model and then continue with the applying (306), the monitoring (308), and/or the transmitting (310).
- the UE in response to the performance indicating there is no failure to apply the machine learning model at the user equipment, the UE may continue to use the machine learning model as noted in the example of 210. Alternatively, or additionally, in response to the performance indicating there is a failure to apply the machine learning model at the user equipment, the UE may switch as a fallback to a non-machine learning mode for performing a task of the machine learning model and/or switch to a prior version of the machine learning mode for performing the task.
- the machine learning model may be adapted in accordance with the at least one model adaptation constraint by at least compressing the machine learning model.
- This compressing may reduce the size of the ML model.
- the compressing may take the form of pruning weights from the ML model (e.g., weight pruning), a structural pruning (e.g., removing layers, nodes, and the like), quantization changes (e.g., weight quantization from 32 to 16 bits), and/or machine learning model architecture change (e.g., choosing another type of ML model, such as a convolutional neural network (CNN) instead of a multi-layer perceptron (ML)).
- CNN convolutional neural network
- ML multi-layer perceptron
- FIG. 3C depicts an example process at an access node for ML model adaption, in accordance with some embodiments.
- the access node may receive from a user equipment a request for a machine learning model, wherein the request includes information on at least one model adaptation constraint for training of the machine learning model or an inference of the machine learning model, in accordance with some embodiments.
- the access node (which in this example is a gNB) may receive from UE 102A a request for an ML model.
- the request may include information on at least one model adaptation constraint for ML model training of the machine learning model or ML model inference.
- the model adaptation constraint may include one or more values for constraints and/or a UE category (such as the categories noted above with respect to Table 3) that maps to one or more constraints with respect to the UE’s training of a machine learning model and/or the UE performing inferences using the machine learning model.
- a UE category such as the categories noted above with respect to Table 3
- the model adaption constraints may include a constraint related to the machine learning model (e.g., a worst case inference latency requirement for the ML model, an average inference latency, a maximum number of predicted time steps in case of predicting a future series of events, a time window duration corresponding to the number of predicted time steps, and/or the like), a constraint related to a user equipment resource constraint (e.g., available processor resources, memory resources, and the like), a battery life of the user equipment, as well as other types of constraints.
- a constraint related to the machine learning model e.g., a worst case inference latency requirement for the ML model, an average inference latency, a maximum number of predicted time steps in case of predicting a future series of events, a time window duration corresponding to the number of predicted time steps, and/or the like
- a constraint related to a user equipment resource constraint e.g., available processor resources, memory resources, and the like
- the access node may comprise or be comprised in a radio access node, a gNB base station, a server, and/or an edge server (e.g., a server coupled to a gNB or located with the gNB).
- the access node may adapt the machine learning model using the at least one model adaptation constraint, in accordance with some embodiments.
- the gNB may adapt the ML model while taking into account the UE’s model adaptation constraint.
- the ML model adaption may include compressing, based on the model adaptation constraint s), the machine learning model.
- the ML model may compress by pruning weights from the ML model (e.g., weight pruning), a structural pruning (e.g., removing layers, such as hidden layers, or removing nodes), quantization changes (e.g., weight quantization from 32 to 16 bits), and/or machine learning model architecture change.
- weight pruning e.g., weight pruning
- structural pruning e.g., removing layers, such as hidden layers, or removing nodes
- quantization changes e.g., weight quantization from 32 to 16 bits
- machine learning model architecture change e.g., machine learning model architecture change.
- the access node may determine at least one instruction for monitoring performance of the machine learning model and/or for monitoring at least one user equipment performance indicator, in accordance with some embodiments.
- the instruction may instruct the UE regarding monitoring the ML model and/or the UE performance during the training or inference.
- the instruction may include one or more metrics for evaluating the performance of the machine learning model (and/or for monitoring UE performance indicator(s), such as the KPIs).
- the instruction may include a KPI or other condition, such as a performance degradation of the user equipment when using the ML model for training or inference.
- the condition may include a threshold value (e.g., a percentage usage of a processor, memory, and/or other resource), and if the threshold value is exceeded, the UE reports the observation to the access node.
- the access node may transmit to the user equipment, the machine learning model and the at least one instruction, in accordance with some embodiments.
- the gNB may transmit to the UE the ML model and instructions (e.g., assistance information and/or other types of information).
- the access node may receive from the user equipment information on monitoring carried out based on the at least one instruction, or failure information indicating the user equipment failure to apply the machine learning model, in accordance with some embodiments.
- the gNB may receive from the UE feedback, which may include observations on the monitored performance of the ML model and/or UE and/or an indication of a failure to apply the ML model by the UE.
- the access network may not be able to adapt the ML model using the at least one model adaptation constraint provided at 320. When this is the case, the access node may (as noted in the example of 204C) provide an ML model to the UE to allow the UE to attempt to adapt the ML model.
- the access node in response to receiving the information on monitoring carried out based on the at least one instruction, or the failure information indicating the user equipment failure to apply the machine learning model, the access node may further adapt the machine learning model using the information from the user equipment.
- the access node (which in this example is gNB 106) may, based on the feedback from the monitoring and/or the failure to apply the ML model, adapt the ML model.
- the adapted ML model may be transmitted again to the UE (either when requested or without a request as an update for example).
- the ML model adaption may include compressing, based on the model adaptation constraint(s), the machine learning model by, for example, pruning weights from the ML model (e.g., weight pruning), a structural pruning (e.g., removing layers, such as hidden layers, or removing nodes), quantization changes (e.g., weight quantization from 32 to 16 bits), and/or machine learning model architecture change.
- pruning weights from the ML model e.g., weight pruning
- a structural pruning e.g., removing layers, such as hidden layers, or removing nodes
- quantization changes e.g., weight quantization from 32 to 16 bits
- machine learning model architecture change e.g., weight quantization from 32 to 16 bits
- the at least one instruction for monitoring performance of the machine learning model may include one or more metrics for evaluating the performance of the machine learning model.
- the at least one user equipment performance indicator may include one or more key performance indicators.
- the instruction(s) may include one or more metrics for evaluating the performance of the machine learning model (and/or for monitoring UE performance indicator(s), such as the KPIs).
- the instruction may include a KPI or other condition, such as a performance degradation of the user equipment when using the ML model for training or inference.
- FIG. 4 shows an example of a ML model 110, in accordance with some example embodiments.
- the ML model may include one or more blocks.
- the first neural network (NN) Block 1 402 may receive data 410 as inputs from the UE 102A. This data may represent data such as measurements related to CSI compression, beam measurements, and/or the like.
- the ML model may include so-called “internal” NN blocks 404A, B, and L.
- each internal NN block (2,...,L-1) may have n h neurons (e.g., the number of neurons), such that the NN Block L 404L generated the output 410 using, for example, M output neurons corresponding to the outputs 410 (e.g., in an inference phase the outputs correspond to the task being performed such as beam selection, CSI compression values, and/or the like while in training the outputs may correspond to “labeled” data used to train the ML model).
- the NN Block may be configured as a fully connected layers (FNN), activation function layers, and batch normalization layers.
- the weights, W C L and biases w forms the trainable parameters W L of the Zth NN block.
- a decision function e.g., a SoftMax function
- the ML model 110 may be trained with a stochastic gradient descent (SGD) algorithm that computes a minimum of the loss function in a direction of the gradient with respect to the ML model weights W although other training techniques may be used as well.
- SGD stochastic gradient descent
- the ML model may be trained in other ways as noted above using for example federated learning, unsupervised learning, and/or the like.
- FIG. 5 depicts a block diagram of a network node 500, in accordance with some example embodiments.
- the network node 500 may comprise or be comprised in one or more network side nodes or functions, such as the network node 106 (e.g., gNB, eNB, DU, TRPs, coupled server 109, centralized server, edge server, and/or the like).
- the network node 106 e.g., gNB, eNB, DU, TRPs, coupled server 109, centralized server, edge server, and/or the like.
- the network node 500 may include a network interface 502, a processor 520, and a memory 504, in accordance with some example embodiments.
- the network interface 502 may include wired and/or wireless transceivers to enable access other nodes including base stations, other network nodes, the Internet, other networks, and/or other nodes.
- the memory 504 may comprise volatile and/or non-volatile memory including program code, which when executed by at least one processor 520 provides, among other things, the processes disclosed herein with respect to the gNB (or access node), for example.
- FIG. 6 illustrates a block diagram of an apparatus 10, in accordance with some example embodiments.
- the apparatus 10 may comprise or be comprised in a user equipment, such as user equipment 102A-N.
- the various embodiments of the user equipment 204 can include cellular telephones such as smart phones, tablets, personal digital assistants (PDAs) having wireless communication capabilities, portable computers having wireless communication capabilities, image capture devices such as digital cameras having wireless communication capabilities, gaming devices having wireless communication capabilities, music storage and playback appliances having wireless communication capabilities, Internet appliances permitting wireless Internet access and browsing, tablets with wireless communication capabilities, as well as portable units or terminals that incorporate combinations of such functions, in addition for vehicles such as autos and/or truck and aerial vehicles such as manned or unmanned aerial vehicle and as well as portable units or terminals that incorporate combinations of such functions.
- vehicles such as autos and/or truck and aerial vehicles
- aerial vehicles such as manned or unmanned aerial vehicle and as well as portable units or terminals that incorporate combinations of such functions.
- the user equipment may comprise or be comprised in an loT device, an Industrial loT (IIoT) device, and/or the like.
- an loT device or IToT device the UE may be configured to operate with less resources (in terms of for example power, processing speed, memory, and the like) when compared to a smartphone, for example.
- the apparatus 10 may include at least one antenna 12 in communication with a transmitter 14 and a receiver 16. Alternatively transmit and receive antennas may be separate.
- the apparatus 10 may also include a processor 20 configured to provide signals to and receive signals from the transmitter and receiver, respectively, and to control the functioning of the apparatus.
- Processor 20 may be configured to control the functioning of the transmitter and receiver by effecting control signalling via electrical leads to the transmitter and receiver.
- processor 20 may be configured to control other elements of apparatus 10 by effecting control signalling via electrical leads connecting processor 20 to the other elements, such as a display or a memory.
- the processor 20 may, for example, be embodied in a variety of ways including circuitry, at least one processing core, one or more microprocessors with accompanying digital signal processor(s), one or more processor(s) without an accompanying digital signal processor, one or more coprocessors, one or more multi-core processors, one or more controllers, processing circuitry, one or more computers, various other processing elements including integrated circuits (for example, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), and/or the like), or some combination thereof. Accordingly, although illustrated in FIG. 6 as a single processor, in some example embodiments the processor 20 may comprise a plurality of processors or processing cores.
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- the apparatus 10 may be capable of operating with one or more air interface standards, communication protocols, modulation types, access types, and/or the like.
- Signals sent and received by the processor 20 may include signalling information in accordance with an air interface standard of an applicable cellular system, and/or any number of different wireline or wireless networking techniques, comprising but not limited to Wi-Fi, wireless local access network (WLAN) techniques, such as Institute of Electrical and Electronics Engineers (IEEE) 802.11, 802.16, 802.3, ADSL, DOCSIS, and/or the like.
- these signals may include speech data, user generated data, user requested data, and/or the like.
- the apparatus 10 and/or a cellular modem therein may be capable of operating in accordance with various first generation (1G) communication protocols, second generation (2G or 2.5G) communication protocols, third-generation (3G) communication protocols, fourth-generation (4G) communication protocols, fifth-generation (5G) communication protocols, sixth-generation (6G) communication protocols, Internet Protocol Multimedia Subsystem (IMS) communication protocols (for example, session initiation protocol (SIP) and/or the like.
- the apparatus 10 may be capable of operating in accordance with 2G wireless communication protocols IS- 136, Time Division Multiple Access TDMA, Global System for Mobile communications, GSM, IS-95, Code Division Multiple Access, CDMA, and/or the like.
- the apparatus 10 may be capable of operating in accordance with 2.5G wireless communication protocols General Packet Radio Service (GPRS), Enhanced Data GSM Environment (EDGE), and/or the like. Further, for example, the apparatus 10 may be capable of operating in accordance with 3G wireless communication protocols, such as Universal Mobile Telecommunications System (UMTS), Code Division Multiple Access 2000 (CDMA2000), Wideband Code Division Multiple Access (WCDMA), Time Division- Synchronous Code Division Multiple Access (TD-SCDMA), and/or the like. The apparatus 10 may be additionally capable of operating in accordance with 3.9G wireless communication protocols, such as Long Term Evolution (LTE), Evolved Universal Terrestrial Radio Access Network (E-UTRAN), and/or the like. Additionally, for example, the apparatus 10 may be capable of operating in accordance with 4G wireless communication protocols, such as LTE Advanced, 5G, and/or the like as well as similar wireless communication protocols that may be subsequently developed.
- GPRS General Packet Radio Service
- EDGE Enhanced Data GSM Environment
- the processor 20 may include circuitry for implementing audio/video and logic functions of apparatus 10.
- the processor 20 may comprise a digital signal processor device, a microprocessor device, an analog-to-digital converter, a digital- to-analog converter, and/or the like. Control and signal processing functions of the apparatus 10 may be allocated between these devices according to their respective capabilities.
- the processor 20 may additionally comprise an internal voice coder (VC) 20a, an internal data modem (DM) 20b, and/or the like.
- the processor 20 may include functionality to operate one or more software programs, which may be stored in memory. In general, processor 20 and stored software instructions may be configured to cause apparatus 10 to perform actions.
- processor 20 may be capable of operating a connectivity program, such as a web browser.
- the connectivity program may allow the apparatus 10 to transmit and receive web content, such as location-based content, according to a protocol, such as wireless application protocol, WAP, hypertext transfer protocol, HTTP, and/or the like.
- Apparatus 10 may also comprise a user interface including, for example, an earphone or speaker 24, a ringer 22, a microphone 26, a display 28, a user input interface, and/or the like, which may be operationally coupled to the processor 20.
- the display 28 may, as noted above, include a touch sensitive display, where a user may touch and/or gesture to make selections, enter values, and/or the like.
- the processor 20 may also include user interface circuitry configured to control at least some functions of one or more elements of the user interface, such as the speaker 24, the ringer 22, the microphone 26, the display 28, and/or the like.
- the processor 20 and/or user interface circuitry comprising the processor 20 may be configured to control one or more functions of one or more elements of the user interface through computer program instructions, for example, software and/or firmware, stored on a memory accessible to the processor 20, for example, volatile memory 40, non-volatile memory 42, and/or the like.
- the apparatus 10 may include a battery for powering various circuits related to the mobile terminal, for example, a circuit to provide mechanical vibration as a detectable output.
- the user input interface may comprise devices allowing the apparatus 20 to receive data, such as a keypad 30 (which can be a virtual keyboard presented on display 28 or an externally coupled keyboard) and/or other input devices.
- apparatus 10 may also include one or more mechanisms for sharing and/or obtaining data.
- the apparatus 10 may include a short-range radio frequency (RF) transceiver and/or interrogator 64, so data may be shared with and/or obtained from electronic devices in accordance with RF techniques.
- RF radio frequency
- the apparatus 10 may include other short-range transceivers, such as an infrared (IR) transceiver 66, a BluetoothTM (BT) transceiver 68 operating using BluetoothTM wireless technology, a wireless universal serial bus (USB) transceiver 70, a BluetoothTM Low Energy transceiver, a ZigBee transceiver, an ANT transceiver, a cellular device-to-device transceiver, a wireless local area link transceiver, and/or any other short-range radio technology.
- Apparatus 10 and, in particular, the short-range transceiver may be capable of transmitting data to and/or receiving data from electronic devices within the proximity of the apparatus, such as within 10 meters, for example.
- the apparatus 10 including the Wi-Fi or wireless local area networking modem may also be capable of transmitting and/or receiving data from electronic devices according to various wireless networking techniques, including 6LoWpan, Wi-Fi, Wi-Fi low power, WLAN techniques such as IEEE 802.11 techniques, IEEE 802.15 techniques, IEEE 802.16 techniques, and/or the like.
- various wireless networking techniques including 6LoWpan, Wi-Fi, Wi-Fi low power, WLAN techniques such as IEEE 802.11 techniques, IEEE 802.15 techniques, IEEE 802.16 techniques, and/or the like.
- the apparatus 10 may comprise memory, such as a subscriber identity module (SIM) 38, a removable user identity module (R-UIM), an eUICC, an UICC, U-SIM, and/or the like, which may store information elements related to a mobile subscriber.
- SIM subscriber identity module
- R-UIM removable user identity module
- eUICC embedded user identity module
- UICC universal integrated circuit card
- U-SIM removable user identity module
- the apparatus 10 may include volatile memory 40 and/or non-volatile memory 42.
- volatile memory 40 may include Random Access Memory (RAM) including dynamic and/or static RAM, on-chip or off-chip cache memory, and/or the like.
- RAM Random Access Memory
- Non-volatile memory 42 which may be embedded and/or removable, may include, for example, read-only memory, flash memory, magnetic storage devices, for example, hard disks, floppy disk drives, magnetic tape, optical disc drives and/or media, non-volatile random access memory (NVRAM), and/or the like. Like volatile memory 40, non-volatile memory 42 may include a cache area for temporary storage of data. At least part of the volatile and/or non-volatile memory may be embedded in processor 20. The memories may store one or more software programs, instructions, pieces of information, data, and/or the like which may be used by the apparatus for performing operations disclosed herein.
- NVRAM non-volatile random access memory
- the memories may comprise an identifier, such as an international mobile equipment identification (IMEI) code, capable of uniquely identifying apparatus 10.
- IMEI international mobile equipment identification
- the memories may comprise an identifier, such as an international mobile equipment identification (IMEI) code, capable of uniquely identifying apparatus 10.
- Some of the embodiments disclosed herein may be implemented in software, hardware, application logic, or a combination of software, hardware, and application logic.
- the software, application logic, and/or hardware may reside on memory 40, the control apparatus 20, or electronic components, for example.
- the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media.
- a “computer-readable storage medium” may be any non- transitory media that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer or data processor circuitry;
- computer-readable medium may comprise a non-transitory computer-readable storage medium that may be any media that can contain or store the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer.
- a technical effect of one or more of the example embodiments disclosed herein may include enhanced use of ML models at a UE as the ML model can be adapted to the specific constraints at a given UE.
- the base stations and user equipment (or one or more components therein) and/or the processes described herein can be implemented using one or more of the following: a processor executing program code, an application-specific integrated circuit (ASIC), a digital signal processor (DSP), an embedded processor, a field programmable gate array (FPGA), and/or combinations thereof.
- ASIC application-specific integrated circuit
- DSP digital signal processor
- FPGA field programmable gate array
- These various implementations may include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
- These computer programs also known as programs, software, software applications, applications, components, program code, or code
- computer-readable medium refers to any computer program product, machine-readable medium, computer-readable storage medium, apparatus and/or device (for example, magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions.
- PLDs Programmable Logic Devices
- systems are also described herein that may include a processor and a memory coupled to the processor.
- the memory may include one or more programs that cause the processor to perform one or more of the operations described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
Claims
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202380081047.2A CN120323000A (en) | 2022-11-23 | 2023-11-01 | ML model transmission and update between UE and network |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| FI20226045 | 2022-11-23 | ||
| FI20226045 | 2022-11-23 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024110160A1 true WO2024110160A1 (en) | 2024-05-30 |
Family
ID=88695344
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2023/080460 Ceased WO2024110160A1 (en) | 2022-11-23 | 2023-11-01 | Ml model transfer and update between ue and network |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120323000A (en) |
| WO (1) | WO2024110160A1 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250247312A1 (en) * | 2024-01-26 | 2025-07-31 | Verizon Patent And Licensing Inc. | Performance indicator acquisition and processing for a communication network |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20210406677A1 (en) * | 2020-06-29 | 2021-12-30 | Google Llc | Deep Neural Network Processing for a User Equipment-Coordination Set |
| US20220116764A1 (en) * | 2020-10-09 | 2022-04-14 | Qualcomm Incorporated | User equipment (ue) capability report for machine learning applications |
| EP4087343A1 (en) * | 2020-01-14 | 2022-11-09 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Information reporting method, apparatus and device, and storage medium |
-
2023
- 2023-11-01 WO PCT/EP2023/080460 patent/WO2024110160A1/en not_active Ceased
- 2023-11-01 CN CN202380081047.2A patent/CN120323000A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP4087343A1 (en) * | 2020-01-14 | 2022-11-09 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Information reporting method, apparatus and device, and storage medium |
| US20210406677A1 (en) * | 2020-06-29 | 2021-12-30 | Google Llc | Deep Neural Network Processing for a User Equipment-Coordination Set |
| US20220116764A1 (en) * | 2020-10-09 | 2022-04-14 | Qualcomm Incorporated | User equipment (ue) capability report for machine learning applications |
Non-Patent Citations (4)
| Title |
|---|
| "3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Study on traffic characteristics and performance requirements for AI/ML model transfer in 5GS (Release 18)", no. V18.2.0, 24 December 2021 (2021-12-24), pages 1 - 111, XP052083489, Retrieved from the Internet <URL:https://ftp.3gpp.org/Specs/archive/22_series/22.874/22874-i20.zip 22874-i20.doc> [retrieved on 20211224] * |
| NOKIA ET AL: "Further discussion on the general aspects of ML for Air-interface", vol. RAN WG1, no. e-Meeting; 20221010 - 20221019, 30 September 2022 (2022-09-30), XP052277285, Retrieved from the Internet <URL:https://ftp.3gpp.org/tsg_ran/WG1_RL1/TSGR1_110b-e/Docs/R1-2209366.zip R1-2209366_ML general aspects.docx> [retrieved on 20220930] * |
| QUALCOMM INCORPORATED: "Other aspects on AI/ML for CSI feedback enhancement", vol. RAN WG1, no. e-Meeting; 20220509 - 20220520, 29 April 2022 (2022-04-29), XP052144134, Retrieved from the Internet <URL:https://ftp.3gpp.org/tsg_ran/WG1_RL1/TSGR1_109-e/Docs/R1-2205025.zip R1-2205025_Other_aspects_on_AI-ML_for_CSI_feedback_enhancement.docx> [retrieved on 20220429] * |
| RAKUTEN MOBILE INC: "Discussion on AI/ML Model Life Cycle Management", vol. RAN WG1, no. Toulouse, France; 20220822 - 20220826, 12 August 2022 (2022-08-12), XP052275054, Retrieved from the Internet <URL:https://ftp.3gpp.org/tsg_ran/WG1_RL1/TSGR1_110/Docs/R1-2207117.zip R1-2207117_AIML_LCM_r2.doc> [retrieved on 20220812] * |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20250247312A1 (en) * | 2024-01-26 | 2025-07-31 | Verizon Patent And Licensing Inc. | Performance indicator acquisition and processing for a communication network |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120323000A (en) | 2025-07-15 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11109283B1 (en) | Handover success rate prediction and management using machine learning for 5G networks | |
| US11496230B2 (en) | Systems and methods for mapping resource blocks to network slices | |
| CN116017543B (en) | Channel state information feedback enhancement method, device, system and storage medium | |
| US12003971B2 (en) | Method for sharing spectrum resources, apparatus, electronic device and storage medium | |
| KR20240134018A (en) | Model construction method and device | |
| US12389294B2 (en) | Framework for a 6G ubiquitous access network | |
| US20200401945A1 (en) | Data Analysis Device and Multi-Model Co-Decision-Making System and Method | |
| US12294551B2 (en) | NR framework for beam prediction in spatial domain | |
| US10149238B2 (en) | Facilitating intelligent radio access control | |
| EP4162650A1 (en) | Methods and apparatus relating to machine-learning in a communications network | |
| US20210168195A1 (en) | Server and method for controlling server | |
| CN119817076A (en) | Communication control method and communication device based on user intention prediction | |
| WO2024110160A1 (en) | Ml model transfer and update between ue and network | |
| US20250008346A1 (en) | Methods and devices for multi-cell radio resource management algorithms | |
| EP4346262A1 (en) | Methods and devices to detect an imbalance associated with an artificial intelligence/machine learning model | |
| US12058668B2 (en) | Systems and methods for a multi-tier self-organizing network architecture | |
| US12216639B2 (en) | Methods for processing data samples in communication networks | |
| WO2024074881A1 (en) | Method and system for feature selection to predict application performance | |
| EP4566003A1 (en) | Task specific models for wireless networks | |
| US12174308B2 (en) | Increasing wireless network performance using contextual fingerprinting | |
| US20250142359A1 (en) | Predicting 5g user plane using control plane features and granger causality for feature selection | |
| WO2023216121A1 (en) | Method, apparatus and computer program | |
| US20250274824A1 (en) | Mitigation of multiple conflicting handovers | |
| EP4369782A1 (en) | Method and device for performing load balance in wireless communication system | |
| US20240259872A1 (en) | Systems and methods for providing a robust single carrier radio access network link |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23800802 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202547046108 Country of ref document: IN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202380081047.2 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202547046108 Country of ref document: IN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 202380081047.2 Country of ref document: CN |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23800802 Country of ref document: EP Kind code of ref document: A1 |