WO2025066789A1 - Data processing method and communication apparatus - Google Patents
Data processing method and communication apparatus Download PDFInfo
- Publication number
- WO2025066789A1 WO2025066789A1 PCT/CN2024/116032 CN2024116032W WO2025066789A1 WO 2025066789 A1 WO2025066789 A1 WO 2025066789A1 CN 2024116032 W CN2024116032 W CN 2024116032W WO 2025066789 A1 WO2025066789 A1 WO 2025066789A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- message
- information
- identifier
- delay
- quantization resolution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/098—Distributed learning, e.g. federated learning
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
Definitions
- the present application relates to the field of communications, and in particular to a data processing method and a communication device.
- Federated learning is a distributed machine learning paradigm that enables multiple devices to conduct joint training and build shared machine learning models without sharing data resources. Specifically, in the process of federated learning, the distributed nodes participating in federated learning can conduct joint training by sharing artificial intelligence (AI)-assisted data.
- AI artificial intelligence
- the AI-assisted data can usually be compressed in a quantized manner.
- the AI-assisted data is represented by multiple bits, and the number of these bits can determine the quantization level (or quantization resolution), thereby affecting the performance of the gradient quantization algorithm.
- the fewer bits used by the quantization algorithm the lower the quantization resolution, the greater the quantization error introduced during the AI-assisted data upload process, and the longer the model convergence time; the more bits used by the quantization algorithm, the higher the quantization resolution, and the longer the communication time for transmitting AI-assisted data.
- the embodiments of the present application provide a data processing method and a communication device, and the second device can quantize the AI collaborative data through dynamically changing quantization resolution, which is conducive to improving the training efficiency of collaborative learning.
- the present application provides a data processing method.
- the method includes: the second device sends a first message to the first device, and the first message includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; the second device receives a second message from the first device, and the second message includes indication information, and the indication information indicates a quantization resolution, and the quantization resolution is related to one or more of the communication condition information, computing resource information, or model training information; further, the second device quantizes the first artificial intelligence AI collaboration data based on the quantization resolution to obtain second AI collaboration data; and sends the second AI collaboration data to the first device.
- the quantization resolution of the second device is dynamically adjusted in combination with the communication conditions, computing resources or model training information of the current second device, which is beneficial to improving the compatibility of the quantization resolution with the current second device.
- the training efficiency of collaborative learning is improved.
- the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput;
- the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay;
- the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
- the second device receives a third message from the first device, the third message being used to request assistance in updating the quantization resolution; the aforementioned first message is a response message corresponding to the third message.
- the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, an identifier of the model, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
- the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
- the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a trigger condition for transmitting the first message, or a number of times of transmitting the first message.
- the second device can The configuration information automatically sends a first message to the first device to assist in updating the quantization resolution, thereby facilitating saving communication transmission resources.
- the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of a model, and an identifier of a training wheel corresponding to the model.
- the present application provides a data processing method.
- the method includes: the first device receives a first message from a second device, the first message including one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; further, based on one or more of the communication condition information, the computing resource information, or the model training information, a quantization resolution is determined; and a second message is sent to the second device, the second message including indication information, the indication information indicating the quantization resolution, and the quantization resolution is related to one or more of the communication condition information, the computing resource information, or the model training information; the first device receives second artificial intelligence AI collaboration data from the second device, and the second AI collaboration data is obtained by quantizing the first AI collaboration data based on the quantization resolution.
- the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput;
- the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay;
- the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
- the first device sends a third message to the second device, where the third message is used to request assistance in updating the quantization resolution; and the aforementioned first message is a response message corresponding to the third message.
- the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, a model identifier, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, and/or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
- the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
- the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a triggering condition for transmitting the first message, or a number of times of transmitting the first message.
- the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of a model, and an identifier of a training wheel corresponding to the model.
- the present application provides a communication device, which may be a second device, or a device in the second device, or a device that can be used in combination with the second device.
- the communication device may also be a chip system.
- the communication device may execute the method described in the first aspect.
- the functions of the communication device may be implemented by hardware, or by hardware executing corresponding software implementations.
- the hardware or software includes one or more units or modules corresponding to the above functions.
- the unit or module may be software and/or hardware.
- the operations and beneficial effects performed by the communication device may refer to the method and beneficial effects described in the first aspect above.
- the present application provides a communication device, which may be a first device, or a device in the first device, or a device that can be used in combination with the first device.
- the communication device may also be a chip system.
- the communication device may execute the method described in the second aspect.
- the functions of the communication device may be implemented by hardware, or by hardware executing corresponding software.
- the hardware or software includes one or more units or modules corresponding to the above functions.
- the unit or module may be software and/or hardware.
- the operations and beneficial effects performed by the communication device may refer to the method and beneficial effects described in the second aspect above.
- the present application provides a communication device, comprising a processor and an interface circuit, wherein the interface circuit is used to receive signals from other communication devices outside the communication device and transmit them to the processor or send signals from the processor to other communication devices outside the communication device, and the processor is used to implement the method described in the first aspect through a logic circuit or by executing code instructions, or the processor is used to implement the method described in the second aspect through a logic circuit or by executing code instructions.
- the present application provides a computer-readable storage medium, in which a computer program or instruction is stored.
- a computer program or instruction is stored.
- the method described in the first aspect is implemented, or the method described in the second aspect is implemented.
- the present application provides a computer program product comprising instructions, which, when a communication device reads and executes the instructions, causes the communication device to execute the method as described in the first aspect, or causes the communication device to execute the method as described in the second aspect.
- the present application provides a communication system, comprising a communication device for executing the method described in the first aspect above, and a communication device for executing the method executed by the network device described in the second aspect above.
- FIG1 is a schematic diagram of a communication system provided in an embodiment of the present application.
- FIG2 is a schematic diagram of an application framework of AI in an NR system provided in an embodiment of the present application
- FIG3 is a schematic diagram of a process flow of federated learning provided in an embodiment of the present application.
- FIG4a is a schematic diagram of a flow chart of a data processing method provided in an embodiment of the present application.
- FIG4b is a schematic diagram of a collaborative learning process provided by an embodiment of the present application.
- FIG4c is a schematic diagram of another collaborative learning process provided by an embodiment of the present application.
- FIG5 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application.
- FIG6 is a schematic diagram of the structure of another communication device provided in an embodiment of the present application.
- FIG. 1 is a schematic diagram of the architecture of a communication system 1000 used in an embodiment of the present application.
- the communication system includes a radio access network (RAN) 100 and a core network 200.
- the communication system 1000 may also include the Internet 300.
- the RAN 100 includes at least one RAN node (such as 110a and 110b in FIG. 1 , collectively referred to as 110), and may also include at least one terminal (such as 120a-120j in FIG. 1 , collectively referred to as 120).
- the RAN 100 may also include other RAN nodes, for example, a wireless relay device and/or a wireless backhaul device (not shown in FIG. 1 ).
- the terminal 120 is connected to the RAN node 110 wirelessly, and the RAN node 110 is connected to the core network 200 wirelessly or by wire.
- the core network device in the core network 200 and the RAN node 110 in the RAN 100 may be independent and different physical devices, or may be the same physical device that integrates the logical functions of the core network device and the logical functions of the RAN node. Terminals and RAN nodes may be connected to each other via wired or wireless means. It should be noted that the RAN node 110 may also be referred to as a network device 110 in the following text.
- RAN100 may be an evolved universal terrestrial radio access (E-UTRA) system, a new radio (NR) system, and a future radio access system defined in the 3rd generation partnership project (3GPP). RAN100 may also include two or more of the above different radio access systems. RAN100 may also be an open RAN (O-RAN).
- E-UTRA evolved universal terrestrial radio access
- NR new radio
- 3GPP 3rd generation partnership project
- RAN100 may also include two or more of the above different radio access systems.
- RAN100 may also be an open RAN (O-RAN).
- RAN nodes also known as radio access network equipment, RAN entities or access nodes, are used to help terminals access the communication system wirelessly.
- RAN nodes can be base stations (base stations), evolved NodeBs (eNodeBs), transmission reception points (TRPs), next generation NodeBs (gNBs) in the fifth generation (5G) mobile communication system, next generation NodeBs in the sixth generation (6G) mobile communication system, and base stations in future mobile communication systems.
- RAN nodes can be macro base stations (such as 110a in FIG. 1 ), micro base stations or indoor stations (such as 110b in FIG. 1 ), or relay nodes or donor nodes.
- the cooperation of multiple RAN nodes can help the terminal achieve wireless access, and different RAN nodes respectively implement part of the functions of the base station.
- the RAN node can be a centralized unit (CU), a distributed unit (DU) or a radio unit (RU).
- the CU here completes the functions of the radio resource control protocol and the packet data convergence protocol (PDCP) of the base station, and can also complete the function of the service data adaptation protocol (SDAP);
- SDAP service data adaptation protocol
- the DU completes the functions of the radio link control layer and the medium access control (MAC) layer of the base station, and can also complete the functions of part or all of the physical layer.
- PDCP packet data convergence protocol
- SDAP service data adaptation protocol
- MAC medium access control
- RU can be used to implement the transceiver function of the radio frequency signal.
- CU and DU can be two independent RAN nodes, or they can be integrated in the same RAN node, such as integrated in the baseband unit (BBU).
- the RU may be included in a radio frequency device, such as a remote radio unit (RRU) or an active antenna unit (AAU).
- RRU remote radio unit
- AAU active antenna unit
- the CU may be further divided into two types of RAN nodes: CU-control plane and CU-user plane.
- RAN nodes may have different names.
- CU may be called an open CU (open CU, O-CU)
- DU may be called an open DU (open DU, O-DU)
- RU may be called an open RU (open RU, O-RU).
- the RAN node in the embodiments of the present application may be implemented by a software module, a hardware module, or a combination of a software module and a hardware module.
- the RAN node may be a server loaded with a corresponding software module.
- the embodiments of the present application do not limit the specific technology and specific device form adopted by the RAN node. For ease of description, the following description takes a base station as an example of a RAN node.
- the quantization resolution will affect the training time of the federated learning model to a certain extent.
- the distributed nodes of federated learning will quantize AI collaborative data based on a fixed (or understood as predetermined) quantization resolution.
- the resources that the distributed nodes can use for federated learning may change.
- the quantization resolution will not match the current distributed nodes, which may increase the model training time of federated learning and reduce the training efficiency of federated learning.
- the present application provides a data processing method and a communication device.
- the data processing method and the communication device provided by the embodiment of the present application are described in detail below in conjunction with the accompanying drawings.
- the collaborative learning mentioned in the present application can be understood as an AI model training method in which multiple devices collaborate (or are understood as collaborative or joint) to perform training.
- the collaborative learning can be distributed learning, federated learning, edge learning, or segmented learning.
- the first device and the second device are devices participating in collaborative learning.
- the first device can be a centralized node corresponding to collaborative learning
- the second device can be a distributed node in collaborative learning (or understood as a device that performs model training).
- the first device can be any one of the devices in RAN 100 (such as a RAN node or terminal, etc.), the functional network element in CN 200, or the server corresponding to the Internet 300 in the communication system of Figure 1
- the second device can also be any one of the devices in RAN 100, the functional network element in CN 200, or the server corresponding to the Internet 300 in the communication system of Figure 1; this application does not make specific limitations on this.
- the first device can be a centralized node corresponding to collaborative learning
- the second device can be a distributed node in collaborative learning (or understood as a device that performs model training).
- the first device can be any one of the devices in RAN 100 (such as a RAN node or terminal, etc.), the functional network element in CN
- the second device sends a first message to the first device, wherein the first message includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device.
- the second device is a device that uploads AI collaborative data during the collaborative learning process
- the first device is a device that determines the quantization resolution of the second device during the collaborative learning process.
- the first device can be a service node (server) of the federated learning
- the second device can be one of the multiple client nodes (client) of the federated learning.
- the first device receives a first message from the second device, and the first message is used to assist the first device in updating the quantization resolution of the second device.
- the second device can indicate the communication environment in which the second device is located (or called corresponding) through the communication condition information in the first message, indicate the computing resources that the second device can use for collaborative learning through the computing information in the first message, and indicate the relevant information of the second device during the model training process (such as the size of AI collaborative data or quantization resolution related data, etc.) through the model training information in the first message.
- this application does not specifically limit the way in which the second device obtains communication condition information, computing resource information and model training information.
- the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal receiving power (RSRP), reference signal receiving quality (RSRQ), signal to interference plus noise ratio (SINR), bit error rate or throughput.
- the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay.
- the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
- the first message sent by the second device includes, in addition to the communication condition information of the second device, In addition to one or more of the computing resource information or the model training information of the second device, it may also include one or more of the following information: a measurement identifier, used to identify the reporting of the first message and to associate the subsequent second message; a second device identifier, used to identify the device that sends the first message, or understood as the device to be updated with the quantization resolution; a first device identifier, used to identify the device that receives the first message, or understood as the device for determining the quantization resolution; a model identifier, used to identify the AI model corresponding to the first message; an identifier of the training wheel corresponding to the model, used to indicate that the first message is used to update the quantization resolution used when the second device feeds back which training wheel of AI collaboration data.
- a measurement identifier used to identify the reporting of the first message and to associate the subsequent second message
- a second device identifier used to identify the device that sends the first
- the first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
- the first device determines the quantization resolution according to the information included in the first message.
- the first device determines the quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information with the goal of minimizing the training time of the model.
- the first device will wait for multiple distributed nodes (including the second device) of collaborative learning to feedback AI collaborative data before executing subsequent steps (such as aggregating AI collaborative data, etc.). It can be seen that the training duration of the model is affected by the delay in the distributed nodes of collaborative learning to feedback AI collaborative data.
- the first device can determine a larger quantization resolution (i.e., a larger number of bits corresponding to the quantization resolution) for distributed nodes with large computing power, small delay, or large gradient norm to improve the accuracy of the AI collaborative data fed back by the distributed nodes, and determine a smaller quantization resolution (i.e., a smaller number of bits corresponding to the quantization resolution) for distributed nodes with small computing power, large delay, or small gradient norm to reduce the communication delay between the first device and the distributed node, thereby reducing the training delay.
- a larger quantization resolution i.e., a larger number of bits corresponding to the quantization resolution
- a smaller quantization resolution i.e., a smaller number of bits corresponding to the quantization resolution
- the first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
- the indication information may indicate the quantization resolution by directly indicating the quantization resolution (i.e., the number of bits used for quantization), for example, the second message includes 4 bits or 16 bits, etc.; or it may indicate the quantization index corresponding to the quantization resolution, for example, if the quantization index included in the second message is 0, it means that the quantization is based on 4 bits, and if the quantization index included in the second message is 1, it means that the quantization is based on 16 bits.
- the second message may also include one or more of the following information: a measurement identifier, corresponding to the measurement identifier contained in the first message, used to identify the issuance of the second message; a second device identifier, used to identify the device receiving the second message, or understood as a device applying the quantization resolution; a first device identifier, used to identify the device sending the second message, or understood as a device determining the quantization resolution; a model identifier, used to identify the AI model corresponding to the quantization resolution; an identifier of the training wheel corresponding to the model, used to indicate the quantization resolution used when the second device feeds back the AI collaboration data of which training wheel is updated.
- a measurement identifier corresponding to the measurement identifier contained in the first message, used to identify the issuance of the second message
- a second device identifier used to identify the device receiving the second message, or understood as a device applying the quantization resolution
- a first device identifier used to identify the device sending the second message, or understood as a
- the second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
- the second device trains the model to obtain the first AI collaboration data (i.e., the AI collaboration data before quantization). Further, the second device quantizes the first AI collaboration data based on the quantization resolution to obtain the second AI collaboration data (i.e., the AI collaboration data after quantization).
- the second device sends second AI collaboration data to the first device.
- the second device sends a fourth message to the first device, and the fourth message includes the second AI collaboration data.
- the fourth message may also include one or more of the following information: a second device identifier, used to identify the device that sends the fourth message; a first device identifier, used to identify the device that receives the fourth message; a model identifier, used to identify the AI model corresponding to the second AI collaboration data; an identifier of the training wheel corresponding to the model, used to indicate which training wheel the second AI collaboration data belongs to.
- the quantization resolution of the second device can be dynamically adjusted in combination with the communication conditions, computing resources or model training information of the current second device, which is beneficial to improving the compatibility of the quantization resolution with the current second device.
- By quantizing and transmitting AI collaborative data at a quantization resolution adapted to the current second device it is beneficial to improve the training efficiency of collaborative learning.
- the collaborative learning process is described in detail in the following two cases according to the triggering method of triggering the second device to send the first message to the first device.
- the devices participating in the collaborative learning include the first device, the second device and the third device as an example.
- the devices participating in the collaborative learning may also include other devices (not shown in the figure), and this application does not make specific limitations on this.
- Case 1 triggering the second device to send the first message to the first device through passive triggering.
- the second device receives a third message from another device (other devices except the second device, including the first device), and the third message is used to trigger the second device to send the aforementioned first message to the first device.
- the first message may be
- the following text uses the example that the third message is sent by the first device to the second device as an example for illustrative explanation, which should not be regarded as a specific limitation on the sending device corresponding to the third message.
- FIG. 4b is a schematic diagram of a collaborative learning process provided by the present application.
- the collaborative learning includes the following steps S411 to S419. Among them:
- the first device sends a third message to the second device, where the third message is used to request the second device to assist in updating a quantization resolution.
- the first device determines that there is a need to update the quantization resolution of the second device, the first device sends a third message to the second device, and the third message is used to request the second device to assist in updating the quantization resolution.
- the specific name of the third message is not specifically limited in this application.
- the third message can be a request message for assisting in updating the quantization resolution.
- delay 1 is the maximum value of delay 1 to delay 4, and the absolute value of the difference between the average delay of delay 1 to delay 4 and delay 1 is greater than the first threshold; the first threshold is a preset value greater than 0, and its specific value can be adjusted according to the specific application scenario.
- device 1 in order to reduce the time consumed by device 1 waiting for all clients to feedback AI collaboration data, device 1 can reduce the delay of device 1 waiting for device 2 to feedback AI collaboration data by adjusting the quantization resolution of device 2, which can be regarded as device 1 determining the need to update the quantization resolution of device 2 (that is, it can be understood as determining device 2 as the second device). Further, device 1 sends a third message to device 2, and the third message is used to request device 2 to assist device 1 in updating the quantization resolution, that is, requesting device 2 to send the first message to device 1.
- Measurement ID also known as measurement ID, is used to indicate the request for assistance in updating the quantization resolution.
- the identifier of the second device used to indicate the device whose quantization resolution is to be updated, or used to indicate the device that receives the third message, or used to indicate the device that sends the first message.
- the identifier of the first device used to indicate a device for determining the quantization resolution, or used to indicate a device for sending the third message, or used to indicate a device for receiving the first message.
- Model identification used to indicate that the second device needs to update the AI model (or understand it as an AI collaboration model) of the quantization resolution.
- the model identification can be the identification of the AI model, or associated information associated with the AI model (such as the business identification corresponding to the AI model), etc.
- the third message includes field A, and the value of field A is used to indicate the reason for this update of the quantization resolution.
- the third message indicates that the reason for updating the quantization resolution is: the latency of the second device (i.e., the latency of the first device waiting for the second device to feedback the AI collaborative data) is large, or it is understood that the efficiency of the second device to feedback the AI collaborative data needs to be improved and the quantization resolution needs to be updated.
- the third message indicates that the reason for updating the quantization resolution is: the error of the AI collaborative data fed back by the second device is large, or it is understood that the accuracy of the AI collaborative data fed back by the second device needs to be improved and the quantization resolution needs to be updated.
- the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay.
- the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data
- the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data
- the multiple distributed nodes include the second device. It can be understood that when the reason for updating the quantization resolution is that the delay of the second device is large (or it is understood that the quantization resolution needs to be updated to improve the efficiency of the second device to feedback the AI collaboration data), the indication information included in the third message can be the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay.
- the identifier of the training wheel corresponding to the model is used to indicate the quantization resolution used when the second device feeds back the AI collaboration data of which training wheel the first device updates based on the first message.
- the identifier of the Nth training wheel of the model included in the third message indicates that the first device updates the quantization resolution used when the second device feeds back the AI collaboration data of the Nth training wheel based on the first message, that is, when the second device subsequently feeds back the AI collaboration data of the Nth training wheel, it can use the quantization resolution updated based on the first message for quantization.
- the training wheel may also be referred to as a wheel, and the whole text is the same.
- the identifier of the information requested for feedback is used to indicate that the first device expects (or understands as a request) information to be fed back by the second device.
- the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information.
- the second device may subsequently generate a first message based on the identifier of the information requested for feedback, that is, the first message is generated based on the identifier of the information requested for feedback.
- the information included in the first message may be part or all of the information requested for feedback in the third message.
- the third message sent by the first device may include the information requested for feedback.
- the identifier of the information fed back includes: the identifier of the communication condition information (such as the identifier of the transmission delay or the identifier of the RSRP, etc.), the identifier of the computing resource information and the identifier of the model training information (such as the identifier of the test set loss value or the identifier of the gradient norm size).
- the second device can only feedback part (that is, feedback one or two of the communication condition information, computing resource information or model training information) according to the third message, or it can feedback all (that is, feedback communication condition information, computing resource information and model training information).
- the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a trigger condition for transmitting the first message, or a number of times of transmitting the first message.
- the transmission period of transmitting the first message may be determined based on a time interval value/index or based on a training round. Taking the transmission period determined based on a training round as an example, when the configuration information indicates that the transmission period of the first message is 3 training rounds, the configuration information indicates that the second device transmits the first message once every 3 training rounds.
- the triggering condition for transmitting the first message may include one or more of the following conditions: 1.
- a triggering condition for triggering instant transmission of the first message i.e., the second device transmits the first message once if the triggering condition is met
- the triggering condition is that the gradient norm of the second device is greater than or equal to the gradient norm threshold, the communication bandwidth is less than or equal to the bandwidth threshold, the channel quality (e.g., RSRP, RSRQ, SINR, bit error rate, throughput, etc.) is less than or equal to the channel quality threshold, the computing power is less than or equal to the computing power threshold, etc.
- the triggering condition is that the gradient norm of the second device is greater than or equal to the gradient norm threshold, the communication bandwidth is less than or equal to the bandwidth threshold, the channel quality (e.g., RSRP, RSRQ, SINR, bit error rate, throughput, etc.) is less than or equal to the channel quality threshold, the computing power is less than or equal to the computing power threshold, etc.
- a triggering condition for triggering periodic transmission of the first message i.e., the second device transmits the first message once if the triggering condition is met, according to the transmission period of the first message, the period
- the trigger condition for periodically transmitting the first message is as follows: for example, the gradient norm of the second device is greater than or equal to the gradient norm threshold, the communication bandwidth is less than or equal to the bandwidth threshold, the channel quality is less than or equal to the channel quality threshold, the computing power is less than or equal to the computing power threshold, etc.; 3.
- the trigger condition for ending the periodic transmission of the first message (that is, when the second message periodically transmits the first message, if the second device meets the trigger condition, the second device stops periodically transmitting the first message), for example, the trigger condition is that the gradient norm of the second device is less than the gradient norm threshold, the communication bandwidth is greater than the bandwidth threshold, the channel quality is greater than the channel quality threshold, the computing power is greater than the computing power threshold, etc.
- the number of times the first message is transmitted is used to indicate the total number of times the second device needs to transmit the first message after receiving the configuration information for transmitting the first message.
- S412 The second device sends the first message to the first device according to the third message.
- the first message is understood as a response message of the third message.
- the measurement identifier in the first message corresponds to (or is understood to be the same as) the measurement identifier included in the third message, or the identifier of the model in the first message is the same as the identifier of the model included in the third message, or the identifier of the training wheel corresponding to the model in the first message is the same as the identifier of the training wheel corresponding to the model included in the third message.
- the present application does not specifically limit the specific name of the first message.
- the first message may be a response message for assisting in updating the quantization resolution.
- the first message please refer to the specific description of the first message in S401 above, which will not be repeated here.
- the first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
- S414 The first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
- the second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
- the second device sends second AI collaboration data to the first device.
- the specific implementation process of S413 to S416 can refer to the description of the specific implementation process of S402 to S405 above, which will not be repeated here.
- the third device sends third AI collaboration data to the first device.
- the third device trains the model to obtain AI collaboration data of the third device; further, the third device quantizes the AI collaboration data of the third device based on its corresponding quantization resolution to obtain third AI collaboration data, and sends the third AI collaboration data to the first device.
- the specific content of the AI collaboration data of the third device can refer to the specific description of the aforementioned AI collaboration data.
- the AI collaboration data of the third device includes one or more of the following data: gradient data, AI model, AI sub-model, AI model output, AI model intermediate value, etc.
- the quantization resolution of the third device can also refer to the data processing method provided in the aforementioned FIG. 4a for updating the quantization resolution, and this application does not specifically limit this.
- the first device aggregates the second AI collaboration data and the third AI collaboration data to obtain aggregated AI collaboration data.
- the first device receives the AI collaboration data fed back from all (or part) of the distributed nodes (such as the second device and the third device) participating in the collaborative learning, it aggregates the AI collaboration data to obtain aggregated AI collaboration data.
- the first device sends the aggregated AI collaboration data to the second device and the third device respectively.
- the first device after the first device obtains the aggregated AI collaborative data, the first device sends the aggregated AI collaborative data to the distributed nodes participating in the collaborative learning, so that the distributed nodes participating in the collaborative learning (including the second device and the third device) can respectively
- the AI collaboration data is then used for model training.
- the first device when it sends aggregated AI collaborative data to the distributed nodes participating in collaborative learning, it can also indicate the identifier of the distributed node (for example, the identifier of the second device or the identifier of the third device) to indicate the device receiving the aggregated AI collaborative data; the first device identifier is used to identify the device sending the aggregated AI collaborative data; the identifier of the model is used to indicate the AI model corresponding to the aggregated AI collaborative data; the identifier of the training wheel corresponding to the model is used to indicate which training wheel of the model the aggregated AI collaborative data corresponds to.
- the identifier of the distributed node for example, the identifier of the second device or the identifier of the third device
- the aggregated AI collaborative data sent by the first device to the distributed nodes participating in collaborative learning can be unquantized data (that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data), or quantized data (that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data, and then subjected to an aggregated quantization resolution quantization operation).
- unquantized data that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data
- quantized data that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data, and then subjected to an aggregated quantization resolution quantization operation.
- the aggregated quantization resolution can be understood as the quantization resolution of the quantized aggregated AI collaborative data, and the aggregated quantization resolution can also be determined based on one or more of the communication condition information of the first device, the communication condition information of the second device, or the computing resource information of the second device, and this application does not make specific restrictions.
- Case 2 triggering the second device to send the first message to the first device by actively triggering the second device.
- FIG. 4c is a schematic diagram of another collaborative learning process provided by the present application.
- the collaborative learning includes the following steps S421 to S428. Among them:
- the second device sends a first message to the first device, where the first message is used to request the first device to update a quantization resolution of the second device.
- the second device when the second device determines that it has a need to update the quantization resolution, the second device actively sends a first message to the first device based on the need to update the quantization resolution.
- the first message can be a request message for updating the quantization resolution.
- the first message please refer to the specific description of the first message in S401 above, which will not be repeated here.
- device 1 and device 2 are devices that perform federated learning tasks.
- device 1 is the server that performs the federated learning task
- device 2 is the client that performs the federated learning task.
- device 2 will feedback AI collaboration data to device 1.
- the gradient norm of the AI model is greater than or equal to the gradient norm threshold
- the communication bandwidth of the second device is less than or equal to the bandwidth threshold
- the channel quality of the second device is less than or equal to the channel quality threshold
- the computing power of the second device is less than or equal to the computing power threshold
- the first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
- the first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
- the second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
- the second device sends second AI collaboration data to the first device.
- the third device sends third AI collaboration data to the first device.
- the first device aggregates the second AI collaboration data and the third AI collaboration data to obtain aggregated AI collaboration data.
- the first device sends the aggregated AI collaboration data to the second device and the third device respectively.
- the specific implementation process of S422 to S428 can refer to the description of the specific implementation process of S413 to S419 mentioned above, which will not be repeated here.
- the device includes a hardware structure and/or software module corresponding to the execution of each function. It should be easily appreciated by those skilled in the art that, in combination with the units and method steps of each example described in the embodiments disclosed in this application, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or a computer software transceiver unit driving the hardware depends on the specific application scenario and design constraints of the technical solution.
- Figures 5 and 6 are schematic diagrams of the structures of possible communication devices provided by embodiments of the present application. These communication devices can be used to implement the functions of the devices in the above method embodiments, and thus can also achieve the beneficial effects possessed by the above method embodiments.
- the communication device can be the first device in Figures 4a to 4c, or a module (such as a chip) applied to the first device, or the communication device can be the second device in Figures 4a to 4c, or a module (such as a chip) applied to the second device.
- the communication device 500 includes a processing unit 510 and a transceiver unit 520.
- the communication device 500 is used to implement the function of the first device or the function of the second device in the method embodiments shown in Figs. 4a to 4c above.
- the transceiver unit 520 is used to send The first device sends a first message, which includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; the transceiver unit 520 is also used to receive a second message from the first device, and the second message includes indication information, which indicates a quantization resolution, and the quantization resolution is related to one or more of the communication condition information, computing resource information, or model training information; the processing unit 510 is used to quantize the first artificial intelligence AI collaboration data based on the quantization resolution to obtain second AI collaboration data; the transceiver unit 520 is also used to send the second AI collaboration data to the first device.
- the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput;
- the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay;
- the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
- the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
- the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a triggering condition for transmitting the first message, or a number of times of transmitting the first message.
- transceiver unit 520 For a more detailed description of the transceiver unit 520 and the processing unit 510, reference may be made to the related description of the second device in the method embodiment shown in FIG. 4a to FIG. 4c.
- the transceiver unit 520 is used to receive a first message from the second device, and the first message includes one or more of the communication condition information of the second device, the computing resource information of the second device, or the model training information of the second device; the processing unit 510 is used to determine the quantization resolution based on one or more of the communication condition information, the computing resource information, or the model training information; the transceiver unit 520 is also used to send a second message to the second device, and the second message includes indication information, and the indication information indicates the quantization resolution, and the quantization resolution is related to one or more of the communication condition information, the computing resource information, or the model training information; the transceiver unit 520 is also used to receive second artificial intelligence AI collaboration data from the second device, and the second AI collaboration data is obtained by quantizing the first AI collaboration data based on the quantization resolution.
- the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput;
- the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay;
- the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
- the transceiver unit 520 is further used to send a third message to the second device, where the third message is used to request assistance in updating the quantization resolution; and the first message is a response message corresponding to the third message.
- the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, a model identifier, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, and/or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
- the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
- the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of the model, and an identifier of a training wheel corresponding to the model.
- transceiver unit 520 For a more detailed description of the transceiver unit 520 and the processing unit 510, reference may be made to the related description of the first device in the method embodiment shown in FIG. 4a to FIG. 4c.
- the chip implements the function of the second device in the above-mentioned method embodiment.
- the chip receives information from the first device, which can be understood as the information is first received by other modules in the second device (such as a radio frequency module or an antenna), and then sent to the chip by these modules.
- the chip sends information to the first device, which can be understood as the information is first sent to other modules in the second device (such as a radio frequency module or an antenna), and then sent to the first device by these modules.
- Entities A and B can be RAN nodes or terminals, or modules inside the RAN nodes or terminals.
- the sending and receiving of information can be information interaction between a RAN node and a terminal, for example, information interaction between a base station and a terminal; the sending and receiving of information can also be information interaction between two RAN nodes, for example, information interaction between a CU and a DU; the sending and receiving of information can also be information interaction between different modules inside a device, for example, information interaction between a terminal chip and other modules of the terminal, or information interaction between a base station chip and other modules in the base station.
- processors in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof.
- the general-purpose processor may be a microprocessor or any conventional processor.
- the method steps in the embodiments of the present application can be implemented in hardware or in software instructions that can be executed by a processor.
- the software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, register, hard disk, mobile hard disk, CD-ROM or any other form of storage medium known in the art.
- An exemplary storage medium is coupled to the processor so that the processor can read information from the storage medium and write information to the storage medium.
- the storage medium can also be a component of the processor.
- the processor and the storage medium can be located in an ASIC.
- the ASIC can be located in a base station or a terminal.
- the processor and the storage medium can also be present in a base station or a terminal as discrete components.
- the computer program product includes one or more computer programs or instructions.
- the computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device or other programmable device.
- the computer program or instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
- the computer program or instructions may be transmitted from one website site, computer, server or data center to another website site, computer, server or data center by wire or wireless means.
- the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
- the available medium may be a magnetic medium, such as a floppy disk, a hard disk, or a magnetic tape; it may also be an optical medium, such as a digital video disc; it may also be a
- the computer readable storage medium may be a semiconductor medium, such as a solid state drive.
- the computer readable storage medium may be a volatile or non-volatile storage medium, or may include both volatile and non-volatile storage media.
- “at least one” means one or more, and “more than one” means two or more.
- “And/or” describes the association relationship of associated objects, indicating that three relationships may exist.
- a and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
- the character “/” generally indicates that the previous and next associated objects are in an “or” relationship; in the formula of the present application, the character “/” indicates that the previous and next associated objects are in a “division” relationship.
- “Including at least one of A, B and C” can mean: including A; including B; including C; including A and B; including A and C; including B and C; including A, B and C.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Mobile Radio Communication Systems (AREA)
Abstract
Description
本申请要求于2023年9月25日提交于中国国家知识产权局、申请号为202311246336.1、申请名称为“一种数据处理方法及通信装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the State Intellectual Property Office of China on September 25, 2023, with application number 202311246336.1 and application name “A Data Processing Method and Communication Device”, the entire contents of which are incorporated by reference in this application.
本申请涉及通信领域,尤其涉及一种数据处理方法及通信装置。The present application relates to the field of communications, and in particular to a data processing method and a communication device.
联邦学习是一种分布式的机器学习范式,可以使多个设备在无需共享数据资源的情况下,进行联合训练,建立共享的机器学习模型。具体的,在联邦学习的过程中,参与联邦学习的分布式节点可以通过共享人工智能(artificial intelligence,AI)协助数据的方式,进行联合训练。Federated learning is a distributed machine learning paradigm that enables multiple devices to conduct joint training and build shared machine learning models without sharing data resources. Specifically, in the process of federated learning, the distributed nodes participating in federated learning can conduct joint training by sharing artificial intelligence (AI)-assisted data.
为了减少联邦学习的过程中各分布式节点上传和下载AI协助数据产生的通信开销,通常可以采用量化的方式对AI协助数据进行压缩。也就是说,该AI协助数据由多个比特表示,这些比特的数量可以确定量化级别(或称为量化分辨率),从而影响梯度量化算法的性能。其中,量化算法使用的比特数量越少,则量化分辨率越低,AI协助数据上传过程中引入的量化误差越大,模型收敛的用时越长;量化算法使用的比特数量越多,则量化分辨率越高,传输AI协助数据的通信用时越长。In order to reduce the communication overhead generated by each distributed node uploading and downloading AI-assisted data during the federated learning process, the AI-assisted data can usually be compressed in a quantized manner. In other words, the AI-assisted data is represented by multiple bits, and the number of these bits can determine the quantization level (or quantization resolution), thereby affecting the performance of the gradient quantization algorithm. Among them, the fewer bits used by the quantization algorithm, the lower the quantization resolution, the greater the quantization error introduced during the AI-assisted data upload process, and the longer the model convergence time; the more bits used by the quantization algorithm, the higher the quantization resolution, and the longer the communication time for transmitting AI-assisted data.
在联邦学习的过程中,分布式节点采用何种量化分辨率AI协助数据进行量化,对有待进一步研究。In the process of federated learning, what kind of quantization resolution AI distributed nodes use to assist in data quantization needs further study.
发明内容Summary of the invention
本申请实施例提供了一种数据处理方法及通信装置,第二设备可以通过动态变化的量化分辨率对AI协作数据进行量化,有利于提升协作学习的训练效率。The embodiments of the present application provide a data processing method and a communication device, and the second device can quantize the AI collaborative data through dynamically changing quantization resolution, which is conducive to improving the training efficiency of collaborative learning.
第一方面,本申请提供了一种数据处理方法,以第二设备执行该方法为例,该方法包括:第二设备向第一设备发送第一消息,该第一消息包括第二设备的通信条件信息、第二设备的计算资源信息或第二设备的模型训练信息中的一种或多种;第二设备接收来自第一设备的第二消息,该第二消息包括指示信息,该指示信息指示量化分辨率,该量化分辨率与通信条件信息、计算资源信息或模型训练信息中的一种或多种相关;进一步地,第二设备基于该量化分辨率对第一人工智能AI协作数据进行量化,得到第二AI协作数据;并向第一设备发送第二AI协作数据。In a first aspect, the present application provides a data processing method. Taking the execution of the method by a second device as an example, the method includes: the second device sends a first message to the first device, and the first message includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; the second device receives a second message from the first device, and the second message includes indication information, and the indication information indicates a quantization resolution, and the quantization resolution is related to one or more of the communication condition information, computing resource information, or model training information; further, the second device quantizes the first artificial intelligence AI collaboration data based on the quantization resolution to obtain second AI collaboration data; and sends the second AI collaboration data to the first device.
基于第一方面所描述的方法,结合当前第二设备的通信条件、计算资源或模型训练信息,对第二设备的量化分辨率进行动态调整,有利于提升量化分辨率与当前第二设备的适配度,通过与当前第二设备适配的量化分辨率对AI协作数据进行量化和传输,有利于提升协作学习的训练效率。Based on the method described in the first aspect, the quantization resolution of the second device is dynamically adjusted in combination with the communication conditions, computing resources or model training information of the current second device, which is beneficial to improving the compatibility of the quantization resolution with the current second device. By quantizing and transmitting AI collaborative data at a quantization resolution adapted to the current second device, the training efficiency of collaborative learning is improved.
在一种可能的实现方式中,该通信条件信息包括通信带宽信息、传输时延、参考信号接收功率RSRP、参考信号接收质量RSRQ、信号与干扰加噪声比SINR、误码率或吞吐量中的一种或多种;计算资源信息包括计算资源的大小、计算能力的大小或计算时延中的一种或多种;模型训练信息包括测试集损失值、训练集损失值、梯度范数大小、梯度大小、模型大小中的一种或多种。In one possible implementation, the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput; the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay; the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
在一种可能的实现方式中,第二设备接收来自第一设备的第三消息,该第三消息用于请求协助更新量化分辨率;前述第一消息为该第三消息对应的响应消息。通过实施该可能的实施方式,有利于提升触发第二设备发送第一消息的灵活性。In a possible implementation, the second device receives a third message from the first device, the third message being used to request assistance in updating the quantization resolution; the aforementioned first message is a response message corresponding to the third message. By implementing this possible implementation, it is helpful to improve the flexibility of triggering the second device to send the first message.
在一种可能的实现方式中,第三消息包括以下信息中的一项或多项:测量标识、第二设备的标识、第一设备的标识、模型的标识、指示更新量化分辨率的原因的指示信息、模型对应的训练轮的标识、请求反馈的信息的标识或传输第一消息的配置信息;其中,请求反馈的信息包括通信条件信息、计算资源信息或模型训练信息中的一种或多种;第一消息是根据请求反馈的信息的标识生成的。In one possible implementation, the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, an identifier of the model, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
在一种可能的实现方式中,指示更新量化分辨率的原因的指示信息包括第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值;其中,第一时延为第一设备等待第二设备反馈AI协作数据的时延,平均时延为第一设备等待多个分布式节点反馈AI协作数据的时延的平均值,该多个分布式节点包括第二设备。通过实施该可能的实施方式,有利于缩短第二设备反馈AT协助数据的时延,从而有利于提升协作学习的训练效率。In a possible implementation, the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device. By implementing this possible implementation, it is beneficial to shorten the delay of the second device to feedback the AT assistance data, thereby facilitating the improvement of the training efficiency of collaborative learning.
在一种可能的实现方式中,传输第一消息的配置信息包括传输第一消息的传输周期、或传输第一消息的触发条件或传输第一消息的次数中的一种或多种。通过实施该可能的实施方式,可以使得第二设备根据 该配置信息自动向第一设备发送第一消息以协助更新量化分辨率,从而有利于节省通信传输资源。In a possible implementation, the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a trigger condition for transmitting the first message, or a number of times of transmitting the first message. By implementing this possible implementation, the second device can The configuration information automatically sends a first message to the first device to assist in updating the quantization resolution, thereby facilitating saving communication transmission resources.
在一种可能的实现方式中,第一消息还包括以下信息中的一项或多项:测量标识、第二设备标识、第一设备标识、模型的标识、模型对应的训练轮的标识。In a possible implementation manner, the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of a model, and an identifier of a training wheel corresponding to the model.
第二方面,本申请提供了一种数据处理方法,以第一设备执行该方法为例,该方法包括:第一设备接收来自第二设备的第一消息,该第一消息包括第二设备的通信条件信息、第二设备的计算资源信息或第二设备的模型训练信息中的一种或多种;进一步地,基于通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率;并向第二设备发送第二消息,该第二消息包括指示信息,该指示信息指示量化分辨率,该量化分辨率与通信条件信息、计算资源信息或模型训练信息中的一种或多种相关;第一设备接收来自第二设备的第二人工智能AI协作数据,第二AI协作数据是基于量化分辨率对第一AI协作数据量化得到的。In a second aspect, the present application provides a data processing method. Taking the execution of the method by a first device as an example, the method includes: the first device receives a first message from a second device, the first message including one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; further, based on one or more of the communication condition information, the computing resource information, or the model training information, a quantization resolution is determined; and a second message is sent to the second device, the second message including indication information, the indication information indicating the quantization resolution, and the quantization resolution is related to one or more of the communication condition information, the computing resource information, or the model training information; the first device receives second artificial intelligence AI collaboration data from the second device, and the second AI collaboration data is obtained by quantizing the first AI collaboration data based on the quantization resolution.
基于第二方面所描述的方法所得到的有益效果,可以参见上述第一方面所述的方法以及有益效果。For the beneficial effects obtained by the method described in the second aspect, reference may be made to the method and beneficial effects described in the first aspect.
在一种可能的实现方式中,通信条件信息包括通信带宽信息、传输时延、参考信号接收功率RSRP、参考信号接收质量RSRQ、信号与干扰加噪声比SINR、误码率或吞吐量中的一种或多种;计算资源信息包括计算资源的大小、计算能力的大小或计算时延中的一种或多种;模型训练信息包括测试集损失值、训练集损失值、梯度范数大小、梯度大小、模型大小中的一种或多种。In one possible implementation, the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput; the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay; the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
在一种可能的实现方式中,第一设备向第二设备发送第三消息,该第三消息用于请求协助更新量化分辨率;前述第一消息为该第三消息对应的响应消息。In a possible implementation, the first device sends a third message to the second device, where the third message is used to request assistance in updating the quantization resolution; and the aforementioned first message is a response message corresponding to the third message.
在一种可能的实现方式中,该第三消息包括以下信息中的一项或多项:测量标识、第二设备的标识、第一设备的标识、模型标识、指示更新量化分辨率的原因的指示信息、模型对应的训练轮的标识、请求反馈的信息的标识和/或传输第一消息的配置信息;其中,请求反馈的信息包括通信条件信息、计算资源信息或模型训练信息中的一种或多种;第一消息是根据请求反馈的信息的标识生成的。In a possible implementation, the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, a model identifier, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, and/or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
在一种可能的实现方式中,指示更新量化分辨率的原因的指示信息包括第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值;其中,第一时延为第一设备等待第二设备反馈AI协作数据的时延,平均时延为第一设备等待多个分布式节点反馈AI协作数据的时延的平均值,多个分布式节点包括第二设备。In one possible implementation, the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
在一种可能的实现方式中,传输第一消息的配置信息包括传输第一消息的传输周期、或传输第一消息的触发条件或传输第一消息的次数中的一种或多种。In a possible implementation manner, the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a triggering condition for transmitting the first message, or a number of times of transmitting the first message.
在一种可能的实现方式中,第一消息还包括以下信息中的一项或多项:测量标识、第二设备标识、第一设备标识、模型的标识、模型对应的训练轮的标识。In a possible implementation manner, the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of a model, and an identifier of a training wheel corresponding to the model.
第三方面,本申请提供了一种通信装置,该通信装置可以是第二设备,也可以是第二设备中的装置,或者是能够和第二设备匹配使用的装置。其中,该通信装置还可以为芯片系统。该通信装置可执行第一方面所述的方法。该通信装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的单元或模块。该单元或模块可以是软件和/或硬件。该通信装置执行的操作及有益效果可以参见上述第一方面所述的方法以及有益效果。In a third aspect, the present application provides a communication device, which may be a second device, or a device in the second device, or a device that can be used in combination with the second device. Among them, the communication device may also be a chip system. The communication device may execute the method described in the first aspect. The functions of the communication device may be implemented by hardware, or by hardware executing corresponding software implementations. The hardware or software includes one or more units or modules corresponding to the above functions. The unit or module may be software and/or hardware. The operations and beneficial effects performed by the communication device may refer to the method and beneficial effects described in the first aspect above.
第四方面,本申请提供了一种通信装置,该通信装置可以是第一设备,也可以是第一设备中的装置,或者是能够和第一设备匹配使用的装置。其中,该通信装置还可以为芯片系统。该通信装置可执行第二方面所述的方法。该通信装置的功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的单元或模块。该单元或模块可以是软件和/或硬件。该通信装置执行的操作及有益效果可以参见上述第二方面所述的方法以及有益效果。In a fourth aspect, the present application provides a communication device, which may be a first device, or a device in the first device, or a device that can be used in combination with the first device. The communication device may also be a chip system. The communication device may execute the method described in the second aspect. The functions of the communication device may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more units or modules corresponding to the above functions. The unit or module may be software and/or hardware. The operations and beneficial effects performed by the communication device may refer to the method and beneficial effects described in the second aspect above.
第五方面,本申请提供了一种通信装置,该通信装置包括处理器和接口电路,所述接口电路用于接收来自所述通信装置之外的其它通信装置的信号并传输至所述处理器或将来自所述处理器的信号发送给所述通信装置之外的其它通信装置,所述处理器通过逻辑电路或执行代码指令用于实现如第一方面所述的方法,或者所述处理器通过逻辑电路或执行代码指令用于实现如第二方面所述的方法。In a fifth aspect, the present application provides a communication device, comprising a processor and an interface circuit, wherein the interface circuit is used to receive signals from other communication devices outside the communication device and transmit them to the processor or send signals from the processor to other communication devices outside the communication device, and the processor is used to implement the method described in the first aspect through a logic circuit or by executing code instructions, or the processor is used to implement the method described in the second aspect through a logic circuit or by executing code instructions.
第六方面,本申请提供了一种计算机可读存储介质,所述存储介质中存储有计算机程序或指令,当所述计算机程序或指令被通信装置执行时,实现如第一方面所述的方法,或者实现如第二方面所述的方法。In a sixth aspect, the present application provides a computer-readable storage medium, in which a computer program or instruction is stored. When the computer program or instruction is executed by a communication device, the method described in the first aspect is implemented, or the method described in the second aspect is implemented.
第七方面,本申请提供一种包括指令的计算机程序产品,当通信装置读取并执行该指令时,使得通信装置执行如第一方面所述的方法,或者,使得通信装置执行如第二方面所述的方法。In a seventh aspect, the present application provides a computer program product comprising instructions, which, when a communication device reads and executes the instructions, causes the communication device to execute the method as described in the first aspect, or causes the communication device to execute the method as described in the second aspect.
第八方面,本申请提供了一种通信系统,包括用于执行上述第一方面所述方法的通信装置,以及用于执行上述第二方面中所描述的网络设备所执行方法的通信装置。 In an eighth aspect, the present application provides a communication system, comprising a communication device for executing the method described in the first aspect above, and a communication device for executing the method executed by the network device described in the second aspect above.
图1是本申请实施例提供的一种通信系统的示意图;FIG1 is a schematic diagram of a communication system provided in an embodiment of the present application;
图2是本申请实施例提供的一种NR系统中AI的应用框架示意图;FIG2 is a schematic diagram of an application framework of AI in an NR system provided in an embodiment of the present application;
图3是本申请实施例提供的一种联邦学习的流程示意图;FIG3 is a schematic diagram of a process flow of federated learning provided in an embodiment of the present application;
图4a是本申请实施例提供的一种数据处理方法的流程示意图;FIG4a is a schematic diagram of a flow chart of a data processing method provided in an embodiment of the present application;
图4b是本申请实施例提供的一种协作学习的流程示意图;FIG4b is a schematic diagram of a collaborative learning process provided by an embodiment of the present application;
图4c是本申请实施例提供的另一种协作学习的流程示意图;FIG4c is a schematic diagram of another collaborative learning process provided by an embodiment of the present application;
图5是本申请实施例提供的一种通信装置的结构示意图;FIG5 is a schematic diagram of the structure of a communication device provided in an embodiment of the present application;
图6是本申请实施例提供的另一种通信装置的结构示意图。FIG6 is a schematic diagram of the structure of another communication device provided in an embodiment of the present application.
为了便于对本申请实施例的具体理解,下面先对本申请实施例涉及的系统架构进行介绍。In order to facilitate a specific understanding of the embodiments of the present application, the system architecture involved in the embodiments of the present application is first introduced below.
图1是本申请的实施例应用的通信系统1000的架构示意图。如图1所示,该通信系统包括无线接入网(radio access network,RAN)100和核心网200,可选的,通信系统1000还可以包括互联网300。其中,RAN100包括至少一个RAN节点(如图1中的110a和110b,统称为110),还可以包括至少一个终端(如图1中的120a-120j,统称为120)。RAN100还可以包括其它RAN节点,例如,无线中继设备和/或无线回传设备(图1中未示出)。终端120通过无线的方式与RAN节点110相连,RAN节点110通过无线或有线方式与核心网200连接。核心网200中的核心网设备与RAN100中的RAN节点110可以是独立的不同的物理设备,也可以是集成了核心网设备的逻辑功能与RAN节点的逻辑功能的同一个物理设备。终端和终端之间以及RAN节点和RAN节点之间可以通过有线或无线的方式相互连接。需要说明的是,在后文中RAN节点110也可称为网络设备110。FIG. 1 is a schematic diagram of the architecture of a communication system 1000 used in an embodiment of the present application. As shown in FIG. 1 , the communication system includes a radio access network (RAN) 100 and a core network 200. Optionally, the communication system 1000 may also include the Internet 300. Among them, the RAN 100 includes at least one RAN node (such as 110a and 110b in FIG. 1 , collectively referred to as 110), and may also include at least one terminal (such as 120a-120j in FIG. 1 , collectively referred to as 120). The RAN 100 may also include other RAN nodes, for example, a wireless relay device and/or a wireless backhaul device (not shown in FIG. 1 ). The terminal 120 is connected to the RAN node 110 wirelessly, and the RAN node 110 is connected to the core network 200 wirelessly or by wire. The core network device in the core network 200 and the RAN node 110 in the RAN 100 may be independent and different physical devices, or may be the same physical device that integrates the logical functions of the core network device and the logical functions of the RAN node. Terminals and RAN nodes may be connected to each other via wired or wireless means. It should be noted that the RAN node 110 may also be referred to as a network device 110 in the following text.
RAN100可以是第三代合作伙伴计划(3rd generation partnership project,3GPP)中定义的演进的通用陆地无线接入(evolved universal terrestrial radio access,E-UTRA)系统、新无线(new radio,NR)系统以及未来的无线接入系统。RAN100还可以包括上述两种或两种以上不同的无线接入系统。RAN100还可以是开放式RAN(open RAN,O-RAN)。RAN100 may be an evolved universal terrestrial radio access (E-UTRA) system, a new radio (NR) system, and a future radio access system defined in the 3rd generation partnership project (3GPP). RAN100 may also include two or more of the above different radio access systems. RAN100 may also be an open RAN (O-RAN).
RAN节点,也称为无线接入网设备、RAN实体或接入节点,用以帮助终端通过无线方式接入到通信系统中。在一种应用场景中,RAN节点可以是基站(base station)、演进型基站(evolved NodeB,eNodeB)、发送接收点(transmission reception point,TRP)、第五代(5th generation,5G)移动通信系统中的下一代基站(next generation NodeB,gNB)、第六代(6th generation,6G)移动通信系统中的下一代基站、未来移动通信系统中的基站。RAN节点可以是宏基站(如图1中的110a),也可以是微基站或室内站(如图1中的110b),还可以是中继节点或施主节点。RAN nodes, also known as radio access network equipment, RAN entities or access nodes, are used to help terminals access the communication system wirelessly. In an application scenario, RAN nodes can be base stations (base stations), evolved NodeBs (eNodeBs), transmission reception points (TRPs), next generation NodeBs (gNBs) in the fifth generation (5G) mobile communication system, next generation NodeBs in the sixth generation (6G) mobile communication system, and base stations in future mobile communication systems. RAN nodes can be macro base stations (such as 110a in FIG. 1 ), micro base stations or indoor stations (such as 110b in FIG. 1 ), or relay nodes or donor nodes.
在另一种应用场景中,可以通过多个RAN节点的协作来帮助终端实现无线接入,不同的RAN节点分别实现基站的部分功能。例如,RAN节点可以是集中式单元(central unit,CU)、分布式单元(distributed unit,DU)或无线单元(radio unit,RU)。这里的CU完成基站的无线资源控制协议和分组数据汇聚层协议(packet data convergence protocol,PDCP)的功能,还可以完成业务数据适配协议(service data adaptation protocol,SDAP)的功能;DU完成基站的无线链路控制层和介质访问控制(medium access control,MAC)层的功能,还可以完成部分物理层或全部物理层的功能,有关上述各个协议层的具体描述,可以参考3GPP的相关技术规范。RU可以用于实现射频信号的收发功能。CU和DU可以是两个独立的RAN节点,也可以是集成在同一个RAN节点中,例如集成在基带单元(baseband unit,BBU)中。RU可以包括在射频设备中,例如包括在射频拉远单元(remote radio unit,RRU)或有源天线单元(active antenna unit,AAU)。CU可以进一步划分为CU-控制面和CU-用户面两种类型的RAN节点。In another application scenario, the cooperation of multiple RAN nodes can help the terminal achieve wireless access, and different RAN nodes respectively implement part of the functions of the base station. For example, the RAN node can be a centralized unit (CU), a distributed unit (DU) or a radio unit (RU). The CU here completes the functions of the radio resource control protocol and the packet data convergence protocol (PDCP) of the base station, and can also complete the function of the service data adaptation protocol (SDAP); the DU completes the functions of the radio link control layer and the medium access control (MAC) layer of the base station, and can also complete the functions of part or all of the physical layer. For the specific description of the above-mentioned protocol layers, please refer to the relevant technical specifications of 3GPP. RU can be used to implement the transceiver function of the radio frequency signal. CU and DU can be two independent RAN nodes, or they can be integrated in the same RAN node, such as integrated in the baseband unit (BBU). The RU may be included in a radio frequency device, such as a remote radio unit (RRU) or an active antenna unit (AAU). The CU may be further divided into two types of RAN nodes: CU-control plane and CU-user plane.
在不同的系统中,RAN节点可能有不同的名称,例如,在O-RAN系统中,CU可以称为开放式CU(open CU,O-CU),DU可以称为开放式DU(open DU,O-DU),RU可以称为开放式RU(open RU,O-RU)。本申请的实施例中的RAN节点可以通过软件模块、硬件模块、或者软件模块与硬件模块结合的方式来实现,例如,RAN节点可以是加载了相应软件模块的服务器。本申请的实施例对RAN节点所采用的具体技术和具体设备形态不做限定。为了便于描述,下文中以基站作为RAN节点的一个举例进行描述。In different systems, RAN nodes may have different names. For example, in an O-RAN system, CU may be called an open CU (open CU, O-CU), DU may be called an open DU (open DU, O-DU), and RU may be called an open RU (open RU, O-RU). The RAN node in the embodiments of the present application may be implemented by a software module, a hardware module, or a combination of a software module and a hardware module. For example, the RAN node may be a server loaded with a corresponding software module. The embodiments of the present application do not limit the specific technology and specific device form adopted by the RAN node. For ease of description, the following description takes a base station as an example of a RAN node.
终端是具有无线收发功能的设备,可以向基站发送信号,或接收来自基站的信号。终端也可以称为终端设备、用户设备(user equipment,UE)、移动台、移动终端等。终端可以广泛应用于各种场景,例如, 设备到设备(device-to-device,D2D)、车物(vehicle to everything,V2X)通信、机器类通信(machine-type communication,MTC)、物联网(internet of things,IOT)、虚拟现实、增强现实、工业控制、自动驾驶、远程医疗、智能电网、智能家具、智能办公、智能穿戴、智能交通、智慧城市等。终端可以是手机、平板电脑、带无线收发功能的电脑、可穿戴设备、车辆、飞机、轮船、机器人、机械臂、智能家居设备等。本申请的实施例对终端所采用的具体技术和具体设备形态不做限定。A terminal is a device with wireless transceiver functions that can send signals to a base station or receive signals from a base station. A terminal can also be called a terminal device, user equipment (UE), mobile station, mobile terminal, etc. Terminals can be widely used in various scenarios, for example, Device-to-device (D2D), vehicle to everything (V2X) communication, machine-type communication (MTC), Internet of Things (IOT), virtual reality, augmented reality, industrial control, autonomous driving, telemedicine, smart grid, smart furniture, smart office, smart wear, smart transportation, smart city, etc. The terminal can be a mobile phone, tablet computer, computer with wireless transceiver function, wearable device, vehicle, airplane, ship, robot, mechanical arm, smart home device, etc. The embodiments of the present application do not limit the specific technology and specific device form adopted by the terminal.
基站和终端可以是固定位置的,也可以是可移动的。基站和终端可以部署在陆地上,包括室内或室外、手持或车载;也可以部署在水面上;还可以部署在飞机、气球和人造卫星上。本申请的实施例对基站和终端的应用场景不做限定。Base stations and terminals can be fixed or movable. Base stations and terminals can be deployed on land, including indoors or outdoors, handheld or vehicle-mounted; they can also be deployed on the water surface; they can also be deployed on airplanes, balloons, and artificial satellites. The embodiments of this application do not limit the application scenarios of base stations and terminals.
基站和终端的角色可以是相对的,例如,图1中的直升机或无人机120i可以被配置成移动基站,对于那些通过120i接入到无线接入网100的终端120j来说,终端120i是基站;但对于基站110a来说,120i是终端,即110a与120i之间是通过无线空口协议进行通信的。当然,110a与120i之间也可以是通过基站与基站之间的接口协议进行通信的,此时,相对于110a来说,120i也是基站。因此,基站和终端都可以统一称为通信装置,图1中的110a和110b可以称为具有基站功能的通信装置,图1中的120a-120j可以称为具有终端功能的通信装置。The roles of the base station and the terminal can be relative. For example, the helicopter or drone 120i in FIG. 1 can be configured as a mobile base station. For the terminal 120j that accesses the wireless access network 100 through 120i, the terminal 120i is a base station; but for the base station 110a, 120i is a terminal, that is, 110a and 120i communicate through the wireless air interface protocol. Of course, 110a and 120i can also communicate through the interface protocol between base stations. In this case, relative to 110a, 120i is also a base station. Therefore, base stations and terminals can be collectively referred to as communication devices. 110a and 110b in FIG. 1 can be referred to as communication devices with base station functions, and 120a-120j in FIG. 1 can be referred to as communication devices with terminal functions.
基站和终端之间、基站和基站之间、终端和终端之间可以通过授权频谱进行通信,也可以通过免授权频谱进行通信,也可以同时通过授权频谱和免授权频谱进行通信;可以通过6千兆赫(gigahertz,GHz)以下的频谱进行通信,也可以通过6GHz以上的频谱进行通信,还可以同时使用6GHz以下的频谱和6GHz以上的频谱进行通信。本申请的实施例对无线通信所使用的频谱资源不做限定。Base stations and terminals, base stations and base stations, and terminals and terminals can communicate through authorized spectrum, unauthorized spectrum, or both; they can communicate through spectrum below 6 gigahertz (GHz), spectrum above 6 GHz, or spectrum below 6 GHz and spectrum above 6 GHz. The embodiments of the present application do not limit the spectrum resources used for wireless communication.
在本申请的实施例中,基站的功能也可以由基站中的模块(如芯片)来执行,也可以由包含有基站功能的控制子系统来执行。这里的包含有基站功能的控制子系统可以是智能电网、工业控制、智能交通、智慧城市等上述应用场景中的控制中心。终端的功能也可以由终端中的模块(如芯片或调制解调器)来执行,也可以由包含有终端功能的装置来执行。In the embodiments of the present application, the functions of the base station may also be performed by a module (such as a chip) in the base station, or by a control subsystem including the base station function. The control subsystem including the base station function here may be a control center in the above-mentioned application scenarios such as smart grid, industrial control, smart transportation, and smart city. The functions of the terminal may also be performed by a module (such as a chip or a modem) in the terminal, or by a device including the terminal function.
在本申请中,基站向终端发送下行信号或下行信息,下行信息承载在下行信道上;终端向基站发送上行信号或上行信息,上行信息承载在上行信道上。终端为了与基站进行通信,需要与基站控制的小区建立无线连接。与终端建立了无线连接的小区称为该终端的服务小区。当终端与该服务小区进行通信的时候,还会受到来自邻区的信号的干扰。In this application, the base station sends a downlink signal or downlink information to the terminal, and the downlink information is carried on the downlink channel; the terminal sends an uplink signal or uplink information to the base station, and the uplink information is carried on the uplink channel. In order to communicate with the base station, the terminal needs to establish a wireless connection with the cell controlled by the base station. The cell with which the terminal has established a wireless connection is called the service cell of the terminal. When the terminal communicates with the service cell, it will also be interfered by signals from neighboring cells.
为了方便理解本申请实施例的相关内容,下面再对本申请实施例中涉及的部分用语进行解释说明,此部分仅是为了便于理解,并不能视为对本申请技术方案的揭示或具体限定。In order to facilitate understanding of the relevant contents of the embodiments of the present application, some of the terms involved in the embodiments of the present application are explained below. This part is only for ease of understanding and cannot be regarded as a disclosure or specific limitation of the technical solution of the present application.
1、AI1. AI
AI是一种通过模拟人脑进行复杂计算的技术,AI可以运用到NR系统中,具体可以通过智能收集和分析数据,提升网络性能和用户体验。请参见图2所示,图2为NR系统中AI的应用框架图。AI is a technology that simulates the human brain to perform complex calculations. AI can be applied to NR systems to improve network performance and user experience by intelligently collecting and analyzing data. See Figure 2, which is an application framework diagram of AI in the NR system.
在图2中,数据采集(data collection)模块作为AI模型训练和数据分析推理的数据库,用于采集并存储来自例如gNB,gNB-CU,gNB-DU,UE或其他网络实体在内的数据输入。模型训练(model training)模块,用于对数据采集模块提供的训练数据进行分析,得到最优的AI模型。模型预测(model inference)模块,基于模型训练模块得到的AI模型,对数据采集模块提供的预测数据进行分析,并向执行(actor)实体输出预测结果(包括对网络运行的合理预测,或是指导网络执行相应的调整策略)。进一步地,执行实体基于该模型预测模块输出的预测结果,对网络实体进行调度管理,即执行主体用于调度包括自身在内的多个网络实体执行该预测结果相应的操作。在通信网络中的各个网络实体根据该预测结果执行相应的操作之后,通信网络的具体表现(可以理解为各个网络实体操作过程中产生的数据)将被数据采集模块采集并存储。In Figure 2, the data collection module is used as a database for AI model training and data analysis and reasoning, and is used to collect and store data inputs from, for example, gNB, gNB-CU, gNB-DU, UE or other network entities. The model training module is used to analyze the training data provided by the data collection module to obtain the optimal AI model. The model prediction module, based on the AI model obtained by the model training module, analyzes the prediction data provided by the data collection module and outputs the prediction results to the execution (actor) entity (including reasonable predictions of network operation, or guiding the network to execute corresponding adjustment strategies). Furthermore, the execution entity schedules and manages the network entity based on the prediction results output by the model prediction module, that is, the execution entity is used to schedule multiple network entities including itself to execute the corresponding operations of the prediction results. After each network entity in the communication network performs the corresponding operations according to the prediction results, the specific performance of the communication network (which can be understood as the data generated during the operation of each network entity) will be collected and stored by the data collection module.
2、联邦学习2. Federated Learning
联邦学习,即是多个设备(记为参与联邦学习的分布式节点)在无需共享数据资源(或理解为模型的训练数据)的情况下,进行AI模型的联合训练,建立共享的AI模型。根据训练数据在不同分布式节点之间的数据特征空间和样本标识(identity,ID)空间的分布情况,可以将联邦学习分为以下三种类型:横向联邦学习(horizontal federated learning,HFL)、纵向联邦学习(vertical federated learning,VFL)和联邦迁移学习(federated transfer learning,FTL)。Federated learning is the joint training of AI models by multiple devices (referred to as distributed nodes participating in federated learning) without sharing data resources (or understood as model training data) to establish a shared AI model. According to the distribution of training data in the data feature space and sample identity (ID) space between different distributed nodes, federated learning can be divided into the following three types: horizontal federated learning (HFL), vertical federated learning (VFL) and federated transfer learning (FTL).
无论是哪种类型的联邦学习,在联邦学习的过程中,参与该联邦学习的分布式节点都可以通过共享AI协作数据替代共享数据资源的方式,进行AI模型的联合训练。其中,AI协作数据包括以下数据中的一 种或多种:梯度数据、AI模型、AI子模型、AI模型输出、AI模型中间值等。Regardless of the type of federated learning, during the federated learning process, the distributed nodes participating in the federated learning can perform joint training of AI models by sharing AI collaborative data instead of sharing data resources. Among them, AI collaborative data includes one of the following data: One or more types: gradient data, AI model, AI sub-model, AI model output, AI model intermediate value, etc.
示例性的,如图3所示,设备1和设备2为参与联邦学习的分布式节点。在进行联邦学习的过程中,设备1进行模型训练得到AI协作数据#1,设备2进行模型训练得到AI协作数据#2。进一步地,设备1向该联邦学习的集中式节点(server)发送AI协作数据#1,设备2向该集中式节点发送AI协作数据#2。由集中式节点对该AI协作数据#1和AI协作数据#2进行聚合,得到AI协作数据#12。之后,集中式节点向设备1和设备2发送该AI协作数据#12,以使设备1和设备2根据该AI协作数据#12进行模型训练。Exemplarily, as shown in FIG3, device 1 and device 2 are distributed nodes participating in federated learning. In the process of federated learning, device 1 performs model training to obtain AI collaboration data #1, and device 2 performs model training to obtain AI collaboration data #2. Further, device 1 sends AI collaboration data #1 to the centralized node (server) of the federated learning, and device 2 sends AI collaboration data #2 to the centralized node. The centralized node aggregates the AI collaboration data #1 and the AI collaboration data #2 to obtain AI collaboration data #12. Afterwards, the centralized node sends the AI collaboration data #12 to device 1 and device 2, so that device 1 and device 2 perform model training according to the AI collaboration data #12.
通常AI协作数据的数据量较大,为了减少联邦学习的过程中各分布式节点上传和下载AI协作数据产生的通信开销,可以对AI协作数据进行量化压缩后传输。其中,对AI协作数据进行量化压缩,可以理解为通过多个比特对待传输(包括上传或下载)的AI协作数据进行表示;该用于表示AI协作数据的比特的数量可以理解为该AI协作数据对应的量化级别,或称为量化分辨率。Usually, the amount of AI collaborative data is large. In order to reduce the communication overhead generated by each distributed node uploading and downloading AI collaborative data during federated learning, the AI collaborative data can be quantized and compressed before transmission. Among them, quantization and compression of AI collaborative data can be understood as representing the AI collaborative data to be transmitted (including upload or download) by multiple bits; the number of bits used to represent the AI collaborative data can be understood as the quantization level corresponding to the AI collaborative data, or called quantization resolution.
可以理解的是,用于表示AI协作数据的比特的数量越少,该AI协作数据的量化分辨率越低,在传输过程中引入的误差(可以理解为量化误差)越大,从而可能会增加联邦学习过程中模型迭代收敛的用时;而,用于表示AI协作数据的比特的数量越多,该AI协作数据的量化分辨率越高,在传输过程中传输的数据量越大,传输用时越长,从而可能会增加联邦学习过程中模型训练的用时。可见,量化分辨率在一定程度上会影响联邦学习的模型的训练时长。It is understandable that the fewer bits used to represent AI collaborative data, the lower the quantization resolution of the AI collaborative data, and the greater the error introduced during the transmission process (which can be understood as the quantization error), which may increase the time it takes for the model to iterate and converge during the federated learning process; while, the more bits used to represent AI collaborative data, the higher the quantization resolution of the AI collaborative data, the larger the amount of data transmitted during the transmission process, and the longer the transmission time, which may increase the time it takes for the model to train during the federated learning process. It can be seen that the quantization resolution will affect the training time of the federated learning model to a certain extent.
通常,联邦学习的分布式节点会基于固定的(或理解为预定的)量化分辨率对AI协作数据进行量化。但在联邦学习的过程中,该分布式节点可用于联邦学习的资源(包括但不限于通信资源或计算资源等)可能会发生变化。在此种情况下,一直通过固定的量化分辨率对AI协作数据进行量化,会出现量化分辨率与当前分布式节点不匹配的情况,从而可能会增大联邦学习的模型训练时长,降低联邦学习的训练效率。Typically, the distributed nodes of federated learning will quantize AI collaborative data based on a fixed (or understood as predetermined) quantization resolution. However, during the process of federated learning, the resources that the distributed nodes can use for federated learning (including but not limited to communication resources or computing resources, etc.) may change. In this case, if the AI collaborative data is always quantized at a fixed quantization resolution, the quantization resolution will not match the current distributed nodes, which may increase the model training time of federated learning and reduce the training efficiency of federated learning.
为了提升协作学习的训练效率,本申请提供一种数据处理方法和通信装置,下面结合附图对本申请实施例提供的数据处理方法及通信装置进行详细描述。其中,本申请所提及的协作学习可以理解为多个设备协作(或理解为协同或联合)进行训练的AI模型训练方式,例如该协作学习可以为分布式学习、联邦学习、边缘学习或分割学习等。In order to improve the training efficiency of collaborative learning, the present application provides a data processing method and a communication device. The data processing method and the communication device provided by the embodiment of the present application are described in detail below in conjunction with the accompanying drawings. Among them, the collaborative learning mentioned in the present application can be understood as an AI model training method in which multiple devices collaborate (or are understood as collaborative or joint) to perform training. For example, the collaborative learning can be distributed learning, federated learning, edge learning, or segmented learning.
图4a是本申请实施例提供的一种数据处理方法的流程示意图。如图4a所示,该数据处理方法包括如下步骤S401~步骤S405,图4a所示的方法执行主体以第一设备和第二设备为例进行说明。可以理解,图4a所示的方法执行主体也可以为第一设备中的模块(例如,芯片)和第二设备中的模块(例如,芯片)。FIG4a is a flow chart of a data processing method provided in an embodiment of the present application. As shown in FIG4a, the data processing method includes the following steps S401 to S405, and the method execution subject shown in FIG4a is illustrated by taking the first device and the second device as an example. It can be understood that the method execution subject shown in FIG4a can also be a module (e.g., a chip) in the first device and a module (e.g., a chip) in the second device.
其中,该第一设备和第二设备为参与协作学习的设备,例如该第一设备可以是协作学习对应的集中式节点,第二设备可以是协作学习中的分布式节点(或理解为执行模型训练的设备)。需要说明的是,该第一设备可以为图1通信系统中RAN 100中的设备(例如RAN节点或终端等)、CN 200中的功能网元或互联网300对应的服务器中的任一种,该第二设备也可以为图1通信系统中RAN 100中的设备、CN 200中的功能网元或互联网300对应的服务器中的任一种;本申请对此不进行具体限定。其中:Among them, the first device and the second device are devices participating in collaborative learning. For example, the first device can be a centralized node corresponding to collaborative learning, and the second device can be a distributed node in collaborative learning (or understood as a device that performs model training). It should be noted that the first device can be any one of the devices in RAN 100 (such as a RAN node or terminal, etc.), the functional network element in CN 200, or the server corresponding to the Internet 300 in the communication system of Figure 1, and the second device can also be any one of the devices in RAN 100, the functional network element in CN 200, or the server corresponding to the Internet 300 in the communication system of Figure 1; this application does not make specific limitations on this. Among them:
S401、第二设备向第一设备发送第一消息。其中,该第一消息包括第二设备的通信条件信息、第二设备的计算资源信息或第二设备的模型训练信息中的一种或多种。S401: The second device sends a first message to the first device, wherein the first message includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device.
可以理解为,该第二设备是在协作学习过程中上传AI协作数据的设备,第一设备是在协作学习过程中确定第二设备的量化分辨率的设备。以协作学习为联邦学习为例,该第一设备可以为联邦学习的服务节点(server),第二设备可以为联邦学习的多个客户节点(client)中的一个。It can be understood that the second device is a device that uploads AI collaborative data during the collaborative learning process, and the first device is a device that determines the quantization resolution of the second device during the collaborative learning process. Taking the collaborative learning as federated learning as an example, the first device can be a service node (server) of the federated learning, and the second device can be one of the multiple client nodes (client) of the federated learning.
也就是说,在第一设备更新(或理解为确定)第二设备的量化分辨率之前,第一设备接收来自第二设备的第一消息,该第一消息用于协助第一设备更新第二设备的量化分辨率。具体的,第二设备可以通过第一消息中的通信条件信息指示该第二设备所处(或称为对应)的通信环境,通过第一消息中的计算信息指示该第二设备可用于协作学习的计算资源,通过第一消息中的模型训练信息指示第二设备在进行模型训练过程中的相关信息(例如AI协作数据的大小或量化分辨率相关数据等)。需要说明的是,本申请对第二设备获取通信条件信息、计算资源信息和模型训练信息的方式不进行具体限定。That is to say, before the first device updates (or is understood to determine) the quantization resolution of the second device, the first device receives a first message from the second device, and the first message is used to assist the first device in updating the quantization resolution of the second device. Specifically, the second device can indicate the communication environment in which the second device is located (or called corresponding) through the communication condition information in the first message, indicate the computing resources that the second device can use for collaborative learning through the computing information in the first message, and indicate the relevant information of the second device during the model training process (such as the size of AI collaborative data or quantization resolution related data, etc.) through the model training information in the first message. It should be noted that this application does not specifically limit the way in which the second device obtains communication condition information, computing resource information and model training information.
在一种可能的实施方式中,该通信条件信息包括通信带宽信息、传输时延、参考信号接收功率(reference signal receiving power,RSRP)、参考信号接收质量(reference signal receiving quality,RSRQ)、信号与干扰加噪声比(signal to interference plus noise ratio,SINR)、误码率或吞吐量中的一种或多种。该计算资源信息包括计算资源的大小、计算能力的大小或计算时延中的一种或多种。该模型训练信息包括测试集损失值、训练集损失值、梯度范数大小、梯度大小、模型大小中的一种或多种。In a possible implementation, the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal receiving power (RSRP), reference signal receiving quality (RSRQ), signal to interference plus noise ratio (SINR), bit error rate or throughput. The computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay. The model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
在一种可能的实施方式中,第二设备发送的第一消息除了包含第二设备的通信条件信息、第二设备的 计算资源信息或第二设备的模型训练信息中的一种或多种之外,还可以包括以下信息中的一项或多项:测量标识,用于标识本次第一消息的上报,以及关联后续第二消息;第二设备标识,用于标识发送第一消息的设备,或理解为待被更新量化分辨率的设备;第一设备标识,用于标识接收第一消息的设备,或理解为用于确定量化分辨率的设备;模型的标识,用于标识该第一消息对应的AI模型;模型对应的训练轮的标识,用于指示该第一消息用于更新的是第二设备反馈哪个训练轮的AI协作数据时所使用的量化分辨率。In a possible implementation manner, the first message sent by the second device includes, in addition to the communication condition information of the second device, In addition to one or more of the computing resource information or the model training information of the second device, it may also include one or more of the following information: a measurement identifier, used to identify the reporting of the first message and to associate the subsequent second message; a second device identifier, used to identify the device that sends the first message, or understood as the device to be updated with the quantization resolution; a first device identifier, used to identify the device that receives the first message, or understood as the device for determining the quantization resolution; a model identifier, used to identify the AI model corresponding to the first message; an identifier of the training wheel corresponding to the model, used to indicate that the first message is used to update the quantization resolution used when the second device feeds back which training wheel of AI collaboration data.
S402、第一设备根据该通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率。S402. The first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
也就是说,第一设备接收该第一消息之后,根据该第一消息包括的信息确定量化分辨率。在一种可能的实施方式中,第一设备以最小化模型的训练时长为目标,根据通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率。That is, after receiving the first message, the first device determines the quantization resolution according to the information included in the first message. In one possible implementation, the first device determines the quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information with the goal of minimizing the training time of the model.
可以理解,在协作学习的模型的每一轮训练过程中,第一设备会等待协作学习的多个分布式节点(包含第二设备)反馈AI协作数据之后,才会执行后续步骤(例如聚合AI协作数据等)。可见,模型的训练时长受协作学习的分布式节点反馈AI协作数据的时延的影响。在这种情况下,第一设备为了最小化模型的训练时长,可以为算力大、时延小或梯度范数大的分布式节点确定较大的量化分辨率(即该量化分辨率对应的比特数较多),以提升该分布式节点反馈AI协作数据的准确度,为算力小、时延大或梯度范数小的分布式节点确定较小的量化分辨率(即该量化分辨率对应的比特数较少),以减少第一设备与该分布式节点的通信时延,从而降低训练时延。It can be understood that in each round of training of the collaborative learning model, the first device will wait for multiple distributed nodes (including the second device) of collaborative learning to feedback AI collaborative data before executing subsequent steps (such as aggregating AI collaborative data, etc.). It can be seen that the training duration of the model is affected by the delay in the distributed nodes of collaborative learning to feedback AI collaborative data. In this case, in order to minimize the training duration of the model, the first device can determine a larger quantization resolution (i.e., a larger number of bits corresponding to the quantization resolution) for distributed nodes with large computing power, small delay, or large gradient norm to improve the accuracy of the AI collaborative data fed back by the distributed nodes, and determine a smaller quantization resolution (i.e., a smaller number of bits corresponding to the quantization resolution) for distributed nodes with small computing power, large delay, or small gradient norm to reduce the communication delay between the first device and the distributed node, thereby reducing the training delay.
S403、第一设备向第二设备发送第二消息,该第二消息包括指示量化分辨率的指示信息。S403: The first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
也就是说,第一设备确定第二设备的量化分辨率之后,通过第二消息向第二设备指示量化分辨率。其中,该指示信息指示量化分辨率的方式可以是直接指示该量化分辨率(即量化所使用的比特的数量),例如第二消息包括4bit或16bit等;也可以是指示量化分辨率对应的量化索引,例如第二消息包括的量化索引为0则表示基于4bit进行量化,第二消息包括的量化索引为1则表示基于16bit进行量化。That is, after the first device determines the quantization resolution of the second device, it indicates the quantization resolution to the second device through the second message. The indication information may indicate the quantization resolution by directly indicating the quantization resolution (i.e., the number of bits used for quantization), for example, the second message includes 4 bits or 16 bits, etc.; or it may indicate the quantization index corresponding to the quantization resolution, for example, if the quantization index included in the second message is 0, it means that the quantization is based on 4 bits, and if the quantization index included in the second message is 1, it means that the quantization is based on 16 bits.
在一种可能的实施方式中,该第二消息除了包含该指示量化分辨率的指示信息之外,还可以包括以下信息中的一种或多种:测量标识,与第一消息中所包含的测量标识对应,用于标识本次第二消息的下发;第二设备标识,用于标识接收第二消息的设备,或理解为应用该量化分辨率的设备;第一设备标识,用于标识发送第二消息的设备,或理解为确定该量化分辨率的设备;模型的标识,用于标识该量化分辨率对应的AI模型;模型对应的训练轮的标识,用于指示更新的是第二设备反馈哪个训练轮的AI协作数据时所使用的量化分辨率。In a possible implementation, in addition to the indication information indicating the quantization resolution, the second message may also include one or more of the following information: a measurement identifier, corresponding to the measurement identifier contained in the first message, used to identify the issuance of the second message; a second device identifier, used to identify the device receiving the second message, or understood as a device applying the quantization resolution; a first device identifier, used to identify the device sending the second message, or understood as a device determining the quantization resolution; a model identifier, used to identify the AI model corresponding to the quantization resolution; an identifier of the training wheel corresponding to the model, used to indicate the quantization resolution used when the second device feeds back the AI collaboration data of which training wheel is updated.
S404、第二设备基于量化分辨率对第一AI协作数据进行量化,得到第二AI协作数据。S404: The second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
在该量化分辨率对应的训练轮,第二设备进行模型的训练,得到第一AI协作数据(即量化前的AI协作数据)。进一步地,第二设备基于该量化分辨率,对第一AI协作数据进行量化,得到第二AI协作数据(即量化后的AI协作数据)。In the training round corresponding to the quantization resolution, the second device trains the model to obtain the first AI collaboration data (i.e., the AI collaboration data before quantization). Further, the second device quantizes the first AI collaboration data based on the quantization resolution to obtain the second AI collaboration data (i.e., the AI collaboration data after quantization).
S405、第二设备向第一设备发送第二AI协作数据。S405. The second device sends second AI collaboration data to the first device.
可以理解为第二设备得到第二AI协作数据之后,第二设备向第一设备发送第四消息,该第四消息包括第二AI协作数据。在一种可能的实施方式中,该第四消息除了包括第二AI协作数据之外,还可以包括以下信息中的一项或多项:第二设备标识,用于标识发送第四消息的设备;第一设备标识,用于标识接收第四消息的设备;模型的标识,用于标识第二AI协作数据对应的AI模型;模型对应的训练轮的标识,用于指示第二AI协作数据是哪个训练轮的AI协作数据。It can be understood that after the second device obtains the second AI collaboration data, the second device sends a fourth message to the first device, and the fourth message includes the second AI collaboration data. In a possible implementation, in addition to the second AI collaboration data, the fourth message may also include one or more of the following information: a second device identifier, used to identify the device that sends the fourth message; a first device identifier, used to identify the device that receives the fourth message; a model identifier, used to identify the AI model corresponding to the second AI collaboration data; an identifier of the training wheel corresponding to the model, used to indicate which training wheel the second AI collaboration data belongs to.
综上所述,通过图4a所示的方法,可以结合当前第二设备的通信条件、计算资源或模型训练信息,对第二设备的量化分辨率进行动态调整,有利于提升量化分辨率与当前第二设备的适配度,通过与当前第二设备适配的量化分辨率对AI协作数据进行量化和传输,有利于提升协作学习的训练效率。To summarize, through the method shown in Figure 4a, the quantization resolution of the second device can be dynamically adjusted in combination with the communication conditions, computing resources or model training information of the current second device, which is beneficial to improving the compatibility of the quantization resolution with the current second device. By quantizing and transmitting AI collaborative data at a quantization resolution adapted to the current second device, it is beneficial to improve the training efficiency of collaborative learning.
下面再根据触发第二设备向第一设备发送第一消息的触发方式区分,分为以下两种情况对协作学习的过程进行详细说明。需要说明的是,在以下两种情况中仅是以参与协作学习的设备包括第一设备、第二设备和第三设备为例,参与该协作学习的设备还可以包括其他设备(图示中未示出),本申请对此并未进行具体限定。Next, the collaborative learning process is described in detail in the following two cases according to the triggering method of triggering the second device to send the first message to the first device. It should be noted that in the following two cases, only the devices participating in the collaborative learning include the first device, the second device and the third device as an example. The devices participating in the collaborative learning may also include other devices (not shown in the figure), and this application does not make specific limitations on this.
情况一、通过被动触发的方式,触发第二设备向第一设备发送第一消息。Case 1: triggering the second device to send the first message to the first device through passive triggering.
也就是说,在情况一中,第二设备接收来自其他设备(除第二设备之外的其他设备,包括第一设备)的第三消息,该第三消息用于触发第二设备向第一设备发送前述第一消息。在这种情况下,该第一消息可 以理解为该第三消息对应的响应消息。为了便于理解,后文中均以第三消息是由第一设备向第二设备发送的为例进行示例性讲解,不应视为对第三消息对应发送设备的具体限定。That is, in case 1, the second device receives a third message from another device (other devices except the second device, including the first device), and the third message is used to trigger the second device to send the aforementioned first message to the first device. In this case, the first message may be For ease of understanding, the following text uses the example that the third message is sent by the first device to the second device as an example for illustrative explanation, which should not be regarded as a specific limitation on the sending device corresponding to the third message.
请参见图4b所示,该图4b为本申请提供的一种协作学习的流程示意图。如图4b所示,该协作学习包括如下步骤S411~步骤S419。其中:Please refer to FIG. 4b, which is a schematic diagram of a collaborative learning process provided by the present application. As shown in FIG. 4b, the collaborative learning includes the following steps S411 to S419. Among them:
S411、第一设备向第二设备发送第三消息,该第三消息用于请求第二设备协助更新量化分辨率。S411. The first device sends a third message to the second device, where the third message is used to request the second device to assist in updating a quantization resolution.
可以理解为,在第一设备确定具有更新第二设备的量化分辨率的需求的情况下,第一设备向第二设备发送第三消息,该第三消息用于请求第二设备协助更新量化分辨率。需要说明的是,本申请对第三消息的具体名称不进行具体限定,例如,该第三消息可以为协助更新量化分辨率的请求消息。It can be understood that, when the first device determines that there is a need to update the quantization resolution of the second device, the first device sends a third message to the second device, and the third message is used to request the second device to assist in updating the quantization resolution. It should be noted that the specific name of the third message is not specifically limited in this application. For example, the third message can be a request message for assisting in updating the quantization resolution.
示例性的,设备1~设备5为执行联邦学习任务的设备。其中,设备1为该联邦学习任务对应的server,设备2~设备5为执行该联邦学习任务的client。在执行联邦学习任务的过程中,设备2~设备5中的每个设备会向设备1反馈AI协作数据。在联邦学习的某一轮(或称为训练轮(epoch))训练过程中,设备1等待设备2反馈AI协作数据#2的时延为时延1,设备1等待设备3反馈AI协作数据#3的时延为时延2,设备1等待设备4反馈AI协作数据#4的时延为时延3,设备1等待设备5反馈AI协作数据#5的时延为时延4。其中,时延1为时延1~时延4中的最大值,且时延1~时延4的平均时延与时延1之间差值的绝对值大于第一阈值;该第一阈值为预设的大于0的数值,其具体数值可根据具体应用场景进行调整。在这种情况下,为了减少设备1等待所有client反馈AI协作数据的耗时,设备1可以通过调整设备2的量化分辨率的方式,减少设备1等待设备2反馈AI协作数据的时延,即可视为设备1确定具有更新设备2的量化分辨率的需求(即可以理解为将设备2确定为第二设备)。进一步地,设备1向设备2发送第三消息,该第三消息用于请求设备2协助设备1更新量化分辨率,即请求设备2向设备1发送第一消息。Exemplarily, devices 1 to 5 are devices that perform federated learning tasks. Among them, device 1 is the server corresponding to the federated learning task, and devices 2 to 5 are clients that perform the federated learning task. In the process of executing the federated learning task, each device in devices 2 to 5 will feedback AI collaboration data to device 1. In a certain round (or training round (epoch)) of federated learning training, the delay for device 1 to wait for device 2 to feedback AI collaboration data #2 is delay 1, the delay for device 1 to wait for device 3 to feedback AI collaboration data #3 is delay 2, the delay for device 1 to wait for device 4 to feedback AI collaboration data #4 is delay 3, and the delay for device 1 to wait for device 5 to feedback AI collaboration data #5 is delay 4. Among them, delay 1 is the maximum value of delay 1 to delay 4, and the absolute value of the difference between the average delay of delay 1 to delay 4 and delay 1 is greater than the first threshold; the first threshold is a preset value greater than 0, and its specific value can be adjusted according to the specific application scenario. In this case, in order to reduce the time consumed by device 1 waiting for all clients to feedback AI collaboration data, device 1 can reduce the delay of device 1 waiting for device 2 to feedback AI collaboration data by adjusting the quantization resolution of device 2, which can be regarded as device 1 determining the need to update the quantization resolution of device 2 (that is, it can be understood as determining device 2 as the second device). Further, device 1 sends a third message to device 2, and the third message is used to request device 2 to assist device 1 in updating the quantization resolution, that is, requesting device 2 to send the first message to device 1.
在一种可能的实施方式中,该第三消息包括以下信息中的一项或多项:In a possible implementation manner, the third message includes one or more of the following information:
①测量标识,又称measurement ID,用于指示本次请求协助更新量化分辨率的请求。① Measurement ID, also known as measurement ID, is used to indicate the request for assistance in updating the quantization resolution.
②第二设备的标识,用于指示待更新量化分辨率的设备,或用于指示接收第三消息的设备,或用于指示发送第一消息的设备。② The identifier of the second device, used to indicate the device whose quantization resolution is to be updated, or used to indicate the device that receives the third message, or used to indicate the device that sends the first message.
③第一设备的标识,用于指示确定量化分辨率的设备,或用于指示发送第三消息的设备,或用于指示接收第一消息的设备。③ The identifier of the first device, used to indicate a device for determining the quantization resolution, or used to indicate a device for sending the third message, or used to indicate a device for receiving the first message.
④模型的标识,用于指示第二设备需要更新量化分辨率的AI模型(或理解为AI协作模型)。其中,该模型标识可以是AI模型的标识,或者与该AI模型关联的关联信息(例如该AI模型对应的业务标识)等。④ Model identification, used to indicate that the second device needs to update the AI model (or understand it as an AI collaboration model) of the quantization resolution. The model identification can be the identification of the AI model, or associated information associated with the AI model (such as the business identification corresponding to the AI model), etc.
⑤指示更新量化分辨率的原因的指示信息,用于指示本次更新量化分辨率的原因。示例性的,第三消息中包括字段A,通过该字段A的值来指示本次更新量化分辨率的原因。当该字段A的值为1时,该第三消息指示更新量化分辨率的原因为:第二设备的时延(即第一设备等待第二设备反馈AI协作数据的时延)较大,或理解为需要提升第二设备反馈AI协作数据的效率而更新量化分辨率。当该字段A的值为0时,该第三消息指示更新量化分辨率的原因为:第二设备反馈的AI协作数据的误差较大,或理解为需要提升第二设备反馈的AI协作数据的准确度而更新量化分辨率。⑤ Indication information indicating the reason for updating the quantization resolution, used to indicate the reason for this update of the quantization resolution. Exemplarily, the third message includes field A, and the value of field A is used to indicate the reason for this update of the quantization resolution. When the value of field A is 1, the third message indicates that the reason for updating the quantization resolution is: the latency of the second device (i.e., the latency of the first device waiting for the second device to feedback the AI collaborative data) is large, or it is understood that the efficiency of the second device to feedback the AI collaborative data needs to be improved and the quantization resolution needs to be updated. When the value of field A is 0, the third message indicates that the reason for updating the quantization resolution is: the error of the AI collaborative data fed back by the second device is large, or it is understood that the accuracy of the AI collaborative data fed back by the second device needs to be improved and the quantization resolution needs to be updated.
在一种可能的实施方式中,该指示更新量化分辨率的原因的指示信息包括第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值。其中,第一时延为第一设备等待第二设备反馈AI协作数据的时延,该平均时延为第一设备等待多个分布式节点反馈AI协作数据的时延的平均值,该多个分布式节点包括第二设备。可以理解为,在更新量化分辨率的原因为第二设备的时延较大时(或理解为需要提升第二设备反馈AI协作数据的效率而更新量化分辨率)时,该第三消息中包括的指示信息可以为第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值。In a possible implementation, the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay. The first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device. It can be understood that when the reason for updating the quantization resolution is that the delay of the second device is large (or it is understood that the quantization resolution needs to be updated to improve the efficiency of the second device to feedback the AI collaboration data), the indication information included in the third message can be the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay.
⑥模型对应的训练轮的标识,用于指示第一设备基于第一消息更新的是第二设备反馈哪个训练轮的AI协作数据时所使用的量化分辨率。例如,第三消息中包括的模型的第N个训练轮的标识,则表示第一设备基于第一消息更新的是第二设备反馈第N个训练轮的AI协作数据时所使用的量化分辨率,即第二设备后续在反馈第N个训练轮的AI协作数据时,可以所使用基于该第一消息更新后的量化分辨率进行量化。需要说明的是,在本申请中训练轮也可简称为轮,全文如是。⑥ The identifier of the training wheel corresponding to the model is used to indicate the quantization resolution used when the second device feeds back the AI collaboration data of which training wheel the first device updates based on the first message. For example, the identifier of the Nth training wheel of the model included in the third message indicates that the first device updates the quantization resolution used when the second device feeds back the AI collaboration data of the Nth training wheel based on the first message, that is, when the second device subsequently feeds back the AI collaboration data of the Nth training wheel, it can use the quantization resolution updated based on the first message for quantization. It should be noted that in this application, the training wheel may also be referred to as a wheel, and the whole text is the same.
⑦请求反馈的信息的标识,用于指示第一设备期待(或理解为请求)第二设备反馈的信息。其中,请求反馈的信息包括通信条件信息、计算资源信息或模型训练信息中的一种或多种。后续第二设备可以根据该请求反馈的信息的标识生成的第一消息,即该第一消息是根据该请求反馈的信息的标识生成的。⑦ The identifier of the information requested for feedback is used to indicate that the first device expects (or understands as a request) information to be fed back by the second device. The information requested for feedback includes one or more of communication condition information, computing resource information, or model training information. The second device may subsequently generate a first message based on the identifier of the information requested for feedback, that is, the first message is generated based on the identifier of the information requested for feedback.
需要说明的是,第一消息中包括的信息可以是部分或全部该第三消息请求反馈的信息。例如,第一设备请求第二设备反馈通信条件信息、计算资源信息和模型训练信息,则第一设备发送的第三消息中请求反 馈的信息的标识包括:通信条件信息的标识(例如传输时延的标识或RSRP的标识等)、计算资源信息的标识和模型训练信息的标识(例如测试集损失值的标识或梯度范数大小的标识)。在这种情况下,第二设备根据该第三消息,可以只反馈部分(即反馈通信条件信息、计算资源信息或模型训练信息的一种或两种),也可以反馈全部((即反馈通信条件信息、计算资源信息和模型训练信息)。It should be noted that the information included in the first message may be part or all of the information requested for feedback in the third message. For example, if the first device requests the second device to feedback communication condition information, computing resource information, and model training information, then the third message sent by the first device may include the information requested for feedback. The identifier of the information fed back includes: the identifier of the communication condition information (such as the identifier of the transmission delay or the identifier of the RSRP, etc.), the identifier of the computing resource information and the identifier of the model training information (such as the identifier of the test set loss value or the identifier of the gradient norm size). In this case, the second device can only feedback part (that is, feedback one or two of the communication condition information, computing resource information or model training information) according to the third message, or it can feedback all (that is, feedback communication condition information, computing resource information and model training information).
⑧传输第一消息的配置信息,用于指示第二设备如何传输第一消息。在一种可能的实施方式中,该传输第一消息的配置信息包括传输第一消息的传输周期、传输第一消息的触发条件或传输第一消息的次数中的一种或多种。其中:⑧ Configuration information for transmitting the first message, used to instruct the second device how to transmit the first message. In a possible implementation, the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a trigger condition for transmitting the first message, or a number of times of transmitting the first message. Wherein:
传输第一消息的传输周期,可以是基于时间间隔值/索引确定的,也可以是基于训练轮确定的。以基于训练轮确定的传输周期为例,在该配置信息指示第一消息的传输周期为3个训练轮时,该配置信息指示第二设备每间隔3个训练轮就传输一次第一消息。The transmission period of transmitting the first message may be determined based on a time interval value/index or based on a training round. Taking the transmission period determined based on a training round as an example, when the configuration information indicates that the transmission period of the first message is 3 training rounds, the configuration information indicates that the second device transmits the first message once every 3 training rounds.
传输第一消息的触发条件可以包括以下条件中的一种或多种:1、触发即时传输第一消息的触发条件(即第二设备满足该触发条件则传输一次第一消息),例如,该触发条件为第二设备的梯度范数大于或等于梯度范数阈值、通信带宽小于或等于带宽阈值、信道质量(例如RSRP、RSRQ、SINR、误码率、吞吐量等)小于或等于信道质量阈值、算力小于或等于算力阈值等;2、触发周期传输第一消息的触发条件(即第二设备满足该触发条件则根据传输第一消息的传输周期,周期性地传输第一消息),例如,该触发条件为第二设备的梯度范数大于或等于梯度范数阈值、通信带宽小于或等于带宽阈值、信道质量小于或等于信道质量阈值、算力小于或等于算力阈值等;3、结束周期性传输第一消息的触发条件(即第二消息在周期性地传输第一消息的情况下,若第二设备满足该触发条件,则第二设备停止周期性地传输第一消息),例如,该触发条件为第二设备的梯度范数小于梯度范数阈值、通信带宽大于带宽阈值、信道质量大于信道质量阈值、算力大于算力阈值等。The triggering condition for transmitting the first message may include one or more of the following conditions: 1. A triggering condition for triggering instant transmission of the first message (i.e., the second device transmits the first message once if the triggering condition is met), for example, the triggering condition is that the gradient norm of the second device is greater than or equal to the gradient norm threshold, the communication bandwidth is less than or equal to the bandwidth threshold, the channel quality (e.g., RSRP, RSRQ, SINR, bit error rate, throughput, etc.) is less than or equal to the channel quality threshold, the computing power is less than or equal to the computing power threshold, etc.; 2. A triggering condition for triggering periodic transmission of the first message (i.e., the second device transmits the first message once if the triggering condition is met, according to the transmission period of the first message, the period The trigger condition for periodically transmitting the first message is as follows: for example, the gradient norm of the second device is greater than or equal to the gradient norm threshold, the communication bandwidth is less than or equal to the bandwidth threshold, the channel quality is less than or equal to the channel quality threshold, the computing power is less than or equal to the computing power threshold, etc.; 3. The trigger condition for ending the periodic transmission of the first message (that is, when the second message periodically transmits the first message, if the second device meets the trigger condition, the second device stops periodically transmitting the first message), for example, the trigger condition is that the gradient norm of the second device is less than the gradient norm threshold, the communication bandwidth is greater than the bandwidth threshold, the channel quality is greater than the channel quality threshold, the computing power is greater than the computing power threshold, etc.
传输第一消息的次数,用于指示第二设备接收该传输第一消息的配置信息后,需要传输第一消息的总次数。The number of times the first message is transmitted is used to indicate the total number of times the second device needs to transmit the first message after receiving the configuration information for transmitting the first message.
S412、第二设备根据该第三消息,向第一设备发送第一消息。S412: The second device sends the first message to the first device according to the third message.
可以理解将该第一消息理解为第三消息的响应消息。例如,第一消息中测量标识与第三消息中所包含的测量标识对应(或理解为相同),或者第一消息中模型的标识与第三消息中所包含的模型的标识相同,或者第一消息中模型对应的训练轮的标识与第三消息中所包含的模型对应的训练轮的标识相同。It can be understood that the first message is understood as a response message of the third message. For example, the measurement identifier in the first message corresponds to (or is understood to be the same as) the measurement identifier included in the third message, or the identifier of the model in the first message is the same as the identifier of the model included in the third message, or the identifier of the training wheel corresponding to the model in the first message is the same as the identifier of the training wheel corresponding to the model included in the third message.
需要说明的是,本申请对第一消息的具体名称不进行具体限定,例如,该第一消息可以为协助更新量化分辨率的响应消息。其中,对于该第一消息的具体描述可参见前述S401中对第一消息的具体描述,在此不再赘述。It should be noted that the present application does not specifically limit the specific name of the first message. For example, the first message may be a response message for assisting in updating the quantization resolution. For a specific description of the first message, please refer to the specific description of the first message in S401 above, which will not be repeated here.
S413、第一设备根据该通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率。S413. The first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
S414、第一设备向第二设备发送第二消息,该第二消息包括指示量化分辨率的指示信息。S414: The first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
S415、第二设备基于量化分辨率对第一AI协作数据进行量化,得到第二AI协作数据。S415. The second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
S416、第二设备向第一设备发送第二AI协作数据。S416. The second device sends second AI collaboration data to the first device.
其中,该S413~S416的具体实施过程可参见对前述S402~S405的具体实施过程的描述,在此不再赘述。The specific implementation process of S413 to S416 can refer to the description of the specific implementation process of S402 to S405 above, which will not be repeated here.
S417、第三设备向第一设备发送第三AI协作数据。S417. The third device sends third AI collaboration data to the first device.
可以理解为,第三设备进行模型的训练,得到第三设备的AI协作数据;进一步地,第三设备基于其对应的量化分辨率对该第三设备的AI协作数据进行量化,得到第三AI协作数据,并向第一设备发送该第三AI协作数据。It can be understood that the third device trains the model to obtain AI collaboration data of the third device; further, the third device quantizes the AI collaboration data of the third device based on its corresponding quantization resolution to obtain third AI collaboration data, and sends the third AI collaboration data to the first device.
需要说明的是,第三设备的AI协作数据的具体内容可参见前述AI协作数据的具体描述,例如第三设备的AI协作数据包括以下数据中的一种或多种:梯度数据、AI模型、AI子模型、AI模型输出、AI模型中间值等。该第三设备的量化分辨率也可以参见前述图4a所提供的数据处理方法进行量化分辨率更新,本申请对此不进行具体限定。It should be noted that the specific content of the AI collaboration data of the third device can refer to the specific description of the aforementioned AI collaboration data. For example, the AI collaboration data of the third device includes one or more of the following data: gradient data, AI model, AI sub-model, AI model output, AI model intermediate value, etc. The quantization resolution of the third device can also refer to the data processing method provided in the aforementioned FIG. 4a for updating the quantization resolution, and this application does not specifically limit this.
S418、第一设备聚合第二AI协作数据和第三AI协作数据,得到聚合后的AI协作数据。S418. The first device aggregates the second AI collaboration data and the third AI collaboration data to obtain aggregated AI collaboration data.
也就是说,第一设备接收来自参与协作学习的所有(或部分)分布式节点(例如第二设备和第三设备)反馈的AI协作数据之后,对这些AI协作数据进行聚合,得到聚合后的AI协作数据。That is to say, after the first device receives the AI collaboration data fed back from all (or part) of the distributed nodes (such as the second device and the third device) participating in the collaborative learning, it aggregates the AI collaboration data to obtain aggregated AI collaboration data.
S419、第一设备分别向第二设备和第三设备发送该聚合后的AI协作数据。S419. The first device sends the aggregated AI collaboration data to the second device and the third device respectively.
也就是说,第一设备得到聚合后的AI协作数据之后,第一设备向参与协作学习的分布式节点发送该聚合后的AI协作数据,以使该参与协作学习的分布式节点(包括第二设备和第三设备)分别基于该聚合 后的AI协作数据进行模型训练。That is, after the first device obtains the aggregated AI collaborative data, the first device sends the aggregated AI collaborative data to the distributed nodes participating in the collaborative learning, so that the distributed nodes participating in the collaborative learning (including the second device and the third device) can respectively The AI collaboration data is then used for model training.
需要说明的是,第一设备向参与协作学习的分布式节点发送聚合后的AI协作数据时,还可以指示分布式节点的标识(例如第二设备的标识或第三设备的标识),用于指示接收聚合后的AI协作数据的设备;第一设备标识,用于标识发送聚合后的AI协作数据的设备;模型的标识,用于指示该聚合后的AI协作数据对应的AI模型;模型对应的训练轮的标识,用于指示聚合后的AI协作数据对应的是模型的哪个训练轮。还需要说明的是,第一设备向参与协作学习的分布式节点发送的聚合后的AI协作数据,可以是未量化的数据(即可以理解为基于第二AI协作数据和第三AI协作数据聚合得到的聚合数据),也可以是量化后的数据(即可以理解为基于第二AI协作数据和第三AI协作数据聚合得到的聚合数据之后,再经过聚合量化分辨率量化操作得到的数据)。本申请对此不进行具体限定。其中,聚合量化分辨率可以理解为量化聚合后的AI协作数据的量化分辨率,聚合量化分辨率也可以是根据第一设备的通信条件信息、第二设备的通信条件信息或第二设备的计算资源信息中的一种或多种确定,本申请不进行具体限定。It should be noted that when the first device sends aggregated AI collaborative data to the distributed nodes participating in collaborative learning, it can also indicate the identifier of the distributed node (for example, the identifier of the second device or the identifier of the third device) to indicate the device receiving the aggregated AI collaborative data; the first device identifier is used to identify the device sending the aggregated AI collaborative data; the identifier of the model is used to indicate the AI model corresponding to the aggregated AI collaborative data; the identifier of the training wheel corresponding to the model is used to indicate which training wheel of the model the aggregated AI collaborative data corresponds to. It should also be noted that the aggregated AI collaborative data sent by the first device to the distributed nodes participating in collaborative learning can be unquantized data (that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data), or quantized data (that is, it can be understood as aggregated data obtained by aggregating the second AI collaborative data and the third AI collaborative data, and then subjected to an aggregated quantization resolution quantization operation). This application does not make specific restrictions on this. Among them, the aggregated quantization resolution can be understood as the quantization resolution of the quantized aggregated AI collaborative data, and the aggregated quantization resolution can also be determined based on one or more of the communication condition information of the first device, the communication condition information of the second device, or the computing resource information of the second device, and this application does not make specific restrictions.
情况二、通过主动触发的方式,触发第二设备向第一设备发送第一消息。Case 2: triggering the second device to send the first message to the first device by actively triggering the second device.
请参见图4c所示,该图4c为本申请提供的另一种协作学习的流程示意图。如图4c所示,该协作学习包括如下步骤S421~步骤S428。其中:Please refer to FIG. 4c, which is a schematic diagram of another collaborative learning process provided by the present application. As shown in FIG. 4c, the collaborative learning includes the following steps S421 to S428. Among them:
S421、第二设备向第一设备发送第一消息,该第一消息用于请求第一设备更新该第二设备的量化分辨率。S421. The second device sends a first message to the first device, where the first message is used to request the first device to update a quantization resolution of the second device.
也就是说,在第二设备确定自身具有更新量化分辨率的需求时,该第二设备基于该更新量化分辨率的需求,主动向第一设备发送第一消息。需要说明的是,本申请对第一消息的具体名称不进行具体限定,例如,该第一消息可以为更新量化分辨率的请求消息。其中,对于该第一消息的具体描述可参见前述S401中对第一消息的具体描述,在此不再赘述。That is, when the second device determines that it has a need to update the quantization resolution, the second device actively sends a first message to the first device based on the need to update the quantization resolution. It should be noted that the present application does not specifically limit the specific name of the first message. For example, the first message can be a request message for updating the quantization resolution. For a specific description of the first message, please refer to the specific description of the first message in S401 above, which will not be repeated here.
例如,设备1和设备2为执行联邦学习任务的设备。其中,设备1为执行该联邦学习任务的server,设备2为执行该联邦学习任务的client。在执行联邦学习任务的过程中,设备2会向设备1反馈AI协作数据。在联邦学习的某一轮训练过程中,设备2检测到满足以下一种或多种条件:该AI模型的梯度范数大于或等于梯度范数阈值、第二设备的通信带宽小于或等于带宽阈值、该第二设备的信道质量小于或等于信道质量阈值、该第二设备的算力小于或等于算力阈值等,则可以视为具有更新设备2的量化分辨率的需求,设备2主动向第一设备发送第一消息。For example, device 1 and device 2 are devices that perform federated learning tasks. Among them, device 1 is the server that performs the federated learning task, and device 2 is the client that performs the federated learning task. In the process of performing the federated learning task, device 2 will feedback AI collaboration data to device 1. During a round of training in federated learning, if device 2 detects that one or more of the following conditions are met: the gradient norm of the AI model is greater than or equal to the gradient norm threshold, the communication bandwidth of the second device is less than or equal to the bandwidth threshold, the channel quality of the second device is less than or equal to the channel quality threshold, the computing power of the second device is less than or equal to the computing power threshold, etc., it can be regarded as having a need to update the quantization resolution of device 2, and device 2 actively sends a first message to the first device.
S422、第一设备根据该通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率。S422. The first device determines a quantization resolution according to one or more of the communication condition information, the computing resource information, or the model training information.
S423、第一设备向第二设备发送第二消息,该第二消息包括指示量化分辨率的指示信息。S423: The first device sends a second message to the second device, where the second message includes indication information indicating a quantization resolution.
S424、第二设备基于量化分辨率对第一AI协作数据进行量化,得到第二AI协作数据。S424. The second device quantizes the first AI collaboration data based on the quantization resolution to obtain second AI collaboration data.
S425、第二设备向第一设备发送第二AI协作数据。S425. The second device sends second AI collaboration data to the first device.
S426、第三设备向第一设备发送第三AI协作数据。S426. The third device sends third AI collaboration data to the first device.
S427、第一设备聚合第二AI协作数据和第三AI协作数据,得到聚合后的AI协作数据。S427. The first device aggregates the second AI collaboration data and the third AI collaboration data to obtain aggregated AI collaboration data.
S428、第一设备分别向第二设备和第三设备发送该聚合后的AI协作数据。S428. The first device sends the aggregated AI collaboration data to the second device and the third device respectively.
其中,该S422~S428的具体实施过程可参见对前述S413~S419的具体实施过程的描述,在此不再赘述。The specific implementation process of S422 to S428 can refer to the description of the specific implementation process of S413 to S419 mentioned above, which will not be repeated here.
可以理解的是,为了实现上述实施例中功能,设备(包括前述第一设备或第二设备)包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软收发单元件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。It is understandable that in order to implement the functions in the above embodiments, the device (including the aforementioned first device or second device) includes a hardware structure and/or software module corresponding to the execution of each function. It should be easily appreciated by those skilled in the art that, in combination with the units and method steps of each example described in the embodiments disclosed in this application, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or a computer software transceiver unit driving the hardware depends on the specific application scenario and design constraints of the technical solution.
图5和图6为本申请的实施例提供的可能的通信装置的结构示意图。这些通信装置可以用于实现上述方法实施例中设备的功能,因此也能实现上述方法实施例所具备的有益效果。在本申请的实施例中,该通信装置可以是图4a~图4c中的第一设备,还可以是应用于第一设备的模块(如芯片),或者,该通信装置可以是图4a~图4c中的第二设备,还可以是应用于第二设备的模块(如芯片)。Figures 5 and 6 are schematic diagrams of the structures of possible communication devices provided by embodiments of the present application. These communication devices can be used to implement the functions of the devices in the above method embodiments, and thus can also achieve the beneficial effects possessed by the above method embodiments. In the embodiments of the present application, the communication device can be the first device in Figures 4a to 4c, or a module (such as a chip) applied to the first device, or the communication device can be the second device in Figures 4a to 4c, or a module (such as a chip) applied to the second device.
如图5所示,通信装置500包括处理单元510和收发单元520。通信装置500用于实现上述图4a~图4c中所示的方法实施例中第一设备的功能或第二设备的功能。As shown in Fig. 5, the communication device 500 includes a processing unit 510 and a transceiver unit 520. The communication device 500 is used to implement the function of the first device or the function of the second device in the method embodiments shown in Figs. 4a to 4c above.
当通信装置500用于实现图4a~图4c所示的方法实施例中第一设备的功能时:收发单元520,用于向 第一设备发送第一消息,该第一消息包括第二设备的通信条件信息、第二设备的计算资源信息或第二设备的模型训练信息中的一种或多种;收发单元520,还用于接收来自第一设备的第二消息,该第二消息包括指示信息,该指示信息指示量化分辨率,该量化分辨率与通信条件信息、计算资源信息或模型训练信息中的一种或多种相关;处理单元510,用于基于量化分辨率对第一人工智能AI协作数据进行量化,得到第二AI协作数据;收发单元520,还用于向第一设备发送第二AI协作数据。When the communication device 500 is used to implement the function of the first device in the method embodiment shown in FIG. 4a to FIG. 4c: the transceiver unit 520 is used to send The first device sends a first message, which includes one or more of communication condition information of the second device, computing resource information of the second device, or model training information of the second device; the transceiver unit 520 is also used to receive a second message from the first device, and the second message includes indication information, which indicates a quantization resolution, and the quantization resolution is related to one or more of the communication condition information, computing resource information, or model training information; the processing unit 510 is used to quantize the first artificial intelligence AI collaboration data based on the quantization resolution to obtain second AI collaboration data; the transceiver unit 520 is also used to send the second AI collaboration data to the first device.
在一种可能的实现中,通信条件信息包括通信带宽信息、传输时延、参考信号接收功率RSRP、参考信号接收质量RSRQ、信号与干扰加噪声比SINR、误码率或吞吐量中的一种或多种;计算资源信息包括计算资源的大小、计算能力的大小或计算时延中的一种或多种;模型训练信息包括测试集损失值、训练集损失值、梯度范数大小、梯度大小、模型大小中的一种或多种。In one possible implementation, the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput; the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay; the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
在一种可能的实现中,收发单元520,还用于接收来自第一设备的第三消息,该第三消息用于请求协助更新量化分辨率;第一消息为第三消息对应的响应消息。In a possible implementation, the transceiver unit 520 is further used to receive a third message from the first device, where the third message is used to request assistance in updating the quantization resolution; and the first message is a response message corresponding to the third message.
在一种可能的实现中,第三消息包括以下信息中的一项或多项:测量标识、第二设备的标识、第一设备的标识、模型的标识、指示更新量化分辨率的原因的指示信息、模型对应的训练轮的标识、请求反馈的信息的标识或传输第一消息的配置信息;其中,该请求反馈的信息包括通信条件信息、计算资源信息或模型训练信息中的一种或多种;该第一消息是根据请求反馈的信息的标识生成的。In a possible implementation, the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, an identifier of the model, indication information indicating the reason for updating the quantization resolution, an identifier of the training round corresponding to the model, an identifier of the information requested for feedback, or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
在一种可能的实现中,指示更新量化分辨率的原因的指示信息包括第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值;其中,第一时延为第一设备等待第二设备反馈AI协作数据的时延,平均时延为第一设备等待多个分布式节点反馈AI协作数据的时延的平均值,多个分布式节点包括第二设备。In one possible implementation, the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
在一种可能的实现中,传输第一消息的配置信息包括传输第一消息的传输周期、或传输第一消息的触发条件或传输第一消息的次数中的一种或多种。In a possible implementation, the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a triggering condition for transmitting the first message, or a number of times of transmitting the first message.
在一种可能的实现中,第一消息还包括以下信息中的一项或多项:测量标识、第二设备标识、第一设备标识、模型的标识、模型对应的训练轮的标识。In a possible implementation, the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of the model, and an identifier of a training wheel corresponding to the model.
有关上述收发单元520和处理单元510更详细的描述可以参考图4a~图4c所示的方法实施例中第二设备的相关描述。For a more detailed description of the transceiver unit 520 and the processing unit 510, reference may be made to the related description of the second device in the method embodiment shown in FIG. 4a to FIG. 4c.
如图5所示,通信装置500包括处理单元510和收发单元520。通信装置500用于实现上述图4a~图4c中所示的方法实施例中第一设备的功能。As shown in Fig. 5, the communication device 500 includes a processing unit 510 and a transceiver unit 520. The communication device 500 is used to implement the function of the first device in the method embodiments shown in Figs. 4a to 4c above.
当通信装置500用于实现图4a~图4c所示的方法实施例中第一设备的功能时:收发单元520,用于接收来自第二设备的第一消息,该第一消息包括第二设备的通信条件信息、第二设备的计算资源信息或第二设备的模型训练信息中的一种或多种;处理单元510,用于基于通信条件信息、计算资源信息或模型训练信息中的一种或多种,确定量化分辨率;收发单元520,还用于向第二设备发送第二消息,该第二消息包括指示信息,该指示信息指示量化分辨率,该量化分辨率与通信条件信息、计算资源信息或模型训练信息中的一种或多种相关;收发单元520,还用于接收来自第二设备的第二人工智能AI协作数据,该第二AI协作数据是基于量化分辨率对第一AI协作数据量化得到的。When the communication device 500 is used to implement the function of the first device in the method embodiment shown in Figures 4a to 4c: the transceiver unit 520 is used to receive a first message from the second device, and the first message includes one or more of the communication condition information of the second device, the computing resource information of the second device, or the model training information of the second device; the processing unit 510 is used to determine the quantization resolution based on one or more of the communication condition information, the computing resource information, or the model training information; the transceiver unit 520 is also used to send a second message to the second device, and the second message includes indication information, and the indication information indicates the quantization resolution, and the quantization resolution is related to one or more of the communication condition information, the computing resource information, or the model training information; the transceiver unit 520 is also used to receive second artificial intelligence AI collaboration data from the second device, and the second AI collaboration data is obtained by quantizing the first AI collaboration data based on the quantization resolution.
在一种可能的实现中,通信条件信息包括通信带宽信息、传输时延、参考信号接收功率RSRP、参考信号接收质量RSRQ、信号与干扰加噪声比SINR、误码率或吞吐量中的一种或多种;计算资源信息包括计算资源的大小、计算能力的大小或计算时延中的一种或多种;模型训练信息包括测试集损失值、训练集损失值、梯度范数大小、梯度大小、模型大小中的一种或多种。In one possible implementation, the communication condition information includes one or more of communication bandwidth information, transmission delay, reference signal received power RSRP, reference signal received quality RSRQ, signal to interference plus noise ratio SINR, bit error rate or throughput; the computing resource information includes one or more of the size of computing resources, the size of computing power or computing delay; the model training information includes one or more of the test set loss value, the training set loss value, the gradient norm size, the gradient size, and the model size.
在一种可能的实现中,收发单元520,还用于向第二设备发送第三消息,该第三消息用于请求协助更新量化分辨率;该第一消息为该第三消息对应的响应消息。In a possible implementation, the transceiver unit 520 is further used to send a third message to the second device, where the third message is used to request assistance in updating the quantization resolution; and the first message is a response message corresponding to the third message.
在一种可能的实现中,第三消息包括以下信息中的一项或多项:测量标识、第二设备的标识、第一设备的标识、模型标识、指示更新量化分辨率的原因的指示信息、模型对应的训练轮的标识、请求反馈的信息的标识和/或传输第一消息的配置信息;其中,请求反馈的信息包括通信条件信息、计算资源信息或模型训练信息中的一种或多种;第一消息是根据请求反馈的信息的标识生成的。In one possible implementation, the third message includes one or more of the following information: a measurement identifier, an identifier of the second device, an identifier of the first device, a model identifier, indication information indicating the reason for updating the quantization resolution, an identifier of a training round corresponding to the model, an identifier of information requested for feedback, and/or configuration information for transmitting the first message; wherein the information requested for feedback includes one or more of communication condition information, computing resource information, or model training information; and the first message is generated based on the identifier of the information requested for feedback.
在一种可能的实现中,指示更新量化分辨率的原因的指示信息包括第一时延和平均时延之间的时间差,或者,第一时延和平均时延之间的比值;其中,第一时延为第一设备等待第二设备反馈AI协作数据的时延,平均时延为第一设备等待多个分布式节点反馈AI协作数据的时延的平均值,多个分布式节点包括第二设备。 In one possible implementation, the indication information indicating the reason for updating the quantization resolution includes the time difference between the first delay and the average delay, or the ratio between the first delay and the average delay; wherein the first delay is the delay of the first device waiting for the second device to feedback the AI collaboration data, and the average delay is the average value of the delay of the first device waiting for multiple distributed nodes to feedback the AI collaboration data, and the multiple distributed nodes include the second device.
在一种可能的实现中,传输第一消息的配置信息包括传输第一消息的传输周期、或传输第一消息的触发条件或传输第一消息的次数中的一种或多种。In a possible implementation, the configuration information for transmitting the first message includes one or more of a transmission period for transmitting the first message, a triggering condition for transmitting the first message, or a number of times of transmitting the first message.
在一种可能的实现中,第一消息还包括以下信息中的一项或多项:测量标识、第二设备标识、第一设备标识、模型的标识、模型对应的训练轮的标识。In a possible implementation, the first message further includes one or more of the following information: a measurement identifier, a second device identifier, a first device identifier, an identifier of the model, and an identifier of a training wheel corresponding to the model.
有关上述收发单元520和处理单元510更详细的描述可以参考图4a~图4c所示的方法实施例中第一设备的相关描述。For a more detailed description of the transceiver unit 520 and the processing unit 510, reference may be made to the related description of the first device in the method embodiment shown in FIG. 4a to FIG. 4c.
如图6所示,通信装置600包括处理器610和接口电路620。处理器610和接口电路620之间相互耦合。可以理解的是,接口电路620可以为收发器或输入输出接口。可选的,通信装置600还可以包括存储器630,用于存储处理器610执行的指令或存储处理器610运行指令所需要的输入数据或存储处理器610运行指令后产生的数据。As shown in FIG6 , the communication device 600 includes a processor 610 and an interface circuit 620. The processor 610 and the interface circuit 620 are coupled to each other. It is understood that the interface circuit 620 may be a transceiver or an input/output interface. Optionally, the communication device 600 may further include a memory 630 for storing instructions executed by the processor 610 or storing input data required by the processor 610 to execute instructions or storing data generated after the processor 610 executes instructions.
当通信装置600用于实现图4a~图4c所示的方法时,处理器610用于实现上述处理单元510的功能,接口电路620用于实现上述收发单元520的功能。When the communication device 600 is used to implement the method shown in FIG. 4 a to FIG. 4 c , the processor 610 is used to implement the function of the processing unit 510 , and the interface circuit 620 is used to implement the function of the transceiver unit 520 .
当上述通信装置为应用于第一设备的芯片时,该芯片实现上述方法实施例中第一设备的功能。该芯片接收来自第二设备的信息,可以理解为该信息是先由第一设备中的其它模块(如射频模块或天线)接收到的,然后再由这些模块发送给芯片。该芯片向第二设备发送信息,可以理解为该信息是先发送给第一设备中的其它模块(如射频模块或天线),然后再由这些模块向第二设备发送。When the above-mentioned communication device is a chip applied to the first device, the chip implements the functions of the first device in the above-mentioned method embodiment. The chip receives information from the second device, which can be understood as the information is first received by other modules in the first device (such as a radio frequency module or an antenna), and then sent to the chip by these modules. The chip sends information to the second device, which can be understood as the information is first sent to other modules in the first device (such as a radio frequency module or an antenna), and then sent to the second device by these modules.
当上述通信装置为应用于第二设备的芯片时,该芯片实现上述方法实施例中第二设备的功能。该芯片接收来自第一设备的信息,可以理解为该信息是先由第二设备中的其它模块(如射频模块或天线)接收到的,然后再由这些模块发送给芯片。该芯片向第一设备发送信息,可以理解为该信息是先发送给第二设备中的其它模块(如射频模块或天线),然后再由这些模块向第一设备发送。When the above-mentioned communication device is a chip applied to the second device, the chip implements the function of the second device in the above-mentioned method embodiment. The chip receives information from the first device, which can be understood as the information is first received by other modules in the second device (such as a radio frequency module or an antenna), and then sent to the chip by these modules. The chip sends information to the first device, which can be understood as the information is first sent to other modules in the second device (such as a radio frequency module or an antenna), and then sent to the first device by these modules.
在本申请中,实体A向实体B发送信息,可以是A直接向B发送,也可以是A经过其它实体间接地向B发送。同样的,实体B接收来自实体A的信息,可以是实体B直接接收实体A发送的信息,也可以是实体B通过其它实体间接地接收实体A发送的信息。这里的实体A和B可以是RAN节点或终端,也可以是RAN节点或终端内部的模块。信息的发送与接收可以是RAN节点与终端之间的信息交互,例如,基站与终端之间的信息交互;信息的发送与接收也可以是两个RAN节点之间的信息交互,例如CU和DU之间的信息交互;信息的发送与接收还可以是在一个装置内部不同模块之间的信息交互,例如,终端芯片与终端其它模块之间的信息交互,或者,基站芯片与该基站中其它模块之间的信息交互。In the present application, when entity A sends information to entity B, it can be that A sends it directly to B, or that A sends it to B indirectly through other entities. Similarly, when entity B receives information from entity A, it can be that entity B directly receives the information sent by entity A, or that entity B indirectly receives the information sent by entity A through other entities. Entities A and B here can be RAN nodes or terminals, or modules inside the RAN nodes or terminals. The sending and receiving of information can be information interaction between a RAN node and a terminal, for example, information interaction between a base station and a terminal; the sending and receiving of information can also be information interaction between two RAN nodes, for example, information interaction between a CU and a DU; the sending and receiving of information can also be information interaction between different modules inside a device, for example, information interaction between a terminal chip and other modules of the terminal, or information interaction between a base station chip and other modules in the base station.
可以理解的是,本申请的实施例中的处理器可以是中央处理单元(central processing unit,CPU),还可以是其它通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。It is understood that the processor in the embodiments of the present application may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application specific integrated circuits (ASIC), field programmable gate arrays (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. The general-purpose processor may be a microprocessor or any conventional processor.
本申请的实施例中的方法步骤可以在硬件中实现,也可以在可由处理器执行的软件指令中实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器、闪存、只读存储器、可编程只读存储器、可擦除可编程只读存储器、电可擦除可编程只读存储器、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于基站或终端中。处理器和存储介质也可以作为分立组件存在于基站或终端中。The method steps in the embodiments of the present application can be implemented in hardware or in software instructions that can be executed by a processor. The software instructions can be composed of corresponding software modules, and the software modules can be stored in random access memory, flash memory, read-only memory, programmable read-only memory, erasable programmable read-only memory, electrically erasable programmable read-only memory, register, hard disk, mobile hard disk, CD-ROM or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor so that the processor can read information from the storage medium and write information to the storage medium. The storage medium can also be a component of the processor. The processor and the storage medium can be located in an ASIC. In addition, the ASIC can be located in a base station or a terminal. The processor and the storage medium can also be present in a base station or a terminal as discrete components.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘;还可 以是半导体介质,例如,固态硬盘。该计算机可读存储介质可以是易失性或非易失性存储介质,或可包括易失性和非易失性两种类型的存储介质。In the above embodiments, it can be implemented in whole or in part by software, hardware, firmware or any combination thereof. When implemented by software, it can be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the process or function described in the embodiment of the present application is executed in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device or other programmable device. The computer program or instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer program or instructions may be transmitted from one website site, computer, server or data center to another website site, computer, server or data center by wire or wireless means. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media. The available medium may be a magnetic medium, such as a floppy disk, a hard disk, or a magnetic tape; it may also be an optical medium, such as a digital video disc; it may also be a The computer readable storage medium may be a semiconductor medium, such as a solid state drive. The computer readable storage medium may be a volatile or non-volatile storage medium, or may include both volatile and non-volatile storage media.
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。In the various embodiments of the present application, unless otherwise specified or provided in a logical conflict, the terms and/or descriptions between the different embodiments are consistent and may be referenced to each other, and the technical features in the different embodiments may be combined to form new embodiments according to their inherent logical relationships.
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。在本申请的文字描述中,字符“/”,一般表示前后关联对象是一种“或”的关系;在本申请的公式中,字符“/”,表示前后关联对象是一种“相除”的关系。“包括A,B和C中的至少一个”可以表示:包括A;包括B;包括C;包括A和B;包括A和C;包括B和C;包括A、B和C。In the present application, "at least one" means one or more, and "more than one" means two or more. "And/or" describes the association relationship of associated objects, indicating that three relationships may exist. For example, A and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural. In the text description of the present application, the character "/" generally indicates that the previous and next associated objects are in an "or" relationship; in the formula of the present application, the character "/" indicates that the previous and next associated objects are in a "division" relationship. "Including at least one of A, B and C" can mean: including A; including B; including C; including A and B; including A and C; including B and C; including A, B and C.
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。 It is understood that the various numbers involved in the embodiments of the present application are only for the convenience of description and are not used to limit the scope of the embodiments of the present application. The size of the sequence number of the above-mentioned processes does not mean the order of execution, and the execution order of each process should be determined by its function and internal logic.
Claims (18)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311246336.1A CN119697040A (en) | 2023-09-25 | 2023-09-25 | A data processing method and a communication device |
| CN202311246336.1 | 2023-09-25 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025066789A1 true WO2025066789A1 (en) | 2025-04-03 |
Family
ID=95041550
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/116032 Pending WO2025066789A1 (en) | 2023-09-25 | 2024-08-30 | Data processing method and communication apparatus |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN119697040A (en) |
| WO (1) | WO2025066789A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111401552A (en) * | 2020-03-11 | 2020-07-10 | 浙江大学 | A Federated Learning Method and System Based on Adjusting Batch Size and Gradient Compression Ratio |
| US20220245527A1 (en) * | 2021-02-01 | 2022-08-04 | Qualcomm Incorporated | Techniques for adaptive quantization level selection in federated learning |
| CN115174397A (en) * | 2022-07-28 | 2022-10-11 | 河海大学 | Federal edge learning training method and system combining gradient quantization and bandwidth allocation |
| CN115280335A (en) * | 2020-03-24 | 2022-11-01 | Oppo广东移动通信有限公司 | Machine learning model training method, electronic device and storage medium |
-
2023
- 2023-09-25 CN CN202311246336.1A patent/CN119697040A/en active Pending
-
2024
- 2024-08-30 WO PCT/CN2024/116032 patent/WO2025066789A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111401552A (en) * | 2020-03-11 | 2020-07-10 | 浙江大学 | A Federated Learning Method and System Based on Adjusting Batch Size and Gradient Compression Ratio |
| CN115280335A (en) * | 2020-03-24 | 2022-11-01 | Oppo广东移动通信有限公司 | Machine learning model training method, electronic device and storage medium |
| US20220245527A1 (en) * | 2021-02-01 | 2022-08-04 | Qualcomm Incorporated | Techniques for adaptive quantization level selection in federated learning |
| CN115174397A (en) * | 2022-07-28 | 2022-10-11 | 河海大学 | Federal edge learning training method and system combining gradient quantization and bandwidth allocation |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119697040A (en) | 2025-03-25 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11838787B2 (en) | Functional architecture and interface for non-real-time ran intelligent controller | |
| CN113873538A (en) | Model data transmission method and communication device | |
| EP4580230A1 (en) | Communication method and apparatus | |
| WO2024169522A1 (en) | Communication method and apparatus | |
| WO2024051789A1 (en) | Beam management method | |
| WO2025066789A1 (en) | Data processing method and communication apparatus | |
| WO2024026846A1 (en) | Artificial intelligence model processing method and related device | |
| WO2025060349A1 (en) | Methods, devices, and computer readable medium for artificial intelligence (ai) service | |
| US20250148370A1 (en) | Method and apparatus for intelligent operating of communication system | |
| EP4590017A1 (en) | Communication method and apparatus | |
| EP4657815A1 (en) | Communication method and communication apparatus | |
| WO2025209121A1 (en) | Data format indication method and related product | |
| US20250373510A1 (en) | Communication method, communication apparatus, and communication system | |
| CN119922528A (en) | A communication method and device | |
| WO2025124135A1 (en) | Communication method, and apparatus | |
| TW202533603A (en) | Communication method, communication apparatus, and communication system | |
| WO2025228310A1 (en) | Communication method and related apparatus | |
| WO2025124143A1 (en) | Model training method and communication apparatus | |
| EP4657907A1 (en) | Communication method, communication apparatus, and communication system | |
| WO2025140003A1 (en) | Communication method and communication apparatus | |
| WO2025039276A1 (en) | Method for model transmission, and communication device | |
| CN119856416A (en) | Method and apparatus for wireless communication | |
| WO2024255785A1 (en) | Communication method and communication apparatus | |
| CN118193001A (en) | Model updating method, device, equipment and storage medium | |
| JP2025529936A (en) | Communication method and communication device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24870335 Country of ref document: EP Kind code of ref document: A1 |