WO2025246847A1

WO2025246847A1 - Communication method and related apparatus

Info

Publication number: WO2025246847A1
Application number: PCT/CN2025/093589
Authority: WO
Inventors: 张浩男; 刘银萍; 李贤明
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2024-05-31
Filing date: 2025-05-08
Publication date: 2025-12-04
Anticipated expiration: 2026-11-30

Abstract

A communication method and a related apparatus, relating to the technical field of communications. The method comprises: sending a first message to a network device, wherein the first message is used for requesting the network device to deploy a first model, and the first message comprises identification information of the first model and resource information of a terminal device; and receiving a second message from the network device, wherein the second message comprises first instruction information or second instruction information, the first instruction information instructs the network device to independently execute a reasoning task of the first model, and the second instruction information instructs the terminal device and the network device to jointly execute the reasoning task, and instructs the terminal device and the network device to respectively deploy a first sub-model and a second sub-model in the first model. The method can reduce a transmission delay in the process of obtaining a model reasoning result by a terminal.

Description

Communication methods and related devices

本申请要求在2024年05月31日提交中国国家知识产权局、申请号为202410704253.0的中国专利申请的优先权，发明名称为“通信方法及相关装置”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims priority to Chinese Patent Application No. 202410704253.0, filed with the China National Intellectual Property Administration on May 31, 2024, entitled "Communication Method and Related Apparatus", the entire contents of which are incorporated herein by reference.

Technical Field

本申请涉及通信技术领域，尤其涉及一种通信方法及相关装置。This application relates to the field of communication technology, and in particular to a communication method and related apparatus.

Background Technology

随着人工智能(Artificial Intelligence,AI)技术的发展，AI模型逐渐由传统小规模神经网络模型向基于Transformer的大规模神经网络模型演进。与此同时，基于大模型的终端应用不断涌现，如问答系统、文生图、文生视频等各类自然语言处理和多模态任务。With the development of Artificial Intelligence (AI) technology, AI models are gradually evolving from traditional small-scale neural network models to large-scale neural network models based on Transformers. At the same time, terminal applications based on large models are constantly emerging, such as question answering systems, text-to-image processing, text-to-video processing, and various natural language processing and multimodal tasks.

大模型具有大规模参数和复杂计算结构，与传统的模型相比，对于设备的计算能力和存储能力的要求较高。大模型一般存储和部署在应用服务器中，当用户需要使用基于大模型的终端应用时，用户终端通过网络将数据上传至应用服务器，应用服务器计算并返回结果给用户。在此过程中，用户使用终端应用，模型推理任务被执行，数据和结果在用户终端、网络和应用服务器之间传输，传输时延高。Large models, with their massive number of parameters and complex computational structures, place higher demands on the computing and storage capabilities of devices compared to traditional models. Large models are typically stored and deployed on application servers. When users need to use terminal applications based on these models, the user terminal uploads data to the application server via the network, and the application server performs the computation and returns the results to the user. During this process, the user uses the terminal application, the model inference task is executed, and data and results are transmitted between the user terminal, the network, and the application server, resulting in high transmission latency.

Summary of the Invention

本申请实施例提供了一种通信方法及相关装置，可以降低终端获得模型推理结果过程的传输时延。This application provides a communication method and related apparatus that can reduce the transmission latency of the terminal in obtaining model inference results.

第一方面，本申请实施例提供了一种通信方法，应用于终端设备。可理解，该方法可以由通信装置执行，该通信装置可以是终端设备，也可以是用于终端设备的芯片(系统)或电路，本申请对此不作限定。该方法包括：Firstly, embodiments of this application provide a communication method applied to a terminal device. It is understood that this method can be executed by a communication device, which can be the terminal device itself, or a chip (system) or circuit used in the terminal device; this application does not limit this. The method includes:

向网络设备发送第一消息，该第一消息用于请求网络设备部署第一模型，第一消息包括第一模型的识别信息和终端设备的资源信息；Send a first message to the network device, the first message being used to request the network device to deploy a first model, the first message including the identification information of the first model and the resource information of the terminal device;

接收来自网络设备的第二消息，该第二消息包括第一指示信息或第二指示信息，第一指示信息指示由网络设备独立执行第一模型的推理任务，第二指示信息指示由终端设备和网络设备联合执行该推理任务，以及指示终端设备与网络设备分别部署第一模型中的第一子模型和第二子模型。The system receives a second message from a network device, the second message including a first instruction or a second instruction, the first instruction indicating that the network device independently performs the inference task of the first model, the second instruction indicating that the terminal device and the network device jointly perform the inference task, and instructing the terminal device and the network device to respectively deploy the first sub-model and the second sub-model in the first model.

本申请中，第一模型可以包括人工智能模型或机器学习模型。第一模型的识别信息可理解为指示该第一模型和识别该第一模型的信息，具体可以包括以下至少一项：第一模型的标识信息、指示第一模型的可获取方式的信息等。终端设备的资源信息可以包括终端设备的存储资源信息和计算资源信息，具体可以包括终端设备芯片的可用内存、终端设备芯片的计算能力等。In this application, the first model may include an artificial intelligence model or a machine learning model. The identification information of the first model can be understood as information indicating and identifying the first model, specifically including at least one of the following: identification information of the first model, information indicating the accessibility of the first model, etc. The resource information of the terminal device may include storage resource information and computing resource information of the terminal device, specifically including the available memory of the terminal device chip, the computing power of the terminal device chip, etc.

网络设备可基于第一模型的参数规模、模型结构等信息以及终端设备的存储资源、计算资源等资源信息，确定是否需要对第一模型进行拆分，若需要进行拆分则确定拆分得到的第一子模型和第二子模型，生成第一指示信息或第二指示信息。Based on information such as the parameter scale and model structure of the first model, as well as the storage and computing resources of the terminal device, the network device can determine whether the first model needs to be split. If splitting is required, the first sub-model and the second sub-model obtained from the split are determined, and the first instruction information or the second instruction information is generated.

本申请中，终端设备接收来自网络设备的第一指示信息或第二指示信息，由于该指示信息是网络设备根据第一模型和终端设备的资源信息得到的，该指示信息可指示由网络设备独立执行推理任务或由终端设备和网络设备联合执行推理任务，因此终端设备基于该指示信息获得的推理结果不是来自第一模型所在的服务设备。推理所需的数据也无需再传输给服务设备，因此在终端设备使用第一模型获得推理结果的过程中可降低传输时延。当进行多次推理时，能大幅度降低传输时延。该通信方法可以支持大模型的分布式部署，实现将大模型转移部署在网络设备或部署在网络设备和终端设备中，支持网络设备独立推理、网络设备和终端设备独立推理等多种推理方式，拓展终端可承载模型规模。In this application, the terminal device receives first or second instruction information from the network device. Since this instruction information is obtained by the network device based on the first model and the terminal device's resource information, it can instruct the network device to independently execute the inference task or the terminal device and the network device to jointly execute the inference task. Therefore, the inference result obtained by the terminal device based on this instruction information does not originate from the service device where the first model resides. The data required for inference also does not need to be transmitted to the service device, thus reducing transmission latency during the process of the terminal device obtaining the inference result using the first model. When performing multiple inferences, transmission latency can be significantly reduced. This communication method can support the distributed deployment of large models, enabling the transfer and deployment of large models to network devices or between network devices and terminal devices. It supports various inference methods, such as independent inference by network devices and independent inference by network devices and terminal devices, expanding the model scale that the terminal can support.

在一种可能的实施方式中，在向网络设备发送第一消息之前，还包括：In one possible implementation, before sending the first message to the network device, the method further includes:

基于所述第一模型和所述终端设备的资源信息确定所述终端设备能否独立执行所述推理任务；Based on the resource information of the first model and the terminal device, determine whether the terminal device can independently execute the inference task;

所述向网络设备发送第一消息，包括：Sending the first message to the network device includes:

在所述终端设备不能独立执行所述推理任务的情况下，向所述网络设备发送所述第一消息。If the terminal device cannot perform the inference task independently, the first message is sent to the network device.

在本申请实施方式中，终端设备基于第一模型的参数规模等信息以及该终端设备的存储资源信息和计算资源信息确定能否独立执行该第一模型的推理任务。若确定不能独立执行推理任务，终端设备可请求网络设备部署第一模型，并确定是否要拆分部分模型部署在终端设备中，这样可以实现网络侧独立执行推理任务或网络侧和终端侧协同执行推理任务，不必再从服务设备获得推理结果，降低了传输时延。In this embodiment, the terminal device determines whether it can independently execute the inference task of the first model based on information such as the parameter scale of the first model, as well as the storage and computing resource information of the terminal device. If it is determined that the inference task cannot be executed independently, the terminal device can request the network device to deploy the first model and determine whether to split a part of the model for deployment in the terminal device. This allows the network side to execute the inference task independently or the network side and the terminal side to execute the inference task collaboratively, without having to obtain the inference result from the service device, thus reducing transmission latency.

在一种可能的实施方式中，在基于所述第一模型和所述终端设备的资源信息确定所述终端设备能否独立执行所述推理任务之前，还包括：In one possible implementation, before determining whether the terminal device can independently execute the inference task based on the first model and the resource information of the terminal device, the method further includes:

接收第三消息，所述第三消息包括所述第一模型和所述第一模型的识别信息。A third message is received, the third message including the first model and the identification information of the first model.

在由终端设备触发第一模型的转移部署的场景下，该第三消息可以是第一模型所在的服务设备下发的，终端设备接收到第三消息后，可基于第一模型和自身资源信息确定能否独立完成推理任务，从而可以在终端设备自身资源不足的情况下，使第一模型能部分或完全部署至网络设备中，降低后续获得推理结果的传输时延。当第一模型部署在网络设备和终端设备中时，还可以利用终端设备的资源，提高终端设备的资源利用率，进一步降低传输时延。In scenarios where the transfer and deployment of the first model is triggered by a terminal device, the third message can be sent by the service device where the first model resides. After receiving the third message, the terminal device can determine whether it can independently complete the inference task based on the first model and its own resource information. This allows the first model to be partially or completely deployed to the network device even when the terminal device's own resources are insufficient, reducing the transmission latency for obtaining subsequent inference results. When the first model is deployed in both the network device and the terminal device, the terminal device's resources can also be utilized, improving the terminal device's resource utilization and further reducing transmission latency.

接收第四消息，所述第四消息包括所述网络设备中执行所述推理任务的第一网元的标识信息和所述第一模型的识别信息；Receive a fourth message, the fourth message including the identification information of the first network element performing the inference task in the network device and the identification information of the first model;

响应于所述第四消息，向所述网络设备发送所述第一消息，所述网络设备为所述第一网元。In response to the fourth message, the first message is sent to the network device, where the network device is the first network element.

其中，第一网元可以为RAN域的MIF、核心网中的NWDAF等，也可以是EMS或NMS等，第一网元的标识信息可以包括以下至少一项：第一网元的IP地址；第一网元所在的子网标识和第一网元对应的子网内唯一标识；在全网内能够唯一指示该第一网元的ID信息等。The first network element can be a MIF in the RAN domain, an NWDAF in the core network, or an EMS or NMS. The identification information of the first network element can include at least one of the following: the IP address of the first network element; the subnet identifier where the first network element is located and the unique identifier within the subnet corresponding to the first network element; and the ID information that can uniquely indicate the first network element in the entire network.

在本申请实施方式中，终端设备接收第四消息后，可将自身资源信息发送给网络设备，使得网络设备基于终端设备的资源信息和第一模型生成第一指示信息或第二指示信息，明确后续由网络设备还是网络设备和终端设备执行第一模型的推理任务，实现将第一模型全部或部分部署在网络设备中，终端设备不必再从服务设备获得推理结果，可降低传输时延，提高服务效率。In this embodiment, after receiving the fourth message, the terminal device can send its own resource information to the network device, so that the network device can generate a first instruction information or a second instruction information based on the terminal device's resource information and the first model, and clarify whether the network device or the network device and the terminal device will perform the inference task of the first model. This enables the first model to be deployed in the network device in whole or in part, so that the terminal device no longer needs to obtain the inference result from the service device, thereby reducing transmission latency and improving service efficiency.

在一种可能的实施方式中，所述方法还包括：In one possible implementation, the method further includes:

向所述第一模型所在的服务设备发送第五消息，所述第五消息包括所述第一指示信息或所述第二指示信息。A fifth message is sent to the service device where the first model is located, the fifth message including the first indication information or the second indication information.

在本申请实施方式中，终端设备接收来自网络设备的第二信息后，向服务设备发送第五消息，可以向服务设备反馈第一模型已转移部署在网络设备中，并反馈后续执行推理任务的主体和实际执行推理任务的第一网元，以便于服务设备获知相关信息，不必在后续再执行第一模型的推理任务，从而可不必从服务设备获得推理结果，降低传输时延。In this embodiment of the application, after receiving the second information from the network device, the terminal device sends a fifth message to the service device. The message can provide feedback to the service device that the first model has been transferred and deployed in the network device, and provide feedback on the subject that will subsequently perform the inference task and the first network element that will actually perform the inference task. This allows the service device to obtain relevant information and avoids having to perform the inference task of the first model again. As a result, it does not need to obtain the inference result from the service device, thus reducing transmission latency.

在一种可能的实施方式中，所述第二消息还包括所述网络设备中执行所述推理任务的第一网元的标识信息。In one possible implementation, the second message may also include identification information of the first network element in the network device that performs the inference task.

在所述第二消息包括所述第一指示信息的情况下，向所述网络设备发送第六消息，所述第六消息用于请求所述网络设备独立执行所述推理任务，所述第六消息包括执行所述推理任务需要输入的第一数据；If the second message includes the first indication information, a sixth message is sent to the network device, the sixth message being used to request the network device to independently execute the inference task, the sixth message including first data required to execute the inference task;

接收第一推理结果，所述第一推理结果是基于所述第一数据得到的。Receive the first inference result, which is obtained based on the first data.

在本申请实施方式中，终端设备将第一数据发送给网络设备，使得网络设备基于第一数据和第一模型进行推理，获得第一推理结果并反馈给终端设备。该过程由网络设备独立完成推理任务，无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。In this embodiment, the terminal device sends first data to the network device, enabling the network device to perform inference based on the first data and the first model, obtain a first inference result, and feed it back to the terminal device. This process is completed independently by the network device, without needing to send data to the service device or obtain the inference result from the service device, thus reducing the transmission latency of the entire inference process.

在所述第二消息包括所述第二指示信息的情况下，向所述网络设备发送第七消息，所述第七消息用于请求所述终端设备与所述网络设备联合执行所述推理任务，所述第七消息包括第二数据；所述第二数据是所述终端设备基于所述第一子模型对所述第一数据进行推理得到的，或者，所述第二数据是对所述第一数据进行预处理得到的，所述第一数据是执行所述推理任务需要输入的数据。If the second message includes the second indication information, a seventh message is sent to the network device. The seventh message is used to request the terminal device to jointly execute the inference task with the network device. The seventh message includes second data. The second data is obtained by the terminal device inferring the first data based on the first sub-model, or the second data is obtained by preprocessing the first data. The first data is the data required to execute the inference task.

在本申请实施方式中，终端设备将第二数据发送给网络设备，终端设备和网络设备联合执行推理任务，获得推理结果，无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。In this embodiment of the application, the terminal device sends the second data to the network device, and the terminal device and the network device jointly perform the inference task to obtain the inference result. There is no need to send data to the service device and obtain the inference result from the service device, which reduces the transmission latency of the entire inference process.

在所述第二指示信息指示所述网络设备基于所述第二子模型得到的推理结果为中间层特征的情况下，基于所述第一子模型对所述第二数据进行推理，得到第三数据；When the second indication information indicates that the network device obtains an intermediate layer feature based on the second sub-model, the third data is obtained by reasoning the second data based on the first sub-model.

接收来自所述网络设备的第四数据，所述第四数据是所述网络设备基于所述第二子模型对所述第二数据进行推理得到的；Receive fourth data from the network device, the fourth data being obtained by the network device from reasoning about the second data based on the second sub-model;

基于所述第三数据和所述第四数据得到第二推理结果。The second reasoning result is obtained based on the third and fourth data.

在本申请实施方式中，终端设备和网络设备基于第二指示信息，确定联合执行推理任务的具体流程，终端设备和网络设备分别利用第一子模型和第二子模型进行部分推理，获得最终推理结果。这样无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。由于终端设备参与推理，可利用终端设备的资源，提高了资源利用率。终端设备和网络设备各自进行独立推理，可降低推理所需时间，提高推理效率。In this embodiment, the terminal device and the network device determine the specific process for jointly executing the inference task based on the second instruction information. The terminal device and the network device respectively perform partial inference using the first sub-model and the second sub-model to obtain the final inference result. This eliminates the need to send data to the service device and obtain inference results from the service device, reducing the transmission latency of the entire inference process. Since the terminal device participates in inference, its resources can be utilized, improving resource utilization. The independent inference performed by the terminal device and the network device reduces the inference time required and improves inference efficiency.

在一种可能的实施方式中，所述第一模型包括人工智能模型或机器学习模型。In one possible implementation, the first model includes an artificial intelligence model or a machine learning model.

在本申请实施方式中，可实现对大模型的分布式部署，支持网络侧推理、网络侧与终端侧协同推理等多种推理方式，可降低传输时延。In the embodiments of this application, distributed deployment of large models can be realized, supporting multiple inference methods such as network-side inference and network-side and terminal-side collaborative inference, which can reduce transmission latency.

第二方面，本申请实施例提供了一种通信方法，应用于网络设备。可理解，该方法可以由通信装置执行，该通信装置可以是网络设备，也可以是用于网络设备的芯片(系统)或电路，本申请对此不作限定。该方法包括：Secondly, embodiments of this application provide a communication method applied to a network device. It is understood that this method can be executed by a communication device, which can be a network device, or a chip (system) or circuit used in a network device; this application does not limit this. The method includes:

接收来自终端设备的第一消息，第一消息用于请求网络设备部署第一模型，第一消息包括第一模型的识别信息和终端设备的资源信息；Receive a first message from the terminal device. The first message is used to request the network device to deploy a first model. The first message includes the identification information of the first model and the resource information of the terminal device.

基于第一模型和终端设备的资源信息，生成第二消息，第二消息包括第一指示信息或第二指示信息，第一指示信息指示由网络设备独立执行第一模型的推理任务，第二指示信息指示由终端设备和网络设备联合执行推理任务，以及指示终端设备和网络设备分别部署第一模型中的第一子模型和第二子模型；Based on the resource information of the first model and the terminal device, a second message is generated. The second message includes a first instruction message or a second instruction message. The first instruction message indicates that the network device independently executes the inference task of the first model. The second instruction message indicates that the terminal device and the network device jointly execute the inference task. It also indicates that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

向终端设备发送第二消息。Send a second message to the terminal device.

本申请中，网络设备根据第一模型和终端设备的资源信息，生成第一指示信息或第二指示信息，使得第一模型部署在网络设备或部署在网络设备和终端设备中，相应的，由网络设备独立执行推理任务或由终端设备和网络设备联合执行推理任务，推理所需的数据和获得的推理结果无需再传输给第一模型所在的服务设备，从而在终端设备使用第一模型获得推理结果的过程中降低了传输时延。当进行多次推理时，能大幅度降低传输时延。In this application, the network device generates first or second instruction information based on the resource information of the first model and the terminal device. This allows the first model to be deployed on the network device or in both the network device and the terminal device. Consequently, the inference task can be executed independently by the network device or jointly by the terminal device and the network device. The data required for inference and the obtained inference results no longer need to be transmitted to the service device where the first model resides, thereby reducing transmission latency during the process of the terminal device using the first model to obtain the inference results. When multiple inference operations are performed, the transmission latency can be significantly reduced.

在一种可能的实施方式中，在接收来自终端设备的第一消息之前，还包括：In one possible implementation, before receiving the first message from the terminal device, the method further includes:

向所述终端设备发送第四消息，所述第四消息用于获取所述终端设备的资源信息，所述第四消息包括所述网络设备中执行所述推理任务的第一网元的标识信息、所述第一模型的识别信息。A fourth message is sent to the terminal device. The fourth message is used to obtain the resource information of the terminal device. The fourth message includes the identification information of the first network element in the network device that performs the inference task and the identification information of the first model.

本申请实施方式中，网络设备向终端设备发送第四消息，可使得终端设备将自身资源信息发送给网络设备，网络设备基于终端设备的资源信息和第一模型生成第一指示信息或第二指示信息，明确后续由网络设备还是网络设备和终端设备执行第一模型的推理任务，实现将第一模型全部或部分部署在网络设备中，使得终端设备不必再从服务设备获得推理结果，可降低传输时延，提高服务效率。In this embodiment, the network device sends a fourth message to the terminal device, which enables the terminal device to send its own resource information to the network device. Based on the terminal device's resource information and the first model, the network device generates a first instruction message or a second instruction message to clarify whether the network device or the network device and the terminal device will subsequently perform the inference task of the first model. This allows the first model to be deployed entirely or partially in the network device, so that the terminal device no longer needs to obtain the inference result from the service device, thereby reducing transmission latency and improving service efficiency.

在一种可能的实施方式中，在接收来自终端设备的第一消息之前，还包括：向所述第一模型所在的服务设备发送第八消息，所述第八消息包括所述网络设备中执行所述推理任务的第一网元的标识信息。In one possible implementation, before receiving the first message from the terminal device, the method further includes: sending an eighth message to the service device where the first model is located, the eighth message including the identification information of the first network element in the network device that performs the inference task.

本申请实施方式中，该第一网元是由网络设备事先确定的，网络设备向服务设备反馈该第一网元的标识信息，可使得服务设备能获知网络设备中处理第一模型的部署和推理事项的网元，以便于后续服务设备能不必再执行第一模型的推理任务，降低推理传输时延。In this embodiment of the application, the first network element is determined in advance by the network device. The network device feeds back the identification information of the first network element to the service device, so that the service device can know the network element in the network device that handles the deployment and inference of the first model, so that the subsequent service device does not have to perform the inference task of the first model again, thereby reducing the inference transmission latency.

在一种可能的实施方式中，所述方法还包括：接收第三消息，所述第三消息包括所述第一模型和所述第一模型的识别信息。In one possible implementation, the method further includes: receiving a third message, the third message including the first model and the identification information of the first model.

本申请实施方式中，网络设备接收第三消息后，可存储或调用第一模型，获取终端设备的资源信息，生成第一指示信息或第二指示信息，实现将第一模型部分或完全部署至网络设备中，降低后续获得推理结果的传输时延。当第一模型部署在网络设备和终端设备中时，还可以利用终端设备的资源，提高终端设备的资源利用率，进一步降低传输时延。In this embodiment, after receiving the third message, the network device can store or call the first model, obtain resource information of the terminal device, and generate first or second indication information. This enables the first model to be partially or completely deployed in the network device, reducing the transmission latency for obtaining subsequent inference results. When the first model is deployed in both the network device and the terminal device, the resources of the terminal device can also be utilized, improving the resource utilization rate of the terminal device and further reducing transmission latency.

在一种可能的实施方式中，所述第一消息还包括所述第一模型。In one possible implementation, the first message also includes the first model.

基于所述第一模型的识别信息确定在所述网络设备中是否已存储所述第一模型；Based on the identification information of the first model, determine whether the first model has been stored in the network device;

在所述网络设备中未存储所述第一模型的情况下，存储所述第一模型；If the first model is not stored in the network device, the first model is stored.

在所述网络设备中已存储所述第一模型的情况下，调用所述第一模型；If the first model has been stored in the network device, the first model is invoked.

确定所述网络设备中执行所述第一模型的推理任务的第一网元。Identify the first network element in the network device that performs the inference task of the first model.

在所述第二消息包括所述第一指示信息的情况下，接收来自终端设备的第六消息，所述第六消息用于请求所述网络设备独立执行所述推理任务，所述第六消息包括执行所述推理任务需要输入的第一数据；If the second message includes the first indication information, a sixth message is received from the terminal device, the sixth message being used to request the network device to independently execute the inference task, the sixth message including first data required to execute the inference task;

基于所述第一数据独立执行所述推理任务，获得第一推理结果；Based on the first data, the reasoning task is executed independently to obtain the first reasoning result;

向所述终端设备发送所述第一推理结果。The first inference result is sent to the terminal device.

在所述第二消息包括所述第二指示信息的情况下，接收来自终端设备的第七消息，所述第七消息用于请求所述终端设备与所述网络设备联合执行所述推理任务，所述第七消息包括第二数据；所述第二数据是所述终端设备基于所述第一子模型对所述第一数据推理得到的，或者，所述第二数据是所述终端设备对所述第一数据进行预处理得到的，所述第一数据是执行所述推理任务需要输入的数据。If the second message includes the second indication information, a seventh message is received from the terminal device. The seventh message is used to request the terminal device and the network device to jointly execute the inference task. The seventh message includes second data. The second data is obtained by the terminal device from the first data based on the first sub-model, or the second data is obtained by the terminal device from the first data after preprocessing. The first data is the data required to execute the inference task.

基于所述第二子模型对所述第二数据进行推理得到第四数据；The fourth data is obtained by reasoning from the second data based on the second sub-model;

向所述终端设备发送所述第四数据。The fourth data is sent to the terminal device.

本申请实施例中，网络设备基于第二指示信息，确定联合执行推理任务的具体流程，和终端设备分别利用第二子模型和第一子模型进行部分推理，获得最终推理结果。这样无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。由于终端设备参与推理，可利用终端设备的资源，提高了资源利用率。终端设备和网络设备各自进行独立推理，可降低推理所需时间，提高推理效率。In this embodiment, the network device determines the specific process for jointly executing the inference task based on the second instruction information, and the terminal device performs partial inference using the second sub-model and the first sub-model respectively to obtain the final inference result. This eliminates the need to send data to the service device and obtain inference results from the service device, reducing the transmission latency of the entire inference process. Since the terminal device participates in inference, its resources can be utilized, improving resource utilization. The independent inference performed by the terminal device and the network device reduces the inference time required and improves inference efficiency.

第三方面，本申请实施例提供了一种通信装置，该装置包括用于执行如第一方面任一项所述方法的单元。Thirdly, embodiments of this application provide a communication device that includes a unit for performing the method as described in any of the first aspects.

在一种可能的设计中，该装置包括：In one possible design, the device includes:

通信单元，用于向网络设备发送第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；A communication unit is configured to send a first message to a network device, the first message being a request for the network device to deploy a first model, the first message including identification information of the first model and resource information of the terminal device;

所述通信单元，还用于接收来自所述网络设备的第二消息，所述第二消息包括第一指示信息或第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备与所述网络设备分别部署所述第一模型中的第一子模型和第二子模型。The communication unit is further configured to receive a second message from the network device, the second message including a first instruction message or a second instruction message, the first instruction message indicating that the network device independently executes the inference task of the first model, the second instruction message indicating that the terminal device and the network device jointly execute the inference task, and instructing the terminal device and the network device to respectively deploy the first sub-model and the second sub-model in the first model.

在一种可能的实施方式中，该装置还包括：In one possible implementation, the device further includes:

处理单元，用于生成所述第一消息。A processing unit is used to generate the first message.

关于第三方面以及任一项可能的实施方式所述的处理单元和通信单元，其执行的步骤可参考对应于第一方面以及相应的实施方式。Regarding the processing unit and communication unit described in the third aspect and any possible implementation, the steps performed thereon can be referred to the corresponding implementations in the first aspect.

关于第三方面以及任一项可能的实施方式所带来的技术效果，可参考对应于第一方面以及相应的实施方式的技术效果的介绍。For the technical effects of the third aspect and any possible implementation, please refer to the description of the technical effects corresponding to the first aspect and the corresponding implementation.

第四方面，本申请实施例提供了一种通信装置，该装置包括用于执行如第二方面任一项所述方法的单元。Fourthly, embodiments of this application provide a communication device that includes a unit for performing the method as described in any of the second aspects.

通信单元，用于接收来自终端设备的第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；A communication unit is configured to receive a first message from a terminal device, the first message being a request for the network device to deploy a first model, the first message including identification information of the first model and resource information of the terminal device;

所述通信单元，还用于向所述终端设备发送所述第二消息；The communication unit is also used to send the second message to the terminal device;

处理单元，用于基于所述第一模型和所述终端设备的资源信息，生成第二消息，所述第二消息包括所述第一指示信息或所述第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备和所述网络设备分别部署所述第一模型中的第一子模型和第二子模型。The processing unit is configured to generate a second message based on the resource information of the first model and the terminal device. The second message includes either the first instruction information or the second instruction information. The first instruction information indicates that the network device independently executes the inference task of the first model, and the second instruction information indicates that the terminal device and the network device jointly execute the inference task. It also indicates that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

关于第四方面以及任一项可能的实施方式所述的处理单元和通信单元，其执行的步骤可参考对应于第二方面以及相应的实施方式。Regarding the processing unit and communication unit described in the fourth aspect and any possible implementation, the steps performed thereon can be referred to the corresponding implementation in the second aspect.

关于第四方面以及任一项可能的实施方式所带来的技术效果，可参考对应于第二方面以及相应的实施方式的技术效果的介绍。Regarding the technical effects of the fourth aspect and any possible implementation, refer to the description of the technical effects corresponding to the second aspect and the corresponding implementation.

可选的，在上述第三方面至第四方面任一方面以及任一项可能的实施方式所述的通信装置中：Optionally, in the communication apparatus described in any of the third to fourth aspects and any of the possible embodiments:

在一种实现方式中，该通信装置为通信设备。当该通信装置为通信设备时，通信单元可以是收发器，或，输入/输出接口；处理单元可以是至少一个处理器。可选地，收发器可以为收发电路。可选地，输入/输出接口可以为输入/输出电路。In one implementation, the communication device is a communication equipment. When the communication device is a communication equipment, the communication unit can be a transceiver or an input/output interface; the processing unit can be at least one processor. Optionally, the transceiver can be a transceiver circuit. Optionally, the input/output interface can be an input/output circuit.

在另一种实现方式中，该通信装置为用于通信设备中的芯片(系统)或电路。当该通信装置为用于通信设备中的芯片(系统)或电路时，通信单元可以是该芯片(系统)或电路上的通信接口(输入/输出接口)、接口电路、输出电路、输入电路、管脚或相关电路等；处理单元可以是至少一个处理器、处理电路或逻辑电路等。In another implementation, the communication device is a chip (system) or circuit used in a communication device. When the communication device is a chip (system) or circuit used in a communication device, the communication unit can be a communication interface (input/output interface), interface circuit, output circuit, input circuit, pin, or related circuit on the chip (system) or circuit; the processing unit can be at least one processor, processing circuit, or logic circuit.

第五方面，本申请实施例提供了一种通信装置，该通信装置包括处理器。该处理器与存储器耦合，可用于执行存储器中的指令，以实现上述第一方面至第二方面任一方面以及任一项可能的实施方式的方法。可选地，该通信装置还包括存储器。可选地，该通信装置还包括通信接口，处理器与通信接口耦合。Fifthly, embodiments of this application provide a communication device including a processor. The processor is coupled to a memory and can be used to execute instructions in the memory to implement the methods of any one of the first to second aspects and any possible implementations described above. Optionally, the communication device further includes a memory. Optionally, the communication device further includes a communication interface, and the processor is coupled to the communication interface.

第六方面，本申请实施例提供了一种通信装置，包括：逻辑电路和通信接口。所述通信接口，用于接收信息或者发送信息；所述逻辑电路，用于通过所述通信接口接收信息或者发送信息，使得所述通信装置执行上述第一方面至第二方面任一方面以及任一项可能的实施方式的方法。Sixthly, embodiments of this application provide a communication device, including: a logic circuit and a communication interface. The communication interface is used to receive or send information; the logic circuit is used to receive or send information through the communication interface, causing the communication device to perform the method of any one of the first to second aspects and any possible implementation thereof.

第七方面，本申请实施例提供了一种计算机可读存储介质，所述计算机可读存储介质用于存储计算机程序(也可以称为代码，或指令)；当所述计算机程序在计算机上运行时，使得上述第一方面至第二方面任一方面以及任一项可能的实施方式的方法被实现。In a seventh aspect, embodiments of this application provide a computer-readable storage medium for storing a computer program (also referred to as code or instructions); when the computer program is run on a computer, the methods described in any of the first to second aspects and any possible implementations are implemented.

第八方面，本申请实施例提供了一种计算机程序产品，所述计算机程序产品包括：计算机程序(也可以称为代码，或指令)；当所述计算机程序被运行时，使得计算机执行上述第一方面至第二方面任一方面以及任一项可能的实施方式的方法。Eighthly, embodiments of this application provide a computer program product, the computer program product comprising: a computer program (also referred to as code or instructions); and, when the computer program is run, causing a computer to perform the method of any one of the first to second aspects and any possible implementation thereof.

第九方面，本申请实施例提供一种芯片，该芯片包括处理器，所述处理器用于执行指令，当该处理器执行所述指令时，使得该芯片执行上述第一方面至第二方面任一方面以及任一项可能的实施方式的方法。可选地，该芯片还包括通信接口，所述通信接口用于接收信号或发送信号。Ninthly, embodiments of this application provide a chip including a processor configured to execute instructions, which, when executed, cause the chip to perform the methods described in any one of the first to second aspects and any possible implementations thereof. Optionally, the chip further includes a communication interface configured to receive or transmit signals.

第十方面，本申请实施例提供一种通信系统，所述通信系统包括至少一个如第三方面所述的通信装置，或第四方面所述的通信装置，或第五方面所述的通信装置，或第六方面所述的通信装置，或第九方面所述的芯片。In a tenth aspect, embodiments of this application provide a communication system, the communication system including at least one communication device as described in the third aspect, or the fourth aspect, or the fifth aspect, or the sixth aspect, or the ninth aspect.

第十一方面，本申请实施例提供了一种通信系统，所述通信系统包括终端设备和网络设备，所述终端设备用于执行上述第一方面以及任一项可能的实施方式的方法，所述网络设备用于执行上述第二方面以及任一项可能的实施方式的方法。Eleventhly, embodiments of this application provide a communication system, the communication system including a terminal device and a network device, the terminal device being used to perform the methods of the first aspect and any possible implementation described above, and the network device being used to perform the methods of the second aspect and any possible implementation described above.

此外，在执行上述第一方面至第二方面任一方面以及任一项可能的实施方式所述的方法的过程中，上述方法中有关发送信息和/或接收信息等的过程，可以理解为由处理器输出信息的过程，和/或，处理器接收输入的信息的过程。在输出信息时，处理器可以将信息输出给收发器(或者通信接口、或发送模块)，以便由收发器进行发射。信息在由处理器输出之后，还可能需要进行其他的处理，然后才到达收发器。类似的，处理器接收输入的信息时，收发器(或者通信接口、或发送模块)接收信息，并将其输入处理器。更进一步的，在收发器收到该信息之后，该信息可能需要进行其他的处理，然后才输入处理器。Furthermore, in the process of performing the methods described in any of the first to second aspects and any possible embodiments described above, the processes related to sending and/or receiving information in the above methods can be understood as the process of the processor outputting information, and/or the process of the processor receiving input information. When outputting information, the processor can output the information to a transceiver (or communication interface, or transmitting module) so that the transceiver can transmit it. After the information is output by the processor, it may need to undergo other processing before reaching the transceiver. Similarly, when the processor receives input information, the transceiver (or communication interface, or transmitting module) receives the information and inputs it to the processor. Furthermore, after the transceiver receives the information, the information may need to undergo other processing before being input to the processor.

基于上述原理，举例来说，前述方法中提及的发送信息可以理解为处理器输出信息。又例如，接收信息可以理解为处理器接收输入的信息。Based on the above principles, for example, the information sent mentioned in the aforementioned method can be understood as information output by the processor. Similarly, the information received can be understood as information received by the processor from input.

可选的，对于处理器所涉及的发射、发送和接收等操作，如果没有特殊说明，或者，如果未与其在相关描述中的实际作用或者内在逻辑相抵触，则均可以更加一般性的理解为处理器输出和接收、输入等操作。Optionally, unless otherwise specified, or unless they contradict their actual function or internal logic in the relevant description, the operations of the processor, such as transmitting, sending, and receiving, can be more generally understood as processor output and receiving, input, and other operations.

可选的，在执行上述第一方面至第二方面任一方面以及任一项可能的实施方式所述的方法的过程中，上述处理器可以是专门用于执行这些方法的处理器，也可以是通过执行存储器中的计算机指令来执行这些方法的处理器，例如通用处理器。上述存储器可以为非瞬时性(non-transitory)存储器，例如只读存储器(Read Only Memory，ROM)，其可以与处理器集成在同一块芯片上，也可以分别设置在不同的芯片上，本申请实施例对存储器的类型以及存储器与处理器的设置方式不做限定。Optionally, in the process of performing the methods described in any of the first to second aspects and any possible embodiments above, the processor may be a processor specifically designed to perform these methods, or a processor that performs these methods by executing computer instructions stored in memory, such as a general-purpose processor. The memory may be a non-transitory memory, such as read-only memory (ROM), which may be integrated with the processor on the same chip or disposed on different chips. This application does not limit the type of memory or the arrangement of the memory and processor.

在一种可能的实施方式中，上述至少一个存储器位于装置之外。In one possible implementation, at least one of the aforementioned memories is located outside the device.

在又一种可能的实施方式中，上述至少一个存储器位于装置之内。In yet another possible implementation, at least one of the aforementioned memories is located within the device.

在又一种可能的实施方式之中，上述至少一个存储器的部分存储器位于装置之内，另一部分存储器位于装置之外。In another possible implementation, a portion of the memory of the at least one memory is located inside the device, while another portion is located outside the device.

本申请中，处理器和存储器还可能集成于一个器件中，即处理器和存储器还可以被集成在一起。In this application, the processor and memory may also be integrated into a single device, that is, the processor and memory can be integrated together.

本申请实施例中，网络设备可以根据第一模型和终端设备的资源信息，生成第一指示信息或第二指示信息，使得第一模型部署在网络设备或部署在网络设备和终端设备中，相应的，由网络设备独立执行推理任务或由终端设备和网络设备联合执行推理任务，推理所需的数据和获得的推理结果无需再传输给第一模型所在的服务设备，从而在终端设备使用第一模型获得推理结果的过程中降低了传输时延。In this embodiment, the network device can generate first instruction information or second instruction information based on the resource information of the first model and the terminal device, so that the first model is deployed on the network device or deployed in the network device and the terminal device. Accordingly, the network device can independently execute the inference task or the terminal device and the network device can jointly execute the inference task. The data required for inference and the obtained inference results do not need to be transmitted to the service device where the first model is located, thereby reducing the transmission latency during the process of the terminal device using the first model to obtain the inference results.

Attached Figure Description

为了更清楚地说明本申请实施例的技术方案，下面将对本申请实施例中所需要使用的附图作简单地介绍，显而易见地，下面所描述的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。To more clearly illustrate the technical solutions of the embodiments of this application, the drawings used in the embodiments of this application will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

图1为本申请实施例提供的一种通信系统的示意图；Figure 1 is a schematic diagram of a communication system provided in an embodiment of this application;

图2为本申请实施例提供的一种通信方法的流程示意图；Figure 2 is a flowchart illustrating a communication method provided in an embodiment of this application;

图3为本申请实施例提供的另一种通信方法的流程示意图；Figure 3 is a flowchart illustrating another communication method provided in an embodiment of this application;

图4为本申请实施例提供的又一种通信方法的流程示意图；Figure 4 is a flowchart illustrating another communication method provided in an embodiment of this application;

图5为本申请实施例提供的又一种通信方法的流程示意图；Figure 5 is a flowchart illustrating another communication method provided in an embodiment of this application;

图6a为本申请实施例提供的第一模型转移部署后终端设备获得推理结果的一流程示意图；Figure 6a is a schematic diagram of the process by which a terminal device obtains inference results after the first model is transferred and deployed according to an embodiment of this application.

图6b为本申请实施例提供的第一模型转移部署后终端设备获得推理结果的另一流程示意图；Figure 6b is another schematic diagram of the process by which the terminal device obtains the inference result after the first model is transferred and deployed according to the embodiment of this application;

图6c为本申请实施例提供的第一模型转移部署后终端设备获得推理结果的又一流程示意图；Figure 6c is another schematic diagram of the process by which the terminal device obtains the inference result after the first model is transferred and deployed according to the embodiment of this application;

图7为本申请实施例提供的一种通信装置的结构示意图；Figure 7 is a schematic diagram of the structure of a communication device provided in an embodiment of this application;

图8为本申请实施例提供的一种通信装置的结构示意图；Figure 8 is a schematic diagram of the structure of a communication device provided in an embodiment of this application;

图9为本申请实施例提供的一种芯片的结构示意图。Figure 9 is a schematic diagram of the structure of a chip provided in an embodiment of this application.

Detailed Implementation

为了使本申请的目的、技术方案和优点更加清楚，下面将结合本申请实施例中的附图对本申请实施例进行描述。To make the objectives, technical solutions, and advantages of this application clearer, the embodiments of this application will be described below with reference to the accompanying drawings.

本申请的说明书、权利要求书及附图中的术语“第一”和“第二”等是用于区别不同对象，而不是用于描述特定顺序。此外，术语“包括”和“具有”以及它们的任何变形，意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备等，没有限定于已列出的步骤或单元，而是可选地还包括没有列出的步骤或单元等，或可选地还包括对于这些过程、方法、产品或设备等固有的其它步骤或单元。The terms "first" and "second," etc., used in the specification, claims, and drawings of this application are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or apparatuses.

在本文中提及的“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员可以显式地和隐式地理解的是，在本申请的各个实施例中，如果没有特殊说明以及逻辑冲突，各个实施例之间的术语和/或描述具有一致性、且可以相互引用，不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。The term "embodiment" as used herein means that a specific feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art will explicitly and implicitly understand that, unless otherwise specified or logically conflicting, the terminology and/or descriptions between the various embodiments of this application are consistent and can be mutually referenced, and technical features in different embodiments can be combined to form new embodiments based on their inherent logical relationships.

应当理解，在本申请中，“至少一个(项)”是指一个或者多个，“多个”是指两个或两个以上，“至少两个(项)”是指两个或三个及三个以上，“和/或”，用于描述关联对象的关联关系，表示可以存在三种关系，例如，“A和/或B”可以表示：只存在A，只存在B以及同时存在A和B三种情况，其中A，B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达，是指这些项中的任意组合，包括单项(个)或复数项(个)的任意组合。例如，a，b或c中的至少一项(个)，可以表示：a，b，c，“a和b”，“a和c”，“b和c”，或“a和b和c”，其中a，b，c可以是单个，也可以是多个。It should be understood that in this application, "at least one (item)" means one or more, "more than one" means two or more, "at least two (items)" means two or three or more, and "and/or" is used to describe the relationship between related objects, indicating that there can be three relationships. For example, "A and/or B" can mean: only A exists, only B exists, and A and B exist simultaneously, where A and B can be singular or plural. The character "/" generally indicates that the related objects before and after are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

需要说明的是，在本申请中，“指示”可以包括直接指示、间接指示、显示指示、隐式指示。当描述某一指示信息用于指示A时，可以理解为该指示信息携带A、直接指示A，或间接指示A。It should be noted that, in this application, "instruction" can include direct instruction, indirect instruction, explicit instruction, and implicit instruction. When describing a certain instruction information for the purpose of instructing A, it can be understood that the instruction information carries A, directly instructs A, or indirectly instructs A.

本申请中，指示信息所指示的信息，称为待指示信息。在具体实现过程中，对待指示信息进行指示的方式有很多种，例如但不限于，可以直接指示待指示信息，如待指示信息本身或者该待指示信息的索引等。也可以通过指示其它信息来间接指示待指示信息，其中该其它信息与待指示信息之间存在关联关系。还可以仅仅指示待指示信息的一部分，而待指示信息的其它部分则是已知的或者提前约定的。例如，还可以借助预先约定(例如协议规定)的各个信息的排列顺序来实现对特定信息的指示，从而在一定程度上降低指示开销。待指示信息可以作为一个整体一起发送，也可以分成多个子信息分开发送，而且这些子信息的发送周期和/或发送时机可以相同，也可以不同。具体发送方法本申请不进行限定。其中，这些子信息的发送周期和/或发送时机可以是预先定义的，例如根据协议预先定义的，也可以是发射端设备通过向接收端设备发送配置信息来配置的。In this application, the information indicated by the instruction information is called the information to be instructed. In specific implementations, there are many ways to indicate the information to be instructed, such as, but not limited to, directly indicating the information to be instructed, such as the information to be instructed itself or its index. It can also indirectly indicate the information to be instructed by indicating other information, where there is a correlation between the other information and the information to be instructed. It can also indicate only a part of the information to be instructed, while the other parts are known or pre-agreed upon. For example, the instruction of specific information can be achieved by using a pre-agreed (e.g., protocol-defined) arrangement of various information, thereby reducing instruction overhead to some extent. The information to be instructed can be sent as a whole or divided into multiple sub-information units, and the sending period and/or timing of these sub-information units can be the same or different. This application does not limit the specific sending method. The sending period and/or timing of these sub-information units can be predefined, for example, according to a protocol, or configured by the transmitting device by sending configuration information to the receiving device.

需要说明的是，本申请中“发送”可以理解为“输出”，“接收”可以理解为“输入”。“向A发送信息”，其中“向A”只是表示信息传输的走向，A是目的地，不限制“向A发送信息”一定是空口上的直接发送。“向A发送信息”包括直接向A发送信息，也包括通过发射机间接向A发送信息，所以“向A发送信息”也可以理解为“输出去向A的信息”。同理，“接收来自A的信息”，表示该信息的来源是A，包括直接从A接收信息，也包括通过接收机间接接收来自A的信息，所以“接收来自A的信息”也可以理解为“输入来自A的信息”。It should be noted that in this application, "send" can be understood as "output" and "receive" can be understood as "input". "Send information to A", where "to A" simply indicates the direction of information transmission, and A is the destination, does not limit "send information to A" to a direct transmission over the air interface. "Send information to A" includes sending information directly to A, as well as sending information indirectly to A through a transmitter. Therefore, "send information to A" can also be understood as "outputting information destined for A". Similarly, "receive information from A" indicates that the source of the information is A, including receiving information directly from A, as well as receiving information indirectly from A through a receiver. Therefore, "receive information from A" can also be understood as "inputting information from A".

本申请实施例提供一种通信方法及相关装置，可以降低终端设备获得模型推理结果过程的传输时延。This application provides a communication method and related apparatus that can reduce the transmission latency of a terminal device in obtaining model inference results.

以下介绍本申请实施例提供的通信系统。如图1所示，该通信系统可以包括终端设备、网络设备和服务设备。其中，网络设备可以连接终端设备与服务设备。The following describes a communication system provided in an embodiment of this application. As shown in Figure 1, the communication system may include a terminal device, a network device, and a service device. The network device can connect the terminal device and the service device.

终端设备也可称为用户设备(user equipment，UE)、终端等。终端设备是一种具有无线收发功能的设备，可以部署在陆地上，包括室内或室外、手持、穿戴或车载；也可以部署在水面上，如轮船上等；还可以部署在空中，例如部署在飞机、气球或卫星上等。终端设备可以是手机(mobile phone)、平板电脑(Pad)、带无线收发功能的电脑、虚拟现实(virtual reality，VR)终端设备、增强现实(augmented reality，AR)终端设备、工业控制(industrial control)中的无线终端、无人驾驶(self driving)中的无线终端、远程医疗(remote medical)中的无线终端、智能电网(smart grid)中的无线终端、运输安全(transportation safety)中的无线终端、智慧城市(smart city)中的无线终端、智慧家庭(smart home)中的无线终端等等。可理解，该终端设备还可以是未来6G网络中的终端设备或者未来演进的PLMN中的终端设备等。Terminal equipment, also known as user equipment (UE) or terminal, is a device with wireless transceiver capabilities. It can be deployed on land (indoors or outdoors, handheld, wearable, or vehicle-mounted), on water (e.g., on ships), or in the air (e.g., on airplanes, balloons, or satellites). Terminal equipment can include mobile phones, tablets, computers with wireless transceiver capabilities, virtual reality (VR) terminal devices, augmented reality (AR) terminal devices, wireless terminals in industrial control, wireless terminals in self-driving, wireless terminals in remote medical care, wireless terminals in smart grids, wireless terminals in transportation safety, wireless terminals in smart cities, wireless terminals in smart homes, and so on. It is understandable that the terminal device could also be a terminal device in a future 6G network or a terminal device in a future evolved PLMN, etc.

网络设备，可以从终端设备和服务设备收集数据，并进行分析和预测。网络设备可以包括管理网元，例如该管理网元可以是核心网中的网络数据分析功能(network data analytics function，NWDAF)、网元管理系统(Element Management System，EMS)、网络管理系统(Network Management System，NMS)、接入网中的移动智能功能(Mobile Intelligent Function，MIF)或其它具有大数据分析功能或人工智能处理功能的单元等。其中，NWDAF具有AI训练、推理等各类智能计算功能。EMS用于管理一个或多个某个类别的网元，也可以称为域管理系统或者单域管理系统。NMS负责网络的运行、管理和维护功能，也可以称为跨域管理系统。MIF泛指无线网络中负责进行智能计算的网络功能。此外，该网络设备中还可以包括接入网设备。Network devices can collect, analyze, and predict data from terminal and service devices. These devices may include management network elements, such as the network data analytics function (NWDAF) in the core network, the element management system (EMS), the network management system (NMS), the mobile intelligent function (MIF) in the access network, or other units with big data analytics or artificial intelligence processing capabilities. NWDAF, for example, possesses various intelligent computing functions such as AI training and inference. EMS manages one or more network elements of a specific category and can also be called a domain management system or a single-domain management system. NMS is responsible for the operation, management, and maintenance of the network and can also be called a cross-domain management system. MIF generally refers to the network function responsible for intelligent computing in a wireless network. Furthermore, these network devices may also include access network devices.

接入网设备可以是下一代节点B(next generation node B，gNB)、下一代演进型基站(next generation evolved nodeB，ng-eNB)、或者未来6G通信中的接入网设备等。接入网设备可以是任意一种具有无线收发功能的设备，包括但不限于以上所示的基站(base station，BS)。该基站还可以是未来通信系统如第六代通信系统中的基站。可选的，该接入网设备可以为无线局域网(wireless fidelity，WiFi)系统中的接入节点、无线中继节点、无线回传节点等。可选的，该接入网设备可以是云无线接入网络(cloud radio access network，CRAN)场景下的无线控制器。可选的，该接入网设备可以是可穿戴设备或车载设备等。可选的，该接入网设备还可以是小站，传输接收节点(transmission reception point，TRP)(或也可以称为传输点)等。可理解，该接入网设备还可以是未来演进的公共陆地移动网络(public land mobile network，PLMN)中的基站等等。Access network equipment can be a next-generation node B (gNB), a next-generation evolved node B (ng-eNB), or access network equipment in future 6G communications. Access network equipment can be any device with wireless transceiver capabilities, including but not limited to the base station (BS) mentioned above. The base station can also be a base station in future communication systems such as sixth-generation communication systems. Optionally, the access network equipment can be an access node, wireless relay node, or wireless backhaul node in a wireless Fidelity (WiFi) system. Optionally, the access network equipment can be a wireless controller in a cloud radio access network (CRAN) scenario. Optionally, the access network equipment can be a wearable device or an in-vehicle device. Optionally, the access network equipment can also be a small cell, a transmission reception point (TRP) (or a transmission point), etc. It is understandable that the access network equipment could also be a base station in a future public land mobile network (PLMN), etc.

在一些部署中，基站(如gNB)可以由集中式单元(centralized unit，CU)和分布式单元(distributed unit，DU)构成。即对接入网中的基站的功能进行拆分，将基站的部分功能部署在一个CU，将剩余功能部署在DU。且多个DU共用一个CU，可以节省成本，以及易于网络扩展。在基站的另一些部署中，CU还可以划分为CU-控制面(control plane，CP)和CU-用户面(user plan，UP)等。在基站的又一些部署中，基站还可以是天线单元(radio unit，RU)等。在基站的又一些部署中，基站还可以是开放的无线接入网(open radio access network，ORAN)架构等等，本申请对于基站的具体类型不作限定。示例性的，在基站是ORAN架构时，本申请实施例所示的基站可以是ORAN中的接入网设备，或者是接入网设备中的模块等。在ORAN系统中，CU还可以称为开放(open，O)-CU，DU还可以称为O-DU，CU-DU还可以称为O-CU-DU，CU-UP还可以称为O-CU-UP，RU还可以称为O-RU。In some deployments, base stations (such as gNBs) can be composed of centralized units (CUs) and distributed units (DUs). This means the functions of the base station in the access network are split, with some functions deployed in a CU and the remaining functions in a DU. Multiple DUs sharing a single CU can save costs and facilitate network expansion. In other base station deployments, the CU can be divided into a CU-control plane (CP) and a CU-user plane (UP). In still other deployments, the base station can be a radio unit (RU). In yet another deployment, the base station can be an open radio access network (ORAN) architecture, etc. This application does not limit the specific type of base station. For example, when the base station is an ORAN architecture, the base station shown in the embodiments of this application can be an access network device in ORAN, or a module within an access network device, etc. In the ORAN system, CU can also be called open (O)-CU, DU can also be called O-DU, CU-DU can also be called O-CU-DU, CU-UP can also be called O-CU-UP, and RU can also be called O-RU.

服务设备，是指除网络设备和终端设备之外的第三方网络应用服务器。Service equipment refers to third-party network application servers other than network equipment and terminal equipment.

应理解，图1示例性地示出了一个表示管理网元的核心网设备、一个基站、一个终端设备和一个服务设备，以及各通信设备之间的通信链路。可选地，该通信系统可以包括多个基站，并且每个基站的覆盖范围内可以包括其它数量的终端设备，例如更多或更少的终端设备等，本申请对此不做限定。It should be understood that Figure 1 exemplarily illustrates a core network device representing a management network element, a base station, a terminal device, and a service device, as well as the communication links between the various communication devices. Optionally, the communication system may include multiple base stations, and the coverage area of each base station may include other numbers of terminal devices, such as more or fewer terminal devices, etc., which is not limited in this application.

上述各个通信设备，如图1中的网络设备、终端设备、服务设备，可以配置多个天线。该多个天线可以包括至少一个用于发送信号的发射天线和至少一个用于接收信号的接收天线等，本申请实施例对于各个通信设备的具体结构不作限定。可选地，该通信系统还可以包括网络控制器、移动管理实体等其他网络实体，本申请实施例不限于此。The aforementioned communication devices, such as the network device, terminal device, and service device in Figure 1, can be configured with multiple antennas. These multiple antennas may include at least one transmitting antenna for transmitting signals and at least one receiving antenna for receiving signals, etc. This application embodiment does not limit the specific structure of each communication device. Optionally, the communication system may also include other network entities such as a network controller and a mobility management entity; this application embodiment is not limited to these.

可理解，图1所示的通信系统示意图仅为示例，对于其他形式的通信系统示意图可以参考相关标准或协议等，这里不再一一详述。It is understood that the communication system diagram shown in Figure 1 is only an example. For other forms of communication system diagrams, please refer to the relevant standards or protocols, etc., which will not be described in detail here.

下文示出的各个实施例可以适用于图1所示的通信系统，也可以适用于其他形式的通信系统，对此，下文不再赘述。The various embodiments shown below can be applied to the communication system shown in Figure 1, or to other forms of communication systems, which will not be described further below.

本申请提供了一种通信方法，应用于通信技术领域。为了更清楚地描述本申请的方案，下面先介绍一些与机器学习模型(或人工智能模型)相关的知识。This application provides a communication method applicable to the field of communication technology. To more clearly describe the solution of this application, some knowledge related to machine learning models (or artificial intelligence models) will be introduced below.

本申请中提到的模型即机器学习模型或人工智能模型，可以认为是实现计算机自动“学习”的算法。模型包括训练和推理两个核心阶段。The model mentioned in this application, namely the machine learning model or artificial intelligence model, can be considered as an algorithm that enables computers to "learn" automatically. The model includes two core stages: training and inference.

在训练阶段，通过大量数据和算法，模型学会识别和生成规律。大模型通过深度学习技术，通过多层神经网络，对接收输入的海量数据进行学习和优化，并通过学习调整模型的参数，使其能够对输入数据进行准确的预测。During the training phase, the model learns to recognize and generate patterns through a large amount of data and algorithms. Large models, through deep learning techniques and multi-layered neural networks, learn and optimize from the massive amounts of input data, and adjust the model's parameters to enable it to make accurate predictions on the input data.

推理阶段则建立在训练完成的基础上，训练好的模型被用于新的、未见过的数据进行预测或分类。大模型在推理阶段可以处理各种类型的输入，并输出相应的预测结果。推理可以在生产环境中进行，例如在实际应用中对图像、语音或文本进行分类，也可以用于其他任务，如语言生成、翻译等。The inference phase builds upon the completed training, where the trained model is used to predict or classify new, unseen data. Large models can handle various types of input and output corresponding predictions during the inference phase. Inference can be performed in production environments, such as classifying images, speech, or text in real-world applications, or for other tasks like language generation and translation.

使用模型解决实际问题的过程中，包括模型部署和模型推理两部分。模型部署指把训练好的模型在特定环境中运行的过程。在模型部署好后，输入数据利用模型进行推理，可以获得推理结果，应用在实际场景中。Using models to solve real-world problems involves two main parts: model deployment and model inference. Model deployment refers to running the trained model in a specific environment. After deployment, the input data is used to perform inference using the model, yielding inference results that can then be applied to the real-world scenario.

请参阅图2，图2为本申请实施例提供的一种通信方法的流程示意图。该通信方法应用于通信技术领域。可理解，该通信方法可以由通信装置执行，该通信装置可以是网络设备、终端设备或服务设备，也可以是用于这些设备中的芯片(系统)或电路，本申请对此不作限定。该通信方法包括但不限于如下步骤：Please refer to Figure 2, which is a flowchart illustrating a communication method provided in an embodiment of this application. This communication method is applied in the field of communication technology. It is understood that this communication method can be executed by a communication device, which can be a network device, terminal device, or service device, or a chip (system) or circuit used in these devices; this application does not limit its scope. The communication method includes, but is not limited to, the following steps:

S201：终端设备向网络设备发送第一消息，相应的，网络设备接收该第一消息。S201: The terminal device sends a first message to the network device, and the network device receives the first message accordingly.

S202：网络设备基于第一模型和终端设备的资源信息，生成第二消息。S202: The network device generates a second message based on the first model and the resource information of the terminal device.

S203：网络设备向终端设备发送第二消息，相应的，终端设备接收该第二消息。S203: The network device sends a second message to the terminal device, and the terminal device receives the second message accordingly.

其中，本申请实施例中的第一消息用于请求网络设备部署第一模型，第一消息包括第一模型的识别信息和终端设备的资源信息。第二消息包括第一指示信息或第二指示信息，第一指示信息指示由网络设备独立执行第一模型的推理任务，第二指示信息指示由终端设备和网络设备联合执行第一模型的推理任务，以及指示终端设备与网络设备分别部署第一模型中的第一子模型和第二子模型。In this embodiment, the first message is used to request the network device to deploy a first model. The first message includes the identification information of the first model and the resource information of the terminal device. The second message includes either a first instruction or a second instruction. The first instruction indicates that the network device independently performs the inference task of the first model, and the second instruction indicates that the terminal device and the network device jointly perform the inference task of the first model, and instructs the terminal device and the network device to deploy the first sub-model and the second sub-model of the first model, respectively.

其中，第一模型可以包括人工智能模型或机器学习模型。The first model may include an artificial intelligence model or a machine learning model.

该第一模型的识别信息可理解为指示该第一模型和识别该第一模型的信息，具体可以包括以下至少一项：第一模型的标识信息、指示第一模型的可获取方式的信息等。第一模型的可获取方式可以包括该第一模型的地址信息等。示例性的，当第一模型的识别信息包括第一模型的标识信息时，若该第一模型对应为APP应用中的某个服务，那么该第一模型的标识信息可包括该APP的应用标识和该服务的服务类型。本申请实施例对此不做限制。The identification information of the first model can be understood as information indicating and identifying the first model, specifically including at least one of the following: the identification information of the first model, information indicating the accessibility of the first model, etc. The accessibility of the first model may include the address information of the first model, etc. For example, when the identification information of the first model includes the identification information of the first model, if the first model corresponds to a service in an APP application, then the identification information of the first model may include the application identifier of the APP and the service type of the service. This application embodiment does not limit this.

终端设备的资源信息可以包括终端设备的存储资源信息和计算资源信息，具体可以包括终端设备芯片的可用内存、终端设备芯片的计算能力等。示例性的，该终端设备的资源信息可以包括终端设备芯片的可用内存、芯片功率和计算频率等信息。The resource information of a terminal device may include its storage and computing resources, specifically including the available memory and computing power of its chip. For example, the resource information may include the available memory, power consumption, and computing frequency of the terminal device's chip.

第一模型中的第一子模型可以包括一个或多个模型，第一模型中的第二子模型可以包括一个或多个模型。第一子模型和第二子模型不同。其中，第一子模型和第二子模型可以分别为第一模组中的浅层模型和深层模型。第一模型中的浅层模型相比深层模型更靠近第一模型的输入层。或者，第一子模型和第二子模型可分别为第一模型中可分别部署在不同地方且相互独立的两部分子模型。The first sub-model in the first model may include one or more models, and the second sub-model in the first model may also include one or more models. The first sub-model and the second sub-model are different. Specifically, the first sub-model and the second sub-model can be respectively a shallow model and a deep model in the first module. The shallow model in the first model is closer to the input layer of the first model than the deep model. Alternatively, the first sub-model and the second sub-model can be two independent sub-models that can be deployed in different locations within the first model.

可以理解的是，网络设备可基于第一模型的参数规模、模型结构等信息以及终端设备的存储资源、计算资源等资源信息，确定是否需要对第一模型进行拆分，若需要进行拆分则根据拆分难度确定分别要部署在终端设备和网络设备中的部分模型，生成第一指示信息或第二指示信息。该第一指示信息和第二指示信息可称为协同推理指示。Understandably, network devices can determine whether the first model needs to be split based on information such as the parameter scale and model structure of the first model, as well as resource information such as the storage and computing resources of the terminal devices. If splitting is necessary, the network devices determine the parts of the model to be deployed on the terminal devices and network devices respectively, based on the difficulty of splitting, and generate first or second instruction information. This first and second instruction information can be referred to as collaborative reasoning instructions.

当第二指示信息指示终端设备部署第一子模型时，终端设备的存储资源和计算资源等足够支持在终端设备中对第一子模型进行推理。When the second instruction information instructs the terminal device to deploy the first sub-model, the terminal device's storage and computing resources are sufficient to support inference of the first sub-model within the terminal device.

可选地，第二指示信息还可以进一步指示终端设备和网络设备联合执行推理任务时的具体规则，例如数据的输入、对基于第一子模型得到的特征的处理方式、对基于第二子模型得到的特征的处理方式、获得最终推理结果的方式等，本申请实施例对此不作限制。Optionally, the second instruction information may further instruct the specific rules for the joint execution of the inference task by the terminal device and the network device, such as the input of data, the processing method of the features obtained based on the first sub-model, the processing method of the features obtained based on the second sub-model, and the method of obtaining the final inference result. This application embodiment does not limit this.

可选地，终端设备接收第二消息之后，若第二消息中包括第二指示信息，终端设备可部署第一模型中的第一子模型，后续在需要执行第一模型的推理任务时，可基于该第一子模型和第二指示信息进行部分推理，与网络设备联合执行第一模型的推理任务，获得推理结果。Optionally, after receiving the second message, if the second message includes second instruction information, the terminal device can deploy the first sub-model in the first model. Subsequently, when it is necessary to execute the inference task of the first model, it can perform partial inference based on the first sub-model and the second instruction information, and jointly execute the inference task of the first model with the network device to obtain the inference result.

可以理解的是，当第一模型部署在服务设备中，终端设备获得第一模型的推理结果之前，需要将数据经网络设备发送给服务设备，服务设备执行推理任务，获得推理结果并将推理结果经网络设备传输给终端设备，该过程需要经过多个节点转发信息，传输时延高。Understandably, when the first model is deployed in the service device, before the terminal device obtains the inference result of the first model, it needs to send the data to the service device via the network device. The service device performs the inference task, obtains the inference result, and transmits the inference result to the terminal device via the network device. This process requires information to be forwarded through multiple nodes, resulting in high transmission latency.

而本申请实施例中，网络设备可以根据第一模型和终端设备的资源信息，生成第一指示信息或第二指示信息，使得第一模型部署在网络设备或部署在网络设备和终端设备中，相应的，由网络设备独立执行推理任务或由终端设备和网络设备联合执行推理任务，推理所需的数据和获得的推理结果无需再传输给第一模型所在的服务设备，从而在终端设备使用第一模型获得推理结果的过程中降低了传输时延。当进行多次推理时，能大幅度降低传输时延。In this embodiment, the network device can generate first or second instruction information based on the resource information of the first model and the terminal device. This allows the first model to be deployed on the network device or on both the network device and the terminal device. Consequently, the inference task can be executed independently by the network device or jointly by the terminal device and the network device. The data required for inference and the obtained inference results no longer need to be transmitted to the service device where the first model resides, thereby reducing transmission latency during the process of the terminal device using the first model to obtain the inference results. When multiple inference operations are performed, the transmission latency can be significantly reduced.

在一种可能的实施例中，本申请实施例在执行上述步骤S201之前，还可以执行以下步骤：In one possible embodiment, before performing step S201 described above, the present application embodiment may further perform the following steps:

S204：终端设备通过网络设备与第一模型所在的服务设备建立连接。S204: The terminal device establishes a connection with the service device where the first model is located through the network device.

S205：终端设备进行注册，服务设备基于第一模型对应的应用对终端设备进行鉴权，终端设备授权用户数据。S205: The terminal device registers, and the service device authenticates the terminal device based on the application corresponding to the first model, and the terminal device authorizes user data.

其中，第一模型存储并部署在上述服务设备中，该服务设备可以为应用服务器。服务设备可用于实现某些特定应用或服务的业务逻辑生成和管理功能等。经过训练的模型整合到为解决业务问题而开发的应用中，可以部署在服务设备中。该服务设备可通过执行第一模型的推理任务，实现为终端提供应用服务。可选地，第一模型可对应服务设备中的一个应用，或者，第一模型可对应服务设备中的一个应用中的某一个具体的服务。The first model is stored and deployed in the aforementioned service device, which can be an application server. The service device can be used to implement business logic generation and management functions for specific applications or services. The trained model, integrated into an application developed to solve business problems, can be deployed in the service device. The service device can provide application services to terminals by executing the inference tasks of the first model. Optionally, the first model can correspond to an application within the service device, or it can correspond to a specific service within an application within the service device.

可选地，当服务设备中与第一模型对应的应用可提供个性化服务时，第一模型为利用终端设备授权的用户数据对基础模型进行微调后的模型。否则，第一模型为该服务设备提供的基础模型。Optionally, when the application corresponding to the first model in the service device can provide personalized services, the first model is a model fine-tuned from the basic model using user data authorized by the terminal device. Otherwise, the first model is the basic model provided by the service device.

可以理解的是，终端设备进行注册后，服务设备可以将准备好的第一模型下发给网络设备或终端设备，方便后续将第一模型部署在网络设备中或部署在网络设备和终端设备中，从而能够降低终端设备获得第一模型的推理结果过程的传输时延。Understandably, after the terminal device registers, the service device can send the prepared first model to the network device or the terminal device, which facilitates the subsequent deployment of the first model in the network device or in both the network device and the terminal device, thereby reducing the transmission latency of the terminal device in obtaining the inference results of the first model.

本申请实施例中，服务设备将第一模型下发，终端设备向网络设备发送第一消息，使得网络设备可基于第一消息生成第一指示信息或第二指示信息，并将该指示信息发送给终端设备。当该指示信息指示由网络设备独立执行推理任务，可实现将第一模型部署在网络设备中，获得推理结果时无需经过服务设备，降低了传输时延。当该指示信息指示由终端设备和网络设备联合执行推理任务，终端设备可部署第一模型中的第一子模型，并在需要执行推理任务时基于该指示信息和第一子模型进行部分推理，与网络设备联合获得推理结果，实现将第一模型部署在终端设备和网络设备中。该过程中既可以降低传输时延，又能利用终端设备的资源，提高了资源利用率。In this embodiment, the service device distributes the first model, and the terminal device sends a first message to the network device. This allows the network device to generate either a first instruction or a second instruction based on the first message and send the instruction to the terminal device. When the instruction indicates that the network device should independently execute the inference task, the first model can be deployed within the network device, and the inference result can be obtained without going through the service device, reducing transmission latency. When the instruction indicates that the terminal device and the network device should jointly execute the inference task, the terminal device can deploy a first sub-model from the first model and perform partial inference based on the instruction and the first sub-model when the inference task needs to be executed, jointly obtaining the inference result with the network device, thus deploying the first model in both the terminal device and the network device. This process reduces transmission latency and utilizes the resources of the terminal device, improving resource utilization.

请参阅图3，图3为本申请实施例提供的另一种通信方法的流程示意图。可以理解的是，本申请实施例中的步骤可以视为上述图2中的实施例的合理变形或补充；或者，可以理解的是，本申请实施例中的通信方法也可以视为能单独执行的实施例，本申请对此不作限制。在该通信方法中，由终端设备触发第一模型的转移部署。Please refer to Figure 3, which is a flowchart illustrating another communication method provided in an embodiment of this application. It is understood that the steps in this embodiment can be considered reasonable variations or supplements to the embodiment in Figure 2 above; or, it is understood that the communication method in this embodiment can also be considered an embodiment that can be executed independently, and this application does not impose any limitations on it. In this communication method, the transfer deployment of the first model is triggered by the terminal device.

该通信方法包括但不限于如下步骤：This communication method includes, but is not limited to, the following steps:

S301：终端设备向网络设备发送第一消息，相应的，网络设备接收该第一消息。S301: The terminal device sends a first message to the network device, and the network device receives the first message accordingly.

S302：网络设备基于第一模型和终端设备的资源信息，生成第二消息。S302: The network device generates a second message based on the first model and the resource information of the terminal device.

S303：网络设备向终端设备发送第二消息，相应的，终端设备接收该第二消息。S303: The network device sends a second message to the terminal device, and the terminal device receives the second message accordingly.

上述步骤S301至S303与上述图2所示的实施例中的步骤S201至S203一致，此处不再赘述。The steps S301 to S303 described above are the same as steps S201 to S203 in the embodiment shown in Figure 2 above, and will not be repeated here.

在一种可能的实施例中，本申请实施例在执行上述步骤S301之后，还可以执行以下步骤：In one possible embodiment, after performing step S301 as described above, the present application embodiment may further perform the following steps:

S304：网络设备基于第一模型的识别信息确定在网络设备中是否已存储第一模型；在网络设备中未存储第一模型的情况下，存储第一模型；在网络设备中已存储第一模型的情况下，调用第一模型；S304: The network device determines whether the first model has been stored in the network device based on the identification information of the first model; if the first model has not been stored in the network device, the first model is stored; if the first model has been stored in the network device, the first model is invoked.

网络设备确定网络设备中执行第一模型的推理任务的第一网元。The network device determines the first network element in the network device that performs the inference task of the first model.

可以理解的是，上述第一网元可以为RAN域的MIF、核心网中的NWDAF等，也可以是EMS或NMS等，该第一网元的资源足够执行第一网元的推理任务。It is understandable that the aforementioned first network element can be a MIF in the RAN domain, an NWDAF in the core network, or an EMS or NMS, etc., and the resources of the first network element are sufficient to perform the inference task of the first network element.

可以理解的是，上述步骤S303中的第二消息还可以包括上述第一网元的标识信息。It is understood that the second message in step S303 above may also include the identification information of the first network element.

网络设备可在其存储单元中确定是否已存储第一模型，存储或调用该第一模型，并分配第一网元作为服务网元，以便于后续利用第一网元接收终端设备的推理请求，执行第一模型的推理任务。The network device can determine whether the first model has been stored in its storage unit, store or call the first model, and allocate the first network element as a service network element so that it can subsequently use the first network element to receive inference requests from terminal devices and execute the inference task of the first model.

在一种可能的实施例中，本申请实施例在执行上述步骤S301之前，还可以执行以下步骤：In one possible embodiment, before performing step S301 described above, the present application embodiment may further perform the following steps:

S305：终端设备基于第一模型和终端设备的资源信息确定终端设备能否独立执行推理任务。S305: The terminal device determines whether it can independently perform the inference task based on the first model and the terminal device's resource information.

可以理解的是，上述步骤S301具体可以是，在终端设备不能独立执行第一模型的推理任务的情况下，终端设备向网络设备发送第一消息。相应的，网络设备接收该第一消息。It is understandable that step S301 above can specifically involve the terminal device sending a first message to the network device when the terminal device cannot independently execute the inference task of the first model. Correspondingly, the network device receives the first message.

其中，终端设备基于第一模型的参数规模等信息以及该终端设备的存储资源信息和计算资源信息确定能否独立执行该第一模型的推理任务。例如，终端设备基于第一模型的参数规模，判断执行推理任务所需的显存需求是否大于终端设备的可用内存，若大于，则确定终端设备不能独立执行上述推理任务。又例如，终端设备可基于芯片的计算频率和功率，预估执行推理任务并获得推理结果所需的能耗是否超过终端设备的最大负荷，若超过，则确定终端设备不能独立执行上述推理任务。又例如，终端设备预估执行推理任务时，终端设备的芯片计算能力能否满足服务时延要求，若不能满足，则确定终端设备不能独立执行上述推理任务。可理解的，终端设备确定能否独立执行推理任务的具体判断过程可根据实际需求进行确定，上述判断过程仅作为部分示例，不构成对本申请实施例的限定。In this process, the terminal device determines whether it can independently execute the inference task of the first model based on information such as the parameter scale of the first model, as well as the storage and computing resource information of the terminal device. For example, based on the parameter scale of the first model, the terminal device determines whether the video memory requirement for executing the inference task is greater than the available memory of the terminal device. If it is greater, the terminal device determines that it cannot independently execute the inference task. As another example, the terminal device can estimate whether the energy consumption required to execute the inference task and obtain the inference result exceeds the maximum load of the terminal device based on the chip's computing frequency and power. If it does, the terminal device determines that it cannot independently execute the inference task. As yet another example, the terminal device estimates whether the chip's computing power can meet the service latency requirements when executing the inference task. If it cannot, the terminal device determines that it cannot independently execute the inference task. It is understood that the specific judgment process for determining whether the terminal device can independently execute the inference task can be determined according to actual needs. The above judgment process is only a partial example and does not constitute a limitation on the embodiments of this application.

在一种可能的实施例中，本申请实施例在执行上述步骤S305之前，还可以执行以下步骤：In one possible embodiment, before performing step S305 as described above, the following steps may also be performed:

S306：服务设备发送第三消息，相应的，终端设备接收第三消息。S306: The service device sends a third message, and the terminal device receives the third message accordingly.

其中，第三消息包括第一模型和第一模型的识别信息。The third message includes the first model and its identification information.

第一模型的识别信息的相关说明可参见上文描述，在此不再赘述。The relevant description of the recognition information of the first model can be found in the above description, and will not be repeated here.

可以理解的是，上述步骤S301中的第一消息还可以包括第一模型。It is understandable that the first message in step S301 above may also include the first model.

可以理解的是，本申请实施例中由终端设备触发第一模型的转移部署。因此，服务设备可在终端注册之后，通过发送第三消息，将第一模型和第一模型的识别信息下发至终端设备中。终端设备接收第三消息后，可执行上述步骤S305，基于第一模型和终端设备的资源信息确定终端设备能否独立执行第一模型的推理任务，以便于后续确定如何获得第一模型的推理结果。It is understood that in this embodiment, the transfer and deployment of the first model is triggered by the terminal device. Therefore, after the terminal registers, the service device can send the first model and its identification information to the terminal device by sending a third message. After receiving the third message, the terminal device can execute the above step S305 to determine whether the terminal device can independently execute the inference task of the first model based on the resource information of the first model and the terminal device, so as to determine how to obtain the inference result of the first model.

通过本申请实施例，服务设备主动将第一模型及识别信息下发至终端设备，终端设备可基于第一模型和自身资源信息确定能否独立完成推理任务，从而可以在终端设备自身资源不足的情况下，使第一模型能部分或完全部署至网络设备中，降低后续获得推理结果的传输时延。当第一模型部署在网络设备和终端设备中时，还可以利用终端设备的资源，提高终端设备的资源利用率，进一步降低传输时延。In this embodiment, the service device proactively sends the first model and identification information to the terminal device. The terminal device can determine whether it can independently complete the inference task based on the first model and its own resource information. This allows the first model to be partially or fully deployed in the network device even when the terminal device's own resources are insufficient, reducing the transmission latency for obtaining the subsequent inference results. When the first model is deployed in both the network device and the terminal device, the terminal device's resources can be utilized, improving resource utilization and further reducing transmission latency.

在一种可能的实施例中，本申请实施例在执行上述步骤S303之后，还可以执行以下步骤：In one possible embodiment, after performing step S303 as described above, the present application embodiment may further perform the following steps:

S307：终端设备向第一模型所在的服务设备发送第五消息，相应的，服务设备接收第五消息。S307: The terminal device sends a fifth message to the service device where the first model is located, and the service device receives the fifth message accordingly.

其中，第五消息包括上述第一指示信息或所述第二指示信息。The fifth message includes either the first instruction information or the second instruction information mentioned above.

可以理解的是，本申请实施例中，当上述第二信息包括第一指示信息时，该第五信息包括第一指示信息。当上述第二信息包括第二指示信息时，该第五信息包括第二指示信息。It is understood that, in the embodiments of this application, when the second information includes the first indication information, the fifth information includes the first indication information. When the second information includes the second indication information, the fifth information includes the second indication information.

其中，第五信息还可以包括上述第一网元的标识信息。该标识信息可以包括以下至少一项：第一网元的IP地址；第一网元所在的子网标识和第一网元对应的子网内唯一标识；在全网内能够唯一指示该第一网元的ID信息等。本申请对此不作限制。The fifth piece of information may further include the identification information of the first network element. This identification information may include at least one of the following: the IP address of the first network element; the subnet identifier of the first network element and a unique identifier within the subnet corresponding to the first network element; and ID information that uniquely identifies the first network element across the entire network. This application does not impose any limitations on this.

通过本申请实施例，终端设备接收来自网络设备的第二信息后，向服务设备发送第五消息，可以向服务设备反馈第一模型已转移部署在网络设备中，并反馈后续执行推理任务的主体和实际执行推理任务的第一网元，以便于服务设备获知相关信息，不必在后续再执行第一模型的推理任务，从而可不必从服务设备获得推理结果，降低传输时延。Through the embodiments of this application, after receiving the second information from the network device, the terminal device sends a fifth message to the service device. This message can provide feedback to the service device that the first model has been transferred and deployed in the network device, and also provide feedback on the subject that will subsequently perform the inference task and the first network element that will actually perform the inference task. This allows the service device to obtain relevant information and avoids having to perform the inference task of the first model again. As a result, it does not need to obtain the inference result from the service device, thus reducing transmission latency.

请参阅图4，图4为本申请实施例提供的又一种通信方法的流程示意图。可以理解的是，本申请实施例中的步骤可以视为上述图2中的实施例的合理变形或补充；或者，可以理解的是，本申请实施例中的通信方法也可以视为能单独执行的实施例，本申请对此不作限制。在该通信方法中，由网络设备触发第一模型的转移部署。Please refer to Figure 4, which is a flowchart illustrating another communication method provided in an embodiment of this application. It is understood that the steps in this embodiment can be considered reasonable variations or supplements to the embodiment in Figure 2 above; or, it is understood that the communication method in this embodiment can also be considered an embodiment that can be executed independently, and this application does not limit this. In this communication method, the transfer deployment of the first model is triggered by a network device.

S401：终端设备向网络设备发送第一消息，相应的，网络设备接收该第一消息。S401: The terminal device sends a first message to the network device, and the network device receives the first message accordingly.

S402：网络设备基于第一模型和终端设备的资源信息，生成第二消息。S402: The network device generates a second message based on the first model and the resource information of the terminal device.

S403：网络设备向终端设备发送第二消息，相应的，终端设备接收该第二消息。S403: The network device sends a second message to the terminal device, and the terminal device receives the second message accordingly.

上述步骤S401至S403与上述图2所示的实施例中的步骤S201至S203一致，此处不再赘述。The steps S401 to S403 described above are the same as steps S201 to S203 in the embodiment shown in Figure 2 above, and will not be repeated here.

在一种可能的实施例中，本申请实施例在执行上述步骤S401之前，还可以执行以下步骤：In one possible embodiment, before performing step S401 described above, the present application embodiment may further perform the following steps:

S404：网络设备向终端设备发送第四消息，相应的，终端设备接收第四消息。S404: The network device sends a fourth message to the terminal device, and the terminal device receives the fourth message accordingly.

可以理解的是，上述步骤S401具体可以是，响应于第四消息，终端设备向网络设备发送第一消息。It is understandable that step S401 above can specifically be, in response to the fourth message, the terminal device sends the first message to the network device.

其中，第四消息用于获取终端设备的资源信息，第四消息包括网络设备中执行推理任务的第一网元的标识信息和第一模型的识别信息。第一网元的标识信息和第一模型的识别信息的相关说明可参见上文描述，在此不再赘述。The fourth message is used to obtain resource information of the terminal device. This fourth message includes the identification information of the first network element performing the inference task and the identification information of the first model within the network device. For details regarding the identification information of the first network element and the identification information of the first model, please refer to the description above; they will not be repeated here.

终端设备向网络设备发送第一消息时，该网络设备可以是该第一网元。When a terminal device sends a first message to a network device, the network device can be the first network element.

本申请实施例中，网络设备向终端设备发送第四消息，可使得终端设备将自身资源信息发送给网络设备，网络设备基于终端设备的资源信息和第一模型生成第一指示信息或第二指示信息，明确后续由网络设备还是网络设备和终端设备执行第一模型的推理任务，实现将第一模型全部或部分部署在网络设备中，使得终端设备不必再从服务设备获得推理结果，可降低传输时延，提高服务效率。In this embodiment, the network device sends a fourth message to the terminal device, which enables the terminal device to send its own resource information to the network device. Based on the terminal device's resource information and the first model, the network device generates a first instruction message or a second instruction message to clarify whether the network device or the network device and the terminal device will subsequently perform the inference task of the first model. This allows the first model to be deployed entirely or partially in the network device, so that the terminal device no longer needs to obtain the inference result from the service device, thereby reducing transmission latency and improving service efficiency.

在一种可能的实施例中，本申请实施例在执行上述步骤S404之前，还可以执行以下步骤：In one possible embodiment, before performing step S404 described above, the present application embodiment may further perform the following steps:

S405：网络设备基于第一模型的识别信息确定在网络设备中是否已存储所述第一模型；在网络设备中未存储第一模型的情况下，存储第一模型；在网络设备中已存储第一模型的情况下，调用第一模型；S405: The network device determines whether the first model has been stored in the network device based on the identification information of the first model; if the first model is not stored in the network device, the first model is stored; if the first model is stored in the network device, the first model is invoked.

该步骤与上述图3所示实施例中的步骤S304一致，此处不再赘述。This step is the same as step S304 in the embodiment shown in Figure 3 above, and will not be repeated here.

在一种可能的实施例中，本申请实施例在执行上述步骤S405之前，还可以执行以下步骤：In one possible embodiment, before performing step S405 as described above, the following steps may also be performed:

S406：服务设备向网络设备发送第三消息，相应的，网络设备接收第三消息。S406: The service device sends a third message to the network device, and the network device receives the third message accordingly.

可以理解的是，本申请实施例中由网络设备触发第一模型的转移部署。因此，服务设备可将在终端注册之后，通过发送第三消息，将第一模型和第一模型的识别信息下发至网络设备中。网络设备接收第三消息后，可执行上述步骤S405和S404，将第一模型存储在网络设备中并从终端设备获取终端设备的资源信息，以便于后续生成第一指示信息或第二指示信息，实现将第一模型完全或部分部署在网络设备中。It is understood that in this embodiment, the transfer and deployment of the first model is triggered by the network device. Therefore, after the terminal registers, the service device can send a third message to the network device, distributing the first model and its identification information. After receiving the third message, the network device can execute the above steps S405 and S404, storing the first model in the network device and obtaining the terminal device's resource information to facilitate the subsequent generation of first or second indication information, thereby deploying the first model completely or partially in the network device.

通过本申请实施例，服务设备主动将第一模型及识别信息发送至网络设备，网络设备可存储或调用第一模型，获取终端设备的资源信息，生成第一指示信息或第二指示信息，实现将第一模型部分或完全部署至网络设备中，降低后续获得推理结果的传输时延。当第一模型部署在网络设备和终端设备中时，还可以利用终端设备的资源，提高终端设备的资源利用率，进一步降低传输时延。In this embodiment, the service device actively sends the first model and identification information to the network device. The network device can store or retrieve the first model, obtain resource information from the terminal device, and generate first or second indication information. This allows the first model to be partially or completely deployed in the network device, reducing the transmission latency for obtaining subsequent inference results. When the first model is deployed in both the network device and the terminal device, the resources of the terminal device can also be utilized, improving the resource utilization rate of the terminal device and further reducing transmission latency.

可选地，本申请实施例在执行上述步骤S403之后，还可以执行以下步骤：Optionally, after performing step S403 above, the embodiments of this application may also perform the following steps:

S407：终端设备向第一模型所在的服务设备发送第五消息，相应的，服务设备接收第五消息。S407: The terminal device sends a fifth message to the service device where the first model is located, and the service device receives the fifth message accordingly.

第五信息还可以包括第一网元的标识信息。该标识信息的相关说明参见上文描述。The fifth piece of information may also include the identification information of the first network element. See the description above for details regarding this identification information.

通过本申请实施例，终端设备接收来自网络设备的第二信息后，向服务设备发送第五消息，可以向服务设备反馈第一模型已转移部署在网络设备中，并反馈后续执行推理任务的主体和实际执行推理任务的第一网元，以便于服务设备获知相关信息，后续不必再执行第一模型的推理任务，从而保证不经过服务设备也可获得推理结果，降低传输时延。In the embodiments of this application, after receiving the second information from the network device, the terminal device sends a fifth message to the service device. This message can provide feedback to the service device that the first model has been transferred and deployed in the network device, and also provide feedback on the subject that will subsequently perform the inference task and the first network element that will actually perform the inference task. This allows the service device to obtain relevant information and eliminates the need to perform the inference task of the first model again, thereby ensuring that the inference result can be obtained without going through the service device and reducing transmission latency.

请参阅图5，图5为本申请实施例提供的又一种通信方法的流程示意图。可以理解的是，本申请实施例中的步骤可以视为上述图2中的实施例的合理变形或补充；或者，可以理解的是，本申请实施例中的通信方法也可以视为能单独执行的实施例，本申请对此不作限制。在该通信方法中，由服务设备触发第一模型的转移部署。Please refer to Figure 5, which is a flowchart illustrating another communication method provided in an embodiment of this application. It is understood that the steps in the embodiments of this application can be considered reasonable variations or supplements to the embodiments in Figure 2 above; or, it is understood that the communication method in the embodiments of this application can also be considered an embodiment that can be executed independently, and this application does not limit it in this regard. In this communication method, the transfer deployment of the first model is triggered by the service device.

S501：终端设备向网络设备发送第一消息，相应的，网络设备接收该第一消息。S501: The terminal device sends a first message to the network device, and the network device receives the first message accordingly.

S502：网络设备基于第一模型和终端设备的资源信息，生成第二消息。S502: The network device generates a second message based on the first model and the resource information of the terminal device.

S503：网络设备向终端设备发送第二消息，相应的，终端设备接收该第二消息。S503: The network device sends a second message to the terminal device, and the terminal device receives the second message accordingly.

上述步骤S501至S503与上述图2所示的实施例中的步骤S201至S203一致，此处不再赘述。The steps S501 to S503 described above are the same as steps S201 to S203 in the embodiment shown in Figure 2 above, and will not be repeated here.

在一种可能的实施例中，本申请实施例在执行上述步骤S501之前，还可以执行以下步骤：In one possible embodiment, before performing step S501 described above, the following steps may also be performed:

S504：服务设备向终端设备发送第四消息，相应的，终端设备接收第四消息。S504: The service device sends a fourth message to the terminal device, and the terminal device receives the fourth message accordingly.

可以理解的是，上述步骤S501具体可以是，响应于第四消息，终端设备向网络设备发送第一消息。It is understandable that step S501 above can specifically be, in response to the fourth message, the terminal device sends the first message to the network device.

终端设备向网络设备发送第一消息时，该网络设备可以是上述第一网元。When a terminal device sends a first message to a network device, the network device can be the aforementioned first network element.

第一网元的标识信息和第一模型的识别信息的相关说明可参见上文描述，在此不再赘述。The relevant descriptions of the identification information of the first network element and the identification information of the first model can be found in the above description, and will not be repeated here.

本申请实施例中，在由服务设备触发第一模型的转移部署的场景下，服务设备可主动向终端设备发送上述第四消息，使得终端设备能将其资源信息发送给网络设备，以便于网络设备生成第一指示信息或第二指示信息。进而可明确后续由网络设备还是由网络设备和终端设备执行第一模型的推理任务，实现将第一模型全部或部分部署在网络设备中，使得终端设备不必再从服务设备获得推理结果，可降低传输时延，提高服务效率。In this embodiment, in a scenario where the transfer deployment of the first model is triggered by a service device, the service device can proactively send the aforementioned fourth message to the terminal device, enabling the terminal device to send its resource information to the network device. This allows the network device to generate first or second instruction information. This clarifies whether the subsequent inference task of the first model will be performed by the network device or by both the network device and the terminal device, allowing the first model to be deployed entirely or partially on the network device. This eliminates the need for the terminal device to obtain inference results from the service device, reducing transmission latency and improving service efficiency.

在一种可能的实施例中，本申请实施例在执行上述步骤S504之前，还可以执行以下步骤：In one possible embodiment, before performing step S504 as described above, the following steps may also be performed:

S505：网络设备向第一模型所在的服务设备发送第八消息，相应的，服务设备接收第八消息。S505: The network device sends the eighth message to the service device where the first model is located, and the service device receives the eighth message accordingly.

其中，第八消息包括网络设备中执行推理任务的第一网元的标识信息。The eighth message includes the identification information of the first network element in the network device that performs the inference task.

本申请实施例中，该第一网元是由网络设备事先确定的，网络设备向服务设备反馈该第一网元的标识信息，可使得服务设备能获知网络设备中处理第一模型的部署和推理事项的网元，以便于后续服务设备能不必再执行第一模型的推理任务，降低推理传输时延。In this embodiment, the first network element is determined in advance by the network device. The network device feeds back the identification information of the first network element to the service device, so that the service device can know the network element in the network device that handles the deployment and inference of the first model. This allows the service device to avoid having to perform the inference task of the first model again, thereby reducing the inference transmission latency.

在一种可能的实施例中，本申请实施例在执行上述步骤S505之前，还可以执行以下步骤：In one possible embodiment, before performing step S505 as described above, the following steps may also be performed:

S506：网络设备基于第一模型的识别信息确定在网络设备中是否已存储第一模型；在网络设备中未存储第一模型的情况下，存储第一模型；在网络设备中已存储第一模型的情况下，调用第一模型；S506: The network device determines whether the first model has been stored in the network device based on the identification information of the first model; if the first model has not been stored in the network device, the first model is stored; if the first model has been stored in the network device, the first model is invoked.

在一种可能的实施例中，本申请实施例在执行上述步骤S506之前，还可以执行以下步骤：In one possible embodiment, before performing step S506 described above, the following steps may also be performed:

S507：服务设备向网络设备发送第三消息，相应的，网络设备接收第三消息。S507: The service device sends a third message to the network device, and the network device receives the third message accordingly.

第一模型的识别信息的相关说明参见上文描述，在此不再赘述。The relevant descriptions of the recognition information for the first model are as described above and will not be repeated here.

可以理解的是，本申请实施例中由服务设备触发部署第一模型至网络设备中。因此，服务设备可将在网络注册之后，通过发送第三消息，将第一模型和第一模型的识别信息下发至网络设备中。网络设备接收第三消息后，可执行上述步骤S506，对第一模型进行存储或调用，并分配负责处理第一模型的推理任务的第一网元，以便后续可以将该第一网元的识别信息发给终端设备，从终端设备中获得资源信息，生成第一指示信息或第二指示信息。It is understood that in this embodiment, the deployment of the first model to the network device is triggered by the service device. Therefore, after registering with the network, the service device can send the first model and its identification information to the network device by sending a third message. After receiving the third message, the network device can execute the above step S506 to store or retrieve the first model and allocate a first network element responsible for processing the inference task of the first model, so that the identification information of the first network element can be sent to the terminal device, resource information can be obtained from the terminal device, and first or second indication information can be generated.

通过本申请实施例，服务设备主动将第一模型及其识别信息发送至网络设备，网络设备可存储或调用第一模型，并分配第一网元作为服务网元，生成第一指示信息或第二指示信息，以便能实现将第一模型部分或完全部署至网络设备中，降低后续获得推理结果过程的传输时延。当第一模型部署在终端设备和网络设备中时，还可以利用网络设备的资源，提高网络设备的资源利用率，进一步降低传输时延。In this embodiment, the service device actively sends the first model and its identification information to the network device. The network device can store or retrieve the first model, allocate a first network element as a service network element, and generate first or second indication information. This enables the first model to be partially or completely deployed in the network device, reducing the transmission latency in the subsequent process of obtaining inference results. When the first model is deployed in both the terminal device and the network device, the resources of the network device can also be utilized, improving the resource utilization rate of the network device and further reducing transmission latency.

可选地，本申请实施例在执行上述步骤S503之后，还可以执行以下步骤：Optionally, after performing step S503 above, the embodiments of this application may further perform the following steps:

S508：终端设备向第一模型所在的服务设备发送第五消息，相应的，服务设备接收第五消息。S508: The terminal device sends a fifth message to the service device where the first model is located, and the service device receives the fifth message accordingly.

通过本申请实施例，终端设备接收来自网络设备的第二信息后，向服务设备发送第五消息，可以向服务设备反馈第一模型已转移部署在网络设备中，并反馈后续执行推理任务的主体，以便于服务设备获知相关信息，后续不必再执行第一模型的推理任务，从而保证不经过服务设备也可获得推理结果，降低传输时延。In the embodiments of this application, after receiving the second information from the network device, the terminal device sends a fifth message to the service device. This message can provide feedback to the service device that the first model has been transferred and deployed in the network device, and also provide feedback on the subject that will subsequently perform the inference task. This allows the service device to obtain relevant information and eliminates the need to perform the inference task of the first model again, thereby ensuring that the inference result can be obtained without going through the service device and reducing transmission latency.

本申请实施例还提供的一种通信方法。可以理解的是，本申请实施例中的步骤可以在上述图2至图5任一图示的实施例之后执行，或者，本申请实施例中的通信方法也可以视为能单独执行的实施例，本申请对此不作限制。该通信方法示出了第一模型转移部署后终端设备获得第一模型的推理结果的过程。This application also provides a communication method. It is understood that the steps in this application's embodiments can be executed after any of the embodiments illustrated in Figures 2 to 5, or the communication method in this application's embodiments can also be considered as an embodiment that can be executed independently; this application does not impose any limitations on this. This communication method illustrates the process by which a terminal device obtains the inference results of the first model after the first model is transferred and deployed.

S601：在第二消息包括第一指示信息的情况下，终端设备向网络设备发送第六消息，相应的，网络设备接收第六消息。S601: If the second message includes the first instruction information, the terminal device sends a sixth message to the network device, and the network device receives the sixth message accordingly.

S602：网络设备基于第一数据独立执行第一模型的推理任务，获得第一推理结果。S602: The network device independently executes the inference task of the first model based on the first data and obtains the first inference result.

S603：网络设备向终端设备发送第一推理结果，相应的，终端设备接收第一推理结果。S603: The network device sends the first inference result to the terminal device, and the terminal device receives the first inference result accordingly.

其中，第六消息用于请求网络设备独立执行推理任务，第六消息包括执行推理任务需要输入的第一数据。第一推理结果是基于第一数据得到的。该通信方法中提到的网络设备，可以为上述第一网元。The sixth message requests the network device to independently execute the inference task. This sixth message includes the first data required to execute the inference task. The first inference result is obtained based on the first data. The network device mentioned in this communication method can be the aforementioned first network element.

可以理解的是，由于第一指示信息指示由网络设备独立执行第一模型的推理任务，因此终端设备可将第一数据发送给网络设备，网络设备基于第一数据和第一模型进行推理，获得第一推理结果。该过程由网络设备独立完成推理任务，无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。Understandably, since the first instruction information instructs the network device to independently execute the inference task of the first model, the terminal device can send the first data to the network device. The network device then performs inference based on the first data and the first model to obtain the first inference result. This process is completed independently by the network device, without needing to send data to the service device or obtain the inference result from the service device, thus reducing the transmission latency of the entire inference process.

在一种可能的实施例中，本申请实施例还可以执行以下步骤：In one possible embodiment, the present application embodiment may further perform the following steps:

S604：在第二消息包括第二指示信息的情况下，终端设备向网络设备发送第七消息，相应的，网络设备接收第七消息。S604: If the second message includes the second instruction information, the terminal device sends a seventh message to the network device, and the network device receives the seventh message accordingly.

其中，第七消息用于请求终端设备与网络设备联合执行推理任务，第七消息包括第二数据；第二数据是终端设备基于第一子模型对第一数据进行推理得到的，或者，第二数据是对第一数据进行预处理得到的，第一数据是执行推理任务需要输入的数据。The seventh message is used to request the terminal device and the network device to jointly perform the inference task. The seventh message includes the second data. The second data is obtained by the terminal device inferring the first data based on the first sub-model, or the second data is obtained by preprocessing the first data. The first data is the data required to perform the inference task.

可以理解的是，上述步骤S601～S604可以在步骤S604之前执行，也可以在步骤S604之后执行，或者，该通信方法中多次执行步骤S601～S604，或者，该通信方法中多次执行步骤S604，等等，本申请对此不作限制。It is understood that the above steps S601 to S604 can be executed before step S604 or after step S604, or steps S601 to S604 can be executed multiple times in the communication method, or step S604 can be executed multiple times in the communication method, etc. This application does not limit this.

在一种可能的实施例中，本申请实施例在执行上述步骤S604之后，还可以执行以下步骤：In one possible embodiment, after performing step S604 as described above, the present application embodiment may further perform the following steps:

S605：在第二指示信息指示网络设备基于第二子模型得到的推理结果为中间层特征的情况下，终端设备基于第一子模型对第二数据进行推理，得到第三数据。S605: When the second instruction information indicates that the network device obtains the inference result based on the second sub-model as an intermediate layer feature, the terminal device infers the second data based on the first sub-model to obtain the third data.

S606：网络设备基于所述第二子模型对所述第二数据进行推理得到第四数据。S606: The network device infers the fourth data from the second data based on the second sub-model.

S607：网络设备向所述终端设备发送所述第四数据。相应的，终端设备接收来自网络设备的第四数据。S607: The network device sends the fourth data to the terminal device. Correspondingly, the terminal device receives the fourth data from the network device.

S608：终端设备基于第三数据和第四数据得到第二推理结果。S608: The terminal device obtains the second inference result based on the third and fourth data.

其中，第四数据是网络设备基于第二子模型对第二数据进行推理得到的。The fourth data is obtained by the network device through reasoning on the second data based on the second sub-model.

可以理解的是，第二指示信息指示网络设备基于第二子模型得到的推理结果为中间层特征，那么网络设备基于第二子模型得到的推理结果不是最终推理结果，需要进一步被终端设备处理，此时说明第一模型的部署方式为并联部署方式。该并联部署方式是指，第一模型中的第一子模型和第二子模型分别部署在终端设备和网络设备中，且在终端设备和网络设备各自进行部分推理后终端设备获得最终推理结果。It is understandable that if the second indication information indicates that the inference result obtained by the network device based on the second sub-model is an intermediate layer feature, then the inference result obtained by the network device based on the second sub-model is not the final inference result and needs to be further processed by the terminal device. This indicates that the deployment method of the first model is a parallel deployment method. This parallel deployment method means that the first sub-model and the second sub-model in the first model are deployed in the terminal device and the network device respectively, and the terminal device obtains the final inference result after the terminal device and the network device each perform part of the inference.

此时，第二数据是对第一数据进行预处理得到的。第三数据和第四数据属于中间层特征。At this point, the second data is obtained by preprocessing the first data. The third and fourth data belong to the intermediate layer features.

本申请实施例中，终端设备和网络设备基于第二指示信息，确定联合执行推理任务的具体流程，终端设备和网络设备分别利用第一子模型和第二子模型进行部分推理，获得最终推理结果。这样无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。由于终端设备参与推理，可利用终端设备的资源，提高了资源利用率。终端设备和网络设备各自进行独立推理，可降低推理所需时间，提高推理效率。In this embodiment, the terminal device and the network device determine the specific process for jointly executing the inference task based on the second instruction information. The terminal device and the network device respectively use the first sub-model and the second sub-model to perform partial inference and obtain the final inference result. This eliminates the need to send data to the service device and obtain the inference result from the service device, reducing the transmission latency of the entire inference process. Since the terminal device participates in the inference, its resources can be utilized, improving resource utilization. The independent inference performed by the terminal device and the network device reduces the inference time required and improves inference efficiency.

可选地，本申请实施例在执行上述步骤S604之后，还可以执行以下步骤：Optionally, after performing step S604 above, the embodiments of this application may further perform the following steps:

S609：在第二指示信息指示网络设备基于第二子模型得到的推理结果为最终推理结果的情况下，终端设备向网络设备发送第二数据，相应的，网络设备接收第二数据。S609: When the second instruction information indicates that the inference result obtained by the network device based on the second sub-model is the final inference result, the terminal device sends the second data to the network device, and the network device receives the second data accordingly.

S610：网络设备基于第二子模型对第二数据进行推理，获得第三推理结果。S610: The network device infers the second data based on the second sub-model to obtain the third inference result.

S611：网络设备向终端设备发送第三推理结果，相应的，终端设备接收第三推理结果。S611: The network device sends the third inference result to the terminal device, and the terminal device receives the third inference result accordingly.

可以理解的是，第二指示信息指示网络设备基于第二子模型得到的推理结果为最终推理结果，说明第一模型的部署方式为串联部署方式。该串联部署方式是指，第一模型中的第一子模型和第二子模型分别部署在终端设备和网络设备中，且在终端设备进行部分推理后将推理结果发送给网络设备，网络设备再进行部分推理后获得最终推理结果。It is understandable that the second instruction information indicates that the inference result obtained by the network device based on the second sub-model is the final inference result, indicating that the deployment method of the first model is a serial deployment method. This serial deployment method means that the first sub-model and the second sub-model in the first model are deployed in the terminal device and the network device respectively, and the terminal device performs partial inference and sends the inference result to the network device, and the network device performs partial inference to obtain the final inference result.

此时，第二数据是对第一数据进行推理得到的。第二数据属于中间层特征。At this point, the second data is obtained by reasoning from the first data. The second data belongs to the intermediate layer features.

可选的，该第二数据还可以是先对第一数据进行预处理，再基于第一子模型进行推理得到的。Optionally, the second data can also be obtained by preprocessing the first data and then inferring based on the first sub-model.

本申请实施例中，终端设备和网络设备基于第二指示信息，确定联合执行推理任务的具体流程，终端设备进行部分推理后，将推理结果发送给网络设备进行部分推理，获得最终推理结果。这样无需将数据发给服务设备和从服务设备获得推理结果，降低了整个推理过程的传输时延。由于终端设备参与推理，可利用终端设备的资源，提高了资源利用率。In this embodiment, the terminal device and the network device determine the specific process for jointly executing the inference task based on the second instruction information. After performing partial inference, the terminal device sends the inference result to the network device for further partial inference to obtain the final inference result. This eliminates the need to send data to the service device and obtain the inference result from the service device, reducing the transmission latency of the entire inference process. Since the terminal device participates in inference, its resources can be utilized, improving resource utilization.

参见图6a、图6b和图6c，图6a示出了第一模型转移部署后终端设备获得推理结果的一流程示意图,图6b示出了第一模型转移部署后终端设备获得推理结果的另一流程示意图，图6c示出了第一模型转移部署后终端设备获得推理结果的又一流程示意图。图6a、图6b、图6c分别对应上述网络设备独立执行第一模型的推理任务、终端设备和网络设备联合执行第一模型的推理任务且第一模型的部署方式为并联部署方式、终端设备和网络设备联合执行第一模型的推理任务且第一模型的部署方式为串联部署方式的推理过程。Referring to Figures 6a, 6b, and 6c, Figure 6a shows a flowchart illustrating the process by which the terminal device obtains the inference result after the first model is transferred and deployed; Figure 6b shows another flowchart illustrating the process by which the terminal device obtains the inference result after the first model is transferred and deployed; and Figure 6c shows yet another flowchart illustrating the process by which the terminal device obtains the inference result after the first model is transferred and deployed. Figures 6a, 6b, and 6c correspond to the inference processes of the network device independently executing the inference task of the first model, the terminal device and network device jointly executing the inference task of the first model in a parallel deployment mode, and the terminal device and network device jointly executing the inference task of the first model in a serial deployment mode, respectively.

上述详细阐述了本申请实施例的方法，下面提供用于实现本申请实施例中任一种方法的装置，例如，提供一种装置包括用以实现以上任一种方法中设备所执行的各步骤的单元(或手段)。The methods of the embodiments of this application have been described in detail above. The following provides an apparatus for implementing any one of the methods in the embodiments of this application. For example, an apparatus is provided that includes a unit (or means) for implementing the steps performed by the device in any of the above methods.

请参阅图7，图7为本申请实施例提供的一种通信装置的结构示意图。Please refer to Figure 7, which is a schematic diagram of the structure of a communication device provided in an embodiment of this application.

如图7所示，该通信装置70可以包括通信单元701以及处理单元702。通信单元701以及处理单元702可以是软件，也可以是硬件，或者是软件和硬件结合。As shown in Figure 7, the communication device 70 may include a communication unit 701 and a processing unit 702. The communication unit 701 and the processing unit 702 may be software, hardware, or a combination of software and hardware.

其中，通信单元701可以实现发送功能和/或接收功能，通信单元701也可以描述为收发单元。通信单元701还可以是集成了获取单元和发送单元的单元，其中，获取单元用于实现接收功能，发送单元用于实现发送功能。可选的，通信单元701可以用于接收其他装置发送的信息，还可以用于向其他装置发送信息。The communication unit 701 can implement sending and/or receiving functions, and can also be described as a transceiver unit. The communication unit 701 can also be a unit integrating an acquisition unit and a sending unit, wherein the acquisition unit is used to implement the receiving function, and the sending unit is used to implement the sending function. Optionally, the communication unit 701 can be used to receive information sent by other devices, and can also be used to send information to other devices.

在一种可能的设计中，该通信装置70可对应于上述图2至图6c所示的方法实施例中的终端设备，如该通信装置70可以是终端设备，也可以是终端设备中的芯片。该通信装置70可以包括用于执行上述图2至图6c所示的方法实施例中由终端设备所执行的操作的单元，并且，该通信装置70中的各单元分别为了实现上述图2至图6c所示的方法实施例中由终端设备所执行的操作。其中，各个单元的描述如下：In one possible design, the communication device 70 may correspond to the terminal device in the method embodiments shown in Figures 2 to 6c. For example, the communication device 70 may be a terminal device or a chip within the terminal device. The communication device 70 may include units for performing the operations performed by the terminal device in the method embodiments shown in Figures 2 to 6c, and each unit in the communication device 70 is for implementing the operations performed by the terminal device in the method embodiments shown in Figures 2 to 6c. The descriptions of each unit are as follows:

通信单元701，用于向网络设备发送第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；The communication unit 701 is used to send a first message to the network device. The first message is used to request the network device to deploy a first model. The first message includes the identification information of the first model and the resource information of the terminal device.

所述通信单元701，还用于接收来自所述网络设备的第二消息，所述第二消息包括第一指示信息或第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备与所述网络设备分别部署所述第一模型中的第一子模型和第二子模型。The communication unit 701 is further configured to receive a second message from the network device, the second message including a first instruction message or a second instruction message, the first instruction message indicating that the network device independently executes the inference task of the first model, the second instruction message indicating that the terminal device and the network device jointly execute the inference task, and indicating that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

处理单元702，用于生成所述第一消息。Processing unit 702 is used to generate the first message.

关于本设计所述的通信单元701和处理单元702，其执行的步骤可参考对应于上述图2至图6c所示的方法实施例中的终端设备对应的实施方式。Regarding the communication unit 701 and processing unit 702 described in this design, the steps they perform can be referred to the implementation methods corresponding to the terminal devices in the method embodiments shown in Figures 2 to 6c above.

关于本设计所述的通信单元701和处理单元702所执行的实施方式所带来的技术效果，可参考对应于上述图2至图6c所示的方法实施例的技术效果的介绍。Regarding the technical effects of the implementation methods performed by the communication unit 701 and processing unit 702 described in this design, please refer to the description of the technical effects corresponding to the method embodiments shown in Figures 2 to 6c above.

在另一种可能的设计中，该通信装置70可对应于上述图2至图6c所示的方法实施例中的网络设备，如该通信装置70可以是网络设备，也可以是网络设备中的芯片。该通信装置70可以包括用于执行上述图2至图6c所示的方法实施例中由网络设备所执行的操作的单元，并且，该通信装置70中的各单元分别为了实现上述图2至图6c所示的方法实施例中由网络设备所执行的操作。其中，各个单元的描述如下：In another possible design, the communication device 70 may correspond to the network device in the method embodiments shown in Figures 2 to 6c. For example, the communication device 70 may be a network device or a chip within a network device. The communication device 70 may include units for performing the operations performed by the network device in the method embodiments shown in Figures 2 to 6c, and each unit in the communication device 70 is for implementing the operations performed by the network device in the method embodiments shown in Figures 2 to 6c. The descriptions of each unit are as follows:

通信单元701，用于接收来自终端设备的第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；The communication unit 701 is configured to receive a first message from a terminal device, the first message being a request for the network device to deploy a first model, the first message including identification information of the first model and resource information of the terminal device;

所述通信单元701，还用于向所述终端设备发送所述第二消息；The communication unit 701 is also used to send the second message to the terminal device;

处理单元702，用于基于所述第一模型和所述终端设备的资源信息，生成第二消息，所述第二消息包括所述第一指示信息或所述第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备和所述网络设备分别部署所述第一模型中的第一子模型和第二子模型。Processing unit 702 is configured to generate a second message based on the resource information of the first model and the terminal device. The second message includes either the first instruction information or the second instruction information. The first instruction information indicates that the network device independently executes the inference task of the first model, and the second instruction information indicates that the terminal device and the network device jointly execute the inference task. It also indicates that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

关于本设计所述的通信单元701和处理单元702，其执行的步骤可参考对应于上述图2至图6c所示的方法实施例中的网络设备对应的实施方式。Regarding the communication unit 701 and processing unit 702 described in this design, the steps they perform can be referred to the implementation methods corresponding to the network devices in the method embodiments shown in Figures 2 to 6c above.

根据本申请实施例，图7所示的装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成，或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成，这可以实现同样的操作，而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的，在实际应用中，一个单元的功能也可以由多个单元来实现，或者多个单元的功能由一个单元实现。在本申请的其它实施例中，基于电子设备也可以包括其它单元，在实际应用中，这些功能也可以由其它单元协助实现，并且可以由多个单元协作实现。According to embodiments of this application, the various units in the device shown in FIG7 can be individually or entirely merged into one or more other units, or some of the units can be further divided into multiple functionally smaller units. This achieves the same operation without affecting the technical effect of the embodiments of this application. The above units are based on logical function division. In practical applications, the function of one unit can also be implemented by multiple units, or the function of multiple units can be implemented by one unit. In other embodiments of this application, the electronic device may also include other units. In practical applications, these functions can also be implemented with the assistance of other units, and can be implemented collaboratively by multiple units.

需要说明的是，各个单元的实现还可以对应参照上述图2至图6c所示的方法实施例的相应描述。It should be noted that the implementation of each unit can also refer to the corresponding description of the method embodiments shown in Figures 2 to 6c above.

在图7所描述的通信装置70中，网络设备可基于第一模型和终端设备的资源信息，生成第一指示信息或第二指示信息，实现将第一模型完全或部分部署在网络设备中，由网络设备独立执行第一模型的推理任务或由终端设备和网络设备联合执行第一模型的推理任务，无需从服务设备获得第一模型的推理结果，可降低传输时延。In the communication device 70 described in Figure 7, the network device can generate first instruction information or second instruction information based on the resource information of the first model and the terminal device, so that the first model can be fully or partially deployed in the network device, and the network device can independently execute the inference task of the first model or the terminal device and the network device can jointly execute the inference task of the first model without obtaining the inference result of the first model from the service device, which can reduce transmission latency.

请参阅图8，图8为本申请实施例提供的一种通信装置的结构示意图。Please refer to Figure 8, which is a schematic diagram of the structure of a communication device provided in an embodiment of this application.

应理解，图8示出的通信装置80仅是示例，本申请实施例的通信装置还可包括其他部件，或者包括与图8中的各个部件的功能相似的部件，或者并非要包括图8中所有部件。It should be understood that the communication device 80 shown in FIG8 is only an example. The communication device in the embodiments of this application may also include other components, or include components with functions similar to the various components in FIG8, or may not include all the components in FIG8.

通信装置80包括通信接口801和至少一个处理器802。The communication device 80 includes a communication interface 801 and at least one processor 802.

该通信装置80可以对应终端设备、网络设备中的任一网元或设备。通信接口801用于收发信号，至少一个处理器802执行程序指令，使得通信装置80实现上述方法实施例中由对应设备所执行的方法的相应流程。The communication device 80 can correspond to any network element or device among terminal devices and network devices. The communication interface 801 is used for sending and receiving signals, and at least one processor 802 executes program instructions, causing the communication device 80 to implement the corresponding process of the method executed by the corresponding device in the above method embodiments.

在一种可能的设计中，该通信装置80可对应于上述图2至图6c所示的方法实施例中的终端设备，如该通信装置80可以是终端设备，也可以是终端设备中的芯片。该通信装置80可以包括用于执行上述方法实施例中由终端设备所执行的操作的部件，并且，该通信装置80中的各部件分别为了实现上述方法实施例中由终端设备所执行的操作。具体可以如下所示：In one possible design, the communication device 80 may correspond to the terminal device in the method embodiments shown in Figures 2 to 6c. For example, the communication device 80 may be a terminal device or a chip within the terminal device. The communication device 80 may include components for performing the operations performed by the terminal device in the above method embodiments, and each component in the communication device 80 is specifically designed to implement the operations performed by the terminal device in the above method embodiments. Specifically, it may be as follows:

向网络设备发送第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；Send a first message to the network device, the first message being used to request the network device to deploy a first model, the first message including the identification information of the first model and the resource information of the terminal device;

接收来自所述网络设备的第二消息，所述第二消息包括第一指示信息或第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备与所述网络设备分别部署所述第一模型中的第一子模型和第二子模型。The system receives a second message from the network device, the second message including a first instruction or a second instruction, the first instruction indicating that the network device independently performs the inference task of the first model, the second instruction indicating that the terminal device and the network device jointly perform the inference task, and instructing the terminal device and the network device to respectively deploy a first sub-model and a second sub-model in the first model.

在另一种可能的设计中，该通信装置80可对应于上述图2至图6c所示的方法实施例中的网络设备，如该通信装置80可以是网络设备，也可以是网络设备中的芯片。该通信装置80可以包括用于执行上述方法实施例中由网络设备所执行的操作的部件，并且，该通信装置80中的各部件分别为了实现上述方法实施例中由网络设备所执行的操作。具体可以如下所示：In another possible design, the communication device 80 may correspond to the network device in the method embodiments shown in Figures 2 to 6c. For example, the communication device 80 may be a network device or a chip within a network device. The communication device 80 may include components for performing the operations performed by the network device in the above method embodiments, and each component in the communication device 80 is specifically designed to implement the operations performed by the network device in the above method embodiments. Specifically, it may be as follows:

接收来自终端设备的第一消息，所述第一消息用于请求所述网络设备部署第一模型，所述第一消息包括所述第一模型的识别信息和所述终端设备的资源信息；Receive a first message from a terminal device, the first message being used to request the network device to deploy a first model, the first message including identification information of the first model and resource information of the terminal device;

基于所述第一模型和所述终端设备的资源信息，生成第二消息，所述第二消息包括所述第一指示信息或所述第二指示信息，所述第一指示信息指示由所述网络设备独立执行所述第一模型的推理任务，所述第二指示信息指示由所述终端设备和所述网络设备联合执行所述推理任务，以及指示所述终端设备和所述网络设备分别部署所述第一模型中的第一子模型和第二子模型；Based on the resource information of the first model and the terminal device, a second message is generated. The second message includes the first instruction information or the second instruction information. The first instruction information indicates that the network device independently executes the inference task of the first model. The second instruction information indicates that the terminal device and the network device jointly execute the inference task. It also indicates that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

向所述终端设备发送所述第二消息。The second message is sent to the terminal device.

在图8所描述的通信装置80中，网络设备可基于第一模型和终端设备的资源信息，生成第一指示信息或第二指示信息，实现将第一模型完全或部分部署在网络设备中，由网络设备独立执行第一模型的推理任务或由终端设备和网络设备联合执行第一模型的推理任务，无需从服务设备获得第一模型的推理结果，可降低传输时延。In the communication device 80 described in Figure 8, the network device can generate first instruction information or second instruction information based on the resource information of the first model and the terminal device, so that the first model can be fully or partially deployed in the network device, and the network device can independently execute the inference task of the first model or the terminal device and the network device can jointly execute the inference task of the first model without obtaining the inference result of the first model from the service device, which can reduce transmission latency.

对于通信装置可以是芯片或芯片系统的情况，可参阅图9所示的芯片的结构示意图。For cases where the communication device can be a chip or a chip system, please refer to the schematic diagram of the chip structure shown in Figure 9.

如图9所示，芯片90包括处理器901和接口902。其中，处理器901的数量可以是一个或多个，接口902的数量可以是多个。需要说明的是，处理器901、接口902各自对应的功能既可以通过硬件设计实现，也可以通过软件设计来实现，还可以通过软硬件结合的方式来实现，这里不作限制。As shown in Figure 9, chip 90 includes processor 901 and interface 902. There can be one or more processors 901, and multiple interfaces 902. It should be noted that the functions of processor 901 and interface 902 can be implemented through hardware design, software design, or a combination of both; no restrictions are placed here.

可选的，芯片90还可以包括存储器903，存储器903用于存储必要的程序指令和数据。Optionally, chip 90 may also include memory 903, which is used to store necessary program instructions and data.

本申请中，处理器901可用于从存储器903中调用本申请的一个或多个实施例提供的通信方法在终端设备、网络设备中一个或多个设备或网元的实现程序，并执行该程序包含的指令。接口902可用于输出处理器901的执行结果。本申请中，接口902可具体用于输出处理器901的各个消息或信息。In this application, processor 901 can be used to call implementation programs of communication methods provided in one or more embodiments of this application in one or more devices or network elements of a terminal device or network device from memory 903, and execute the instructions contained in the program. Interface 902 can be used to output the execution results of processor 901. In this application, interface 902 can specifically be used to output various messages or information of processor 901.

关于本申请的一个或多个实施例提供的通信方法可参考前述图2至图6c所示各个实施例，这里不再赘述。The communication methods provided by one or more embodiments of this application can be referred to the various embodiments shown in Figures 2 to 6c above, and will not be repeated here.

本申请实施例中的处理器可以是中央处理单元(Central Processing Unit，CPU)，该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor，DSP)、专用集成电路(application specific integrated circuit，ASIC)、现成可编程门阵列(field programmable gate array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor in this embodiment can be a Central Processing Unit (CPU), but it can also be other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or any conventional processor.

本申请实施例中的存储器用于提供存储空间，存储空间中可以存储操作系统和计算机程序等数据。存储器包括但不限于是随机存储记忆体(random access memory，RAM)、只读存储器(read-only memory，ROM)、可擦除可编程只读存储器(erasable programmable read only memory，EPROM)、或便携式只读存储器(compact disc read-only memory，CD-ROM)。The memory in this application embodiment is used to provide storage space, in which data such as the operating system and computer programs can be stored. The memory includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), or compact disc read-only memory (CD-ROM).

根据本申请实施例提供的方法，本申请实施例还提供一种计算机可读存储介质，上述计算机可读存储介质中存储有计算机程序，当上述计算机程序在一个或多个处理器上运行时，可以实现上述图2至图6c所示的方法。According to the method provided in the embodiments of this application, the embodiments of this application also provide a computer-readable storage medium storing a computer program. When the computer program is run on one or more processors, it can implement the method shown in Figures 2 to 6c.

根据本申请实施例提供的方法，本申请实施例还提供一种计算机程序产品，上述计算机程序产品包括计算机程序，当上述计算机程序在处理器上运行时，可以实现上述图2至图6c所示的方法。According to the method provided in the embodiments of this application, the embodiments of this application also provide a computer program product, which includes a computer program. When the computer program runs on a processor, it can implement the methods shown in Figures 2 to 6c.

本申请实施例还提供了一种系统，该系统包括至少一个如上述通信装置70或通信装置80或芯片90，用于执行上述图2至图6c任一实施例中相应设备执行的步骤。This application also provides a system comprising at least one communication device 70, communication device 80, or chip 90 as described above, for performing the steps performed by the corresponding device in any of the embodiments of FIG2 to FIG6c.

本申请实施例还提供了一种系统，该系统包括终端设备和网络设备，该终端设备用于执行上述图2至图6c任一实施例中终端设备执行的步骤，该网络设备用于执行上述图2至图6c任一实施例中网络设备执行的步骤。This application also provides a system comprising a terminal device and a network device. The terminal device is used to execute the steps performed by the terminal device in any of the embodiments of Figures 2 to 6c, and the network device is used to execute the steps performed by the network device in any of the embodiments of Figures 2 to 6c.

本申请实施例还提供了一种处理装置，包括处理器和接口；所述处理器用于执行上述任一方法实施例中的方法。This application also provides a processing apparatus, including a processor and an interface; the processor is used to execute the method in any of the above method embodiments.

应理解，上述处理装置可以是一个芯片。例如，该处理装置可以是现场可编程门阵列(field programmable gate array，FPGA)，可以是通用处理器、数字信号处理器(digital signal processor，DSP)、专用集成电路(application specific integrated circuit，ASIC)、现成可编程门阵列(field programmable gate array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件，还可以是系统芯片(system on chip，SoC)，还可以是中央处理器(central processor unit，CPU)，还可以是网络处理器(network processor，NP)，还可以是数字信号处理电路(digital signal processor，DSP)，还可以是微控制器(micro controller unit，MCU)，还可以是可编程控制器(programmable logic device，PLD)或其他集成芯片。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成，或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器，闪存、只读存储器，可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器，处理器读取存储器中的信息，结合其硬件完成上述方法的步骤。It should be understood that the aforementioned processing device can be a chip. For example, the processing device can be a field-programmable gate array (FPGA), a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf FPGA, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, a system-on-chip (SoC), a central processing unit (CPU), a network processor (NP), a digital signal processing circuit (DSP), a microcontroller unit (MCU), a programmable logic device (PLD), or other integrated chips. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor, etc. The steps of the method disclosed in the embodiments of this application can be directly manifested as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.

可以理解，本申请实施例中的存储器可以是易失性存储器或非易失性存储器，或可包括易失性和非易失性存储器两者。其中，非易失性存储器可以是只读存储器(read-only memory，ROM)、可编程只读存储器(programmable ROM，PROM)、可擦除可编程只读存储器(erasable PROM，EPROM)、电可擦除可编程只读存储器(electrically EPROM，EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory，RAM)，其用作外部高速缓存。通过示例性但不是限制性说明，许多形式的RAM可用，例如静态随机存取存储器(static RAM，SRAM)、动态随机存取存储器(dynamic RAM，DRAM)、同步动态随机存取存储器(synchronous DRAM，SDRAM)、双倍数据速率同步动态随机存取存储器(doubledata rate SDRAM，DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM，ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM，SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM，DR RAM)。应注意，本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It is understood that the memory in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.

在上述实施例中，可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时，可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机指令时，全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中，或者从一个计算机可读存储介质向另一个计算机可读存储介质传输，例如，所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line，DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如，软盘、硬盘、磁带)、光介质(例如，高密度数字视频光盘(digital video disc，DVD))、或者半导体介质(例如，固态硬盘(solid state disc，SSD))等。In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially as a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., high-density digital video discs (DVDs)), or semiconductor media (e.g., solid-state drives (SSDs)).

上述各个装置实施例中的单元和方法实施例中的电子设备完全对应，由相应的模块或单元执行相应的步骤，例如通信单元(收发器)执行方法实施例中接收或发送的步骤，除发送、接收外的其它步骤可以由处理单元(处理器)执行。具体单元的功能可以参考相应的方法实施例。其中，处理器可以为一个或多个。The units in the above-described device embodiments and the electronic devices in the method embodiments completely correspond to each other, with corresponding modules or units performing corresponding steps. For example, the communication unit (transceiver) performs the receiving or sending steps in the method embodiments, while other steps besides sending and receiving can be performed by the processing unit (processor). The functions of specific units can be found in the corresponding method embodiments. There can be one or more processors.

可以理解的，本申请实施例中，电子设备可以执行本申请实施例中的部分或全部步骤，这些步骤或操作仅是示例，本申请实施例还可以执行其它操作或者各种操作的变形。此外，各个步骤可以按照本申请实施例呈现的不同的顺序来执行，并且有可能并非要执行本申请实施例中的全部操作。It is understood that in the embodiments of this application, the electronic device may perform some or all of the steps in the embodiments of this application. These steps or operations are merely examples, and the embodiments of this application may also perform other operations or variations thereof. Furthermore, the steps may be performed in different orders as presented in the embodiments of this application, and it is not necessarily necessary to perform all the operations in the embodiments of this application.

本领域普通技术人员可以意识到，结合本文中所公开的实施例描述的各示例的单元及算法步骤，能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行，取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能，但是这种实现不应认为超出本申请的范围。Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统、装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统、装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器ROM、随机存取存储器RAM、磁碟或者光盘等各种可以存储程序代码的介质。If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the contributing part, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

以上所述，仅为本申请的具体实施方式，但本申请的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，可轻易想到变化或替换，都应涵盖在本申请的保护范围之内。The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any changes or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application.

Claims

A communication method, characterized in that it is applied to a terminal device, the method comprising:

Send a first message to the network device, the first message being used to request the network device to deploy a first model, the first message including the identification information of the first model and the resource information of the terminal device;

The system receives a second message from the network device, the second message including a first instruction or a second instruction, the first instruction indicating that the network device independently performs the inference task of the first model, the second instruction indicating that the terminal device and the network device jointly perform the inference task, and instructing the terminal device and the network device to respectively deploy a first sub-model and a second sub-model in the first model.

The method according to claim 1, characterized in that, before sending the first message to the network device, it further includes:

Based on the resource information of the first model and the terminal device, determine whether the terminal device can independently execute the inference task;

Sending the first message to the network device includes:

If the terminal device cannot perform the inference task independently, the first message is sent to the network device.

The method according to claim 2, characterized in that, before determining whether the terminal device can independently execute the inference task based on the first model and the resource information of the terminal device, it further includes:

A third message is received, the third message including the first model and the identification information of the first model.

Receive a fourth message, the fourth message including the identification information of the first network element performing the inference task in the network device and the identification information of the first model;

Sending the first message to the network device includes:

In response to the fourth message, the first message is sent to the network device, where the network device is the first network element.

The method according to claim 3 or 4, characterized in that the method further comprises:

A fifth message is sent to the service device where the first model is located, the fifth message including the first indication information or the second indication information.

According to the method of claim 2 or 3, the second message further includes identification information of the first network element in the network device that performs the inference task.

The method according to any one of claims 1-6, characterized in that the method further comprises:

If the second message includes the first indication information, a sixth message is sent to the network device. The sixth message is used to request the network device to independently execute the inference task. The sixth message includes first data that needs to be input to execute the inference task.

Receive the first inference result, which is obtained based on the first data.

The method according to any one of claims 1-7, characterized in that the method further comprises:

If the second message includes the second indication information, a seventh message is sent to the network device. The seventh message is used to request the terminal device to jointly execute the inference task with the network device. The seventh message includes second data. The second data is obtained by the terminal device inferring the first data based on the first sub-model, or the second data is obtained by preprocessing the first data. The first data is the data required to execute the inference task.

The method according to claim 8, characterized in that the method further comprises:

When the second indication information indicates that the network device obtains an intermediate layer feature based on the second sub-model, the third data is obtained by reasoning the second data based on the first sub-model.

Receive fourth data from the network device, the fourth data being obtained by the network device from reasoning about the second data based on the second sub-model;

The second reasoning result is obtained based on the third and fourth data.

The method according to any one of claims 1-9, wherein the first model comprises an artificial intelligence model or a machine learning model.

A communication method, characterized in that it is applied to a network device, the method comprising:

Receive a first message from a terminal device, the first message being used to request the network device to deploy a first model, the first message including the identification information of the first model and the resource information of the terminal device;

Based on the resource information of the first model and the terminal device, a second message is generated. The second message includes the first instruction information or the second instruction information. The first instruction information indicates that the network device independently executes the inference task of the first model. The second instruction information indicates that the terminal device and the network device jointly execute the inference task. It also indicates that the terminal device and the network device respectively deploy the first sub-model and the second sub-model in the first model.

The second message is sent to the terminal device.

The method according to claim 11, characterized in that, before receiving the first message from the terminal device, it further includes:

A fourth message is sent to the terminal device. The fourth message is used to obtain the resource information of the terminal device. The fourth message includes the identification information of the first network element in the network device that performs the inference task and the identification information of the first model.

The method according to claim 11 is characterized in that, before receiving the first message from the terminal device, it further includes: sending an eighth message to the service device where the first model is located, the eighth message including the identification information of the first network element in the network device that performs the inference task.

The method according to claim 12 or 13 is characterized in that the method further includes: receiving a third message, the third message including the first model and the identification information of the first model.

The method according to claim 11, wherein the first message further includes the first model.

The method according to claim 14 or 15, characterized in that the method further comprises:

Based on the identification information of the first model, determine whether the first model has been stored in the network device;

If the first model is not stored in the network device, the first model is stored.

If the first model has been stored in the network device, the first model is invoked.

Identify the first network element in the network device that performs the inference task of the first model.

The method according to any one of claims 11-16, characterized in that the method further comprises:

If the second message includes the first indication information, a sixth message is received from the terminal device, the sixth message being used to request the network device to independently execute the inference task, the sixth message including first data required to execute the inference task;

Based on the first data, the reasoning task is executed independently to obtain the first reasoning result;

The first inference result is sent to the terminal device.

The method according to any one of claims 11-17, characterized in that the method further comprises:

If the second message includes the second indication information, a seventh message is received from the terminal device. The seventh message is used to request the terminal device and the network device to jointly execute the inference task. The seventh message includes second data. The second data is obtained by the terminal device from the first data based on the first sub-model, or the second data is obtained by the terminal device from the first data after preprocessing. The first data is the data required to execute the inference task.

The method according to claim 18, characterized in that the method further comprises:

The fourth data is obtained by reasoning from the second data based on the second sub-model;

The fourth data is sent to the terminal device.

A communication device, characterized in that it includes a unit for performing the method as described in any one of claims 1 to 10 or 11 to 19.

A communication device, characterized in that it includes a processor for performing the method as claimed in any one of claims 1 to 10 or 11 to 19.

A communication device, characterized in that it includes a logic circuit and an interface, wherein the logic circuit and the interface are coupled;

The interface is used for inputting and/or outputting information, and the logic circuit is used for performing the method as claimed in any one of claims 1 to 10 or 11 to 19.

A computer-readable storage medium, characterized in that the computer-readable storage medium is used to store a computer program, which, when executed, performs the method as claimed in any one of claims 1 to 10 or 11 to 19.

A computer program product, characterized in that the computer program product includes a computer program, which, when executed, performs the method as described in any one of claims 1 to 10 or claims 11 to 19.

A communication system, characterized in that it includes at least one of the following:

Terminal equipment, network equipment;

The terminal device is used to perform the method as described in any one of claims 1 to 10, and the network device is used to perform the method as described in any one of claims 11 to 19.