[go: up one dir, main page]

CN117521778B - A cost-optimization approach for split federated learning - Google Patents

A cost-optimization approach for split federated learning Download PDF

Info

Publication number
CN117521778B
CN117521778B CN202311398490.0A CN202311398490A CN117521778B CN 117521778 B CN117521778 B CN 117521778B CN 202311398490 A CN202311398490 A CN 202311398490A CN 117521778 B CN117521778 B CN 117521778B
Authority
CN
China
Prior art keywords
client
model
server
energy consumption
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311398490.0A
Other languages
Chinese (zh)
Other versions
CN117521778A (en
Inventor
黄旭民
杨锐彬
吴茂强
钟伟锋
谢胜利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202311398490.0A priority Critical patent/CN117521778B/en
Publication of CN117521778A publication Critical patent/CN117521778A/en
Application granted granted Critical
Publication of CN117521778B publication Critical patent/CN117521778B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/098Distributed learning, e.g. federated learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a cost optimization method for split federal learning, which comprises the following steps: s1: constructing a split federal learning system model, wherein the model comprises a server and a plurality of clients, and each client is provided with a local data set; s2: calculating the time delay and the energy consumption of data calculation and communication of all clients; s3: calculating time delay and energy consumption of communication by using client data, analyzing an objective function and related constraints by taking model splitting layers and bandwidths as decision variables, and establishing a cost optimization problem of a splitting federal learning system; s4: solving an optimization problem to obtain a model splitting and bandwidth allocation strategy; s5: and organizing all clients to train the given complete model by matching with the server based on the obtained model splitting and bandwidth allocation strategy, and obtaining the trained complete model. The invention calculates the optimal model splitting and bandwidth allocation strategy, so that the weighted sum of the total time and the total energy consumption of all clients in the splitting federal learning process is minimum, and the cost of time and energy consumption is reduced.

Description

一种拆分联邦学习的成本优化方法A cost-optimization approach for split federated learning

技术领域Technical Field

本发明涉及人工智能领域,更具体地,涉及一种拆分联邦学习的成本优化方法及系统。The present invention relates to the field of artificial intelligence, and more specifically, to a cost optimization method and system for split federated learning.

背景技术Background Art

联邦学习(Federated Learning,FL)和拆分学习(Split Learning,SL)是深度学习中以不分享用户数据为前提的分布式模型训练技术。在联邦学习中,服务器首先初始化一个全局模型。在每一轮全局模型训练中,服务器下发全局模型给所有客户端,每个客户端在本地侧基于本地数据训练此模型,随后将更新后的模型参数上传至服务器,由服务器聚合更新全局模型,以上过程持续迭代进行,直至全局模型精度收敛或者全局模型训练轮次到达预设值。在服务器与某客户端之间的拆分学习中,首先将完整的模型在指定位置(即拆分层)拆分成前后两部分子模型,为各个客户端分配前部分子模型,为服务器端分发后部分子模型。在每一轮模型训练中,客户端基于本地数据执行到拆分层的前向传播,获得粉碎数据并上传至服务器,由服务器完成剩余层的前向传播,服务器也进行到拆分层的反向传播,获得上述粉碎数据的梯度信息并回传至客户端,客户端接着完成剩余层的反向传播。以上前向与反向传播两个过程持续迭代进行直至客户端完成所有本地数据计算,再将更新后客户端子模型参数发送到下一个参与模型训练的客户端,重复以上步骤直至模型收敛。联邦学习的优势是允许多个客户端在保护数据隐私条件下并行训练模型,拆分学习是允许客户端只单独训练子模型,为了结合两者的优势,现有文献[Thapa,Chandra,et al."Splitfed:When federated learning meets split learning."Proceedings oftheAAAIConference onArtificial Intelligence.Vol.36.No.8.2022.]提出了拆分联邦学习(Splitfed Learning,SFL),所有客户端与服务器采用拆分学习完成模型训练,并且服务器利用联邦学习更新统一的客户端子模型。然而,考虑能量受限的物联网设备参与拆分联邦学习,尽管拆分联邦学习将大部分模型训练涉及的计算负荷分配至服务器,但是在训练过程中仍旧需要每个客户端持续地计算、发送与接收数据,带来巨大的时间与通信开销,这对物联网设备造成不小的压力。因此,为了确保拆分联邦学习的实用性,对参与拆分联邦学习的所有客户端在包括时间与能耗的成本优化管理至关重要。Federated Learning (FL) and Split Learning (SL) are distributed model training technologies in deep learning that do not share user data. In federated learning, the server first initializes a global model. In each round of global model training, the server sends the global model to all clients. Each client trains the model based on local data on the local side, and then uploads the updated model parameters to the server. The server aggregates and updates the global model. The above process continues to iterate until the accuracy of the global model converges or the global model training round reaches the preset value. In split learning between the server and a client, the complete model is first split into two parts at the specified position (i.e., the split layer), and the front part sub-model is assigned to each client, and the back part sub-model is distributed to the server. In each round of model training, the client performs forward propagation to the split layer based on local data, obtains the crushed data and uploads it to the server. The server completes the forward propagation of the remaining layers. The server also performs back propagation to the split layer, obtains the gradient information of the above crushed data and transmits it back to the client, and the client then completes the back propagation of the remaining layers. The above two processes of forward and backward propagation are iterated continuously until the client completes all local data calculations, and then the updated client sub-model parameters are sent to the next client participating in model training, and the above steps are repeated until the model converges. The advantage of federated learning is that it allows multiple clients to train models in parallel while protecting data privacy. Split learning allows the client to train only sub-models separately. In order to combine the advantages of both, the existing literature [Thapa, Chandra, et al. "Splitfed: When federated learning meets split learning." Proceedings of the AAAIConference on Artificial Intelligence. Vol. 36. No. 8. 2022.] proposed split federated learning (Splitfed Learning, SFL), in which all clients and servers use split learning to complete model training, and the server uses federated learning to update the unified client sub-model. However, considering the participation of energy-constrained IoT devices in split federated learning, although split federated learning distributes most of the computational load involved in model training to the server, each client still needs to continuously calculate, send and receive data during the training process, which brings huge time and communication overhead, which puts a lot of pressure on IoT devices. Therefore, in order to ensure the practicality of split federated learning, it is crucial to optimize the cost management of all clients participating in split federated learning, including time and energy consumption.

发明内容Summary of the invention

本发明为克服上述现有技术所述模型训练时间与能耗的成本较大的缺陷,提供一种拆分联邦学习的成本优化方法。In order to overcome the defects of the above-mentioned prior art in that the model training time and energy consumption are relatively high, the present invention provides a cost optimization method for split federated learning.

本发明的首要目的是为解决上述技术问题,本发明的技术方案如下:The primary purpose of the present invention is to solve the above technical problems. The technical solution of the present invention is as follows:

本发明第一方面提供了一种拆分联邦学习的成本优化方法,包括以下步骤:A first aspect of the present invention provides a cost optimization method for splitting federated learning, comprising the following steps:

S1:构建拆分联邦学习系统模型,该模型包括一个服务器与若干个个客户端,每个客户端拥有自己的本地数据集;S1: Build a split federated learning system model, which includes a server and several clients, each of which has its own local dataset;

S2:计算所有客户端数据计算与通信的时延和能耗;S2: Calculates the latency and energy consumption of all client data computation and communication;

S3:利用客户端数据计算与通信的时延和能耗,以模型拆分层与带宽为决策变量,分析目标函数与相关约束,建立拆分联邦学习系统成本优化问题;S3: Using the latency and energy consumption of client data computing and communication, taking the model splitting layer and bandwidth as decision variables, analyzing the objective function and related constraints, and establishing the cost optimization problem of the split federated learning system;

S4:求解优化问题,得到模型拆分和带宽分配策略;S4: Solve the optimization problem and obtain the model splitting and bandwidth allocation strategy;

S5:在拆分联邦学习系统中,基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,得到训练后的完整模型。S5: In the split federated learning system, all clients are organized to cooperate with the server to train the given complete model based on the obtained model splitting and bandwidth allocation strategy to obtain the trained complete model.

进一步的,步骤S2所述计算所有客户端数据计算与通信的时延和能耗,首先计算客户端本地计算能量消耗和计算时延,然后计算拆分联邦学习系统客户端能量成本和计算通信延迟。Furthermore, step S2 calculates the latency and energy consumption of all client data computation and communication, firstly calculating the client local computation energy consumption and computation latency, and then calculating the split federated learning system client energy cost and computation communication latency.

进一步的,所述计算客户端本地计算能量消耗和计算时延的具体过程如下:Furthermore, the specific process of calculating the energy consumption and the calculation delay of the local calculation client is as follows:

客户端i进行单位数据客户端子模型训练的本地计算负载为:The local computing load of client i for unit data client sub-model training is:

其中,γ(·)、ξ(·)分别表示客户端子模型和服务器子模型样本的前向、反向计算所需要的操作数,单位为FLOPs,C表示客户端子模型包含的层数;同理,对应需要服务器协同的计算负载 Among them, γ(·) and ξ(·) represent the number of operations required for the forward and reverse calculations of the client sub-model and server sub-model samples, respectively. The unit is FLOPs, where C represents the number of layers in the client submodel; similarly, it corresponds to the computing load that requires server collaboration.

客户端本地计算能量消耗计算时延分别为Client local computing energy consumption Calculation delay They are

其中,客户端i计算频率为fi,单位为cycle/s,ki为客户端i计算强度,单位为FLOPs/cycle,则定义客户端的计算速度为φi=fi×ki,单位为FLOPS;客户端计算能耗系数为εi,单位为J/FLOPs;Di为客户端i拥有的本地数据集;服务器协同客户端i的计算速度为ψ。Among them, the computing frequency of client i is fi , the unit is cycle/s, ki is the computing intensity of client i, the unit is FLOPs/cycle, then the computing speed of the client is defined as φi = fi × ki , the unit is FLOPS; the client computing energy consumption coefficient is εi , the unit is J/FLOPs; Di is the local data set owned by client i; the computing speed of the server collaborating with client i is ψ.

进一步的,所述计算拆分联邦学习系统客户端能量成本和计算通信延迟,具体过程如下:Furthermore, the calculation splits the energy cost of the federated learning system client and the calculation communication delay. The specific process is as follows:

在传输粉碎数据与更新的客户端子模型参数时,客户端i的上行链路数据传输速率ri up为:When transmitting the shredded data and updated client sub-model parameters, the uplink data transmission rate ri up of client i is:

其中,Pi up为客户端i上传数据时传输功率,bi为服务器分配给客户端i的带宽,σ2为高斯通道噪声,为服务器与客户端之间的信道增益,其中α0表示距离d=1 m处的信道增益,di表示服务器与客户端之间的欧式距离;Where, Piup is the transmission power when client i uploads data, bi is the bandwidth allocated by the server to client i, σ2 is the Gaussian channel noise, is the channel gain between the server and the client, where α 0 represents the channel gain at a distance of d = 1 m, and d i represents the Euclidean distance between the server and the client;

服务器向客户端i传输粉碎数据梯度信息的下行链路数据传输速率ri down为:The downlink data transmission rate r i down of the server transmitting the crushing data gradient information to the client i is:

其中,BD是服务器用于向每个用户广播的固定带宽,PB为平均传输功率;客户端上传时延下载时延分别为Where BD is the fixed bandwidth used by the server to broadcast to each user, PB is the average transmission power, and the client upload delay is Download delay They are

其中,单个样本粉碎数据大小为D(C),梯度大小为G(C),数据标签大小为β,客户端子模型参数大小为 Among them, the size of a single sample crushed data is D(C), the size of the gradient is G(C), the size of the data label is β, and the size of the client sub-model parameter is

通信时延Ti comm为:The communication delay Ti comm is:

客户端通信能量消耗Client communication energy consumption for

其中,Pi down为客户端接收功率,Pi up客户端i平均发射功率;Where, Pi down is the client receive power, and Pi up is the average transmit power of client i;

客户端i时间开销Ti total与能耗开销分别为:Client i time cost Ti total and energy cost They are:

Ti total=Ti comm+Ti comp T i total =T i comm +T i comp

在指定M个全局模型训练轮次中,拆分联邦学习系统客户端能量成本E和计算通信延迟T分别为:In a specified M global model training rounds, the split federated learning system client energy cost E and computational communication delay T are:

进一步的,在步骤S3中,建立的系统成本优化问题具体如下:Furthermore, in step S3, the system cost optimization problem established is as follows:

subject to:subject to:

C1:Cmin≤C≤Cmax C1 : Cmin≤C≤Cmax

C2:bmin≤bi≤bmax C2: bmin ≤b i ≤bmax

C3: C3:

其中,Btotal为所有客户端可用上行带宽;C1指明模型拆分层的范围,约束客户端子模型层数;C2指明每个客户端能分配到的带宽值取值范围,该取值范围为预先设定的;C3保证分配给客户端的带宽资源的总和不超过Btotal表示总能耗与总时间延迟之间的权重, Wherein, B total is the available uplink bandwidth for all clients; C1 indicates the range of model splitting layers, constraining the number of client sub-model layers; C2 indicates the bandwidth value range that can be allocated to each client, which is pre-set; C3 ensures that the sum of bandwidth resources allocated to the client does not exceed B total ; represents the weight between total energy consumption and total time delay,

进一步的,在步骤S4中,求解优化问题具体包括:将带宽变量离散化,设置离散间隔为Δ,则其中,bi为服务器分配给客户端i的带宽,bmin和bmax分别表示客户端能分配到的带宽值的最小值和最大值。Further, in step S4, solving the optimization problem specifically includes: discretizing the bandwidth variable and setting the discrete interval to Δ, then Where bi is the bandwidth allocated by the server to client i, bmin and bmax represent the minimum and maximum bandwidth values that can be allocated to the client, respectively.

进一步的,利用Python中Scipy库工具或者现有的规划问题求解器求解出整数规划问题的最优解。Furthermore, the optimal solution to the integer programming problem is solved using the Scipy library tool in Python or an existing planning problem solver.

进一步的,步骤S5所述基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,具体过程如下:Furthermore, in step S5, all clients are organized to cooperate with the server to train a given complete model based on the obtained model splitting and bandwidth allocation strategy. The specific process is as follows:

将给定的完整模型拆分为全局客户端子模型与全局服务器子模型,在每一轮全局模型的训练中,训练全局客户端子模型与全局服务器子模型,包括两个方面:首先,每个客户端接收服务器下发的全局客户端子模型,通过联合服务器训练此子模型,随后将更新后的客户端子模型参数上传至服务器以聚合更新全局客户端子模型;另外,服务器在内部给每个客户端分配一个相关联的服务器子模型,当协同每个客户端训练其客户端子模型时也更新此关联的服务器子模型,最后通过聚合所有关联的服务器子模型以更新全局服务器子模型。The given complete model is split into a global client sub-model and a global server sub-model. In each round of global model training, the global client sub-model and the global server sub-model are trained, which includes two aspects: first, each client receives the global client sub-model sent by the server, trains this sub-model by jointly training the server, and then uploads the updated client sub-model parameters to the server to aggregate and update the global client sub-model; in addition, the server internally assigns an associated server sub-model to each client, and updates this associated server sub-model when coordinating each client to train its client sub-model. Finally, the global server sub-model is updated by aggregating all associated server sub-models.

进一步的,步骤S5的具体过程如下:Furthermore, the specific process of step S5 is as follows:

S51:服务器选择模型拆分策略以确定客户端子模型包含的层数C,对给定的完整模型进行拆分后将客户端子模型Wclient下发至每个客户端;S51: The server selects a model splitting strategy to determine the number of layers C included in the client sub-model, splits the given complete model, and sends the client sub-model W client to each client;

S52:每个客户端接收Wclient,在服务器协同下基于本地数据更新此子模型,服务器负责更新服务器子模型;S52: Each client receives W client , and updates this sub-model based on local data in cooperation with the server, and the server is responsible for updating the server sub-model;

S53:任意客户端i拥有本地数据集Di,训练批量大小为H,将本地数据集划分为个批次;S53: Any client i has a local dataset D i , and the training batch size is H. The local dataset is divided into batches;

S54:任意客户端i使用当前批次数据执行客户端子模型的前向传播,将粉碎数据与当前批次对应的数据标签发送到服务器,其中粉碎数据即拆分层的输出;S54: Any client i uses the current batch data to perform forward propagation of the client sub-model, and sends the crushed data and the data label corresponding to the current batch to the server, where the crushed data is the output of the splitting layer;

S55:服务器利用接收到的粉碎数据,进行客户端i关联的服务器子模型的前向传播,并根据所接收的数据标签计算损失函数,在关联的服务器子模型执行反向传播计算得到粉碎数据的梯度,基于粉碎数据梯度以更新关联的服务器子模型;对于所有客户端的以上步骤在服务器内部并行执行;S55: The server uses the received crushed data to perform forward propagation of the server sub-model associated with client i, and calculates the loss function according to the received data label, performs back propagation calculation on the associated server sub-model to obtain the gradient of the crushed data, and updates the associated server sub-model based on the crushed data gradient; the above steps for all clients are executed in parallel inside the server;

S56:服务器将所有粉碎数据的梯度信息分别下发至对应客户端;S56: The server sends the gradient information of all crushed data to the corresponding clients respectively;

S57:每个客户端接收到粉碎数据的梯度信息,在客户端子模型执行反向传播并更新其客户端子模型;S57: Each client receives the gradient information of the crushed data, performs back propagation in the client sub-model and updates its client sub-model;

S58:重复S54至S57直到用完所有本地数据;S58: Repeat S54 to S57 until all local data is used up;

S59:每个客户端将更新的本地客户端子模型参数上传至服务器,服务器利用加权平均来聚合更新全局客户端子模型,随后将全局客户端子模型重新下发至所有客户端进行新一轮的客户端子模型本地训练;服务器通过加权平均来聚合所有关联的服务器子模型以更新全局服务器子模型。S59: Each client uploads the updated local client sub-model parameters to the server. The server uses weighted averaging to aggregate and update the global client sub-model, and then re-sends the global client sub-model to all clients for a new round of local client sub-model training. The server aggregates all associated server sub-models through weighted averaging to update the global server sub-model.

本发明第二方面提供了一种拆分联邦学习的成本优化系统,该系统包括:存储器、处理器,所述存储器中包括一种拆分联邦学习的成本优化方法程序,所述一种拆分联邦学习的成本优化方法程序被所述处理器执行时实现如下步骤:A second aspect of the present invention provides a cost optimization system for split federated learning, the system comprising: a memory, a processor, the memory comprising a cost optimization method program for split federated learning, the cost optimization method program for split federated learning being executed by the processor to implement the following steps:

S1:构建拆分联邦学习系统模型,该模型包括一个服务器与若干个个客户端,每个客户端拥有自己的本地数据集;S1: Build a split federated learning system model, which includes a server and several clients, each of which has its own local dataset;

S2:计算所有客户端数据计算与通信的时延和能耗;S2: Calculates the latency and energy consumption of all client data computation and communication;

S3:利用客户端数据计算与通信的时延和能耗,以模型拆分层与带宽为决策变量,分析目标函数与相关约束,建立拆分联邦学习系统成本优化问题;S3: Using the latency and energy consumption of client data computing and communication, taking the model splitting layer and bandwidth as decision variables, analyzing the objective function and related constraints, and establishing the cost optimization problem of the split federated learning system;

S4:求解优化问题,得到模型拆分和带宽分配策略;S4: Solve the optimization problem and obtain the model splitting and bandwidth allocation strategy;

S5:在拆分联邦学习系统中,基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,得到训练后的完整模型。S5: In the split federated learning system, all clients are organized to cooperate with the server to train the given complete model based on the obtained model splitting and bandwidth allocation strategy to obtain the trained complete model.

与现有技术相比,本发明技术方案的有益效果是:Compared with the prior art, the technical solution of the present invention has the following beneficial effects:

本发明首先构建拆分联邦学习系统模型,该模型包括一个服务器与若干个客户端,每个客户端拥有自己的本地数据集;其次计算所有客户端数据计算与通信的时延和能耗;然后利用客户端数据计算与通信的时延和能耗,以模型拆分层与带宽为决策变量,分析目标函数与相关约束,建立拆分联邦学习系统成本优化问题;之后求解优化问题,得到模型拆分和带宽分配策略,获得最佳模型拆分和带宽分配策略;最后在拆分联邦学习系统中,基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,得到训练后的完整模型,保护客户端的数据隐私同时以一种训练成本优化方式令客户端完成模型训练任务。The present invention first constructs a split federated learning system model, which includes a server and several clients, each of which has its own local data set; secondly, the delay and energy consumption of data calculation and communication of all clients are calculated; then, the delay and energy consumption of data calculation and communication of the clients are used, the model splitting layer and bandwidth are used as decision variables, the objective function and related constraints are analyzed, and the cost optimization problem of the split federated learning system is established; then, the optimization problem is solved to obtain the model splitting and bandwidth allocation strategy, and the optimal model splitting and bandwidth allocation strategy is obtained; finally, in the split federated learning system, based on the obtained model splitting and bandwidth allocation strategy, all clients are organized to cooperate with the server to train a given complete model, and the trained complete model is obtained, so as to protect the data privacy of the client and enable the client to complete the model training task in a training cost optimization manner.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明实施例提供的一种拆分联邦学习的成本优化方法流程图。FIG1 is a flow chart of a cost optimization method for split federated learning provided by an embodiment of the present invention.

图2为本发明实施例提供的拆分联邦学习系统模型图。FIG2 is a diagram of a split federated learning system model provided in an embodiment of the present invention.

图3为本发明实施例提供的拆分联邦学习系统工作流图。FIG3 is a workflow diagram of a split federated learning system provided in an embodiment of the present invention.

图4为本发明实施例提供的模型拆分和带宽分配策略计算流程图。FIG4 is a flow chart of model splitting and bandwidth allocation strategy calculation provided by an embodiment of the present invention.

图5为本发明实施例提供的ResNet18模型拆分示意图。FIG5 is a schematic diagram of splitting a ResNet18 model provided in an embodiment of the present invention.

图6为本发明实施例提供仿真结果图。FIG. 6 is a diagram showing simulation results according to an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

为了能够更清楚地理解本发明的上述目的、特征和优点,下面结合附图和具体实施方式对本发明进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to more clearly understand the above-mentioned purpose, features and advantages of the present invention, the present invention is further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present application and the features in the embodiments can be combined with each other without conflict.

在下面的描述中阐述了很多具体细节以便于充分理解本发明,但是,本发明还可以采用其他不同于在此描述的其他方式来实施,因此,本发明的保护范围并不受下面公开的具体实施例的限制。In the following description, many specific details are set forth to facilitate a full understanding of the present invention. However, the present invention may also be implemented in other ways different from those described herein. Therefore, the protection scope of the present invention is not limited to the specific embodiments disclosed below.

实施例1Example 1

如图1所示,本发明第一方面提供了一种拆分联邦学习的成本优化方法,包括以下步骤:As shown in FIG1 , the first aspect of the present invention provides a cost optimization method for splitting federated learning, comprising the following steps:

S1:构建拆分联邦学习系统模型,该模型包括一个服务器与I个客户端,每个客户端拥有自己的本地数据集,如图2所示为拆分联邦学习系统模型图。S1: Build a split federated learning system model, which includes a server and I clients. Each client has its own local data set. Figure 2 shows the split federated learning system model diagram.

更具体的,构建拆分联邦学习系统模型,如图2所示,包括一个服务器与I个客户端,由服务器执行服务器子模型训练,同时聚合更新所有客户端共享的客户端子模型,并分配不同通信带宽给不同客户端。每个客户端拥有自己的本地数据集,以训练自身客户端子模型,通过无线信道与服务器进行数据通信。More specifically, a split federated learning system model is constructed, as shown in Figure 2, which includes a server and I clients. The server performs server sub-model training, aggregates and updates the client sub-model shared by all clients, and allocates different communication bandwidths to different clients. Each client has its own local data set to train its own client sub-model and communicates data with the server through a wireless channel.

S2:计算所有客户端数据计算与通信的时延和能耗。S2: Calculate the latency and energy consumption of all client data calculations and communications.

更具体的,计算客户端本地计算能量消耗和计算时延的具体过程如下:More specifically, the specific process of calculating the client's local computing energy consumption and computing latency is as follows:

客户端i进行单位数据客户端子模型训练的本地计算负载为γ(·)、ξ(·)分别计算网络模型单个样本的前向、反向计算所需要的操作数,单位为FLOPs,C表示客户端子模型包含的层数;同理,对应需要服务器协同的计算负载客户端i计算频率为fi,单位为cycle/s,ki为客户端i计算强度,单位为FLOPs/cycle,则定义客户端的计算速度为φi=fi×ki,单位为FLOPS;客户端计算能耗系数为εi,单位为J/FLOPs;Di为客户端i拥有的本地数据集;服务器协同客户端i的计算速度为ψ,则客户端本地计算能量消耗计算时延Ti comp分别为The local computing load of client i for unit data client sub-model training is γ(·) and ξ(·) respectively calculate the number of operations required for the forward and reverse calculations of a single sample of the network model. The unit is FLOPs, where C represents the number of layers in the client submodel; similarly, it corresponds to the computing load that requires server collaboration. The computing frequency of client i is fi , in cycle/s, and ki is the computing intensity of client i, in FLOPs/cycle. Then the computing speed of the client is defined as φi = fi × ki , in FLOPS. The computing energy consumption coefficient of the client is εi , in J/FLOPs. Di is the local data set owned by client i. The computing speed of the server in collaboration with client i is ψ, and the local computing energy consumption of the client is The calculated delays Ti comp are

计算拆分联邦学习系统客户端能量成本和计算通信延迟,具体过程如下:The calculation splitting federated learning system client energy cost and calculation communication delay are as follows:

Pi up为客户端i上传数据时传输功率,bi为服务器分配给客户端i的带宽,σ2为高斯通道噪声,为服务器与客户端之间的信道增益,其中α0表示距离d=1m处的信道增益,di表示服务器与客户端之间的欧式距离;在传输粉碎数据与更新的客户端子模型参数时,客户端i的上行链路数据传输速率ri upP i up is the transmission power when client i uploads data, b i is the bandwidth allocated by the server to client i, σ 2 is the Gaussian channel noise, is the channel gain between the server and the client, where α 0 represents the channel gain at a distance of d = 1 m, and d i represents the Euclidean distance between the server and the client; when transmitting the pulverized data and the updated client sub-model parameters, the uplink data transmission rate r i up of client i is:

类似地,服务器向客户端i传输粉碎数据梯度信息的下行链路数据传输速率ri downSimilarly, the downlink data transmission rate r i down at which the server transmits the crushed data gradient information to the client i is:

其中,BD是服务器用于向每个用户广播的固定带宽,PB为平均传输功率;Where BD is the fixed bandwidth used by the server to broadcast to each user, and PB is the average transmission power;

单个样本粉碎数据大小为D(C),梯度大小为G(C),数据标签大小为β,客户端子模型参数大小为则客户端上传时延下载时延分别为The size of a single sample crushing data is D(C), the size of the gradient is G(C), the size of the data label is β, and the size of the client sub-model parameter is The client upload delay Download delay They are

通信时延Ti comm为:The communication delay Ti comm is:

Pi down为客户端接收功率,Pi up客户端i平均发射功率,则客户端通信能量消耗 Pi down is the client receiving power, Pi up is the average transmitting power of client i, and the client communication energy consumption is for

客户端i时间开销与能耗开销分别为:Client i time overhead and energy consumption They are:

Ti total=Ti comm+Ti comp T i total =T i comm +T i comp

在指定M个全局模型训练轮次中,拆分联邦学习系统客户端能量成本E和计算通信延迟T分别为:In a specified M global model training rounds, the split federated learning system client energy cost E and computational communication delay T are:

S3:利用客户端数据计算与通信的时延和能耗,以模型拆分层与带宽为决策变量,分析目标函数与相关约束,建立拆分联邦学习系统成本优化问题。S3: Using the latency and energy consumption of client data computing and communication, taking the model splitting layer and bandwidth as decision variables, analyzing the objective function and related constraints, and establishing the cost optimization problem of the split federated learning system.

更具体的,在步骤S3中,建立的系统成本优化问题具体如下:More specifically, in step S3, the system cost optimization problem established is as follows:

subject to:subject to:

C1:Cmin≤C≤Cmax C1 : Cmin≤C≤Cmax

C2:bmin≤bi≤bmax C2: bmin ≤b i ≤bmax

C3: C3:

其中,Btotal为所有客户端可用上行带宽;C1指明模型拆分层的范围,约束客户端子模型层数;C2指明每个客户端能分配到的带宽值取值范围,该取值范围为预先设定的;C3保证分配给客户端的带宽资源的总和不超过Btotal表示总能耗与总时间延迟之间的权重, Wherein, B total is the available uplink bandwidth for all clients; C1 indicates the range of model splitting layers, constraining the number of client sub-model layers; C2 indicates the bandwidth value range that can be allocated to each client, which is pre-set; C3 ensures that the sum of bandwidth resources allocated to the client does not exceed B total ; represents the weight between total energy consumption and total time delay,

S4:求解优化问题,得到模型拆分和带宽分配策略。S4: Solve the optimization problem and obtain the model splitting and bandwidth allocation strategy.

更具体的,为了在实际应用中简化问题求解,考虑将带宽变量离散化,设置离散间隔为Δ,则此时,原始问题将转化为传统的整数规划问题,进而利用Python中Scipy库工具或者现有的规划问题求解器(诸如Lingo)求解出整数规划问题的最优解。More specifically, in order to simplify the problem solving in practical applications, consider discretizing the bandwidth variable and setting the discrete interval to Δ, then At this point, the original problem will be transformed into a traditional integer programming problem, and then the optimal solution to the integer programming problem will be solved using the Scipy library tool in Python or an existing planning problem solver (such as Lingo).

S5:在拆分联邦学习系统中,基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,得到训练后的完整模型。S5: In the split federated learning system, all clients are organized to cooperate with the server to train the given complete model based on the obtained model splitting and bandwidth allocation strategy to obtain the trained complete model.

更具体的,将给定的完整模型拆分为全局客户端子模型与全局服务器子模型,在每一轮全局模型的训练中,训练全局客户端子模型与全局服务器子模型,包括两个方面:首先,每个客户端接收服务器下发的全局客户端子模型,通过联合服务器训练此子模型,随后将更新后的客户端子模型参数上传至服务器以聚合更新全局客户端子模型;另外,服务器在内部给每个客户端分配一个相关联的服务器子模型,当协同每个客户端训练其客户端子模型时也更新此关联的服务器子模型,最后通过聚合所有关联的服务器子模型以更新全局服务器子模型。More specifically, the given complete model is split into a global client sub-model and a global server sub-model. In each round of global model training, the global client sub-model and the global server sub-model are trained, which includes two aspects: first, each client receives the global client sub-model sent by the server, trains this sub-model by jointly training the server, and then uploads the updated client sub-model parameters to the server to aggregate and update the global client sub-model; in addition, the server internally assigns an associated server sub-model to each client, and updates this associated server sub-model when coordinating the training of each client's client sub-model. Finally, the global server sub-model is updated by aggregating all associated server sub-models.

更具体的,如图3为拆分联邦学习系统工作流图,步骤S5的具体过程如下:More specifically, as shown in FIG3 , a workflow diagram of the split federated learning system is shown, the specific process of step S5 is as follows:

S51:服务器选择模型拆分策略以确定客户端子模型包含的层数C,对给定的完整模型进行拆分后将客户端子模型Wclient下发至每个客户端;S51: The server selects a model splitting strategy to determine the number of layers C included in the client sub-model, splits the given complete model, and sends the client sub-model W client to each client;

S52:每个客户端接收Wclient,在服务器协同下基于本地数据更新此子模型,服务器负责更新服务器子模型;S52: Each client receives W client , and updates this sub-model based on local data in cooperation with the server, and the server is responsible for updating the server sub-model;

S53:任意客户端i拥有本地数据集Di,训练批量大小为H,将本地数据集划分为个批次;S53: Any client i has a local dataset D i , and the training batch size is H. The local dataset is divided into batches;

S54:任意客户端i使用当前批次数据执行客户端子模型的前向传播,将粉碎数据与当前批次对应的数据标签发送到服务器,其中粉碎数据即拆分层的输出;S54: Any client i uses the current batch data to perform forward propagation of the client sub-model, and sends the crushed data and the data label corresponding to the current batch to the server, where the crushed data is the output of the splitting layer;

S55:服务器利用接收到的粉碎数据,进行客户端i关联的服务器子模型的前向传播,并根据所接收的数据标签计算损失函数,在关联的服务器子模型执行反向传播计算得到粉碎数据的梯度,基于粉碎数据梯度以更新关联的服务器子模型;对于所有客户端的以上步骤在服务器内部并行执行;S55: The server uses the received crushed data to perform forward propagation of the server sub-model associated with client i, and calculates the loss function according to the received data label, performs back propagation calculation on the associated server sub-model to obtain the gradient of the crushed data, and updates the associated server sub-model based on the crushed data gradient; the above steps for all clients are executed in parallel inside the server;

S56:服务器将所有粉碎数据的梯度信息分别下发至对应客户端;S56: The server sends the gradient information of all crushed data to the corresponding clients respectively;

S57:每个客户端接收到粉碎数据的梯度信息,在客户端子模型执行反向传播并更新其客户端子模型;S57: Each client receives the gradient information of the crushed data, performs back propagation in the client sub-model and updates its client sub-model;

S58:重复S54至S57直到用完所有本地数据;S58: Repeat S54 to S57 until all local data is used up;

S59:每个客户端将更新的本地客户端子模型参数上传至服务器,服务器利用加权平均来聚合更新全局客户端子模型,随后将全局客户端子模型重新下发至所有客户端进行新一轮的客户端子模型本地训练;服务器通过加权平均来聚合所有关联的服务器子模型以更新全局服务器子模型。S59: Each client uploads the updated local client sub-model parameters to the server. The server uses weighted averaging to aggregate and update the global client sub-model, and then re-sends the global client sub-model to all clients for a new round of local client sub-model training. The server aggregates all associated server sub-models through weighted averaging to update the global server sub-model.

本发明第二方面提供了一种拆分联邦学习的成本优化系统,该系统包括:存储器、处理器,所述存储器中包括一种拆分联邦学习的成本优化方法程序,所述一种拆分联邦学习的成本优化方法程序被所述处理器执行时实现如下步骤:A second aspect of the present invention provides a cost optimization system for split federated learning, the system comprising: a memory, a processor, the memory comprising a cost optimization method program for split federated learning, the cost optimization method program for split federated learning being executed by the processor to implement the following steps:

S1:构建拆分联邦学习系统模型,该模型包括一个服务器与若干个个客户端,每个客户端拥有自己的本地数据集;S1: Build a split federated learning system model, which includes a server and several clients, each of which has its own local dataset;

S2:计算所有客户端数据计算与通信的时延和能耗;S2: Calculates the latency and energy consumption of all client data computation and communication;

S3:利用客户端数据计算与通信的时延和能耗,以模型拆分层与带宽为决策变量,分析目标函数与相关约束,建立拆分联邦学习系统成本优化问题;S3: Using the latency and energy consumption of client data computing and communication, taking the model splitting layer and bandwidth as decision variables, analyzing the objective function and related constraints, and establishing the cost optimization problem of the split federated learning system;

S4:求解优化问题,得到模型拆分和带宽分配策略;S4: Solve the optimization problem and obtain the model splitting and bandwidth allocation strategy;

S5:在拆分联邦学习系统中,基于得到的模型拆分和带宽分配策略组织所有客户端配合服务器对给定的完整模型进行训练,得到训练后的完整模型。S5: In the split federated learning system, all clients are organized to cooperate with the server to train the given complete model based on the obtained model splitting and bandwidth allocation strategy to obtain the trained complete model.

实施例2Example 2

实施拆分联邦学习成本优化的流程图,如图4所示。以下通过具体的实验仿真展示方法效果。考虑客户端数量为5的拆分联邦学习系统,服务器位于中心位置,客户端均匀分布在服务器附近100到200米,训练任务为经典CIFAR10数据集分类任务,全局迭代次数M=100。CIFAR10数据集由10个类别的60000张32×32的彩色图像组成,每个类别有6000张图像,选取50000张训练图像和10000张测试图像。模型选取卷积神经网络模型ResNet-18,定义模型层数为10层,模型拆分候选位置如图5所示。考虑异构性,客户端本地数据在[8000,10000]张均匀分布,计算速度φi满足[0.01,0.03]TFLOPS均匀分布,客户端能耗系数εi满足[0.5,0.7]J/TFLOPs均匀分布,传输功率Pi up满足[0.1,0.2]W均匀分布,简单起见令Pi up=Pi down。服务器协同每个客户端的计算速度ψ为2TFLOPS,系统总带宽Btotal为8MHz,信道增益α0设置为-50dB,服务器传输功率为1W,用于广播的带宽BD为15MHz。目标函数中权重系数 The flowchart for implementing split federated learning cost optimization is shown in Figure 4. The following is a specific experimental simulation to demonstrate the effect of the method. Consider a split federated learning system with 5 clients, the server is located in the center, and the clients are evenly distributed 100 to 200 meters around the server. The training task is the classic CIFAR10 dataset classification task, and the global iteration number M=100. The CIFAR10 dataset consists of 60,000 32×32 color images in 10 categories, each category has 6,000 images, and 50,000 training images and 10,000 test images are selected. The model selects the convolutional neural network model ResNet-18, and the number of model layers is defined as 10. The candidate positions for model splitting are shown in Figure 5. Considering heterogeneity, the client local data is evenly distributed in [8000, 10000] sheets, the computing speed φ i satisfies the uniform distribution of [0.01, 0.03] TFLOPS, the client energy consumption coefficient ε i satisfies the uniform distribution of [0.5, 0.7] J/TFLOPs, and the transmission power Pi up satisfies the uniform distribution of [0.1, 0.2] W. For simplicity, let Pi up = Pi down . The computing speed ψ of the server cooperating with each client is 2 TFLOPS, the total bandwidth of the system B total is 8 MHz, the channel gain α 0 is set to -50 dB, the server transmission power is 1 W, and the bandwidth used for broadcasting BD is 15 MHz. The weight coefficient in the objective function

带宽离散间距为Δ=0.1MHz,分配给用户最低带宽bmin为0.5MHz,最大带宽bmax=3.5MHz,则bi∈{0.5+k×Δ,|k=0,1,2,…,30},仿真结果如图6所示。The bandwidth discrete interval is Δ=0.1MHz, the minimum bandwidth b min allocated to the user is 0.5MHz, and the maximum bandwidth b max =3.5MHz, then b i ∈{0.5+k×Δ,|k=0,1,2,…,30}. The simulation results are shown in FIG6 .

当C=10时,即执行原始联邦学习方案。由表易知,本专利在拆分层C=6处取得最优解,此时带宽分配为(1.3,2.6,1.8,1.1,1.2),目标函数优于原始联邦学习方案的目标函数 When C=10, the original federated learning scheme is executed. As can be seen from the table, this patent obtains the optimal solution at the split layer C=6, at which the bandwidth allocation is (1.3, 2.6, 1.8, 1.1, 1.2), and the objective function Objective function that is better than the original federated learning solution

显然,本发明的上述实施例仅仅是为清楚地说明本发明所作的举例,而并非是对本发明的实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明权利要求的保护范围之内。Obviously, the above embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. For those skilled in the art, other different forms of changes or modifications can be made based on the above description. It is not necessary and impossible to list all the embodiments here. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the protection scope of the claims of the present invention.

Claims (8)

1. A cost optimization method for split federal learning, comprising the steps of:
S1: constructing a split federal learning system model, wherein the model comprises a server and a plurality of clients, and each client is provided with a local data set;
s2: calculating the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of all client data;
S3: using the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of the client data, taking the model splitting layer and the bandwidth as decision variables, and establishing an optimization target for minimizing the cost of the splitting federal learning system according to an objective function and constraint conditions;
S4: solving an optimization problem to obtain a model splitting and bandwidth allocation strategy;
s5: in the split federal learning system, organizing all clients to train a given complete model by matching with a server based on the obtained model split and bandwidth allocation strategy to obtain a trained complete model;
step S2, calculating the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of all client data, firstly calculating the local calculation energy consumption and calculation time delay of the client, and then calculating the communication energy consumption and the communication time delay of the split federal learning system client;
The specific process of calculating the local energy consumption and the calculating time delay of the client side is as follows:
The local computational load of the client i for training the unit data client terminal model is as follows:
Wherein γ client (C) represents the required operands for forward computation of the model for a single sample of the client at split point C; ζ client (C) represents the required operands for forward computation of the model for a single sample of the client at split point C; The unit is FLOPs, and C represents the number of layers contained in the client terminal model; similarly, the corresponding computing load that requires server collaboration
Client-side local computing energy consumptionThe calculation time delays T i comp are respectively
The calculation frequency of the client i is f i, the unit is cycle/s, k i is the calculation intensity of the client i, the unit is FLOPs/cycle, the calculation speed of the client is phi i=fi×ki, and the unit is FLOPS; the client calculates the energy consumption coefficient as epsilon i and the unit is J/FLOPs; d i is the local dataset owned by client i; the calculation speed of the server cooperative client i is psi.
2. The cost optimization method for splitting federal learning according to claim 1, wherein the calculating the communication energy consumption and the communication time delay of the client of the splitting federal learning system comprises the following specific processes:
In transmitting the shredding data and the updated client terminal model parameters, the uplink data transmission rate r i up of the client i is:
Where Pi up is the transmission power at which the client i uploads data, bi is the bandwidth allocated to the client i by the server, σ 2 is gaussian channel noise, Is the channel gain between the server and the client, where α 0 represents the channel gain at distance d=1m, di represents the euclidean distance between the server and the client;
The downlink data transmission rate r i down of the server transmitting the crushed data gradient information to the client i is:
wherein B D is a fixed bandwidth used by the server for broadcasting to each user, and P B is an average transmission power;
Client upload time delay Download time delayRespectively is
Wherein, the size of the crushed data of a single sample is D (C), the gradient size is G (C), the size of the data label is beta, and the parameter size of the client terminal model is
Communication delayThe method comprises the following steps:
client communication energy consumption Is that
Pi down is the client receiving power, and Pi up is the average transmitting power of the client i;
Client i time overhead T i total and energy consumption overhead The method comprises the following steps of:
Ti total=Ti comm+Ti comp
In the appointed M global model training rounds, the total time expenditure T and the total energy expenditure E of all clients of the split federal learning system are respectively as follows:
T=Mmax(T1 total,T2 total,...,TI total)
3. The method for optimizing the cost of split federation learning according to claim 2, wherein in step S3, the optimization objective for minimizing the cost of the split federation learning system is established according to the objective function and the constraint condition as follows:
The objective function is:
The constraint conditions are as follows:
subject to:
C1:Cmin≤C≤Cmax
C2:bmin≤bi≤bmax
C3:
Wherein, B total is the available uplink bandwidth of all clients; c1 is the range of model splitting layers, and the number of layers of the client terminal model is constrained; c2 is the constraint of the value range of the bandwidth value which can be allocated to each client; c3 is a constraint on the sum of bandwidth resources allocated to the client; representing a weight between total energy consumption overhead and total latency overhead; b i is the bandwidth allocated to the client i by the server; c is the number of layers the client terminal model contains.
4. The method for cost optimization of split federal learning according to claim 1, wherein in step S4, solving the optimization problem specifically includes: discretizing the bandwidth variable, setting the discretization interval as delta, thenWherein b i is the bandwidth allocated to the client i by the server, and b min and b max respectively represent the minimum value and the maximum value of the bandwidth values that the client can allocate to; k is the kth discrete interval.
5. The method of claim 4, wherein the optimal solution of the integer programming problem is solved using a Scipy library tool in Python or an existing programming problem solver.
6. The method for optimizing cost of split federal learning according to claim 1, wherein in step S5, all clients are organized to train a given complete model with a server based on the obtained model split and bandwidth allocation policy, and the specific process is as follows:
Splitting a given complete model into a global client terminal model and a global server sub-model, and training the global client terminal model and the global server sub-model in each round of training of the global model comprises the following two aspects: firstly, each client receives a global client terminal model issued by a server, trains the sub-model through a joint server, and then uploads updated client terminal model parameters to the server to aggregate and update the global client terminal model; in addition, the server internally assigns each client an associated server sub-model that is updated as each client trains its client sub-model in conjunction with it, and finally updates the global server sub-model by aggregating all of the associated server sub-models.
7. The method for cost optimization for split federal learning according to claim 6, wherein the specific process of step S5 is as follows:
s51: the server selects a model splitting strategy to determine the layer number C contained in the client terminal model, splits a given complete model and then issues a global client terminal model W client to each client terminal;
S52: each client receives W client, updates the sub-model based on local data under the cooperation of a server, and the server is responsible for updating the server sub-model;
S53: any client i owns a local dataset D i with training batch size H, dividing the local dataset into Each batch;
s54: any client i uses the current batch data to perform forward propagation of the client terminal model, and sends the data labels of the crushed data and the current batch to a server, wherein the crushed data is output of a split layer;
s55: the server utilizes the received crushed data to conduct forward propagation of a server sub-model associated with the client i, calculates a loss function according to the received data label, executes backward propagation calculation on the associated server sub-model to obtain a gradient of the crushed data, and updates the associated server sub-model based on the crushed data gradient; the above steps for all clients are performed in parallel inside the server;
S56: the server respectively transmits gradient information of all the crushing data to the corresponding client;
S57: each client receives gradient information of the crushed data, performs back propagation on the client terminal model and updates the client terminal model thereof;
S58: repeating S54 to S57 until all the local data is exhausted;
S59: each client uploads the updated local client terminal model parameters to a server, the server utilizes weighted average to aggregate and update the global client terminal model, and then the global client terminal model is re-issued to all clients for a new round of client terminal model local training; the server aggregates all associated server sub-models by weighted average to update the global server sub-model.
8. A cost optimization system for split federal learning, the system comprising: the system comprises a memory and a processor, wherein the memory comprises a cost optimization method program for splitting federal learning, and the cost optimization method program for splitting federal learning realizes the following steps when being executed by the processor:
S1: constructing a split federal learning system model, wherein the model comprises a server and a plurality of clients, and each client is provided with a local data set;
s2: calculating the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of all client data;
S3: using the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of the client data, taking the model splitting layer and the bandwidth as decision variables, and establishing an optimization target for minimizing the cost of the splitting federal learning system according to an objective function and constraint conditions;
S4: solving an optimization problem to obtain a model splitting and bandwidth allocation strategy;
s5: in the split federal learning system, organizing all clients to train a given complete model by matching with a server based on the obtained model split and bandwidth allocation strategy to obtain a trained complete model;
step S2, calculating the calculation time delay, the calculation energy consumption, the communication time delay and the communication energy consumption of all client data, firstly calculating the local calculation energy consumption and calculation time delay of the client, and then calculating the communication energy consumption and the communication time delay of the split federal learning system client;
The specific process of calculating the local energy consumption and the calculating time delay of the client side is as follows:
The local computational load of the client i for training the unit data client terminal model is as follows:
Wherein γ client (C) represents the required operands for forward computation of the model for a single sample of the client at split point C; ζ client (C) represents the required operands for forward computation of the model for a single sample of the client at split point C; The unit is FLOPs, and C represents the number of layers contained in the client terminal model; similarly, the corresponding computing load that requires server collaboration
Client-side local computing energy consumptionThe calculation time delays T i comp are respectively
The calculation frequency of the client i is f i, the unit is cycle/s, k i is the calculation intensity of the client i, the unit is FLOPs/cycle, the calculation speed of the client is phi i=fi×ki, and the unit is FLOPS; the client calculates the energy consumption coefficient as epsilon i and the unit is J/FLOPs; d i is the local dataset owned by client i; the calculation speed of the server cooperative client i is psi.
CN202311398490.0A 2023-10-25 2023-10-25 A cost-optimization approach for split federated learning Active CN117521778B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311398490.0A CN117521778B (en) 2023-10-25 2023-10-25 A cost-optimization approach for split federated learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311398490.0A CN117521778B (en) 2023-10-25 2023-10-25 A cost-optimization approach for split federated learning

Publications (2)

Publication Number Publication Date
CN117521778A CN117521778A (en) 2024-02-06
CN117521778B true CN117521778B (en) 2024-11-05

Family

ID=89757567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311398490.0A Active CN117521778B (en) 2023-10-25 2023-10-25 A cost-optimization approach for split federated learning

Country Status (1)

Country Link
CN (1) CN117521778B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2641040A (en) * 2024-05-13 2025-11-19 Nokia Technologies Oy Methods, apparatus and computer programs
CN119031415B (en) * 2024-09-06 2025-09-12 南京邮电大学 A 6G computing network adaptive splitting federated learning method for distributed AI training business
CN120196451B (en) * 2025-05-26 2025-08-12 湖南科技大学 A batch-based parallel split federated learning method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115907038A (en) * 2022-09-09 2023-04-04 南开大学 A Multivariate Control Decision-Making Method Based on Federated Split Learning Framework

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116150612A (en) * 2021-11-15 2023-05-23 华为技术有限公司 Method and communication device for model training
CN115130548B (en) * 2022-05-24 2025-07-18 腾讯科技(深圳)有限公司 Data processing method and device, storage medium and electronic equipment
CN116418687A (en) * 2023-03-03 2023-07-11 哈尔滨工业大学(深圳) Parameter optimization and resource allocation method for low-energy wireless federated learning system
CN116418589A (en) * 2023-04-19 2023-07-11 湖南大学 Abnormal traffic detection method for IoT heterogeneous devices based on federated split learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115907038A (en) * 2022-09-09 2023-04-04 南开大学 A Multivariate Control Decision-Making Method Based on Federated Split Learning Framework

Also Published As

Publication number Publication date
CN117521778A (en) 2024-02-06

Similar Documents

Publication Publication Date Title
CN117521778B (en) A cost-optimization approach for split federated learning
CN113504999B (en) Scheduling and resource allocation method for high-performance hierarchical federal edge learning
CN111182582B (en) Multitask distributed unloading method facing mobile edge calculation
Wen et al. Joint parameter-and-bandwidth allocation for improving the efficiency of partitioned edge learning
Yu et al. Energy-aware device scheduling for joint federated learning in edge-assisted internet of agriculture things
Mao et al. Joint client selection and bandwidth allocation of wireless federated learning by deep reinforcement learning
CN113222179A (en) Federal learning model compression method based on model sparsification and weight quantization
CN112105062B (en) Mobile edge computing network energy consumption minimization strategy method under time-sensitive condition
Zhang et al. AoI-energy tradeoff for data collection in UAV-assisted wireless networks
CN111405569A (en) Method and device for computing offloading and resource allocation based on deep reinforcement learning
CN109388492B (en) An optimized computing power distribution method for mobile blockchain based on simulated annealing in multiple edge computing server scenarios
CN111130911A (en) Calculation unloading method based on mobile edge calculation
Hu et al. Accelerating federated learning with model segmentation for edge networks
Zhao et al. Joint content caching, service placement and task offloading in UAV-enabled mobile edge computing networks
Dai et al. Energy‐efficient resource allocation for device‐to‐device communication with WPT
Ji et al. Client selection and bandwidth allocation for federated learning: An online optimization perspective
CN110167176A (en) A kind of wireless network resource distribution method based on distributed machines study
CN115756873B (en) Mobile edge computing and unloading method and platform based on federation reinforcement learning
CN116389270A (en) A method for joint optimization of client selection and bandwidth allocation based on DRL in federated learning
CN114554495A (en) Federal learning-oriented user scheduling and resource allocation method
CN115033382A (en) Equipment scheduling method in multi-task federal learning system
Peng et al. Data-driven spectrum partition for multiplexing URLLC and eMBB
CN115118591B (en) A Cluster Federated Learning Method Based on Alliance Game
CN119312947B (en) Segmentation federal learning model training method based on heterogeneous system
Wang et al. Joint heterogeneous tasks offloading and resource allocation in mobile edge computing systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant