[go: up one dir, main page]

CN118301006A - Data center network traffic model training method supporting rapid scenario adaptation - Google Patents

Data center network traffic model training method supporting rapid scenario adaptation Download PDF

Info

Publication number
CN118301006A
CN118301006A CN202410467067.XA CN202410467067A CN118301006A CN 118301006 A CN118301006 A CN 118301006A CN 202410467067 A CN202410467067 A CN 202410467067A CN 118301006 A CN118301006 A CN 118301006A
Authority
CN
China
Prior art keywords
model
network traffic
data
updated
adaptation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410467067.XA
Other languages
Chinese (zh)
Inventor
李丹
汪锡峥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202410467067.XA priority Critical patent/CN118301006A/en
Publication of CN118301006A publication Critical patent/CN118301006A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/16Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明涉及数字信息的传输技术领域,特别涉及一种支持场景快速适应的数据中心网络流量模型训练方法,其中,方法包括:获取当前网络流量场景中的网络流量数据;基于待更新网络流量大模型的目标训练层的输入格式标准和输出格式标准,将网络流量数据转换为输入数据和输出数据,其中,待更新网络流量大模型由多层神经网络模型Transformer构成;基于预设的LoRA模型,利用输入数据和输出数据训练目标训练层的低秩参数矩阵,并利用训练后的低秩参数矩阵更新待更新网络流量大模型,以通过更新后的网络流量大模型对新网络流量场景快速适应。由此,避免对模型的整体重训练,在保证生成流量保真度的基础上降低模型的重训练时间,实现对新场景的快速适应。

The present invention relates to the field of digital information transmission technology, and in particular to a data center network traffic model training method that supports rapid scene adaptation, wherein the method comprises: obtaining network traffic data in the current network traffic scene; based on the input format standard and output format standard of the target training layer of the large network traffic model to be updated, converting the network traffic data into input data and output data, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer; based on a preset LoRA model, using the input data and output data to train the low-rank parameter matrix of the target training layer, and using the trained low-rank parameter matrix to update the large network traffic model to be updated, so as to quickly adapt to new network traffic scenes through the updated large network traffic model. Thus, the overall retraining of the model is avoided, the retraining time of the model is reduced on the basis of ensuring the fidelity of the generated traffic, and rapid adaptation to new scenes is achieved.

Description

Data center network flow model training method supporting scene rapid adaptation
Technical Field
The invention relates to the technical field of digital information transmission, in particular to a data center network flow model training method supporting rapid scene adaptation.
Background
With the rapid development of technologies such as cloud computing, big data analysis and artificial intelligence, a data center plays a vital role in modern society, not only provides flexible computing and storage resources for enterprises, but also provides support for development of innovative application programs and services, and the scale and complexity of the data center are continuously increased so as to meet the increasing data demands and technical challenges.
Traffic is an important carrier for data center network research, and numerous data center network traffic generation models are proposed in academia and industry. Typically, retraining or fine tuning of the model is required in order to make the model more adaptable to the user's real scene.
However, the bandwidth in the data center network is far higher than that of the wide area network, the change of the flow rate is fast, and higher requirements are put on the retraining or fine tuning speed of the model, otherwise, even if the model is trained by using real-time network flow rate, the model still has difficulty in fitting the real flow rate condition because the flow rate mode is transformed.
Network traffic is often used to assist in performing various network monitoring tasks, and open-source datasets can be generally classified into several categories according to characteristics of the traffic itself and roles in downstream applications: wide area network traffic and data center traffic with application information, internet of things device traffic, traffic including network attacks, etc., which are typically used to train traffic data driven network application design and network management methods, train and evaluate network domain machine learning models, traffic trace inputs for network simulators.
Therefore, the demand of network open source traffic is strong, and the more traffic data the user needs is often better, and the owners of the traffic data are mostly reluctant to disclose the traffic data, which involves business factors and other reasons. Even if there is a publicly available open source data set, it is basically not time-efficient and often subject to anonymization, while collecting fine-grained traffic itself tends to imply a huge overhead, which requires a significant computational and storage overhead for the network device. Current flow generation methods are all to learn the characteristics of flow over an open source data set and to help generate a large number of data sets with the same characteristics from a small number of open source data sets.
The existing network traffic generation method generates artificial traffic according to the characteristics of input traffic, wherein the method represented by a deep learning model obtains the best generation performance, but the superior performance depends on a complex model with huge parameters, ultra-large scale training data and the like. In recent years, large models of billions or even billions of parameters achieve excellent performance in various fields, and large models represented by diffusion models also achieve good performance in terms of network traffic generation.
However, it is undeniable that a large model trained on a very large dataset, while conforming to generic flow characteristics, may vary greatly in different scenarios and may also change over time. At this point, retraining the model to accommodate the new scene or feature is the simplest and general method, which is quite simple on a small-parameter number of models, but retraining a large model undoubtedly involves a huge overhead.
Furthermore, existing traffic generation methods can only generate message sequences, and in many downstream applications, some additional information is required in addition to the traffic itself. For example, when generating a training set for a deep learning model of a network domain, labels in terms of some services or protocols are often required, labels of application types are required in a downstream application identified by an upper layer application, and labels of traffic belonging to normal traffic or some malicious traffic are required in a downstream application of security detection.
In summary, as models become more complex, existing traffic generation methods increasingly have difficulty avoiding training overhead when adapting to new scenarios using retraining methods, and existing traffic generation methods merely generate network traffic itself without considering additional content required by various downstream applications.
Disclosure of Invention
The invention provides a data center network flow model training method supporting rapid adaptation of a scene, which aims to solve the problems that the existing flow generation method has high training cost when adapting to a new scene by using a model retraining method, and the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications.
An embodiment of a first aspect of the present invention provides a data center network traffic model training method supporting rapid scene adaptation, including the following steps: acquiring network flow data in a current network flow scene; based on an input format standard and an output format standard of a target training layer of a network traffic large model to be updated, converting the network traffic data into input data and output data, wherein the network traffic large model to be updated is formed by a multi-layer neural network model Transformer; based on a preset LoRA (Low-Rank Adaptation Model, a learnable re-weighting model) model, training a Low-rank parameter matrix of the target training layer by using the input data and the output data, and updating the network traffic large model to be updated by using the trained Low-rank parameter matrix so as to quickly adapt to a new network traffic scene through the updated network traffic large model.
Optionally, after updating the network traffic large model to be updated by using the trained low-rank parameter matrix, the method further comprises: acquiring a downstream task requirement of a downstream model of the current network traffic scene and a network message sequence output by the updated network traffic large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.
Optionally, the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task demand.
Optionally, the target training layer is the last layer of the network traffic heavy model to be updated.
Optionally, the converting the network traffic data into the input data and the output data based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated includes: acquiring the current task of the network flow large model to be updated; and converting the network traffic data into input data and output data according to the current task based on the input format standard and the output format standard.
An embodiment of a second aspect of the present invention provides a data center network traffic model training device supporting rapid scene adaptation, including: the acquisition module is used for acquiring network flow data in the current network flow scene; the conversion module is used for converting the network traffic data into input data and output data based on the input format standard and the output format standard of a target training layer of the network traffic large model to be updated, wherein the network traffic large model to be updated is formed by a multi-layer neural network model Transformer; the training module is used for training the low-rank parameter matrix of the target training layer by utilizing the input data and the output data based on a preset LoRA model, and updating the network traffic large model to be updated by utilizing the trained low-rank parameter matrix so as to quickly adapt to a new network traffic scene through the updated network traffic large model.
Optionally, after updating the network traffic large model to be updated with the trained low rank parameter matrix, the training module is further configured to: acquiring a downstream task requirement of a downstream model of the current network traffic scene and a network message sequence output by the updated network traffic large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.
Optionally, the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task demand.
Optionally, the target training layer is the last layer of the network traffic heavy model to be updated.
Optionally, the conversion module is further configured to: acquiring the current task of the network flow large model to be updated; and converting the network traffic data into input data and output data according to the current task based on the input format standard and the output format standard.
An embodiment of a third aspect of the present invention provides an electronic device, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the training method of the data center network traffic model supporting rapid scene adaptation according to the embodiment.
An embodiment of a fourth aspect of the present invention provides a computer readable storage medium having stored thereon a computer program for execution by a processor for implementing a data center network traffic model training method supporting fast adaptation of a scenario as described in the above embodiment.
In the above embodiment, network traffic data in a current network traffic scenario is acquired, the network traffic data is converted into input data and output data based on an input format standard and an output format standard of a target training layer of a network traffic large model to be updated, a low rank parameter matrix of the target training layer is trained by using the input data and the output data based on a preset LoRA model, and the network traffic large model to be updated is updated by using the trained low rank parameter matrix, so that the updated network traffic large model is quickly adapted to a new network traffic scenario. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the fidelity of generated flow, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to a downstream task, and performance of the generated flow on the downstream task is improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
FIG. 1 is a flowchart of a data center network traffic model training method supporting scene fast adaptation according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of LoRA training method according to one embodiment of the present invention;
FIG. 3 is a schematic diagram of a downstream scene adaptation method according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a data center network traffic model training device supporting rapid adaptation of scenarios in accordance with an embodiment of the present invention;
Fig. 5 is a schematic diagram of an electronic device structure according to an embodiment of the present invention.
Reference numerals illustrate:
10-supporting a data center network flow model training device for rapidly adapting to a scene; 100-an acquisition module; a 200-conversion module; 300-training module.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
The following describes a data center network traffic model training method supporting scene rapid adaptation according to an embodiment of the present invention with reference to the accompanying drawings. Aiming at the problems that the existing flow generation method mentioned in the background art is large in training expenditure when a model retraining method is used for adapting to a new scene, and the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications, the invention provides a data center network flow model training method for supporting rapid adaptation of the scene. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the fidelity of generated flow, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to a downstream task, and performance of the generated flow on the downstream task is improved.
Specifically, fig. 1 is a schematic flow chart of a training method for supporting a data center network traffic model with rapid scene adaptation according to an embodiment of the present invention.
As shown in fig. 1, the training method for supporting the data center network traffic model of the scene rapid adaptation includes the following steps:
in step S101, network traffic data in a current network traffic scenario is acquired.
It should be understood that the current network traffic scenario refers to a new network scenario. For example, a large model for performing a task of generating network traffic may be a model obtained by collecting traffic in a certain network and training, when collecting network traffic from another network and want to make it perform network traffic generation, that is, change a new network traffic scenario, then the training set is collected in a previous network and the obtained model performs poorly in the new network traffic scenario, so the problem to be solved by the embodiment of the present invention is how to train the large model of network traffic quickly when changing the new network traffic scenario.
In step S102, the network traffic data is converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated, wherein the network traffic large model to be updated is composed of a multi-layer neural network model Transformer.
Wherein in some embodiments, the target training layer is the last layer of the network traffic heavy model to be updated.
Optionally, in some embodiments, converting the network traffic data into the input data and the output data based on the input format standard and the output format standard of the target training layer of the network traffic heavy model to be updated includes: acquiring a current task of a network flow large model to be updated; based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task.
It should be understood that, in the network traffic heavy model to be updated formed by the transformers of the multi-layer neural network model, the transformers of the last layer of the network traffic heavy model to be updated can influence the traffic generation result most. Therefore, in the embodiment of the invention, after obtaining the network traffic data in the current network traffic scene, the whole model is not retrained, but the network traffic data is converted into the input data and the output data conforming to the last layer (namely the target training layer) through preprocessing based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated, and only the last layer is trained, and particularly as shown in fig. 2, the effect similar to retrained whole model is realized on the premise of greatly saving hardware resources and training time through the balance strategy.
Specifically, in fig. 2, a tree model similar to a transverse one refers to a network traffic large model. Where GTT refers to one module implemented by a transducer-based. Since different industry large models may differ slightly in their structure, it is considered here that GTT is a transducer. L1 to LN represent N layers of the network traffic heavy model to be updated, and the input of the former layer is used as the output of the latter layer for training. The LoRA model is a training method provided by the embodiment of the invention, namely, only the L N layer (namely, the target training layer) is extracted, the output of the L N layer is the output of the whole model, then the output does not need additional processing, the input is determined according to the current task of the whole model, and then the input and output data of the target training layer are based on the input and output data of the target training layer for retraining. In particular, its input data is statistical information of some traffic, for example, how many packets and bytes there are every 10 nanoseconds, and its output format standard of output data is how many packets and bytes there are 1 nanosecond.
In step S103, based on the preset LoRA model, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data, and the network traffic large model to be updated is updated by using the trained low-rank parameter matrix, so that the updated network traffic large model is quickly adapted to the new network traffic scene.
It should be understood that the embodiment of the invention uses the preset LoRA model to enable the network traffic heavy model to be updated to quickly adapt to a new scene. This is inspired by LoRA fine tuning the large language model, loRA is core of training the low rank parameter matrix of the whole model with a small amount of data from the new scene in the retraining stage, after training is completed, the retrained parameters are injected into the original model and replace the corresponding parameters, and the embodiment of the invention trains the low rank parameter matrix of the network traffic large model using the same method.
Specifically, the embodiment of the invention trains the low-rank parameter matrix of the target training layer of the network traffic large model to be updated by utilizing input data and output data based on a preset LoRA model, updates the low-rank parameter matrix before training by utilizing the trained low-rank parameter matrix, and updates the network traffic large model to be updated based on the trained low-rank parameter matrix so as to quickly adapt to a new network traffic scene according to the updated network traffic large model, and realizes the fine tuning effect of the network traffic large model with lower expenditure.
Optionally, in some embodiments, after updating the network traffic large model to be updated with the trained low rank parameter matrix, further comprises: acquiring a downstream task demand of a downstream model of a current network flow scene and a network message sequence output by an updated network flow large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.
In some embodiments, the input data of the downstream task adaptation model is a network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.
In order to cope with various downstream applications in a new network traffic scenario, an embodiment of the present invention further proposes an adaptation model that satisfies traffic characteristics of the downstream application, using a transducer that is widely known in the industry and that obtains an SOTA (state-of-the-art) effect in the general field as a base model, where an input of the downstream task adaptation model is an output of a network traffic heavy model (i.e. a network message sequence), and an output of the downstream task adaptation model is determined according to a requirement of a downstream task, for example, when the downstream application needs traffic with a certain numerical label, the output of the downstream task adaptation model is determined as a message sequence carrying the label, and see fig. 3 in particular.
In addition, in order to generate traffic to achieve better effect on downstream applications, the downstream task adaptation model is not directly trained as a single model, but is connected with the network traffic big model and the downstream applications to achieve end-to-end training, so that the updated network traffic big model generates traffic data adapting to the demands of downstream tasks, for example, the network traffic big model outputs a coarse-grained message (Coar-GRAINED PACKETS) only including the sequence number, the timestamp and the length of the message, but when the downstream tasks need packet header information and even some tags (PACKET TRACE), the downstream task adaptation model can generate downstream task tag fields, for example, the downstream task adaptation model uses a transducer to generate packet headers and some tags. By this method, the quality of the generated flow can no longer be judged only by the evaluation index of some distribution conditions of the flow.
In the embodiment of the invention, for the downstream task adaptation model, a user can directly use a pre-trained generator or use the model after fine adjustment according to own flow data. Use method 1: and using the flow super-resolution model and generating a data packet sequence, converting constraint conditions of the data packet into ACL (Access Control List ) rules by a user, and inputting the ACL rules into an inlet of a flow converter to obtain the flow. Use method 2: fine tuning and then using: the user can collect ACL rules and corresponding traffic in the network environment, call the interface provided by the embodiment of the invention, respectively substitute the ACL rules and the traffic into the rule and traffic encoder, train the similarity matrix (fine tuning process) which is more in line with the scene of the user, and then give ACL rules and generate traffic in the subsequent use.
According to the data center network flow model training method supporting rapid scene adaptation provided by the embodiment of the invention, network flow data in a current network flow scene is obtained, the network flow data are converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network flow large model to be updated, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data based on the preset LoRA model, and the network flow large model to be updated is updated by using the trained low-rank parameter matrix, so that rapid adaptation to a new network flow scene is realized by using the updated network flow large model. Therefore, the problem that the existing flow generation method is large in training cost when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing flow fidelity generation, rapid adaptation of the new scene is further achieved, a method for generating fields required by a downstream task through a general transformation is achieved, a network flow large model and a downstream task model are connected, and performance of the generated flow on the downstream task is improved through an end-to-end training method.
The data center network flow model training device supporting rapid scene adaptation according to the embodiment of the invention is described with reference to the accompanying drawings.
Fig. 4 is a block diagram of a data center network traffic model training device supporting rapid adaptation of scenarios in accordance with an embodiment of the present invention.
As shown in fig. 4, the data center network traffic model training device 10 supporting rapid adaptation of a scenario includes: an acquisition module 100, a conversion module 200, and a training module 300.
The acquiring module 100 is configured to acquire network traffic data in a current network traffic scenario; the conversion module 200 is configured to convert the network traffic data into input data and output data based on an input format standard and an output format standard of a target training layer of the network traffic large model to be updated, where the network traffic large model to be updated is formed by a multi-layer neural network model transform; the training module 300 is configured to train a low-rank parameter matrix of a target training layer by using input data and output data based on a preset LoRA model, and update a network traffic large model to be updated by using the trained low-rank parameter matrix, so as to quickly adapt to a new network traffic scene by using the updated network traffic large model.
Optionally, in some embodiments, after updating the network traffic big model to be updated with the trained low rank parameter matrix, the training module 300 is further configured to: acquiring a downstream task demand of a downstream model of a current network flow scene and a network message sequence output by an updated network flow large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.
Optionally, in some embodiments, the input data of the downstream task adaptation model is a network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.
Optionally, in some embodiments, the target training layer is a last layer of the network traffic heavy model to be updated.
Optionally, in some embodiments, the conversion module 200 is further configured to: acquiring a current task of a network flow large model to be updated; based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task.
It should be noted that, the explanation of the foregoing embodiment of the method for training a data center network traffic model supporting rapid adaptation of a scene is also applicable to the device for training a data center network traffic model supporting rapid adaptation of a scene in this embodiment, which is not described herein again.
According to the data center network flow model training device supporting rapid scene adaptation provided by the embodiment of the invention, network flow data in a current network flow scene is acquired, the network flow data are converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network flow large model to be updated, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data based on the preset LoRA model, and the network flow large model to be updated is updated by using the trained low-rank parameter matrix, so that rapid adaptation to a new network flow scene is realized by using the updated network flow large model. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the generated flow fidelity, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to the downstream task, and performance of the generated flow on the downstream task is improved.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device may include:
Memory 501, processor 502, and a computer program stored on memory 501 and executable on processor 502.
The processor 502 implements the data center network traffic model training method supporting rapid adaptation of the scenario provided in the above embodiment when executing the program.
Further, the electronic device further includes:
A communication interface 503 for communication between the memory 501 and the processor 502.
Memory 501 for storing a computer program executable on processor 502.
The memory 501 may include high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
If the memory 501, the processor 502, and the communication interface 503 are implemented independently, the communication interface 503, the memory 501, and the processor 502 may be connected to each other via a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (PERIPHERAL COMPONENT INTERCONNECT, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 501, the processor 502, and the communication interface 503 are integrated on a chip, the memory 501, the processor 502, and the communication interface 503 may perform communication with each other through internal interfaces.
The processor 502 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the invention.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, realizes the data center network traffic model training method supporting rapid scene adaptation.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (10)

1.一种支持场景快速适应的数据中心网络流量模型训练方法,其特征在于,包括以下步骤:1. A data center network traffic model training method supporting rapid scene adaptation, characterized in that it includes the following steps: 获取当前网络流量场景中的网络流量数据;Get the network traffic data in the current network traffic scenario; 基于待更新网络流量大模型的目标训练层的输入格式标准和输出格式标准,将所述网络流量数据转换为输入数据和输出数据,其中,所述待更新网络流量大模型由多层神经网络模型Transformer构成;Based on the input format standard and the output format standard of the target training layer of the large network traffic model to be updated, the network traffic data is converted into input data and output data, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer; 基于预设的LoRA模型,利用所述输入数据和所述输出数据训练所述目标训练层的低秩参数矩阵,并利用训练后的低秩参数矩阵更新所述待更新网络流量大模型,以通过所述更新后的网络流量大模型对新网络流量场景快速适应。Based on the preset LoRA model, the input data and the output data are used to train the low-rank parameter matrix of the target training layer, and the trained low-rank parameter matrix is used to update the large network traffic model to be updated, so as to quickly adapt to the new network traffic scenario through the updated large network traffic model. 2.根据权利要求1所述的支持场景快速适应的数据中心网络流量模型训练方法,其特征在于,在利用所述训练后的低秩参数矩阵更新所述待更新网络流量大模型之后,还包括:2. According to the data center network traffic model training method supporting rapid scene adaptation according to claim 1, it is characterized in that after using the trained low-rank parameter matrix to update the large network traffic model to be updated, it also includes: 获取所述当前网络流量场景的下游模型的下游任务需求和所述更新后的网络流量大模型输出的网络报文序列;Obtaining the downstream task requirements of the downstream model of the current network traffic scenario and the network message sequence output by the updated network traffic large model; 基于所述下游任务需求和所述网络报文序列构建下游任务适配模型,并基于所述下游任务适配模型,连接所述更新后的网络流量大模型和所述下游模型,以使所述更新后的网络流量大模型生成适配所述下游任务需求的流量数据。A downstream task adaptation model is constructed based on the downstream task requirements and the network message sequence, and based on the downstream task adaptation model, the updated network traffic large model and the downstream model are connected so that the updated network traffic large model generates traffic data that adapts to the downstream task requirements. 3.根据权利要求2所述的支持场景快速适应的数据中心网络流量模型训练方法,其特征在于,所述下游任务适配模型的输入数据为所述网络报文序列,所述下游任务适配模型的输出数据由所述下游任务需求确定。3. According to the data center network traffic model training method that supports rapid scenario adaptation according to claim 2, it is characterized in that the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements. 4.根据权利要求1所述的支持场景快速适应的数据中心网络流量模型训练方法,其特征在于,所述目标训练层为所述待更新网络流量大模型的最后一层。4. According to the data center network traffic model training method supporting rapid scenario adaptation according to claim 1, it is characterized in that the target training layer is the last layer of the large network traffic model to be updated. 5.根据权利要求1所述的支持场景快速适应的数据中心网络流量模型训练方法,其特征在于,所述基于待更新网络流量大模型的目标训练层的输入格式标准和输出格式标准,将所述网络流量数据转换为输入数据和输出数据,包括:5. The data center network traffic model training method supporting rapid scenario adaptation according to claim 1 is characterized in that the input format standard and output format standard of the target training layer based on the large network traffic model to be updated, converting the network traffic data into input data and output data, comprises: 获取所述待更新网络流量大模型的当前任务;Obtaining the current task of the large network traffic model to be updated; 基于所述输入格式标准和所述输出格式标准,根据所述当前任务将所述网络流量数据转换为输入数据和输出数据。Based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task. 6.一种支持场景快速适应的数据中心网络流量模型训练装置,其特征在于,包括:6. A data center network traffic model training device supporting rapid scene adaptation, characterized by comprising: 获取模块,用于获取当前网络流量场景中的网络流量数据;The acquisition module is used to obtain the network traffic data in the current network traffic scenario; 转换模块,用于基于待更新网络流量大模型的目标训练层的输入格式标准和输出格式标准,将所述网络流量数据转换为输入数据和输出数据,其中,所述待更新网络流量大模型由多层神经网络模型Transformer构成;A conversion module, used for converting the network traffic data into input data and output data based on the input format standard and output format standard of the target training layer of the large network traffic model to be updated, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer; 训练模块,用于基于预设的LoRA模型,利用所述输入数据和所述输出数据训练所述目标训练层的低秩参数矩阵,并利用训练后的低秩参数矩阵更新所述待更新网络流量大模型,以通过所述更新后的网络流量大模型对新网络流量场景的快速适应。A training module is used to train the low-rank parameter matrix of the target training layer based on a preset LoRA model using the input data and the output data, and to update the large network traffic model to be updated using the trained low-rank parameter matrix, so as to quickly adapt to new network traffic scenarios through the updated large network traffic model. 7.根据权利要求6所述的支持场景快速适应的数据中心网络流量模型训练装置,其特征在于,在利用所述训练后的低秩参数矩阵更新所述待更新网络流量大模型之后,所述训练模块,还用于:7. The data center network traffic model training device supporting rapid scene adaptation according to claim 6 is characterized in that after the trained low-rank parameter matrix is used to update the large network traffic model to be updated, the training module is further used to: 获取所述当前网络流量场景的下游模型的下游任务需求和所述更新后的网络流量大模型输出的网络报文序列;Obtaining the downstream task requirements of the downstream model of the current network traffic scenario and the network message sequence output by the updated network traffic large model; 基于所述下游任务需求和所述网络报文序列构建下游任务适配模型,并基于所述下游任务适配模型,连接所述更新后的网络流量大模型和所述下游模型,以使所述更新后的网络流量大模型生成适配所述下游任务需求的流量数据。A downstream task adaptation model is constructed based on the downstream task requirements and the network message sequence, and based on the downstream task adaptation model, the updated network traffic large model and the downstream model are connected so that the updated network traffic large model generates traffic data that adapts to the downstream task requirements. 8.根据权利要求7所述的支持场景快速适应的数据中心网络流量模型训练装置,其特征在于,所述下游任务适配模型的输入数据为所述网络报文序列,所述下游任务适配模型的输出数据由所述下游任务需求确定。8. According to claim 7, the data center network traffic model training device that supports rapid scenario adaptation is characterized in that the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements. 9.一种电子设备,包括存储器、处理器及存储在存储器上的计算机程序,其特征在于,所述处理器执行所述计算机程序以实现权利要求1-5中任一所述的支持场景快速适应的数据中心网络流量模型训练方法。9. An electronic device comprising a memory, a processor and a computer program stored in the memory, characterized in that the processor executes the computer program to implement the data center network traffic model training method supporting rapid scenario adaptation as described in any one of claims 1-5. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1-5中任一所述的支持场景快速适应的数据中心网络流量模型训练方法。10. A computer-readable storage medium having a computer program stored thereon, characterized in that when the computer program is executed by a processor, the data center network traffic model training method supporting rapid scenario adaptation as described in any one of claims 1-5 is implemented.
CN202410467067.XA 2024-04-18 2024-04-18 Data center network traffic model training method supporting rapid scenario adaptation Pending CN118301006A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410467067.XA CN118301006A (en) 2024-04-18 2024-04-18 Data center network traffic model training method supporting rapid scenario adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410467067.XA CN118301006A (en) 2024-04-18 2024-04-18 Data center network traffic model training method supporting rapid scenario adaptation

Publications (1)

Publication Number Publication Date
CN118301006A true CN118301006A (en) 2024-07-05

Family

ID=91682812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410467067.XA Pending CN118301006A (en) 2024-04-18 2024-04-18 Data center network traffic model training method supporting rapid scenario adaptation

Country Status (1)

Country Link
CN (1) CN118301006A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118694670A (en) * 2024-08-26 2024-09-24 上海天旦网络科技发展有限公司 Simulation network data packet generation method and system based on large language generation model

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111130839A (en) * 2019-11-04 2020-05-08 清华大学 A traffic demand matrix forecasting method and system
US20220383126A1 (en) * 2021-05-19 2022-12-01 Microsoft Technology Licensing, Llc Low-Rank Adaptation of Neural Network Models
CN116502087A (en) * 2023-04-27 2023-07-28 西安电子科技大学 Defending model for graph-oriented attack resistance and construction method thereof
CN116757248A (en) * 2023-06-26 2023-09-15 厦门大学 A parameter-efficient large-scale pre-training model migration method
CN117253079A (en) * 2023-09-18 2023-12-19 北京百度网讯科技有限公司 Model training method, device, equipment and storage medium
CN117436480A (en) * 2023-10-31 2024-01-23 东北大学 A large model and recommendation method under the Mindspore framework
CN117494575A (en) * 2023-11-20 2024-02-02 深圳贝尔信息科技有限公司 State sensing method and device based on artificial intelligent transducer and storage medium
CN117574961A (en) * 2024-01-15 2024-02-20 成都信息工程大学 A parameter-efficient method and device for injecting adapters into pre-trained models
CN117611913A (en) * 2023-12-05 2024-02-27 电子科技大学 Continuous learning method for image classification pre-training model based on low-rank adaptive combination

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111130839A (en) * 2019-11-04 2020-05-08 清华大学 A traffic demand matrix forecasting method and system
US20220383126A1 (en) * 2021-05-19 2022-12-01 Microsoft Technology Licensing, Llc Low-Rank Adaptation of Neural Network Models
CN116502087A (en) * 2023-04-27 2023-07-28 西安电子科技大学 Defending model for graph-oriented attack resistance and construction method thereof
CN116757248A (en) * 2023-06-26 2023-09-15 厦门大学 A parameter-efficient large-scale pre-training model migration method
CN117253079A (en) * 2023-09-18 2023-12-19 北京百度网讯科技有限公司 Model training method, device, equipment and storage medium
CN117436480A (en) * 2023-10-31 2024-01-23 东北大学 A large model and recommendation method under the Mindspore framework
CN117494575A (en) * 2023-11-20 2024-02-02 深圳贝尔信息科技有限公司 State sensing method and device based on artificial intelligent transducer and storage medium
CN117611913A (en) * 2023-12-05 2024-02-27 电子科技大学 Continuous learning method for image classification pre-training model based on low-rank adaptive combination
CN117574961A (en) * 2024-01-15 2024-02-20 成都信息工程大学 A parameter-efficient method and device for injecting adapters into pre-trained models

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIANGXIN KONG ET AL: "Performance Evaluation of Software-Defined Networking with Real-life ISP Traffi", IEEE, 6 March 2014 (2014-03-06) *
代志康;吴秋新;程希明;: "一种基于ResNet的网络流量识别方法", 北京信息科技大学学报(自然科学版), no. 01, 15 February 2020 (2020-02-15) *
王帅 李丹: "分布式机器学习系统网络性能优化研究进展", 计算机学报, 15 July 2022 (2022-07-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118694670A (en) * 2024-08-26 2024-09-24 上海天旦网络科技发展有限公司 Simulation network data packet generation method and system based on large language generation model

Similar Documents

Publication Publication Date Title
CN110554958B (en) Graph database testing method, system, device and storage medium
CN111507993B (en) Image segmentation method, device and storage medium based on generation countermeasure network
US7882485B2 (en) Method for modeling components of an information processing application using semantic graph transformations
CN112036563B (en) Insights from deep learning models using provenance data
CN108923983B (en) Method and device for predicting opportunistic network link and readable storage medium
CN108172213A (en) Jiaochuan audio recognition method, device, equipment and computer readable medium
CN118555196B (en) Service intention-oriented configuration automation issuing method and system
CN113811897B (en) Inference method, device, computer equipment and storage medium for neural network model
CN118301006A (en) Data center network traffic model training method supporting rapid scenario adaptation
CN118250288B (en) AIoT middle platform data processing method, device and system based on scenario componentization
CN117541883B (en) Image generation model training, image generation method, system and electronic equipment
CN118587899A (en) Road congestion prediction model construction method, device, equipment, medium and product
CN111626338B (en) Cloud environment matching method, device, equipment and medium based on fusion classification model
CN117351273A (en) Partial discharge fault diagnosis method of power equipment based on causal knowledge guidance
CN119338011B (en) Multi-modal data processing method and device based on large language model and reinforcement learning
CN119578558A (en) A dialogue model training method, device, equipment and medium
CN119202030A (en) A multi-source heterogeneous data governance method and system
CN115982236B (en) Big data optimization method and server applied to AI
CN118297107A (en) Processing method, system, device, medium and program product for large language model
CN114218719B (en) Cross-region emission model migration method, system, storage medium and device
CN116668351A (en) Service quality prediction method, device, computer equipment and storage medium
CN115526299A (en) A Neural Network Model Mixed Precision Solution Search Method and System
CN118070850B (en) Data center network traffic generation method, device, medium and computer program
CN114528450A (en) Classification system and method for growth hypergraph data
CN116842958B (en) Time series knowledge graph completion method and entity prediction method and device based on it

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20240705