CN118301006A

CN118301006A - Data center network traffic model training method supporting rapid scenario adaptation

Info

Publication number: CN118301006A
Application number: CN202410467067.XA
Authority: CN
Inventors: 李丹; 汪锡峥
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2024-04-18
Filing date: 2024-04-18
Publication date: 2024-07-05

Abstract

The present invention relates to the field of digital information transmission technology, and in particular to a data center network traffic model training method that supports rapid scene adaptation, wherein the method comprises: obtaining network traffic data in the current network traffic scene; based on the input format standard and output format standard of the target training layer of the large network traffic model to be updated, converting the network traffic data into input data and output data, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer; based on a preset LoRA model, using the input data and output data to train the low-rank parameter matrix of the target training layer, and using the trained low-rank parameter matrix to update the large network traffic model to be updated, so as to quickly adapt to new network traffic scenes through the updated large network traffic model. Thus, the overall retraining of the model is avoided, the retraining time of the model is reduced on the basis of ensuring the fidelity of the generated traffic, and rapid adaptation to new scenes is achieved.

Description

Data center network flow model training method supporting scene rapid adaptation

Technical Field

The invention relates to the technical field of digital information transmission, in particular to a data center network flow model training method supporting rapid scene adaptation.

Background

With the rapid development of technologies such as cloud computing, big data analysis and artificial intelligence, a data center plays a vital role in modern society, not only provides flexible computing and storage resources for enterprises, but also provides support for development of innovative application programs and services, and the scale and complexity of the data center are continuously increased so as to meet the increasing data demands and technical challenges.

Traffic is an important carrier for data center network research, and numerous data center network traffic generation models are proposed in academia and industry. Typically, retraining or fine tuning of the model is required in order to make the model more adaptable to the user's real scene.

However, the bandwidth in the data center network is far higher than that of the wide area network, the change of the flow rate is fast, and higher requirements are put on the retraining or fine tuning speed of the model, otherwise, even if the model is trained by using real-time network flow rate, the model still has difficulty in fitting the real flow rate condition because the flow rate mode is transformed.

Network traffic is often used to assist in performing various network monitoring tasks, and open-source datasets can be generally classified into several categories according to characteristics of the traffic itself and roles in downstream applications: wide area network traffic and data center traffic with application information, internet of things device traffic, traffic including network attacks, etc., which are typically used to train traffic data driven network application design and network management methods, train and evaluate network domain machine learning models, traffic trace inputs for network simulators.

Therefore, the demand of network open source traffic is strong, and the more traffic data the user needs is often better, and the owners of the traffic data are mostly reluctant to disclose the traffic data, which involves business factors and other reasons. Even if there is a publicly available open source data set, it is basically not time-efficient and often subject to anonymization, while collecting fine-grained traffic itself tends to imply a huge overhead, which requires a significant computational and storage overhead for the network device. Current flow generation methods are all to learn the characteristics of flow over an open source data set and to help generate a large number of data sets with the same characteristics from a small number of open source data sets.

The existing network traffic generation method generates artificial traffic according to the characteristics of input traffic, wherein the method represented by a deep learning model obtains the best generation performance, but the superior performance depends on a complex model with huge parameters, ultra-large scale training data and the like. In recent years, large models of billions or even billions of parameters achieve excellent performance in various fields, and large models represented by diffusion models also achieve good performance in terms of network traffic generation.

However, it is undeniable that a large model trained on a very large dataset, while conforming to generic flow characteristics, may vary greatly in different scenarios and may also change over time. At this point, retraining the model to accommodate the new scene or feature is the simplest and general method, which is quite simple on a small-parameter number of models, but retraining a large model undoubtedly involves a huge overhead.

Furthermore, existing traffic generation methods can only generate message sequences, and in many downstream applications, some additional information is required in addition to the traffic itself. For example, when generating a training set for a deep learning model of a network domain, labels in terms of some services or protocols are often required, labels of application types are required in a downstream application identified by an upper layer application, and labels of traffic belonging to normal traffic or some malicious traffic are required in a downstream application of security detection.

In summary, as models become more complex, existing traffic generation methods increasingly have difficulty avoiding training overhead when adapting to new scenarios using retraining methods, and existing traffic generation methods merely generate network traffic itself without considering additional content required by various downstream applications.

Disclosure of Invention

The invention provides a data center network flow model training method supporting rapid adaptation of a scene, which aims to solve the problems that the existing flow generation method has high training cost when adapting to a new scene by using a model retraining method, and the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications.

An embodiment of a first aspect of the present invention provides a data center network traffic model training method supporting rapid scene adaptation, including the following steps: acquiring network flow data in a current network flow scene; based on an input format standard and an output format standard of a target training layer of a network traffic large model to be updated, converting the network traffic data into input data and output data, wherein the network traffic large model to be updated is formed by a multi-layer neural network model Transformer; based on a preset LoRA (Low-Rank Adaptation Model, a learnable re-weighting model) model, training a Low-rank parameter matrix of the target training layer by using the input data and the output data, and updating the network traffic large model to be updated by using the trained Low-rank parameter matrix so as to quickly adapt to a new network traffic scene through the updated network traffic large model.

Optionally, after updating the network traffic large model to be updated by using the trained low-rank parameter matrix, the method further comprises: acquiring a downstream task requirement of a downstream model of the current network traffic scene and a network message sequence output by the updated network traffic large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.

Optionally, the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task demand.

Optionally, the target training layer is the last layer of the network traffic heavy model to be updated.

Optionally, the converting the network traffic data into the input data and the output data based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated includes: acquiring the current task of the network flow large model to be updated; and converting the network traffic data into input data and output data according to the current task based on the input format standard and the output format standard.

An embodiment of a second aspect of the present invention provides a data center network traffic model training device supporting rapid scene adaptation, including: the acquisition module is used for acquiring network flow data in the current network flow scene; the conversion module is used for converting the network traffic data into input data and output data based on the input format standard and the output format standard of a target training layer of the network traffic large model to be updated, wherein the network traffic large model to be updated is formed by a multi-layer neural network model Transformer; the training module is used for training the low-rank parameter matrix of the target training layer by utilizing the input data and the output data based on a preset LoRA model, and updating the network traffic large model to be updated by utilizing the trained low-rank parameter matrix so as to quickly adapt to a new network traffic scene through the updated network traffic large model.

Optionally, after updating the network traffic large model to be updated with the trained low rank parameter matrix, the training module is further configured to: acquiring a downstream task requirement of a downstream model of the current network traffic scene and a network message sequence output by the updated network traffic large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.

Optionally, the conversion module is further configured to: acquiring the current task of the network flow large model to be updated; and converting the network traffic data into input data and output data according to the current task based on the input format standard and the output format standard.

An embodiment of a third aspect of the present invention provides an electronic device, including: the system comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the program to realize the training method of the data center network traffic model supporting rapid scene adaptation according to the embodiment.

An embodiment of a fourth aspect of the present invention provides a computer readable storage medium having stored thereon a computer program for execution by a processor for implementing a data center network traffic model training method supporting fast adaptation of a scenario as described in the above embodiment.

In the above embodiment, network traffic data in a current network traffic scenario is acquired, the network traffic data is converted into input data and output data based on an input format standard and an output format standard of a target training layer of a network traffic large model to be updated, a low rank parameter matrix of the target training layer is trained by using the input data and the output data based on a preset LoRA model, and the network traffic large model to be updated is updated by using the trained low rank parameter matrix, so that the updated network traffic large model is quickly adapted to a new network traffic scenario. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the fidelity of generated flow, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to a downstream task, and performance of the generated flow on the downstream task is improved.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:

FIG. 1 is a flowchart of a data center network traffic model training method supporting scene fast adaptation according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of LoRA training method according to one embodiment of the present invention;

FIG. 3 is a schematic diagram of a downstream scene adaptation method according to an embodiment of the invention;

FIG. 4 is a schematic diagram of a data center network traffic model training device supporting rapid adaptation of scenarios in accordance with an embodiment of the present invention;

Fig. 5 is a schematic diagram of an electronic device structure according to an embodiment of the present invention.

Reference numerals illustrate:

10-supporting a data center network flow model training device for rapidly adapting to a scene; 100-an acquisition module; a 200-conversion module; 300-training module.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

The following describes a data center network traffic model training method supporting scene rapid adaptation according to an embodiment of the present invention with reference to the accompanying drawings. Aiming at the problems that the existing flow generation method mentioned in the background art is large in training expenditure when a model retraining method is used for adapting to a new scene, and the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications, the invention provides a data center network flow model training method for supporting rapid adaptation of the scene. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the fidelity of generated flow, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to a downstream task, and performance of the generated flow on the downstream task is improved.

Specifically, fig. 1 is a schematic flow chart of a training method for supporting a data center network traffic model with rapid scene adaptation according to an embodiment of the present invention.

As shown in fig. 1, the training method for supporting the data center network traffic model of the scene rapid adaptation includes the following steps:

in step S101, network traffic data in a current network traffic scenario is acquired.

It should be understood that the current network traffic scenario refers to a new network scenario. For example, a large model for performing a task of generating network traffic may be a model obtained by collecting traffic in a certain network and training, when collecting network traffic from another network and want to make it perform network traffic generation, that is, change a new network traffic scenario, then the training set is collected in a previous network and the obtained model performs poorly in the new network traffic scenario, so the problem to be solved by the embodiment of the present invention is how to train the large model of network traffic quickly when changing the new network traffic scenario.

In step S102, the network traffic data is converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated, wherein the network traffic large model to be updated is composed of a multi-layer neural network model Transformer.

Wherein in some embodiments, the target training layer is the last layer of the network traffic heavy model to be updated.

Optionally, in some embodiments, converting the network traffic data into the input data and the output data based on the input format standard and the output format standard of the target training layer of the network traffic heavy model to be updated includes: acquiring a current task of a network flow large model to be updated; based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task.

It should be understood that, in the network traffic heavy model to be updated formed by the transformers of the multi-layer neural network model, the transformers of the last layer of the network traffic heavy model to be updated can influence the traffic generation result most. Therefore, in the embodiment of the invention, after obtaining the network traffic data in the current network traffic scene, the whole model is not retrained, but the network traffic data is converted into the input data and the output data conforming to the last layer (namely the target training layer) through preprocessing based on the input format standard and the output format standard of the target training layer of the network traffic large model to be updated, and only the last layer is trained, and particularly as shown in fig. 2, the effect similar to retrained whole model is realized on the premise of greatly saving hardware resources and training time through the balance strategy.

Specifically, in fig. 2, a tree model similar to a transverse one refers to a network traffic large model. Where GTT refers to one module implemented by a transducer-based. Since different industry large models may differ slightly in their structure, it is considered here that GTT is a transducer. L1 to LN represent N layers of the network traffic heavy model to be updated, and the input of the former layer is used as the output of the latter layer for training. The LoRA model is a training method provided by the embodiment of the invention, namely, only the L _N layer (namely, the target training layer) is extracted, the output of the L _N layer is the output of the whole model, then the output does not need additional processing, the input is determined according to the current task of the whole model, and then the input and output data of the target training layer are based on the input and output data of the target training layer for retraining. In particular, its input data is statistical information of some traffic, for example, how many packets and bytes there are every 10 nanoseconds, and its output format standard of output data is how many packets and bytes there are 1 nanosecond.

In step S103, based on the preset LoRA model, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data, and the network traffic large model to be updated is updated by using the trained low-rank parameter matrix, so that the updated network traffic large model is quickly adapted to the new network traffic scene.

It should be understood that the embodiment of the invention uses the preset LoRA model to enable the network traffic heavy model to be updated to quickly adapt to a new scene. This is inspired by LoRA fine tuning the large language model, loRA is core of training the low rank parameter matrix of the whole model with a small amount of data from the new scene in the retraining stage, after training is completed, the retrained parameters are injected into the original model and replace the corresponding parameters, and the embodiment of the invention trains the low rank parameter matrix of the network traffic large model using the same method.

Specifically, the embodiment of the invention trains the low-rank parameter matrix of the target training layer of the network traffic large model to be updated by utilizing input data and output data based on a preset LoRA model, updates the low-rank parameter matrix before training by utilizing the trained low-rank parameter matrix, and updates the network traffic large model to be updated based on the trained low-rank parameter matrix so as to quickly adapt to a new network traffic scene according to the updated network traffic large model, and realizes the fine tuning effect of the network traffic large model with lower expenditure.

Optionally, in some embodiments, after updating the network traffic large model to be updated with the trained low rank parameter matrix, further comprises: acquiring a downstream task demand of a downstream model of a current network flow scene and a network message sequence output by an updated network flow large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.

In some embodiments, the input data of the downstream task adaptation model is a network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.

In order to cope with various downstream applications in a new network traffic scenario, an embodiment of the present invention further proposes an adaptation model that satisfies traffic characteristics of the downstream application, using a transducer that is widely known in the industry and that obtains an SOTA (state-of-the-art) effect in the general field as a base model, where an input of the downstream task adaptation model is an output of a network traffic heavy model (i.e. a network message sequence), and an output of the downstream task adaptation model is determined according to a requirement of a downstream task, for example, when the downstream application needs traffic with a certain numerical label, the output of the downstream task adaptation model is determined as a message sequence carrying the label, and see fig. 3 in particular.

In addition, in order to generate traffic to achieve better effect on downstream applications, the downstream task adaptation model is not directly trained as a single model, but is connected with the network traffic big model and the downstream applications to achieve end-to-end training, so that the updated network traffic big model generates traffic data adapting to the demands of downstream tasks, for example, the network traffic big model outputs a coarse-grained message (Coar-GRAINED PACKETS) only including the sequence number, the timestamp and the length of the message, but when the downstream tasks need packet header information and even some tags (PACKET TRACE), the downstream task adaptation model can generate downstream task tag fields, for example, the downstream task adaptation model uses a transducer to generate packet headers and some tags. By this method, the quality of the generated flow can no longer be judged only by the evaluation index of some distribution conditions of the flow.

In the embodiment of the invention, for the downstream task adaptation model, a user can directly use a pre-trained generator or use the model after fine adjustment according to own flow data. Use method 1: and using the flow super-resolution model and generating a data packet sequence, converting constraint conditions of the data packet into ACL (Access Control List ) rules by a user, and inputting the ACL rules into an inlet of a flow converter to obtain the flow. Use method 2: fine tuning and then using: the user can collect ACL rules and corresponding traffic in the network environment, call the interface provided by the embodiment of the invention, respectively substitute the ACL rules and the traffic into the rule and traffic encoder, train the similarity matrix (fine tuning process) which is more in line with the scene of the user, and then give ACL rules and generate traffic in the subsequent use.

According to the data center network flow model training method supporting rapid scene adaptation provided by the embodiment of the invention, network flow data in a current network flow scene is obtained, the network flow data are converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network flow large model to be updated, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data based on the preset LoRA model, and the network flow large model to be updated is updated by using the trained low-rank parameter matrix, so that rapid adaptation to a new network flow scene is realized by using the updated network flow large model. Therefore, the problem that the existing flow generation method is large in training cost when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow and does not consider additional content required by various downstream applications, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing flow fidelity generation, rapid adaptation of the new scene is further achieved, a method for generating fields required by a downstream task through a general transformation is achieved, a network flow large model and a downstream task model are connected, and performance of the generated flow on the downstream task is improved through an end-to-end training method.

The data center network flow model training device supporting rapid scene adaptation according to the embodiment of the invention is described with reference to the accompanying drawings.

Fig. 4 is a block diagram of a data center network traffic model training device supporting rapid adaptation of scenarios in accordance with an embodiment of the present invention.

As shown in fig. 4, the data center network traffic model training device 10 supporting rapid adaptation of a scenario includes: an acquisition module 100, a conversion module 200, and a training module 300.

The acquiring module 100 is configured to acquire network traffic data in a current network traffic scenario; the conversion module 200 is configured to convert the network traffic data into input data and output data based on an input format standard and an output format standard of a target training layer of the network traffic large model to be updated, where the network traffic large model to be updated is formed by a multi-layer neural network model transform; the training module 300 is configured to train a low-rank parameter matrix of a target training layer by using input data and output data based on a preset LoRA model, and update a network traffic large model to be updated by using the trained low-rank parameter matrix, so as to quickly adapt to a new network traffic scene by using the updated network traffic large model.

Optionally, in some embodiments, after updating the network traffic big model to be updated with the trained low rank parameter matrix, the training module 300 is further configured to: acquiring a downstream task demand of a downstream model of a current network flow scene and a network message sequence output by an updated network flow large model; and constructing a downstream task adaptation model based on the downstream task demands and the network message sequence, and connecting the updated network traffic large model and the downstream model based on the downstream task adaptation model so as to enable the updated network traffic large model to generate traffic data adapting to the downstream task demands.

Optionally, in some embodiments, the input data of the downstream task adaptation model is a network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.

Optionally, in some embodiments, the target training layer is a last layer of the network traffic heavy model to be updated.

Optionally, in some embodiments, the conversion module 200 is further configured to: acquiring a current task of a network flow large model to be updated; based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task.

It should be noted that, the explanation of the foregoing embodiment of the method for training a data center network traffic model supporting rapid adaptation of a scene is also applicable to the device for training a data center network traffic model supporting rapid adaptation of a scene in this embodiment, which is not described herein again.

According to the data center network flow model training device supporting rapid scene adaptation provided by the embodiment of the invention, network flow data in a current network flow scene is acquired, the network flow data are converted into input data and output data based on the input format standard and the output format standard of the target training layer of the network flow large model to be updated, the low-rank parameter matrix of the target training layer is trained by using the input data and the output data based on the preset LoRA model, and the network flow large model to be updated is updated by using the trained low-rank parameter matrix, so that rapid adaptation to a new network flow scene is realized by using the updated network flow large model. Therefore, the problem that the existing flow generation method is large in training expenditure when the model retraining method is used for adapting to a new scene is solved, the existing flow generation method only generates network flow, the problem of extra content required by various downstream applications is not considered, retraining of the whole model is avoided, retraining time of the model is shortened on the basis of guaranteeing the generated flow fidelity, rapid adaptation of the new scene is further achieved, the network flow large model can generate flow data which is accurately adapted to the downstream task, and performance of the generated flow on the downstream task is improved.

Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device may include:

Memory 501, processor 502, and a computer program stored on memory 501 and executable on processor 502.

The processor 502 implements the data center network traffic model training method supporting rapid adaptation of the scenario provided in the above embodiment when executing the program.

Further, the electronic device further includes:

A communication interface 503 for communication between the memory 501 and the processor 502.

Memory 501 for storing a computer program executable on processor 502.

The memory 501 may include high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

If the memory 501, the processor 502, and the communication interface 503 are implemented independently, the communication interface 503, the memory 501, and the processor 502 may be connected to each other via a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (PERIPHERAL COMPONENT INTERCONNECT, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.

Alternatively, in a specific implementation, if the memory 501, the processor 502, and the communication interface 503 are integrated on a chip, the memory 501, the processor 502, and the communication interface 503 may perform communication with each other through internal interfaces.

The processor 502 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the invention.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, realizes the data center network traffic model training method supporting rapid scene adaptation.

In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or N embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "N" means at least two, for example, two, three, etc., unless specifically defined otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more N executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or N wires, a portable computer cartridge (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.

The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims

1. A data center network traffic model training method supporting rapid scene adaptation, characterized in that it includes the following steps:

Get the network traffic data in the current network traffic scenario;

Based on the input format standard and the output format standard of the target training layer of the large network traffic model to be updated, the network traffic data is converted into input data and output data, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer;

Based on the preset LoRA model, the input data and the output data are used to train the low-rank parameter matrix of the target training layer, and the trained low-rank parameter matrix is used to update the large network traffic model to be updated, so as to quickly adapt to the new network traffic scenario through the updated large network traffic model.

2. According to the data center network traffic model training method supporting rapid scene adaptation according to claim 1, it is characterized in that after using the trained low-rank parameter matrix to update the large network traffic model to be updated, it also includes:

Obtaining the downstream task requirements of the downstream model of the current network traffic scenario and the network message sequence output by the updated network traffic large model;

A downstream task adaptation model is constructed based on the downstream task requirements and the network message sequence, and based on the downstream task adaptation model, the updated network traffic large model and the downstream model are connected so that the updated network traffic large model generates traffic data that adapts to the downstream task requirements.

3. According to the data center network traffic model training method that supports rapid scenario adaptation according to claim 2, it is characterized in that the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.

4. According to the data center network traffic model training method supporting rapid scenario adaptation according to claim 1, it is characterized in that the target training layer is the last layer of the large network traffic model to be updated.

5. The data center network traffic model training method supporting rapid scenario adaptation according to claim 1 is characterized in that the input format standard and output format standard of the target training layer based on the large network traffic model to be updated, converting the network traffic data into input data and output data, comprises:

Obtaining the current task of the large network traffic model to be updated;

Based on the input format standard and the output format standard, the network traffic data is converted into input data and output data according to the current task.

6. A data center network traffic model training device supporting rapid scene adaptation, characterized by comprising:

The acquisition module is used to obtain the network traffic data in the current network traffic scenario;

A conversion module, used for converting the network traffic data into input data and output data based on the input format standard and output format standard of the target training layer of the large network traffic model to be updated, wherein the large network traffic model to be updated is composed of a multi-layer neural network model Transformer;

A training module is used to train the low-rank parameter matrix of the target training layer based on a preset LoRA model using the input data and the output data, and to update the large network traffic model to be updated using the trained low-rank parameter matrix, so as to quickly adapt to new network traffic scenarios through the updated large network traffic model.

7. The data center network traffic model training device supporting rapid scene adaptation according to claim 6 is characterized in that after the trained low-rank parameter matrix is used to update the large network traffic model to be updated, the training module is further used to:

8. According to claim 7, the data center network traffic model training device that supports rapid scenario adaptation is characterized in that the input data of the downstream task adaptation model is the network message sequence, and the output data of the downstream task adaptation model is determined by the downstream task requirements.

9. An electronic device comprising a memory, a processor and a computer program stored in the memory, characterized in that the processor executes the computer program to implement the data center network traffic model training method supporting rapid scenario adaptation as described in any one of claims 1-5.

10. A computer-readable storage medium having a computer program stored thereon, characterized in that when the computer program is executed by a processor, the data center network traffic model training method supporting rapid scenario adaptation as described in any one of claims 1-5 is implemented.