[go: up one dir, main page]

WO2023222185A1 - Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement - Google Patents

Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement Download PDF

Info

Publication number
WO2023222185A1
WO2023222185A1 PCT/EP2022/063206 EP2022063206W WO2023222185A1 WO 2023222185 A1 WO2023222185 A1 WO 2023222185A1 EP 2022063206 W EP2022063206 W EP 2022063206W WO 2023222185 A1 WO2023222185 A1 WO 2023222185A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
domain
target domain
target
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP2022/063206
Other languages
English (en)
Inventor
Hannes LARSSON
Farnaz MORADI
Andreas Johnsson
Jalil TAGHIA
Xiaoyu LAN
Masoumeh EBRAHIMI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Priority to PCT/EP2022/063206 priority Critical patent/WO2023222185A1/fr
Publication of WO2023222185A1 publication Critical patent/WO2023222185A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning

Definitions

  • the present disclosure relates generally to evaluating a target domain machine learning (ML) model for deployment, and related methods and apparatuses.
  • ML machine learning
  • Transfer learning has been widely used in computer vision (CV) and natural language processing (NLP). TL has also been shown to be beneficial in other domains, such as telecommunications.
  • transfer learning problem is defined as follows. Given a source domain DS, a learning task TS, a target domain DT, and a learning task TT, transfer learning aims to help improve the learning of the target predictive function fT(-) in DT using the knowledge in DS and TS, where DS ⁇ DT and/or TS ⁇ TT.
  • pre-trained ML models are transferred and fine-tuned (e.g., re-trained and/or updating one or more model weights/parameters) in a target domain to try to achieve better performance, faster training, lower computational cost, and/or consequently lower energy consumption in a target task.
  • the number of available pretrained models also known as source models
  • the selection of the most beneficial source model has become a challenge.
  • Fine-tuning and evaluating all existing source models is inefficient and/or computationally unacceptable.
  • the evaluation of the performance of the transferred model can be challenging if only limited data samples are available in the target domain.
  • a ML model when a ML model is trained on a source domain but used for prediction in a target domain, particularly where an underlying distribution of the input features between the source and target domains are different, the ML model may tend to perform worse in the target domain than in the source domain.
  • An example of such an ML model may be an image ML model trained in a source domain corresponding to images from a winter landscape used for the same predictions but in a desert landscape. If there are no labeled samples from the target domain (e.g., desert from the example) there is no chance to update the source model using traditional TL by fine tuning weights.
  • Evaluation of performance of a supervised predictive model in a target domain may be lacking.
  • the field of domain adaptation may try to address potential problems when a ML model is trained on a source domain but used for prediction in a target domain where an underlying distribution of the input features between the source and target domains are different.
  • UDA unsupervised domain adaptation
  • training a domain adversarial neural network may be described for domain adaptation. See e.g., Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks", The journal of machine learning research 17.1: 2096-2030 (2016) (referred to herein as "Ganin”).
  • a domain invariant feature representation is described that may be learned so that a classifier that works on the source domain may also work on the target domain. Training is discussed that may be done by training a model using both source samples (e.g., labeled samples) and target samples (e.g. unlabeled samples) at the same time. Simultaneous training of the domain invariant feature representation and the actual ML task may be performed with a domain classifier that may try to distinguish which domain a feature representation comes from, in addition to the task of solving a main ML task.
  • a generalized framework for UDA may include three design choices including a generative or discriminative base model, tie or untie weights in feature extraction, and what adversarial training objective to use. See e.g., E. Tzeng, J. Hoffman, K. Saenko and T. Darrell, "Adversarial Discriminative Domain Adaptation," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2962-2971, doi: 10.1109/CVPR.2017.316 (2017) (referred to herein as "Tzeng”). Based on this framework, such an approach may train a domain adversarial neural network by leveraging an existing discriminative source model.
  • the model may be pre-trained on the source and adapted using an adversarial domain classifier with weights that are untied in a feature extractor.
  • domain adaptation is described where a target domain and source domain are mapped through different but similar neural networks. See e.g., Rozantsev, Artem, Mathieu Salzmann, and Pascal Fua, "Beyond sharing weights for deep domain adaptation", IEEE transactions on pattern analysis and machine intelligence 41.4: 801-814 (2016) (referred to herein as "Rozantsev”).
  • some layers in both the target and source network may share weights, but some other layers may have different but similar weights. This may minimize both the difference between feature representations on the learned feature representation but also penalizing weights that are too different between a source and target feature extractor.
  • UDA may be performed where, instead of using data from a single source, multiple sources with different data distributions may be used in order to try to achieve the domain adaptation. See e.g., Zhao S, Li B, Xu P, Keutzer K, "Multi -source domain adaptation in the deep learning era: A systematic survey”, arXiv preprint arXiv:2002.12169 (Feb. 26, 2020) (referred to herein as "Zhao").
  • Model parameters e.g. weights of a neural network ML model
  • weights of a neural network ML model may carry knowledge that can be shared and transferred among different domains.
  • weights of ML models that are pre-trained on data from one domain may be transferred to improve the ML model performance and speed up ML model training in different domains and/or for different tasks.
  • agents may share their ML model weights with a server.
  • the server may then aggregate them and send the aggregation to the agents to try to improve ML model performance while preserving data privacy.
  • One approach discusses predicting trends in the quality of state-of-the-art neural networks without access to training or testing data, by only investigating ML model weights. See e.g., Martin, Charles H. and Peng, Tongsu and Mahoney, Michael W., "Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data", Nature Communications (2021) (referred to herein "Martin”).
  • Another approach describes dissecting the weight space of neural networks that may show that a small subset of consecutive weights may reveal information about the training setup of a network and its hyperparameters.
  • features extracted from model weights may be used to predict a final performance of a ML model being trained to terminate underperforming runs during hyperparameter search. See e.g., Yasunori Yamada, Tetsuro Morimura, "Weight Features for Predicting Future Model Performance of Deep Neural Networks", in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) (2016) (referred to herein as "Yamada”).
  • some approaches discussed above lack customization of a ML model for a task at hand, including lack of a ML model performance evaluator to evaluate a ML model(s) (e.g., for source selection, early stopping, hyperparameter tuning, etc.) when a validation and/or test set is not available (e.g., due to a lack of data).
  • a validation and/or test set may not available (e.g., due to a lack of data).
  • such approaches may rely on validation and/or test performance to generate data needed to train a potential ML model evaluator, which may in turn lead to worse decisions and evaluations of ML models. See e.g., Zhao; Martin; Eilertsen.
  • Existing solutions may lack evaluation of performance of a supervised predictive model in a target domain including, e.g. where due to a lack of data, a reliable validation dataset is lacking. In other words, in cases where a reliable validation dataset is lacking in the target domain, a reliable ML model performance evaluator may not be constructed.
  • a model performance evaluator may be needed for training a supervised predictive model, among other tasks. Without a model performance evaluator, a trained predictive model may suffer either from underfitting or overfitting.
  • aspects of the present disclosure may overcome one or more problems with existing solutions which lack evaluation of performance of a supervised predictive model in a target domain. Aspects of the present disclosure may provide a method for evaluating a target domain ML model for deployment.
  • One of the applications of embodiments of the present disclosure may be, for example, telecommunication systems.
  • Aspects of the present disclosure may provide advantages with respect to validating a ML model(s) without a validation or test set more precisely, which may lead to a better ML model(s) with fewer available samples. This may lead to better decisions in ML model selection (e.g., compared to existing techniques), while still being able to train the ML model(s) better since all samples may be used for actual training.
  • One aspect of the present disclosure may provide a method performed by a node for evaluating a target domain ML model for deployment.
  • the method includes accessing data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the method further includes calculating a first performance metric for the target domain ML model using the domain adaptive model evaluator for the trained target domain ML model.
  • the method further includes reporting the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • the node configured to evaluate a target domain ML model for deployment.
  • the node includes processing circuitry, and at least one memory coupled with the processing circuitry.
  • the memory stores program code that is executed by the processing circuitry to perform operations.
  • the operations include to access data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the operations further include to calculate a first performance metric for the target domain ML model using the domain adaptive model evaluator for the trained target domain ML model.
  • the operations further include to report the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • Still another aspect of the present disclosure may provide a node configured to evaluate a target domain ML model for deployment is provided that is adapted to perform operations comprising to access data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the operations further include to calculate a first performance metric for the target domain ML model using the domain adaptive model evaluator for the trained target domain ML model.
  • the operations further include to report the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • Yet another aspect of the present disclosure may provide a computer program product including a non-transitory storage medium including program code to be executed by processing circuitry of a node. Execution of the program code causes the node to perform operations comprising to access data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the operations further include to calculate a first performance metric for the target domain ML model using the domain adaptive model evaluator for the trained target domain ML model.
  • the operations further include to report the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • Still another aspect of the present disclosure may provide a computer program including program code to be executed by processing circuitry of a node.
  • the program code causes the node to perform operations comprising to access data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the operations further include to calculate a first performance metric for the target domain ML model using the domain adaptive model evaluator for the trained target domain ML model.
  • the operations further include to report the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • Figure 1 is a schematic diagram of operations of a method in accordance with some embodiments of the present disclosure.
  • Figure 2 is a block diagram illustrating components and operations for a method in accordance with some embodiments of the present disclosure
  • Figure 3 is a flowchart illustrating operations of a node for offline training of a domain adaptive model evaluator in accordance with some embodiments of the present disclosure
  • Figure 4 is a flowchart illustrating operations of a node for online training of a domain adaptive model evaluator in accordance with some embodiments of the present disclosure
  • Figure 5 is a sequence diagram illustrating operations of a method of the present disclosure that includes early stopping
  • Figure 6 is a sequence diagram illustrating operations of an example embodiment of a method of the present disclosure including a hyperparameter grid search and online training of a domain adaptive model evaluator;
  • Figure 7 is a flowchart illustrating operations of a node according to some embodiments of the present disclosure.
  • Figure 8 is a block diagram of a node in accordance with some embodiments of the present disclosure.
  • a ML model for a supervised task may be trained using a dataset with features and labels.
  • Such a ML model may be evaluated using a validation set by not using all labeled samples in the training set. However, if there are only a few labeled samples available, validation of the ML model may use methods other than a validation set. See e.g., Zhao; Martin; Eilertsen.
  • model evaluator refers to a ML model for evaluating the performance of another ML model based on, e.g., neural network model weights. The evaluation can be in the form of predicting ML model performance, ranking ML models, etc.
  • a method may evaluate performance of a supervised predictive ML model in a target domain, where a reliable validation dataset may be lacking (e.g., due to lack of data).
  • a source model performance evaluator is provided for training the supervised predictive model.
  • the source model performance evaluator may be trained on source domain but applied in a target domain.
  • a domain adaptive model evaluator is adapted for a difference between the two domains. Without a domain adaptive model evaluator, the trained predictive model may suffer from either underfitting or overfitting.
  • the source model performance evaluator is constructed in a source domain where there is sufficient data for construction of a reliable validation dataset.
  • the resulting source model performance evaluator is adapted to a supervised learning task in a target domain.
  • the terms "source model performance evaluator” and “source model evaluator” are interchangeable and refer to a model evaluator to be adapted to a target task model as discussed herein.
  • the phrase a "task model” refers to the predictive model for which the model evaluator is evaluating the performance.
  • the task model can exist in a source domain (e.g., a source task model) or a target domain (e.g., a target task model).
  • Unsupervised domain adaptation (UDA) techniques may be used for adapting the source model evaluator to the target domain.
  • the adapted model evaluator can serve as the domain adaptive model evaluator in the target domain and can be used for, among other things, design of a stopping criterion for the training of the supervised predictive model.
  • Figure 1 is a schematic diagram of operations of a method in accordance with some embodiments of the present disclosure.
  • a domain adaptive model evaluator 101 from a different but related source is used and the domain adaptive model evaluator 101 is adapted (e.g., customized) to a supervised learning target domain task model 103 being evaluated.
  • the customization of the domain adaptive model evaluator 101 evaluator may be done using a UDA method.
  • the method may allow improvement of the precision of the domain adaptive model evaluator 101 for the target task without having access to validation or test sets for the target task mode.
  • the domain adaptive model evaluator 101 is a ML model in charge of evaluating another ML model (e.g., target domain ML model 103, which is also referred to as a task model).
  • the evaluation may be based on weight data (e.g., weight statistics) extracted from the ML model being evaluated (e.g., target domain ML model 103, which may be a neural network).
  • weight data refers to raw weights extracted from a target domain ML model (e.g., a neural network), or some derived weight statistics thereof.
  • the domain adaptive model evaluator 101 takes as its inputs samples of one or more of: a source model evaluator (which is a ML model in itself); source weight data; source targets corresponding to the weight data; and weight data of the target domain predictive ML model 103.
  • the input samples may come from a source buffer 105 and a target buffer 107.
  • the domain adaptive model evaluator 101 generates an adapted version of the source model evaluator which can be used for the performance evaluation of the target domain predictive ML model 103.
  • An example embodiment of inputs to the domain adaptive model evaluator 101 includes, without limitation, raw weights from the target domain ML model 103. For training, such raw weights of the source task model may be used as inputs, e.g. as described in UDA literature.
  • Outputs of the domain adaptive model evaluator 101 include a feature such as ML model performance (e.g., model accuracy).
  • a task model/target domain ML model is a predictive ML model in the target domain to be/being evaluated.
  • the task model is trained using labeled data.
  • labeled or labels refers to what is predicted from the task model, e.g. for both cases of regression and classification.
  • performance of the task model is evaluated using the domain adaptive model evaluator 101 instead of using a validation set (e.g., due to a lack of data).
  • Weight data e.g., weight statistics
  • a source buffer 105 comprises a database with samples from a source training dataset.
  • the samples may contain both features (e.g., weight statistics) and labels (e.g., model performance or similar).
  • features e.g., weight statistics
  • labels e.g., model performance or similar.
  • some samples from the source may be available for domain adaptation training.
  • samples from both the source and the target may be used at the same time to learn domain invariant mappings for the source and target.
  • a target buffer 107 comprises a database with samples from the target domain and may contain only weight data (e.g., weight statistics). Samples from the target buffer 107 may be used as input to the domain adaptive model evaluator 101 to adapt a domain adaptation component as discussed further herein.
  • samples from the source training dataset e.g., from source buffer 105 and from the target domain (e.g., from target buffer 107) are used to train the domain adaptive model evaluator 101.
  • FIG. 2 is a block diagram illustrating components and operations for a method in accordance with some embodiments of the present disclosure.
  • Source model performance evaluator 211 is constructed in a source domain 200 where there is sufficient data 201, 203, 205 for construction of a reliable validation dataset.
  • Data 201, 203, 205 includes a training dataset 201, a validation dataset 203, and a predictive ML model 205.
  • the resulting source model performance evaluator 211 is adapted to a supervised learning task in a target domain 207.
  • Target domain 207 includes a training dataset 209 and a target task model 103.
  • UDA 213 is used for adapting the source model performance evaluator 211 to the target domain 207.
  • UDA 213 may use any UDA technique, as discussed further herein, which may allow improvement of the precision of the domain adaptive model evaluator 101 for the target task without having access to validation or test sets for the target task model 103.
  • the domain adaptive model evaluator 101 takes as its inputs one or more samples of the data 201, 203, 205 and the source model performance evaluator 211 from the source domain 200 and weight data of the target domain predictive ML model 103.
  • the input samples may come from a source buffer 105 and a target buffer 107.
  • the domain adaptive model evaluator 101 generates an adapted version of the source model evaluator 211 which can be used for the performance evaluation of the target domain predictive ML model 103.
  • the adapted source model performance evaluator 211 serves as the domain adaptive model evaluator 101 in the target domain 207 and can be used for, among other things, design of a stopping criterion for the training of the supervised predictive model 205.
  • FIG. 3 is a flowchart illustrating operations of a node for offline training of a domain adaptive model evaluator 101 in accordance with some embodiments of the present disclosure.
  • the node can be node 800 of Figure 8 that is configured to evaluate a target domain ML model for deployment.
  • ML model evaluation decisions are not needed until ML models are fully trained which allows training the domain adaptive model evaluator 101 in an offline fashion, which may simplify training.
  • data 201/203/205 is received from a source domain 200 (e.g., source task model weights, corresponding source task model performances, optionally a source model evaluator). Task model weights and corresponding model performances may be put in a source buffer.
  • a source domain 200 e.g., source task model weights, corresponding source task model performances, optionally a source model evaluator.
  • Task model weights and corresponding model performances may be put in a source buffer.
  • the domain adaptive model evaluator 101 is initialized.
  • a target task model 103 is trained (operation 305) and, during training, weight data 209 (e.g., weight statistics) are collected.
  • the weight data 209 is sent to a target buffer (e.g., target buffer 107).
  • the node determines whether all target domain ML models that will be evaluated are done training. If yes, operations return to operation 305 for the next task model. If no, the method proceeds to operation 311. In operation 311, the domain adaptive model evaluator 101 is trained.
  • ML model e.g., source mode performance evaluator 211
  • model 211 e.g., model 211
  • Candidate ML models e.g., target domain ML model 103 are evaluated in operation 313 and a decision/selection is made on which candidate ML model to use based on the evaluation results output from the domain adaptive model evaluator 101.
  • FIG. 4 is a flowchart illustrating operations of a node for online training of a domain adaptive model evaluator in accordance with some embodiments of the present disclosure.
  • the node can be node 800 of Figure 8 that is configured to evaluate a target domain ML model for deployment.
  • Online training may be suitable for, e.g., early stopping.
  • Design parameters for online training include a number (n) of iterations/epochs between updates of the domain adaptive model evaluator 101.
  • the domain adaptive model evaluator 101 is initialized. In one embodiment, the initialization is conducted by copying neural-network weights from the weights of a selected source model evaluator 211.
  • initialization may begin with operation 403 where an existing domain adaptive model evaluator 101 from previous iterations of the same task model is used as the domain adaptive model evaluator 101.
  • the task model in the target domain 103 is trained for a number of iterations/epochs.
  • Weight data e.g., weight statistics
  • the collected weight data may be sent to a target buffer (if used), or directly as inputs to the domain adaptive model evaluator 101 for training.
  • the domain adaptive model evaluator 101 is updated (e.g., a domain adaptation component of the domain adaptive model evaluator is updated, as discussed further herein) using available data (target and source). This may be done using a method for UDA.
  • the node determines whether the task model 103 has finished training. If no, the method returns to operation 405. If yes, the method proceeds to operation 411.
  • the determination may be performed in various ways. For example, by early stopping or training for a pre-determined number of epochs. When the determination is performed by training for a pre-determined number of epochs, the task model 103 performs the determination. When the determination is performed by early stopping, the determination is performed by sending weights of the task model 103 to the domain adaptive model evaluator 101. If the stopping criterion is met, then training is terminated and the method proceeds to operation 411. Otherwise, training continues by repeating operations 405 to 409.
  • the domain adaptive model evaluator can be used for the purpose of early stopping. If so, a decision can be made in operation 409 by the domain adaptive model evaluator 101 in choosing between models from the same training pass. [0058] In operation 411, the node determines whether more ML models 103 should be trained for the target task. If yes, the method proceeds to operation 403. If no, the training of the domain adaptive model evaluator 101 and task model 103 ends.
  • a decision can be made at different steps of the method. In the case of early stopping, a decision can be made at operation 409. Otherwise, the decision can be made at operation 413.
  • the method for evaluating a target domain ML model 103 and making a decision on model selection comprises the following operations: initialing the domain adaptive model evaluator 101.
  • the domain adaptive model evaluator 101 may be based on a neural-network architecture.
  • the operations further include receiving training data from the source 200 and target 207 domains, respectively; and partially training the target domain ML model 103 with an existing training dataset in the target domain 207.
  • the operations further include training (that is, updating) the domain adaptive model evaluator 101 using weight data (e.g., weight statistics obtained from the partially trained target domain ML model 103 and a source domain weight data (e.g., weight statistics) and models 211.
  • the operations further include calculating an evaluation score using the domain adaptive model evaluator 101 for the target domain ML model 103.
  • the operations further include generating a decision on ML model selection fortarget domain ML models 103. It is noted that decisions can be made both in operations 409 and 413 at the same time. That is while using the domain adaptive model evaluator 101 for early stopping and then using it again for a hyperparameter search.
  • Training of the domain adaptive model evaluator 101 for UDA can be performed in various ways.
  • UDA 213 can (i) leverage an existing source model 211 and adapt the existing source model 211, such as in adversarial discriminative domain adaptation (ADDA) (e.g., an unsupervised domain adaptation framework where target domain data is unlabeled, and ADDA may reduce a difference(s) between source and target domain distributions); (ii) train a model from scratch using samples from both source and target domain; (iii) UDA 213 can utilize multiple sources; etc.
  • ADDA adversarial discriminative domain adaptation
  • a model may be initialized randomly and trained jointly with both source and target samples.
  • DANN domain adversarial neural network
  • FIG. 5 is a sequence diagram illustrating operations of a method of the present disclosure that includes early stopping.
  • Target domain ML model 103 is trained, including collection of data samples, in operations 501-511.
  • Target domain ML model 103 signals a request (operation 501) for model evaluation to domain adaptive model evaluator (DAME) 101.
  • DAME 101 signals request 503 to source model evaluator 211.
  • source model evaluator 211 sends samples (e.g., weight statistics and labels) to source buffer 105.
  • target domain ML model 103 sends samples (e.g., weight statistics) to target buffer 107.
  • Target buffer 107 in operation 509, sends the samples from operation 507 to DAME 101; and in operation 511, source buffer 105 sends the samples from operation 505 to DAME 101.
  • DAME 101 is trained using the received samples.
  • DAME 101 informs target domain ML model 103 that DAME 101 is updated/trained. Responsive to the information, target domain ML model 103, in operation 515, sends weight statistics for evaluation to DAME 101. Responsive to receiving the weight statistics for evaluation, in operation 517, DAME 101 sends a result of weight evaluation to target domain ML model 103. Target domain ML model 103 collects samples and is trained with the result of the weight evaluation. In operation 519, target domain ML model 103 sends samples (e.g., weight statistics) to target buffer 107. Responsive to receiving the samples, target buffer 107 sends samples from the target buffer 107 to DAME 101. In operation 523, source buffer 105 sends samples from source buffer 105 to DAME 101. DAME 101 is trained with the received samples.
  • samples e.g., weight statistics
  • DAME 101 informs target domain ML model 103 that DAME 101 is updated/trained. Responsive to operation 5252, in operation 527, target domain ML model 103 sends weight statistics to DAME 101 for evaluation. Responsive to operation 527, in operation 529, DAME 101 sends a result of weight evaluation to target domain ML model 103. Training of target domain ML model 103 is stopped based on DAME 101 feedback. In operation 531, target domain ML model 103 informs DAME 101 that training of the target ML model 103 is finished, but more target domain ML models 103 may be evaluated. Responsive to this information, DAME 101 is kept and waits for the next model evaluation request.
  • an example task includes a source and a target in similar domains, and the domain adaptive model evaluator 101 is used for hyperparameter tuning using an exhaustive search (e.g., a grid search).
  • An online method is used for the domain adaptation.
  • samples from source and target domains 200, 207 are used to construct the domain adaptive model evaluator 101 jointly.
  • a source model evaluator 211 also can be an input.
  • the example embodiment includes the following:
  • a target task comprising key performance indicator (KPI) prediction for a service running on a cloud infrastructure (e.g., a Video-on-Demand (VoD) framerate). Due to a limited number of samples, the model is evaluated without splitting a small data set into train, validation, and test sets. The domain adaptive model evaluator 101 is used to find a good configuration of hyperparameters.
  • KPI key performance indicator
  • VoD Video-on-Demand
  • a source task comprising KPI prediction for another service running on the cloud infrastructure (e.g., distributed Key-Value-Store (KVS) average read time) which includes a large data set with access to validation and training sets in addition to a large training set.
  • KVS distributed Key-Value-Store
  • a source task model 211 is trained for a source task of predicting average read time for KVS. During training, neural-network weights are collected from the source task model 211, together with corresponding model performance. These weights and the model performances are stored (e.g., in a source buffer 105).
  • the weights and model performances from the source task model 211 are sent to a source buffer 105.
  • the task model 103 e.g., predicting VoD framerate
  • the weights are occasionally saved and sent to a task buffer 107.
  • training of the domain adaptive model evaluator 101 begins. Training methodology can be performed as described. For example, training discussed in Ganin may be expanded.
  • the domain classifier coupled with a gradient reversal layer may be used to update both the feature representations so that they may be indistinguishable between a source and target domain at the same time as the domain classifier may maximize its own ability to distinguish between the domains in a single optimization step.
  • Inputs include target task model 103 weights and source task model weights, and outputs include source task model performances and domain classes.
  • the domain adaptive model evaluator 101 can continue to be trained, which may improve the domain adaptive model evaluator 101.
  • Table 1 shows different features used for different ML models of this example embodiment.
  • Table 1 also includes a source model evaluator 211 (that is, a model evaluator trained on the source that uses a validation set).
  • the flow chart of Figure 3 corresponds to operations of the method of this example embodiment.
  • Figure 6 is a sequence diagram illustrating operations of the example embodiment discussed above for a hyperparameter grid search and online training of the domain adaptive model evaluator.
  • the method of Figure 6 includes performance of the operations 501-531 of Figure 5.
  • the portions of Figure 5 that are included in Figure 6 that is, operations 501-531 of Figure 5) are not repeated for Figure 6, but rather are incorporated by reference in their entirely with respect to Figure 6.
  • further operation 601 is performed in which operations 505 -531 for each new hyperparameter configuration and all final models ae stored. Further operations 603- 607 are now discussed herein with respect to Figure 6.
  • target domain ML model 103 sends weight statistics of final models for each hyperparameter configuration to DAME 101. Responsive to operation 603, in operation 605, DAME 101 sends scores of each candidate target domain ML model to target domain ML model 103. In operation 607, target domain ML model 103 selects the candidate target domain ML model having a highest score for deployment.
  • Iris flower dataset is a well-known small dataset, which leads to small models that are well suited for a conceptual evaluation of the model weights. While the experiment was performed with an Iris dataset, the method of the present disclosure is not so limited and can include, without limitation, performance with other methods from the domain adaptation domain or a change of hyperparameters, etc.
  • the evaluation results showed performance of a model evaluator 211 trained on a source task but not the actual target problem, and the results of the domain adaptive model evaluator 101 adapted to the new task using an ADDA-method, as described herein.
  • Both the source and target task used the Iris flower data set, where a type of flower is classified from four features. There were three possible classes.
  • the source data set was a task model that used the data set with the input features normalized.
  • the target data set was with a task model 103 trained with the Iris flower dataset features un-normalized. This led to the source and target task models needing different weights to solve the same tasks, which in turn led to a domain shift between the source and target model.
  • the weight data collected from the task models in order to train the domain evaluators collected all parameters from the task models (19 in total in the experiment) and corresponding normalized mean absolute error (NMAE) on a test set.
  • NMAE normalized mean absolute error
  • the corresponding NMAE from the target set was only used for evaluating the models, as it was not used during training. Stopping was not done on the domain adapted models to model lack of access to labeled data. Evaluation of domain adaption models, however, also may be done with methods for stopping, such as evaluating source model performance in case of DANN training (see e.g., Ganin), or evaluating KL-divergence for the representations learned from source and target in ADDA, cycle loss, etc.
  • FIG. 7 is a flowchart illustrating operations of a node according to some embodiments of the present disclosure.
  • the node can be node 800 of Figure 8 that is configured to evaluate a target domain ML model (e.g., target domain ML model 103) for deployment.
  • the method includes accessing (705) data comprising (i) a plurality of first model parameters of a target domain ML model trained on a training data of the target domain, (ii) a plurality of second model parameters of a source domain ML model, and (iii) a plurality of performance metrics of the source domain ML model.
  • the method further includes training (707) a domain adaptive model evaluator (e.g., domain adaptive model evaluator 101) with the accessed data; and calculating (709) a first performance metric for the target domain ML model using the trained domain adaptive model evaluator.
  • the method further includes reporting (713) the calculated first performance metric for a selection of a trained target domain ML model for deployment in the target domain.
  • the domain adaptive model evaluator comprises a ML model trained on the source domain that is adapted to the target domain.
  • the ML model may be a neural network.
  • the plurality of first model parameters and the plurality of second model parameters may comprise a plurality of weight and/or performance data of the target domain ML model and a plurality of weight and/or performance data of the source domain ML model, respectively.
  • the method further includes training (707) a domain adaptive model evaluator with the accessed data.
  • the method further includes initializing (701) the domain adaptive model evaluator; receiving (703) the training data from the target domain; repeating (711) the accessing, the training, and the calculating for a number of iterations or epochs for at least one additional target domain ML model; and reporting (715) the respective calculated first performance metrics from the repeating (711) for a selection of one of the target domain ML models from the iterations of epochs for deployment in the target domain.
  • the training data of the target domain comprises a plurality of weights or functions of the plurality of weights from the target domain ML model being evaluated.
  • the first performance metric may comprise a score.
  • the selection may comprise selecting a trained target domain ML model for deployment in the target domain based on the score.
  • the selected trained target domain ML model comprises at least one of a trained target domain ML model having a highest score, and a plurality of trained target domain ML models that respectively have a defined score.
  • the initializing (701) may include one of (i) copying weights of a source domain ML evaluator, (ii) using the domain adaptive model evaluator, wherein the domain adaptive model evaluator exists from a prior iteration of a prior source domain ML model or a prior target domain ML model, and (iii) randomly.
  • the accessing (705) data may be performed after a defined number of the iterations or epochs.
  • the accessing (705) data may be (i) received from a target memory, or (ii) received at the domain adaptive model evaluator from the target domain ML model being evaluated.
  • the training (707) may include an unsupervised UDA procedure.
  • the method further includes receiving (717) information from the target domain ML model being evaluated.
  • the information may include that the target domain ML model is finished training based on (i) satisfying a stopping criteria, or (ii) feedback from the domain adaptive model evaluator based on completion of a defined number of iterations or epochs.
  • the stopping criteria may include a second performance metric that identifies whether a performance of the target domain ML model satisfies the second performance metric based on weights from the target domain ML model in an iteration or between iterations.
  • the selection may include selection of the trained target domain ML model from a same training pass based on the first performance metric.
  • the selection may include the selection of the trained target domain ML model from the defined number of iterations or epochs based on the first performance metric.
  • Various operations from the flow chart of Figure 7 may be optional with respect to some embodiments of nodes and related methods. For example, operations 701, 703, 707, 711, 715 and/or 717 may be optional.
  • FIG. 8 is a block diagram of an overview of a node in accordance with some embodiments of the present disclosure.
  • Node 800 may be a server or other computing device that perform operations discussed herein with respect to a node.
  • node 800 may be a node performing model management in an open radio access network (O- RAN) architecture, including where a limited amount of data has been collected for a target domain.
  • O- RAN open radio access network
  • the node includes domain adaptive model evaluator 101.
  • node 800 and/or an element(s)/function(s) thereof may be embodied as a virtual node/nodes, a virtual machine/machines.
  • Input data to node 800 may be provided by an input source such as source buffer 105 and/or target buffer 107 that is connected over a network interface 807 (e.g., over Ethernet or Infiniband).
  • An output evaluation may also be provided over network interface 807.
  • Figure 8 further includes processor 803 (also referred to as processing circuitry) and memory 805 (also referred to as a memory circuit).
  • the memory 805 stores computer readable program code that when executed by the processor 803 cause the processor 803 to perform operations according to embodiments disclosed herein.
  • processor 803 may be defined to include memory so that a separate memory is not required.
  • processing circuitry 803 may control network interface 807 to transmit communications/data through network interface 807 to one or more task models (task models 103) or nodes/devices including/communicating with one or more task models, and/or to receive communications through network interface 807 from one or more buffers, ML models, or other nodes and/or devices.
  • processing circuitry 803 may provide instructions so that when the instructions are executed by processor 803, processor 803 performs respective operations with respect to domain adaptive model evaluator 101 (e.g., operations discussed herein with respect to example embodiments relating to nodes).
  • nodes may comprise multiple different physical components that make up a single illustrated component, and functionality may be partitioned between separate components.
  • a communication interface may be configured to include any of the components described herein, and/or the functionality of the components may be partitioned between the processing circuitry and the communication interface.
  • non- computationally intensive functions of any of such components may be implemented in software or firmware and computationally intensive functions may be implemented in hardware.
  • processing circuitry executing instructions stored on in memory, which in certain embodiments may be a computer program product in the form of a non-transitory computer-readable storage medium.
  • some or all of the functionality may be provided by the processing circuitry without executing instructions stored on a separate or discrete device-readable storage medium, such as in a hard-wired manner.
  • the processing circuitry can be configured to perform the described functionality. The benefits provided by such functionality are not limited to the processing circuitry alone or to other components of the computing device, but are enjoyed by the node as a whole, and/or by end users generally.
  • the terms “comprise”, “comprising”, “comprises”, “include”, “including”, “includes”, “have”, “has”, “having”, or variants thereof are open-ended, and include one or more stated features, integers, elements, steps, components or functions but does not preclude the presence or addition of one or more other features, integers, elements, steps, components, functions or groups thereof.
  • the common abbreviation “e.g.”, which derives from the Latin phrase “exempli gratia” may be used to introduce or specify a general example or examples of a previously mentioned item, and is not intended to be limiting of such item.
  • the common abbreviation “i.e.”, which derives from the Latin phrase “id est,” may be used to specify a particular item from a more general recitation.
  • Example embodiments are described herein with reference to block diagrams and/or flowchart illustrations of computer-implemented methods, apparatus (systems and/or devices) and/or computer program products. It is understood that a block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions that are performed by one or more computer circuits.
  • These computer program instructions may be provided to a processor circuit of a general purpose computer circuit, special purpose computer circuit, and/or other programmable data processing circuit to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, transform and control transistors, values stored in memory locations, and other hardware components within such circuitry to implement the functions/acts specified in the block diagrams and/or flowchart block or blocks, and thereby create means (functionality) and/or structure for implementing the functions/acts specified in the block diagrams and/or flowchart block(s).
  • These computer program instructions may also be stored in a tangible computer-readable medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the functions/acts specified in the block diagrams and/or flowchart block or blocks. Accordingly, embodiments of the present disclosure may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.) that runs on a processor such as a digital signal processor, which may collectively be referred to as "circuitry," "a module” or variants thereof.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Un procédé mis en œuvre par ordinateur est exécuté par un nœud pour évaluer un modèle d'apprentissage automatique (ML) de domaine cible pour déploiement. Le procédé comprend l'accès à des données comprenant (i) une pluralité de premiers paramètres de modèle d'un modèle d'apprentissage automatique de domaine cible entraîné sur des données d'entraînement du domaine cible, (ii) une pluralité de seconds paramètres de modèle d'un modèle d'apprentissage automatique de domaine source, et (iii) une pluralité de mesures de performance du modèle d'apprentissage automatique de domaine source. Le procédé comprend en outre le calcul d'une première mesure de performance pour le modèle d'apprentissage automatique de domaine cible; et le rapport de la première mesure de performance calculée pour une sélection d'un modèle d'apprentissage automatique de domaine cible entraîné à déployer dans le domaine cible.
PCT/EP2022/063206 2022-05-16 2022-05-16 Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement Ceased WO2023222185A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/063206 WO2023222185A1 (fr) 2022-05-16 2022-05-16 Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/063206 WO2023222185A1 (fr) 2022-05-16 2022-05-16 Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement

Publications (1)

Publication Number Publication Date
WO2023222185A1 true WO2023222185A1 (fr) 2023-11-23

Family

ID=82021036

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2022/063206 Ceased WO2023222185A1 (fr) 2022-05-16 2022-05-16 Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement

Country Status (1)

Country Link
WO (1) WO2023222185A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118132414A (zh) * 2023-12-28 2024-06-04 江淮前沿技术协同创新中心 一种机器人智能性测评方法及系统
CN119988915A (zh) * 2025-04-17 2025-05-13 之江实验室 视觉语言模型的评测方法和公开评测平台
WO2025107640A1 (fr) * 2023-11-24 2025-05-30 华为技术有限公司 Procédé et appareil de réutilisation de modèle d'ia

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
ALAWIEH MOHAMED BAKER ET AL: "ADAPT: An Adaptive Machine Learning Framework with Application to Lithography Hotspot Detection", 2021 ACM/IEEE 3RD WORKSHOP ON MACHINE LEARNING FOR CAD (MLCAD), IEEE, 30 August 2021 (2021-08-30), pages 1 - 6, XP033970276, DOI: 10.1109/MLCAD52597.2021.9531210 *
E. TZENGJ. HOFFMANK. SAENKOT. DARRELL: "Adversarial Discriminative Domain Adaptation", IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR, 2017, pages 2962 - 2971
GABRIEL EILERTSENDANIEL JONSSONTIMO ROPINSKIJONAS UNGERANDERS YNNERMAN: "Classifying the classifier: dissecting the weight space of neural networks", ARXIV:2002.05688, 2020
GANINYAROSLAV ET AL.: "Domain-adversarial training of neural networks", THE JOURNAL OF MACHINE LEARNING RESEARCH, vol. 17, no. 1, 2016, pages 2096 - 2030
LARSSON HANNES ET AL: "Source Selection in Transfer Learning for Improved Service Performance Predictions", 2021 IFIP NETWORKING CONFERENCE (IFIP NETWORKING), IFIP, 21 June 2021 (2021-06-21), pages 1 - 9, XP033939035, DOI: 10.23919/IFIPNETWORKING52078.2021.9472818 *
MARTINCHARLES HPENGTONGSUMAHONEYMICHAEL W.: "Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data", NATURE COMMUNICATIONS, 2021
ROZANTSEV, ARTEMMATHIEU SALZMANNPASCAL FUA: "Beyond sharing weights for deep domain adaptation", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 41, no. 4, 2018, pages 801 - 814, XP011712935, DOI: 10.1109/TPAMI.2018.2814042
THOMAS UNTERTHINERANDDANIEL KEYSERSSYLVAIN GELLYOLIVIER BOUSQUETILYA TOLSTIKHIN: "Predicting Neural Network Accuracy from Weights", ARXIV:2002.11448, 2020
YASUNORI YAMADATETSURO MORIMURA: "Weight Features for Predicting Future Model Performance of Deep Neural Networks", PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-16, 2016
ZHAO SLI BXU PKEUTZER K: "Multi-source domain adaptation in the deep learning era: A systematic survey", ARXIV PREPRINT ARXIV:2002.12169, 26 February 2020 (2020-02-26)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025107640A1 (fr) * 2023-11-24 2025-05-30 华为技术有限公司 Procédé et appareil de réutilisation de modèle d'ia
CN118132414A (zh) * 2023-12-28 2024-06-04 江淮前沿技术协同创新中心 一种机器人智能性测评方法及系统
CN119988915A (zh) * 2025-04-17 2025-05-13 之江实验室 视觉语言模型的评测方法和公开评测平台

Similar Documents

Publication Publication Date Title
CN111369042B (zh) 一种基于加权联邦学习的无线业务流量预测方法
CN115358487A (zh) 面向电力数据共享的联邦学习聚合优化系统及方法
WO2023222185A1 (fr) Évaluation d'un modèle d'apprentissage automatique de domaine cible pour déploiement
CN109948029B (zh) 基于神经网络自适应的深度哈希图像搜索方法
CN113011529B (zh) 文本分类模型的训练方法、装置、设备及可读存储介质
CN110995487B (zh) 多服务质量预测方法、装置、计算机设备及可读存储介质
CN113869521A (zh) 构建预测模型的方法、装置、计算设备和存储介质
CN110770764A (zh) 超参数的优化方法及装置
CN118337640B (zh) 一种云与多边缘网络节点协同的微服务部署方法
CN114896899A (zh) 一种基于信息交互的多智能体分散式决策方法及系统
CN116976461A (zh) 联邦学习方法、装置、设备及介质
CN111371644A (zh) 基于gru的多域sdn网络流量态势预测方法及系统
CN112087329A (zh) 一种网络服务功能链部署方法
CN115456202B (zh) 一种提高工作机学习性能的方法、装置、设备及介质
WO2022252694A1 (fr) Procédé et appareil d'optimisation de réseau neuronal
CN111210002A (zh) 一种基于生成对抗网络模型的多层学术网络社区发现方法、系统
CN115996135B (zh) 一种基于特征组合优化的工业互联网恶意行为实时检测方法
CN117009539A (zh) 知识图谱的实体对齐方法、装置、设备及存储介质
CN115544307A (zh) 基于关联矩阵的有向图数据特征提取与表达方法和系统
CN118070873B (zh) 一种基于迁移学习的边缘数字孪生体部署方法
CN112910680A (zh) 一种融合多粒度社区信息的网络嵌入方法
JP2022025392A (ja) 機械学習装置及び機械学習方法
CN117675610A (zh) 面向无服务器边缘计算的稳健任务部署方法及系统
CN118784462A (zh) 一种基于深度强化学习的自动网络配置方法、装置和设备
Zhang et al. Floodsfcp: Quality and latency balanced service function chain placement for remote sensing in leo satellite network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22730086

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22730086

Country of ref document: EP

Kind code of ref document: A1