EP4487257A1

EP4487257A1 - Method and system for improving the robustness of a compressed machine learning model

Info

Publication number: EP4487257A1
Application number: EP22712251.2A
Authority: EP
Inventors: Peter Schlicht; Fabian HÜGER; Jan Zawadzki
Original assignee: Cariad SE
Current assignee: Cariad SE
Priority date: 2022-03-02
Filing date: 2022-03-02
Publication date: 2025-01-08
Also published as: WO2023165670A1

Abstract

The present invention discloses a method and a system for improving the robustness of a compressed machine learning model. The computer-implemented method for improving the robustness of a compressed machine learning model comprises the following steps: (i) training a machine learning model, (ii) obtaining metadata for said machine learning model (iii) obtaining from a database unlabeled data, (iv) perturbing and or augmenting the unlabeled data, (v) re-training the machine learning model using the perturbed and/or augmented data (vi) and compressing the machine learning model by optimizing the machine learning model architecture and/or quantizing the machine learning model metadata

Description

Method and system for improving the robustness of a compressed machine learning model

When bringing artificial intelligence models into safety-critical applications like healthcare or driver assistance, robustness of the models with respect to data perturbations is compulsory. Robustness is the propriety of a model to react to missing data and/or corrupted data (e.g. adversarial attacks, strong weather influence on sensor data etc.). In addition when applications run on mobile devices or in near real time, compute, storage and communication resources are scarce as those products have less resources available relative to non-portable devices. In those cases, models have to be compressed in order to meet the required latency and hardware constraints while still maintaining robustness respect to adversarial attacks.

One possibility for compressing a models is pruning. Pruning is the process of removing weight connections in a machine learning model in order to increase inference speed and decrease model storage size. Usually the weight connections that are removed, are not used in the model (as it happens in over parametrized or redundant models) so there is no loss of information by removing them.

Another possibility for compressing a model is knowledge distillation. Knowledge distillation is a model compression technique whereby a small network (called “student”) is taught by a larger trained neural network (called “teacher”). Basically, the smaller network is trained to behave like the large neural network. This enables the deployment of the student models on small devices such as mobile phones or other edge devices.

A third possibility for compressing a model is quantization. Quantization reduces the precision of a trained model by converting a high-precision format to a low-precision format (e.g. 32-bit to 4 bit parameter representations), without sacrificing model accuracy which is retained to almost the same level either prior to or post quantization.

Regarding robustness, a method for increasing robustness is to train the machine learning model using perturbed data to simulate malicious attacks and/or missing data. This can be done by adversarial training or by perturbing a given data set using augmentation based method.

Data augmentation allows to artificially modify existing data sets by applying small modifications to them. As example, during data augmentation existing data sets can be randomly rotated, translated, cropped or the colors can be modified and/or altered. Additionally random objects could be added to the data.

Adversarial training is a method for improving the robustness of a machine learning model and consists of including adversarial examples in the training set or in the loss function of the machine learning model. An adversarial example is a corrupted version of a valid input, where the corruption is done by adding a perturbation of a small magnitude to the input in such a way that the model output is changed drastically.

The process of increasing the robustness of a machine learning model is also known as model hardening.

Compression and hardening are usually supervised post-processing steps which are performed independently from each other, see for example “Robustness of Compressed Deep Neural Networks with Adversarial Training” by Yunchun et al. “Supervised” means that a large annotated (labeled) datasets is used for fine-tuning. “Labeling data” means that typically, human agents have to manually annotate the data by using annotation computer programs, where the images to be annotated have been uploaded. During this process a label is then assigned to identified object. A label describes the class to which an object belongs. Examples of classes are vehicles, bicycles, buildings, pedestrian, traffic signs, traffic lights and so on. Because the process of manually annotating data is quite expensive and cumbersome, supervised processes are usually quite time consuming and expensive.

This might cause serious down times for autonomous fleets as the correction of potentially safety-critical weaknesses in Perception models will need a considerable time for data acquisition, labeling and model adaptation.

In “Data free adversarial distillation” an algorithm for unsupervised, robustness- focused DNN compression by means of knowledge distillation is presented. The method uses a very specific compression method (knowledge distillation). The algorithm is applied to semantic segmentation and is data-free. This has the disadvantage that the model cannot be tailored for specific optimization and hence it lacks accuracy. It is an object of the present invention to provide a robust compressed machine learning model, which uses unlabeled data and enables the user to define requirements and model parameters in order to tailor the machine learning model on the user’s needs.

The object is achieved by a method and a system according to the independent claims. Embodiments of the invention are discussed in the dependent claims and the following description

Description

In particular, a computer-implemented method for improving the robustness of a compressed machine learning model is provided, wherein the machine learning model provides a function for the automated driving of a motor vehicle and/or for a driver assistance of the motor vehicle and/or for detecting the surroundings and/or perceiving the surroundings of the motor vehicle. The computer-implemented method for improving the robustness of a compressed machine learning model comprises the following steps:

(i) training a machine learning model,

(ii) obtaining metadata for said machine learning model

(iii) obtaining from a database unlabeled data,

(iv) perturbing and or augmenting the unlabeled data,

(v) re-training the machine learning model using the perturbed and/or augmented data

(vi) and compressing the machine learning model by optimizing the machine learning model architecture and/or quantizing the machine learning model metadata

In particular the machine learning model could be trained for perception tasks. Perception tasks gather all the necessary information about the surrounding environment of the moving vehicle. Examples of perception tasks are object classification, semantic segmentation and depth estimation. In a preferred embodiment the machine learning model is trained to perform an object classification. Such model classifies objects in an image with respect to one or more classes. Examples of classes are vehicles, pedestrian, traffic signs and so one. In yet another preferred embodiment the machine learning model is trained to perform semantic segmentation. Semantic segmentation refers to the process of linking each pixel in an image to a class. In yet another preferred embodiment the machine learning model is trained to perform a depth estimation. Depth estimation refers to the process of linking each pixel in an image to a depth . The trained machine learning model is a neural network, preferably a deep neural network. The machine learning model could be trained either in a supervised or unsupervised manner. The trained machine learning model comprises, in particular, a structure description and parameters (e.g. filter parameters, weightings, activation functions, etc.) of the neural network. This data are stored as metadata.

Preferably, the metadata contains information about the machine learning model such for example, the model architecture, model parameters and/or model performance. In addition metadata could contain information about the requirements for the machine learning model and for the data specification. Example of requirements could be latency and/or accuracy.

In a preferred embodiment requirements could be defined by a user: for example a user could define which latency and/or robustness and/or performance the machine learning model should achieve. The requirements could depend on the task that the machine learning model should perform. For example, some perception tasks such as object classification could require a higher robustness.

Requirements could also contain information about data specification, such for example the operational design domain or possible classes of data perturbation of unlabeled data sets.

In an embodiment of the present invention, unlabeled data sets are stored in a database. The database could be stored in an external server, preferably a cloudbased server. A data set comprises, in particular, data, preferably (acquired) sensor data which are not be labeled or annotated. The data can in particular be onedimensional or multidimensional, in particular two-dimensional. For example, the data may be images of a camera or a LiDAR sensor or a radar sensor or a stereo-camera. In principle, however, any desired sensor data or a combination thereof can be used. The data can be both frame based or sequential. Data where points in the data set are dependent on other points in the dataset are said to be sequential data. An example of sequential data can be sensor data where each point represents an observation at a certain point in time. On the contrary, frame based data are data which are independent on the other points in the dataset.

In one embodiment of the present invention the unlabeled data sets are perturbed by using data augmentation or data augmentation methods. Data augmentation or data augmentation methods allow to artificially modify existing data set. Preferably a multiplicity of modifications can be provided. By way of example modification could comprise: adding noise, changing a contrast, changing brightness, changing colors, rotating, translating or cropping a particular object in a frame. In the automotive field context data augmentation could comprise changing a weather condition (e.g., adding snow or rain in a camera image captured in the summer) or simulate a dangerous situation like a car’s accident. A data augmentation or a data augmentation method is configured or defined in particular as a function of physical sensor properties (disturbances, etc.) and/or possible physical and/or technical disturbances of the sensor system and/or possible adverse examples.

In a preferred embodiment of the present invention the data augmentation to be applied to the data set is defined by a user. The user preferably indicates how data of the data set are to be changed or modified.

In yet another embodiment of the present invention adversarial examples could be used as input to the trained machine learning model. Generally adversarial examples are adversarial perturbations of a given training input which are only slightly different from the given training input, which have the property that when the model predictions differ drastically on the clean and the perturbed data. In particular, in some cases, the difference between each entry of the adversarial perturbation and the corresponding entry of the training input is small enough that the difference would be discarded by a human. Examples of adversarial learning can be found in Goodfellow et al. “Explaining and Harnessing Adversarial Examples” (https://arxiv.org/abs/1412.6572v3), Preferably, the adversarial examples are adversarial perturbations of the unlabeled data set.

In yet another preferred embodiment of the present invention adversarial examples and data augmentation can be used simultaneously in order to increase the robustness of the machine learning model. Preferably both adversarial examples and data augmentation are used on the unlabeled stored data set.

By re-training the machine learning model using data augmentation and/or adversarial examples, the robustness of the machine learning model is improved.

During the re-training or before the re-training, the machine learning model is compressed. Preferably compression is done by quantization and/or pruning and/or optimization of the machine learning model architecture. Quantization is a known process which reduces the precision of a machine learning model by converting a high-precision format to a low-precision format, without loss of accuracy or performance. Preferably quantization reduces the number of bits required to represent every weight. In other words, it decreases the number of parameters by exploiting redundancy. The process of pruning consists in deleting weights and/or neurons that have minor influence on the performance of the machine learning model. Additionally or alternatively, optimization of the machine learning architecture can be reached by changing or regrouping or squeeze the layers of the machine learning model. In a preferred embodiment of the present invention, the users could specify which parameters should be used for the compression of the machine learning model.

In a preferred embodiment of the present invention the step of re-training and of compression of the machine learning model could be done simultaneously.

The present invention further discloses a system for improving the robustness of a compressed machine learning model. The system comprises a data storage unit for storing a trained machine learning model and the relative metadata, a database containing unlabeled data, a perturbation engine configured to receive the unlabeled data set and to generate a perturbed and/or augmented training data set based on said unlabeled data; a training engine configured to train, using the perturbed and/or augmented training data set, the stored machine learning model; and an architecture compression engine configured to compress the machine learning model by optimizing the machine learning model architecture and/or quantizing the machine learning model metadata. The system comprises at least one recipient device data storage for storing the compressed and hardened machine learning model.

Preferably the system is located in a cloud -server.

In at least some embodiments, the one or more data storage units may comprise at least one element of the group of a computer readable storage medium, such as Random Access Memory (RAM), Programmable Read Only Memory (PROM), Erasable Programmable Read Only Memory (EPROM), an Electronically Erasable Programmable Read Only Memory (EEPROM), or a network storage.

In a preferred embodiment of the present invention, the system further comprises a user interface. The user interface could be a form-based user interface, a graphical user interface, a menu-driven user interface, a touch user interface or a voice user interface.

The user could enter via the user interface the desired requirements for the compression and/or the re-training of the models. For example the users could define which performance the machine learning model should retain. Additionally or alternatively the user could define the unlabeled data set to be used. For example, if the user wants to have a compressed and hardened machine learning model for image based object detections, he could require to use unlabeled data from a camera. Additionally or alternatively the user could require a given compression of the model, the compression being dependent on the type of devices on which the compressed and hardened machine learning model should run.

Additionally or alternatively the user could require a given accuracy for the compressed and hardened machine learning model. The required accuracy can then determine which compression technique should be used. Preferably, more than one compression technique is used.

Additionally or alternatively the user could define a robustness for the compressed and hardened machine learning model. The defined robustness can then determine which perturbation technique should be used. Preferably, more than one perturbation technique is used. For example data augmentation could be used together with adversarial samples perturbation.

After compression and re-training, the hardened and compressed machine learning model is stored in at least one recipient device data. Preferably the at least one recipient device data is stored in a cloud-server.

In one embodiment of the present invention the recipient device data is connected to an external device. Preferably the external device is a mobile device, for example a computer, a smartphone or a tablet. The mobile device could download and install the compressed and hardened machine learning model. The machine learning model can then run on the mobile device.

In a preferred embodiment of the present invention the recipient device data is connected is connected to a motor vehicle. Preferably, the motor vehicle communicates with the recipient device data via a Car2X connection. The motor vehicle could download and install the compressed and hardened machine learning model. The compressed and hardened machine learning model could then run in real time in the motor vehicle and perform the perception tasks for which it was trained.

Special embodiments of the present invention are described below.

Fig. 1 describes a first embodiment of the present invention

Fig. 2 describes a second embodiment of the present invention

Fig. 3 describes a third embodiment of the present invention

Fig. 4 describes a system for improving the robustness of a compressed machine learning model according to the present invention Fig, 1 describes a first embodiment 100 of the present invention. The trained machine learning model 101 is preferably a deep neural network (DNN) and it is trained to perform a perception task model. A perception task can be an objection detection task, an object classification task, a semantic segmentation task and so on. Unlabeled data 102 are stored in a database. The unlabeled data 102 are then perturbed by perturbation techniques like data augmentation and/or adversarial samples training. In step 104, the perturbed data 103 are used to re-train the machine learning model 101. The re-trained machine learning model is then compressed in step 105. The compression step uses as input requirements 106, such inputs being given by a user.

The result is a compressed and hardened machine learning model 107.

Fig. 2 describes a third embodiment of the present invention. The trained machine learning model 201 is preferably a deep neural network (DNN) and it is trained to perform a perception task model. A perception task can be an objection detection task, an object classification task, a semantic segmentation task and so on. The compression step 206 and the re-training step 204 are performed simultaneously.

The compression step uses as input requirement 206 from a user while the re-trained step 204 is performed by using the perturbed data 203. The perturbed data are obtaining by perturbing unlabeled data sets 202, stored in a database. The result is a compressed and hardened machine learning model 207.

Note that in the two embodiments above, the user could also define other requirements, like which accuracy the model should achieve and/or which data should be used for re-trained and/or which types of adversarial attacks should be performed.

Fig. 3 describes a system according to a first embodiment of the present invention. The system comprises a user interface 301 . The user interface can be a form-based user interface, a graphical user interface, a menu-driven user interface, a touch user interface or a voice user interface. The user interface 301 is connected to the database 302 and/or to the architecture compression engine 304 and/or to perturbation engine 303. Through the user interface 301 the user can for example require to use a particular data set from the database 302. In another example, additionally or alternatively the user could require through the user interface 301 a given accuracy or latency of the machine learning model. In another example, additionally or alternatively the user could require through the user interface 301 a given perturbation of the data, said request being communicated direct to the perturbation engine 303. In another example, additionally or alternatively the user could require through the user interface 301 a given compression of the machine learning model, said request being communicated direct to the architecture compression engine 304.

The machine learning model is stored in the storing device 305. A training machine learning engine 306 is connected to the perturbation engine 303 and to the storing device 305. The perturbation engine 303 perturbs the data from database 302 by augmentation and/or by adversarial samples. The training machine learning engine 306 gets as input the machine learning model stored in the storing device 305 and the perturbed data from perturbation engine 303. The training machine learning engine 306 gives as output a machine learning model, that is robust against adversarial attacks. This machine learning model is used as input for the architecture compression engine 304. The architecture compression engine 304 delivers as output a compressed machine learning model to the recipient device data 307. The recipient device data 307 stores the hardened and compressed machine learning device. The recipient device data 307 is connected to an external device 308. The external device could be a mobile device, like a smartphone, a tablet, a laptop. In another preferred embodiment the external device 308 is a motor vehicle. Preferably the connection between motor vehicle and recipient device data 307 is performed through a car2X technology.

Preferably, the devices 302, 303, 304, 305, 306 and 307 are located in a cloud 300.

Fig. 4 describes a system according to a second embodiment of the present invention. The system comprises a user interface 401 . The user interface can be a form-based user interface, a graphical user interface, a menu-driven user interface, a touch user interface or a voice user interface. The user interface 401 is connected to the database 402 and/or to the architecture compression engine 404 and/or to perturbation engine 403. Through the user interface 401 the user can for example require to use a particular data set from the database 402. In another example, additionally or alternatively the user could require through the user interface 401 a given accuracy or latency of the machine learning model. In another example, additionally or alternatively the user could require through the user interface 401 a given perturbation of the data, said request being communicated direct to the perturbation engine 403. In another example, additionally or alternatively the user could require through the user interface 401 a given compression of the machine learning model, said request being communicated direct to the architecture compression engine 404. The perturbation engine 403 perturbs the data from database 402 by augmentation and/or by adversarial samples The machine learning model is stored in the storing device 405. A training machine learning engine 406 is connected to the perturbation engine 403, to the storing device 405 and to the architecture compression engine 404. The training machine learning engine 406 gets as input the machine learning model stored in the storing device 405 and the perturbed data from perturbation engine 403. Simultaneously, the architecture compression engine 406 processes the input of the user interfaces 406 and exchanges data with the training machine learning engine 406. In this embodiment the compression and the re-training steps take place simultaneously. The output of those two steps (404 and 406) is a compressed machine learning model, that is robust against adversarial attacks. The robust and compressed machine learning algorithm is stored into the recipient device data 407. The recipient device data 407 is connected to an external device 408. The external device could be a mobile device, like a smartphone, a tablet, a laptop. In another preferred embodiment the external device 408 is a motor vehicle. Preferably the connection between motor vehicle and recipient device data 407 is performed through a car2X technology.

Preferably, the devices 402, 403, 404, 405, 406 and 407 are located in a cloud 400.

Various features, aspects, and embodiments have been described herein. The features, aspects, and embodiments are susceptible to combination with one another as well as to variation and modification, as will be understood by those having skill in the art.

Claims

1. A computer-implemented method for improving the robustness of a compressed machine learning model the method comprising:

(i) training a machine learning model,

(ii) obtaining metadata for said machine learning model

(iii) obtaining from a database unlabeled data,

(iv) perturbing and or augmenting the unlabeled data,

(vi) compressing the machine learning model by optimizing the machine learning model architecture and/or quantizing the machine learning model metadata

2. The method according to claim 1 wherein the metadata contain information about the architecture and/or the parameters and/or the performance and/or the performance and/or the robustness and/or the data domains

3. The method according to one of the previous claims, wherein the perturbed training data set is obtained by applying data augmentation on the unlabeled data and/or by perturbing the unlabeled data by adversarial training

4. The method according to one of the previous claims, wherein the unlabeled data are taken by one or more sensors, like camera, LiDAR, stereocamera or Radar.

5. The method according to one of the previous claims, wherein the step (v) and (vi) are performed simultaneously

6. The method according to one of the previous claims, wherein the machine learning model is a deep neural network trained to perform perception tasks

7. A system for improving the robustness of a compressed machine learning model, the system comprising: a data storage unit for storing a trained machine learning model and the relative metadata. a database containing unlabeled data, a perturbation engine configured to receive as input the unlabeled data set and to generate a perturbed training data set and/or an augmented data set based on said unlabeled data; a training engine configured to re-train, using the perturbed and/or augmented training data set, the stored machine learning model ; and an architecture compression engine configured to compress the machine learning model or the re-trained machine learning model by optimizing the machine learning model architecture and/or quantizing the machine learning model metadata, at least one recipient device data storage configured to receive the robust compressed machine learning model The system according to claim 7, wherein the system is stored in a cloudserver. The system according to claims 7 or 8, wherein the unlabeled data are stored in an external server. The system of claims 7 to 9 further comprising a user interface for defining the requirements for the machine learning model and/or for defining the data set of the unlabeled data and/or for defining the perturbation and/or the augmentation of the unlabeled data The system according to claims 7 to 10, wherein at least one recipient device data storage is further connected to at least one external device. The system according to claim 11 , wherein the at least one external device is a mobile device or a motor vehicle .