WO2022264387A1 - 学習装置、学習方法、および、学習プログラム - Google Patents
学習装置、学習方法、および、学習プログラム Download PDFInfo
- Publication number
- WO2022264387A1 WO2022264387A1 PCT/JP2021/023123 JP2021023123W WO2022264387A1 WO 2022264387 A1 WO2022264387 A1 WO 2022264387A1 JP 2021023123 W JP2021023123 W JP 2021023123W WO 2022264387 A1 WO2022264387 A1 WO 2022264387A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- learning
- loss
- data
- parameters
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/094—Adversarial learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
Definitions
- the present invention relates to a model learning device, a learning method, and a learning program.
- Adversarial Example causes the classifier to misjudge by adding noise to the data to be classified.
- Adversarial Training trains a model (classifier) using Adversarial Examples.
- the model trained by Adversarial Training has the problem of low generalization performance. This is because the loss landscape (shape of loss function) for the weight of the model learned by Adversarial Training is sharp. Therefore, in order to flatten the above loss landscape, there is a technique to add noise (perturbation) to the weight in the direction of maximizing the loss of the model.
- the above technology has the problem of degraded prediction performance for data without noise. Therefore, it is an object of the present invention to learn a model that solves the above-described problems and that can accurately predict even data without noise while ensuring robustness against adversarial examples.
- the present invention provides a data acquisition unit that acquires training data for a model for predicting the label of input data including Adversarial Examples, Add noise that maximizes the KL divergence of the loss value in the model to the parameters, and use the loss function that flattens the loss landscape for the parameters and the learning data including the Adversarial Example. and a learning unit for learning the model.
- FIG. 1 is a diagram showing a configuration example of a learning device.
- FIG. 2 is an equation for explaining the reason why the eigenvector h corresponding to the maximum eigenvalue ⁇ of the Fisher information matrix G can be obtained in order to obtain MAX v in equation (10).
- FIG. 3 is a flow chart showing an example of a processing procedure of the learning device.
- FIG. 4 is a flow chart showing an example of a processing procedure of the learning device.
- FIG. 5 is a diagram for explaining an application example of the learning device.
- FIG. 6 is a diagram showing experimental results for the model learned by the learning device.
- FIG. 7 is a diagram showing a configuration example of a computer that executes the learning program.
- the learning device of this embodiment uses data including adversarial examples (data to which noise is added) to learn a model that predicts the label of input data.
- a loss function (loss function) used for model learning
- the learning device maximizes the KL divergence of the loss value in the model when noise is added to the parameters of the model and when noise is not added.
- a loss function that flattens the loss landscape for the parameters by adding such noise to the parameters is used.
- the learning device can learn a model that can accurately predict labels even for data without noise while ensuring robustness for Adversarial Examples.
- the learning device 10 includes an input unit 11, an output unit 12, a communication control unit 13, a storage unit 14, and a control unit 15, for example.
- the input unit 11 is an interface that receives input of various data.
- the input unit 11 receives input of data used for learning processing and prediction processing, which will be described later.
- the output unit 12 is an interface that outputs various data.
- the output unit 12 outputs the label of data predicted by the control unit 15 .
- the communication control unit 13 is implemented by a NIC (Network Interface Card) or the like, and controls communication between an external device such as a server and the control unit 15 via a network.
- the communication control unit 13 controls communication between the control unit 15 and a management device or the like that manages learning target data.
- the storage unit 14 is realized by a semiconductor memory device such as RAM (Random Access Memory) and flash memory, or a storage device such as a hard disk and an optical disk, and stores the parameters of the model learned by the learning process described later. remembered.
- a semiconductor memory device such as RAM (Random Access Memory) and flash memory
- a storage device such as a hard disk and an optical disk
- the control unit 15 is implemented using, for example, a CPU (Central Processing Unit) or the like, and executes a processing program stored in the storage unit 14 . Thereby, the control unit 15 functions as an acquisition unit 15a, a learning unit 15b, and a prediction unit 15c, as illustrated in FIG.
- a CPU Central Processing Unit
- the control unit 15 functions as an acquisition unit 15a, a learning unit 15b, and a prediction unit 15c, as illustrated in FIG.
- the acquisition unit 15a acquires data used for learning processing and prediction processing, which will be described later, via the input unit 11 or the communication control unit 13.
- the learning unit 15b uses data including Adversarial Examples as learning data to learn a model that predicts the label of the input data.
- the learning unit 15b as a loss function used for learning the model, selects noise that maximizes the KL divergence of the loss value in the model when noise is added to the parameters of the model and when noise is not added. is added to the parameter, and the loss function that flattens the loss landscape for the parameter is used.
- the model to be learned is a model representing the probability distribution of label y of data x, and is represented by Equation (1) using parameter ⁇ .
- f in Equation (1) is a vector representing a label output by the model.
- the learning unit 15b learns the model by determining the parameter ⁇ of the model so that the value of the loss function represented by Equation (2) becomes small.
- Equation (2) represents the true probability.
- the learning unit 15b learns the model so that the label can be predicted correctly even for the Adversarial Example (see formula (3)) in which noise ⁇ is superimposed on the data x. That is, the learning unit 15b performs adversarial training shown in Equation (4).
- ⁇ is defined as the noise (perturbation) on the scale of w for each filter, as shown in Equation (7) below.
- k is the index of the filter.
- Equation 10 This loss function is represented by the following equation (10). Note that ⁇ (w) in Equation (10) corresponds to ⁇ (w) shown in Equation (5).
- the learning unit 15b uses learning data including Adversarial Examples and the above loss function to learn a model that predicts the label of the input data. That is, the learning unit 15b uses the learning data to obtain the parameter ⁇ of the model that minimizes the loss calculated by the above loss function.
- the prediction unit 15c uses the learned model to predict the label of the input data. For example, the prediction unit 15c calculates the probability of each label of newly acquired data by applying the learned parameter ⁇ to the above equation (1), and outputs the label with the highest probability. As a result, the learning device 10 can output a correct label even when the input data is Adversarial Example, for example.
- the acquisition unit 15a acquires learning data including Adversarial Examples (S1).
- the learning unit 15b learns a model representing the probability distribution of the label of the input data using the learning data and the loss function (S2).
- this loss function adds noise to the parameters that maximizes the KL divergence of the loss value in the model, when noise is added to the model parameters and when noise is not added, It is a loss function that flattens the loss landscape for the parameters.
- the learning unit 15b stores the model parameters learned in S2 in the storage unit 14 .
- the acquisition unit 15a acquires label prediction target data (S11).
- the prediction unit 15c uses the model learned by the learning unit 15b to predict the label of the data acquired in S11 (S12). For example, the prediction unit 15c calculates p(x') of the data x' acquired in S11 by applying the learned parameter ⁇ to the above equation (1), and outputs the label with the highest probability. do. As a result, for example, even if the data x' is Adversarial Example, the learning device 10 can output a correct label.
- the learning device 10 described above may be applied to data anomaly detection.
- An application example in this case will be described with reference to FIG.
- a case where the detection device 20 is equipped with the function of the prediction unit 15c will be described as an example.
- the learning device 10 uses teacher data (learning data) acquired from the data acquisition device and the loss function described above to perform model learning (adversarial training). After that, when the detection device 20 acquires new data x' from the data acquisition device, it calculates p(x') of the data x' using the trained model. Then, the detection device 20 outputs a report as to whether or not the data x' is abnormal data based on the label with the highest probability.
- FIG. 6 shows the result of an evaluation experiment of label prediction accuracy using the model learned by the learning device 10 of the present embodiment.
- robust acc and natural acc were evaluated for the model trained by the learning device 10 of this embodiment.
- robustacc is a value that indicates the classification accuracy of the data on which the Adversarial Example is placed (prediction accuracy of the data label).
- natural acc is a value that indicates the classification accuracy of data without noise. Both robust acc and natural acc take values from 0 to 100.
- a model trained by AT and a model trained by AWP were compared. The experimental conditions are as follows.
- the model learned by the learning device 10 has higher values for both robust acc and natural acc than the model learned by AT. Also, the model trained by the learning device 10 of the present embodiment has a slightly lower value for robust acc than the model trained by AWP, but a significantly higher value for natural acc.
- the model trained by the learning device 10 is a model that can accurately predict even data without noise while ensuring robustness against adversarial examples.
- each constituent element of each part shown in the figure is functionally conceptual, and does not necessarily need to be physically configured as shown in the figure.
- the specific form of distribution and integration of each device is not limited to the illustrated one, and all or part of them can be functionally or physically distributed and integrated in arbitrary units according to various loads and usage conditions. Can be integrated and configured.
- all or any part of each processing function performed by each device can be implemented by a CPU and a program executed by the CPU, or implemented as hardware based on wired logic.
- the learning device 10 described above can be implemented by installing a program on a desired computer as package software or online software.
- the information processing device can function as the learning device 10 by causing the information processing device to execute the above program.
- the information processing apparatus referred to here includes a desktop or notebook personal computer.
- information processing devices include mobile communication terminals such as smartphones, mobile phones and PHS (Personal Handyphone Systems), and terminals such as PDAs (Personal Digital Assistants).
- the learning device 10 can also be implemented as a server device that uses a terminal device used by a user as a client and provides the client with services related to the above processing.
- the server device may be implemented as a web server, or may be implemented as a cloud that provides services related to the above processing by outsourcing.
- FIG. 7 is a diagram showing an example of a computer that executes a learning program.
- the computer 1000 has a memory 1010 and a CPU 1020, for example.
- Computer 1000 also has hard disk drive interface 1030 , disk drive interface 1040 , serial port interface 1050 , video adapter 1060 and network interface 1070 . These units are connected by a bus 1080 .
- the memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012 .
- the ROM 1011 stores a boot program such as BIOS (Basic Input Output System).
- BIOS Basic Input Output System
- Hard disk drive interface 1030 is connected to hard disk drive 1090 .
- a disk drive interface 1040 is connected to the disk drive 1100 .
- a removable storage medium such as a magnetic disk or optical disk is inserted into the disk drive 1100 .
- Serial port interface 1050 is connected to mouse 1110 and keyboard 1120, for example.
- Video adapter 1060 is connected to display 1130, for example.
- the hard disk drive 1090 stores, for example, an OS 1091, application programs 1092, program modules 1093, and program data 1094. That is, the program that defines each process executed by the learning device 10 is implemented as a program module 1093 in which computer-executable code is described. Program modules 1093 are stored, for example, on hard disk drive 1090 .
- the hard disk drive 1090 stores a program module 1093 for executing processing similar to the functional configuration of the learning device 10 .
- the hard disk drive 1090 may be replaced by an SSD (Solid State Drive).
- the data used in the processes of the above-described embodiments are stored as program data 1094 in the memory 1010 or the hard disk drive 1090, for example. Then, the CPU 1020 reads out the program modules 1093 and program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary and executes them.
- the program modules 1093 and program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in a removable storage medium, for example, and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program modules 1093 and program data 1094 may be stored in another computer connected via a network (LAN (Local Area Network), WAN (Wide Area Network), etc.). Program modules 1093 and program data 1094 may then be read by CPU 1020 through network interface 1070 from other computers.
- LAN Local Area Network
- WAN Wide Area Network
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Machine Translation (AREA)
- Percussion Or Vibration Massage (AREA)
- Electrically Operated Instructional Devices (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
Description
本実施形態の学習装置は、Adversarial Example(ノイズが付加されたデータ)を含むデータを用いて、入力されたデータのラベルを予測するモデルの学習を行う。ここで、学習装置は、モデルの学習に用いる損失関数(loss関数)として、モデルのパラメータにノイズを加えた場合とノイズを加えなかった場合とで、モデルにおけるloss値のKLダイバージェンスが最大になるようなノイズをパラメータに加え、パラメータに対するloss landscapeを平らにしたloss関数を用いる。
図1を用いて、学習装置10の構成例を説明する。学習装置10は、例えば、入力部11、出力部12、通信制御部13、記憶部14、および、制御部15を備える。
次に、図3を参照して、学習装置10による学習処理手順の例について説明する。図3に示す処理は、例えば、学習処理の開始を指示する操作入力があったタイミングで開始される。
次に、図4を参照して、学習装置10による入力データのラベルの予測処理の例について説明する。図4に示す処理は、例えば、予測処理の開始を指示する操作入力があったタイミングで開始される。
上記の学習装置10を、データの異常検知に適用してもよい。この場合の適用例を、図5を参照しながら説明する。ここでは、前記した予測部15cの機能が、検知装置20に装備される場合を例に説明する。
次に、本実施形態の学習装置10により学習されたモデルによる、ラベルの予測精度の評価実験の結果を図6に示す。本実験では、本実施形態の学習装置10により学習されたモデルについて、robust accとnatural accを評価した。
Deep learning model: Resnet18
Adversarial Example: PGD
PGDのパラメータ: eps=8/255, train_iter=7, eval_iter=20, eps_iter=0.01, rand_init=True, clip_min=0.0, clip_max=1.0
また、図示した各部の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。さらに、各装置にて行われる各処理機能は、その全部又は任意の一部が、CPU及び当該CPUにて実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。
前記した学習装置10は、パッケージソフトウェアやオンラインソフトウェアとしてプログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記のプログラムを情報処理装置に実行させることにより、情報処理装置を学習装置10として機能させることができる。ここで言う情報処理装置には、デスクトップ型又はノート型のパーソナルコンピュータが含まれる。また、その他にも、情報処理装置にはスマートフォン、携帯電話機やPHS(Personal Handyphone System)等の移動体通信端末、さらには、PDA(Personal Digital Assistant)等の端末等がその範疇に含まれる。
11 入力部
12 出力部
13 通信制御部
14 記憶部
15 制御部
15a 取得部
15b 学習部
15c 予測部
20 検知装置
Claims (5)
- Adversarial Exampleを含む入力データのラベルを予測するためのモデルの学習データを取得するデータ取得部と、
前記モデルのパラメータにノイズを加えた場合とノイズを加えなかった場合とで、前記モデルにおけるloss値のKLダイバージェンスが最大になるようなノイズをパラメータに加え、前記パラメータに対するloss landscapeを平らにしたloss関数と、前記Adversarial Exampleを含む学習データとを用いて、前記モデルの学習を行う学習部と
を備えることを特徴とする学習装置。 - 前記学習部は、
前記学習データを用いて、前記loss関数により算出されるlossを最小化する前記モデルのパラメータを求める
ことを特徴とする請求項1に記載の学習装置。 - 学習された前記モデルを用いて、入力データのラベルを予測する予測部
をさらに備えることを特徴とする請求項1に記載の学習装置。 - 学習装置により実行される学習方法であって、
Adversarial Exampleを含む入力データのラベルを予測するためのモデルの学習データを取得する工程と、
前記モデルのパラメータにノイズを加えた場合とノイズを加えなかった場合とで、前記モデルにおけるloss値のKLダイバージェンスが最大になるようなノイズをパラメータに加え、前記パラメータに対するloss landscapeを平らにしたloss関数と、前記Adversarial Exampleを含む学習データとを用いて、前記モデルの学習を行う工程と
を含むことを特徴とする学習方法。 - Adversarial Exampleを含む入力データのラベルを予測するためのモデルの学習データを取得する工程と、
前記モデルのパラメータにノイズを加えた場合とノイズを加えなかった場合とで、前記モデルにおけるloss値のKLダイバージェンスが最大になるようなノイズをパラメータに加え、前記パラメータに対するloss landscapeを平らにしたloss関数と、前記Adversarial Exampleを含む学習データとを用いて、前記モデルの学習を行う工程と
をコンピュータに実行させるための学習プログラム。
Priority Applications (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/023123 WO2022264387A1 (ja) | 2021-06-17 | 2021-06-17 | 学習装置、学習方法、および、学習プログラム |
| AU2021451244A AU2021451244B2 (en) | 2021-06-17 | 2021-06-17 | Training device, training method, and training program |
| US18/567,779 US20240152822A1 (en) | 2021-06-17 | 2021-06-17 | Training device, training method, and training program |
| CN202180099182.0A CN117546183A (zh) | 2021-06-17 | 2021-06-17 | 训练装置、训练方法以及训练程序 |
| EP21946062.3A EP4336419A4 (en) | 2021-06-17 | 2021-06-17 | Training device, training method, and training program |
| JP2023528902A JP7529159B2 (ja) | 2021-06-17 | 2021-06-17 | 学習装置、学習方法、および、学習プログラム |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2021/023123 WO2022264387A1 (ja) | 2021-06-17 | 2021-06-17 | 学習装置、学習方法、および、学習プログラム |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022264387A1 true WO2022264387A1 (ja) | 2022-12-22 |
Family
ID=84526966
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2021/023123 Ceased WO2022264387A1 (ja) | 2021-06-17 | 2021-06-17 | 学習装置、学習方法、および、学習プログラム |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US20240152822A1 (ja) |
| EP (1) | EP4336419A4 (ja) |
| JP (1) | JP7529159B2 (ja) |
| CN (1) | CN117546183A (ja) |
| AU (1) | AU2021451244B2 (ja) |
| WO (1) | WO2022264387A1 (ja) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113313233A (zh) * | 2021-05-17 | 2021-08-27 | 成都时识科技有限公司 | 应对器件失配的神经网络配置参数训练和部署方法与装置 |
| CN120122642A (zh) * | 2025-02-20 | 2025-06-10 | 酷睿程(北京)科技有限公司 | 控制方法、训练方法、电子设备、芯片、车辆及介质 |
-
2021
- 2021-06-17 AU AU2021451244A patent/AU2021451244B2/en active Active
- 2021-06-17 CN CN202180099182.0A patent/CN117546183A/zh active Pending
- 2021-06-17 WO PCT/JP2021/023123 patent/WO2022264387A1/ja not_active Ceased
- 2021-06-17 US US18/567,779 patent/US20240152822A1/en active Pending
- 2021-06-17 EP EP21946062.3A patent/EP4336419A4/en active Pending
- 2021-06-17 JP JP2023528902A patent/JP7529159B2/ja active Active
Non-Patent Citations (5)
| Title |
|---|
| DIEDERIK P. KINGMAMAX WELLING, AUTO-ENCODING VARIATIONAL BAYES, 4 June 2021 (2021-06-04), Retrieved from the Internet <URL:https://arxiv.org/pdf/1312.6114.pdf> |
| DONGXIAN WU; SHU-TAO XIA; YISEN WANG: "Adversarial Weight Perturbation Helps Robust Generalization", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 13 October 2020 (2020-10-13), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081784604 * |
| DONGXIAN WUSHU-TAO XIAYISEN WANG, ADVERSARIAL WEIGHT PERTURBATION HELPS ROBUST GENERALIZATION, 4 June 2021 (2021-06-04), Retrieved from the Internet <URL:https://arxiv.org/pdf/2004.05884> |
| See also references of EP4336419A4 |
| TAKERU MIYATO, KOYAMA MASANORI, NAKAE KEN, ISHII SHIN: "Distributional smoothing with virtual adversarial training", ARXIV:1507.00677V4, 25 September 2015 (2015-09-25), XP055350332, Retrieved from the Internet <URL:https://arxiv.org/abs/1507.00677v4> [retrieved on 20170228] * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP7529159B2 (ja) | 2024-08-06 |
| EP4336419A4 (en) | 2025-03-12 |
| AU2021451244A1 (en) | 2023-12-07 |
| CN117546183A (zh) | 2024-02-09 |
| US20240152822A1 (en) | 2024-05-09 |
| AU2021451244B2 (en) | 2024-09-26 |
| JPWO2022264387A1 (ja) | 2022-12-22 |
| EP4336419A1 (en) | 2024-03-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11829880B2 (en) | Generating trained neural networks with increased robustness against adversarial attacks | |
| US11741398B2 (en) | Multi-layered machine learning system to support ensemble learning | |
| US10733503B2 (en) | Technologies for shifted neural networks | |
| US20150356464A1 (en) | Generating data from imbalanced training data sets | |
| US20220114479A1 (en) | Systems and methods for automatic mixed-precision quantization search | |
| CN106068520A (zh) | 个性化的机器学习模型 | |
| WO2020173270A1 (zh) | 用于分析数据的方法、设备和计算机存储介质 | |
| WO2022156434A1 (zh) | 用于生成文本的方法和装置 | |
| JP7529159B2 (ja) | 学習装置、学習方法、および、学習プログラム | |
| WO2020090413A1 (ja) | 分類装置、分類方法および分類プログラム | |
| US11941867B2 (en) | Neural network training using the soft nearest neighbor loss | |
| JP7276483B2 (ja) | 学習装置、分類装置、学習方法及び学習プログラム | |
| KR102765759B1 (ko) | 딥 뉴럴 네트워크를 양자화하는 방법 및 장치 | |
| US11227231B2 (en) | Computational efficiency in symbolic sequence analytics using random sequence embeddings | |
| US20240095522A1 (en) | Neural network generation device, neural network computing device, edge device, neural network control method, and software generation program | |
| JP7655398B2 (ja) | 学習装置、学習方法、および、学習プログラム | |
| Zheng | Boosting based conditional quantile estimation for regression and binary classification | |
| CN113361678A (zh) | 神经网络模型的训练方法和装置 | |
| CN114968719B (zh) | 线程运行状态分类方法、装置、计算机设备及存储介质 | |
| Valizadegan et al. | Learning to trade off between exploration and exploitation in multiclass bandit prediction | |
| WO2023195120A1 (ja) | 学習装置、学習方法、および、学習プログラム | |
| JP7331938B2 (ja) | 学習装置、推定装置、学習方法及び学習プログラム | |
| JP7416255B2 (ja) | 学習装置、学習方法および学習プログラム | |
| JP7409487B2 (ja) | 学習装置、学習方法および学習プログラム | |
| US20250322256A1 (en) | Reduced precision neural federated learning |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21946062 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023528902 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021451244 Country of ref document: AU Ref document number: AU2021451244 Country of ref document: AU |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2021946062 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2021451244 Country of ref document: AU Date of ref document: 20210617 Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18567779 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202180099182.0 Country of ref document: CN |
|
| ENP | Entry into the national phase |
Ref document number: 2021946062 Country of ref document: EP Effective date: 20231206 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |