[go: up one dir, main page]

WO2024128377A1 - Procédé et système de compression de données au moyen de codes potentiels et de décodeurs - Google Patents

Procédé et système de compression de données au moyen de codes potentiels et de décodeurs Download PDF

Info

Publication number
WO2024128377A1
WO2024128377A1 PCT/KR2022/020896 KR2022020896W WO2024128377A1 WO 2024128377 A1 WO2024128377 A1 WO 2024128377A1 KR 2022020896 W KR2022020896 W KR 2022020896W WO 2024128377 A1 WO2024128377 A1 WO 2024128377A1
Authority
WO
WIPO (PCT)
Prior art keywords
decoder
latent code
latent
composite image
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2022/020896
Other languages
English (en)
Korean (ko)
Inventor
황성주
이해범
이동복
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Korea Advanced Institute of Science and Technology KAIST
Original Assignee
Korea Advanced Institute of Science and Technology KAIST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Korea Advanced Institute of Science and Technology KAIST filed Critical Korea Advanced Institute of Science and Technology KAIST
Publication of WO2024128377A1 publication Critical patent/WO2024128377A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present invention relates to a data compression method and system using a latent code and a decoder. More specifically, instead of directly compressing a data set in an input space, a set of learnable codes defined in a small latent space and the It relates to a method and system for generating a data set from a set of small decoders that each map codes differently to the original input space.
  • Deep learning has been successful in solving numerous machine learning problems in recent years, thanks to advances in parallel processing and huge amounts of real-world data collected from a variety of sources.
  • it is required to repeatedly perform learning processes such as hyperparameter optimization, neural structure search, and continuous learning.
  • maintaining and processing all the images in a huge data set requires enormous memory and computational costs, so each data set must be compressed into a small set of representative images.
  • the present invention was created to solve this problem.
  • the data generation process is decomposed into a set of learnable latent codes and a set of decoders to reduce the number of parameters to a limited number.
  • the purpose is to provide a method to significantly increase the number of synthetic examples.
  • the purpose is to provide a method that effectively allows knowledge sharing between synthetic images through a shared latent space and shared decoders applicable to all latent codes, in contrast to the conventional method that ignores the data generation process.
  • the data compression method using a latent code and a decoder includes the steps of (a) initializing a latent code with a lower resolution than the resolution of the original image; (b) initializing a decoder to output a composite image with a latent code as input; (c) inputting the latent code into the decoder; (d) outputting a composite image from the decoder; (e) calculating the difference (hereinafter referred to as 'loss') between the output composite image and the original image; (f) If the loss is within the preset range, the composite image output in step (d) is determined as the final compressed data and the data compression process is terminated.
  • step (g) If the loss is outside the preset range, proceed to step (g). Steps to proceed; (g) adjusting the latent code and the decoder in a direction that reduces the difference between the output composite image and the original image; and, (h) proceeding to step (c) and performing step (c) and subsequent steps.
  • step (d) the resolution of the composite image output from the decoder may be the same as the resolution of the original image.
  • the step (e) includes: (e1) inputting the composite image output in step (d) into an artificial neural network model that outputs a feature vector; (e2) outputting feature vectors of the synthesized image from the artificial neural network model; And, (e3) may include calculating a loss between the composite image feature vector and the original image feature vector using a loss function.
  • the step (g) includes (g1) calculating a gradient reduction direction of the loss function at the point where the loss is calculated; (g2) calculating the next latent code and decoder in the gradient reduction direction; And, (g3) may include the step of updating the latent code and decoder with the latent code and decoder calculated in step (g2).
  • M latent codes are each randomly initialized in step (a)
  • D decoders are each randomly initialized in step (b)
  • M x D synthetic images may be output, and M x D synthetic images may be input to the artificial neural network model in step (e1).
  • the loss function is, It can be expressed as, where C is the number of classes, N is the number of original images, M is the number of latent codes, D is the number of decoders, f() is the decoder, and g() is an artificial neural network that outputs the feature vector of the synthetic image. It's a model.
  • the artificial neural network model g() can be randomly initialized and re-initialized and used without taking a fixed configuration.
  • the initialization of the latent code in step (a) and the initialization of the decoder in step (b) may each be initialized to random values.
  • a system for compressing data using a latent code and a decoder includes at least one processor; and at least one memory storing computer-executable instructions, wherein the computer-executable instructions stored in the at least one memory are configured to: (a) be displayed at a resolution lower than the resolution of the original image; Initializing the latent code of; (b) initializing a decoder to output a composite image with a latent code as input; (c) inputting the latent code into the decoder; (d) outputting a composite image from the decoder; (e) calculating the difference (hereinafter referred to as 'loss') between the output composite image and the original image; (f) If the loss is within the preset range, the composite image output in step (d) is determined as the final compressed data and the data compression process is terminated.
  • step (g) If the loss is outside the preset range, proceed to step (g). Steps to proceed; (g) adjusting the latent code and the decoder in a direction that reduces the difference between the output composite image and the original image; and, (h) proceeding to step (c), so that step (c) and subsequent steps are performed.
  • a computer program for compressing data using a latent code and a decoder is stored in a non-transitory storage medium, and is stored in a non-transitory storage medium by a processor, (a) a latent code with a resolution lower than the resolution of the original image ( A step of initializing latent code); (b) initializing a decoder to output a composite image with a latent code as input; (c) inputting the latent code into the decoder; (d) outputting a composite image from the decoder; (e) calculating the difference (hereinafter referred to as 'loss') between the output composite image and the original image; (f) If the loss is within the preset range, the composite image output in step (d) is determined as the final compressed data and the data compression process is terminated.
  • step (g) If the loss is outside the preset range, proceed to step (g). Steps to proceed; (g) adjusting the latent code and the decoder in a direction that reduces the difference between the output composite image and the original image; and, (h) an instruction to proceed to step (c) and cause the step of performing step (c) and subsequent steps to be executed.
  • the data generation process is decomposed into a set of learnable latent codes and a set of decoders to generate synthetic examples of images with a limited number of parameters. It has the effect of greatly increasing the number.
  • the method of the present invention effectively allows knowledge sharing between synthetic images through a shared latent space and shared decoders applicable to all latent codes, and the resulting synthetic images are The results show that it is much more effective in terms of compression ratio compared to image quality.
  • 1 is a schematic diagram showing a method of performing data compression directly in the input space.
  • Figure 2 is a schematic diagram showing a data compression method using a latent code and decoder of the present invention.
  • Figure 3 is a flowchart of a data compression method using a latent code and decoder of the present invention.
  • Figure 4 is a diagram showing class classification accuracy (%) in the ConvNet-3 artificial neural network architecture.
  • Figure 5 is a diagram showing a comparison of class classification accuracy in various artificial neural network architectures.
  • Figure 6 is a diagram showing performance evaluation according to various changes in a given training cost.
  • Figure 7 is a diagram showing an example of a composite image of a decoder output through a latent code input.
  • Figure 8 is a diagram showing a comparison of class classification accuracy in various artificial neural network architectures.
  • Figure 9 is a diagram showing an example of a composite image of a decoder output through a latent code input.
  • 10 is a diagram showing test accuracy for several data compression methods.
  • Figure 11 is a diagram showing the test accuracy of a continuous learning experiment under several data compression methods.
  • 12A to 12E are diagrams showing bias and variance analysis.
  • Figure 1 is a schematic diagram showing a method of directly performing data compression in the input space
  • Figure 2 is a schematic diagram showing a data compression method using the latent code 110 and decoder 120 of the present invention.
  • images of a given resolution for example, 3 ⁇ 32 ⁇ 32 (R ⁇ G ⁇ B), were compressed in the image format (3 ⁇ 32 ⁇ 32).
  • the data compression method of the present invention shown in Figure 2 starts from the assumption that the data points are in a compressed latent space whose dimensions are much smaller than the actual input space.
  • the low dimensionality has the advantage of significantly increasing the number of data points.
  • the wall clock time shown in the graph of Figure 6 it is perfectly reasonable to generate more synthetic images in the present invention as long as the evaluation performances are compared. . It is also possible to preserve the quality of composite images to the extent that such assumptions are maintained, and this has generally been proven to be true.
  • the only space overhead is an additional parameter for the small decoder 120 that maps the latent code 110 to the input space.
  • This decoder 120 can be shared by all data points and classes, and the number of latent codes 110 can be flexibly adjusted so that the total number of parameters is similar even if the decoder 120 is relatively large.
  • the above decomposition allows us to separate the compressed representation of the image data and how to decode it again.
  • This decoding process can be thought of as adding detail to the latent code 110 so that the decoded image has sufficient information that was in the original input space.
  • decoder 120 may supplement background details or colors to complete the image. This insight means that the number of decoders 120 can be as many as the different styles present in the actual data set. For example, given the same compressed representation of an airplane, the background could be the ocean, sky, forest, or sunset. Assuming multiple decoders 120, one can expect each decoder 120 to each describe unique styles of the data set.
  • the knowledge of how to add styles to the latent code 110 can be shared across all images and classes, so that the number of parameters does not increase too much, similar to the Cartesian product between a pair of sets.
  • the number of composite images 130 can be efficiently increased without using the
  • the image resolution to be compressed is 3 x 32 x 32
  • the total space available during compression is C x 10 x 3 x 32 x 32.
  • the existing methodology compressed data by using the image format (3 x 32 x 32) at the given resolution.
  • Decoder a very small artificial neural network that takes as input the latent code (12 120) Use
  • Each latent code 110 represents one image, and each latent code 110 is 1/16 the size (12 x 4 x 4) of the original image form (3 x 32 x 32), so it is stored in the same space. It can be said that 16 times more images were stored.
  • this artificial neural network decoder it can be used in various deep learning models by restoring it to close to the original image form.
  • Figure 3 is a flowchart of a data compression method using the latent code 110 and decoder 120 of the present invention.
  • the present invention is a learning-based data set that can be separated into latent codes and decoder parameters by back-propagating through synthetic images generated such as gradient matching or distribution matching. It is a compression method.
  • the present invention proposes to use distribution matching, which is computationally more efficient than gradient matching.
  • a latent code (e.g., 12 x 4 x 4) 110 for the actual original images (hereinafter referred to as 'original images') (e.g., 3
  • the initial value of the decoder 120 is randomly set (S301).
  • the m potential codes 110 and the n decoders 120 are initialized to different random values, and then the following processes are performed simultaneously. You can.
  • the latent code 110 initialized in this way is input to the initialized decoder 120, and synthetic examples 130 of the same form as the original image (for example, 3 x 32 x 32) are output. Do it (S302).
  • m x n composite images are output (e.g., in the case above, each composite image is 3 x 32 x 32).
  • a process (S303 to S309) is performed to reduce the difference between the composite image 130 and the original image.
  • the composite image 130 output from the decoder 120 is input into the ‘feature vector artificial neural network model 140’ (S303).
  • the feature vector artificial neural network model 140 outputs a vector representing the features of the synthetic image (hereinafter referred to as a 'synthetic image feature vector') (S304).
  • the 'loss function value calculation and completion determination unit 150' performs a function of reducing the difference between the output composite image feature vector and the feature vector of the original image.
  • the loss function value calculation and completion determination unit 150 calculates the difference value using a loss function representing the difference (loss) between the synthetic image feature vector and the original image feature vector (S305).
  • the loss function is expressed as Equation 1 below, and is explained.
  • g(x) is a feature extractor that uses x as input.
  • Equation 1 g() is sampled by random initialization and re-initialization rather than a pre-trained neural network. It is computationally efficient and works well in practice. We use the intuition that a well-compressed synthetic data set will lead to an embedding average similar to the real data set, regardless of such random initialization. By fixing g() to a specific configuration, matching the average of two embeddings becomes too easy and a trivial solution.
  • the composite image which is the decoder output image in this case, is finally determined as the compressed data set for the original image (S307), and the data set determined in this way is Can be used as input in various artificial neural networks.
  • the loss function value calculated by Equation 1 is not within the preset range (S306), the loss function Gradient indicating the direction of change Find (S308), and the loss function is in the decreasing direction. (latent code) and ⁇ (decoder) are obtained, and using these, go back to step S302 to obtain an output composite image of the decoder 120 with the corresponding latent code 110 as input, and repeat steps S303 to S309. This method is repeated until the loss function value calculated by Equation 1 is within a preset range.
  • Non-patent Document 1 calculates the first term (i.e., the average of g(x)/n) and the second term (i.e., g(f( ⁇ ; )/dm) is further approximated.
  • ConvNet-3, ResNet-10, DenseNet-121 In the present invention, names representing various artificial neural network model architectures or deep learning model architectures that apply compressed synthetic images as data set input, and Conv3, RN10, and DN121 are respectively It is an abbreviation for
  • KFS Names representing various data compression methods, where KFS represents the data compression method of the present invention.
  • Figure 4 is a diagram showing class classification accuracy (%) in the ConvNet-3 artificial neural network architecture.
  • Figure 4 shows the data compression method of the present invention (hereinafter referred to as 'KFS') for each data set as the number of images (or number of parameters) per class is changed, and the data compression method by previous technology (hereinafter referred to as 'existing It shows the performance compared to the ‘method’).
  • 'KFS' data compression method of the present invention
  • 'existing It shows the performance compared to the ‘method’.
  • Figure 5 is a diagram showing a comparison of class classification accuracy in various artificial neural network architectures.
  • Figure 5 shows the performance of existing methods and KFS when the artificial neural network architecture is different from the architecture used for compression (ConvNet-3) at evaluation times (ResNet-10 and DenseNet-121).
  • KFS KFS is more robust to changes in artificial neural network architecture.
  • “Images/Class” 1 on the SVHN dataset
  • the performance degradation of KFS was only 7.2% and 1.9% on ResNet-10 and DenseNet-121, respectively, while it was 28.5% and 28.2% on IDC.
  • This robustness means that synthetic images learned from KFS are more natural because they are less specifically tailored to a particular neural network architecture.
  • Figure 6 is a diagram showing performance evaluation according to various changes in a given training cost.
  • KFS excellent performance of the present invention
  • KFS provides better performance than existing methods in all ranges of wall clock time.
  • the results show that KFS can not only generate a much wider variety of synthetic images, but also produce high-quality synthetic images that are comparable to existing methods.
  • existing methods often suffer from overfitting (e.g. SVHN dataset)
  • KFS has no such problem, giving users more options on how long the model should be trained for better performance. can do.
  • it is completely valid and reasonable to generate more synthetic images because our method provides better performance in all ranges of wall clock time for evaluation.
  • Figure 7 is a diagram showing an example of a composite image of a decoder output through a latent code input.
  • Figure 7 shows a visualization of the composite image generated by the present invention, KFS.
  • KFS composite image generated by the present invention
  • Figure 8 is a diagram showing a comparison of class classification accuracy in various artificial neural network architectures
  • Figure 9 is a diagram showing an example of a composite image of a decoder output through latent code input.
  • Figure 9 shows that our method can effectively capture the format of the data set across both different latent codes and decoders.
  • Figure 9 shows the experimental results of a high-resolution data set, and like Figure 7, the top two figures show various data distribution modes by changing the learned latent code with the decoder fixed, and the bottom two figures show a fixed latent code. This is the result of creating various data distribution modes by changing the decoder depending on the state.
  • Figure 10 is a diagram showing test accuracy for various data compression methods.
  • KFS aims to model the synthetic data generation process and is therefore closely related to deep generative models. Therefore, we compare VQVAE, which is similar to KFS in its discrete representation of latent codes, and BigGAN, one of the latest large-scale deep generative models. Control the number of parameters of the decoder by changing the channel size.
  • the synthetic dataset is constructed by performing class conditional ancestral sampling on the dictionary dataset and decoder.
  • Figure 10 shows that our invention, KFS, significantly outperforms these generative models when the number of parameters is limited, and performs similarly to BigGAN when the number of parameters is larger.
  • BigGAN is not specifically tuned for dataset compression, it is interesting to note that BigGAN's performance far exceeds most dataset compression methods, such as distribution matching in Figure 10.
  • the main advantage of existing data set compression methods is that the number of samples in a synthetic data set is relatively much smaller than that of a generative model. However, more research is needed to determine whether more benefits can be achieved by leveraging generative models for dataset compression.
  • Figure 11 is a diagram showing the test accuracy of a continuous learning experiment in several data compression methods.
  • Rehearsal-based continuous learning methods construct a small subset of representative images the learner has seen so far and train that subset using the currently available images to prevent catastrophic forgetting.
  • data set compression approaches can be used to improve performance by improving the quality and representativeness of subsets.
  • each of the 20 classes in 5 stages is provided sequentially in the CIFAR-100 data set.
  • the data set is compressed to only the currently available classes.
  • training is performed only with the compressed data set accumulated so far, not with actual images.
  • Existing methods include Herding, DSA, DM, and IDC.
  • FIG 11 shows the effectiveness of the inventive compression method in this continuous learning scenario.
  • KFS the present invention
  • Figures 12A to 12E are diagrams showing bias and variance analysis.
  • the bias effect is checked by controlling the number of decoder samples used for training. For example, instead of using all decoders, we subsample only a few decoders to compute the gradient at each iteration, which can lead to biased gradients.
  • Figure 12a shows that by subsampling fewer decoders, the bias actually increases and test accuracy deteriorates significantly. This is because the bias prevents the decoder from diversifying (see Equation 2).
  • Figure 12b shows that the composite images generated from a decoder trained with fewer images are indeed homogeneous across decoders.
  • the present invention proposes a new method to solve the dataset compression problem in a systematic and efficient manner.
  • the data compression method of the present invention makes full use of the regularities of the given data set, such as assuming a generation process of the synthetic image based on decomposition between the latent code and the decoder. .
  • the number of synthetic images can be increased with essentially the same number of parameters by interchangeably combining latent codes and decoders.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

La présente invention concerne un procédé et un système de compression de données utilisant des codes latents et des décodeurs et, plus particulièrement, un procédé et un système permettant de générer, au lieu de compresser directement un ensemble de données dans un espace d'entrée, un ensemble de données à partir d'un ensemble de codes pouvant faire l'objet d'un apprentissage définis dans un petit espace latent et des ensembles de petits décodeurs qui mappent différemment les codes dans l'espace d'entrée d'origine. Selon la présente invention, pour surmonter l'inefficacité de paramétrage d'espace d'entrée, en décomposant les processus de génération de données en un ensemble de codes latents pouvant faire l'objet d'un apprentissage et des ensembles de décodeurs, le nombre d'images synthétiques (exemples synthétiques) est considérablement augmenté avec un nombre limité de paramètres. Un tel procédé de la présente invention, par comparaison avec des procédés classiques ignorant des processus de génération de données, permet efficacement un partage de connaissances entre des images synthétiques au moyen de décodeurs partagés applicables à tous les codes latents et à un espace latent partagé, et présente des résultats très efficaces en termes de rapport taux de compression/qualité des images synthétiques générées selon le procédé.
PCT/KR2022/020896 2022-12-16 2022-12-20 Procédé et système de compression de données au moyen de codes potentiels et de décodeurs Ceased WO2024128377A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2022-0176642 2022-12-16
KR20220176642 2022-12-16

Publications (1)

Publication Number Publication Date
WO2024128377A1 true WO2024128377A1 (fr) 2024-06-20

Family

ID=91485022

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/020896 Ceased WO2024128377A1 (fr) 2022-12-16 2022-12-20 Procédé et système de compression de données au moyen de codes potentiels et de décodeurs

Country Status (2)

Country Link
KR (1) KR20240096393A (fr)
WO (1) WO2024128377A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220041943A (ko) * 2020-04-30 2022-04-01 주식회사 스트라드비젼 연속 학습 서버를 이용하여 클라이언트의 이미지를 분류하는 클래시파이어를 연속 학습하는 방법 및 이를 이용한 연속 학습 서버
US20220180527A1 (en) * 2020-12-03 2022-06-09 Tasty Tech Ltd. System and method for image synthesis of dental anatomy transformation
US20220286682A1 (en) * 2020-10-23 2022-09-08 Deep Render Ltd Image encoding and decoding, video encoding and decoding: methods, systems and training methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20220041943A (ko) * 2020-04-30 2022-04-01 주식회사 스트라드비젼 연속 학습 서버를 이용하여 클라이언트의 이미지를 분류하는 클래시파이어를 연속 학습하는 방법 및 이를 이용한 연속 학습 서버
US20220286682A1 (en) * 2020-10-23 2022-09-08 Deep Render Ltd Image encoding and decoding, video encoding and decoding: methods, systems and training methods
US20220180527A1 (en) * 2020-12-03 2022-06-09 Tasty Tech Ltd. System and method for image synthesis of dental anatomy transformation

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LEE HAE BEOM, LEE HAE BEOM, LEE DONG BOK, HWANG SUNG JU: "Dataset Condensation with Latent Space Knowledge Factorization and Sharing", ARXIV (CORNELL UNIVERSITY), CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, 21 August 2022 (2022-08-21), XP093181502, Retrieved from the Internet <URL:https://arxiv.org/pdf/2208.10494> DOI: 10.48550/arxiv.2208.10494 *
LEE JOOYOUNG, LEE JOOYOUNG, JEONG SEYOON, KIM MUNCHURL: "Selective compression learning of latent representations for variable-rate image compression", ARXIV (CORNELL UNIVERSITY), CORNELL UNIVERSITY LIBRARY, ARXIV.ORG, 8 November 2022 (2022-11-08), XP093181495, DOI: 10.48550/arxiv.2211.04104 *
ZHAO BO; BILEN HAKAN: "Dataset Condensation with Distribution Matching", 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), IEEE, 2 January 2023 (2023-01-02), pages 6503 - 6512, XP034290884, DOI: 10.1109/WACV56688.2023.00645 *

Also Published As

Publication number Publication date
KR20240096393A (ko) 2024-06-26

Similar Documents

Publication Publication Date Title
WO2022250408A1 (fr) Procédé et appareil de reconnaissance vidéo
WO2023085624A1 (fr) Procédé et appareil de reconstruction tridimensionnelle d&#39;une tête humaine pour rendre une image humaine
WO2020222382A1 (fr) Appareil électronique et procédé de traitement d&#39;image associé
EP3707906A1 (fr) Appareil électronique et son procédé de commande
WO2018164378A1 (fr) Appareil électronique permettant de compresser un modèle linguistique, appareil électronique permettant de fournir un mot de recommandation et procédés de fonctionnement associés
WO2021080145A1 (fr) Appareil et procédé de remplissage d&#39;image
WO2020017871A1 (fr) Appareil de traitement d&#39;image et son procédé de fonctionnement
WO2023167531A1 (fr) Procédé de réalisation de reconnaissance d&#39;image ou de vidéo par apprentissage automatique
EP3577571A1 (fr) Appareil électronique permettant de compresser un modèle linguistique, appareil électronique permettant de fournir un mot de recommandation et procédés de fonctionnement associés
WO2019050297A1 (fr) Procédé et dispositif d&#39;apprentissage de réseau neuronal
WO2020153626A1 (fr) Appareil électronique et procédé de commande associé
WO2020045794A1 (fr) Dispositif électronique et procédé de commande associé
WO2022004970A1 (fr) Appareil et procédé d&#39;entraînement de points clés basés sur un réseau de neurones artificiels
WO2020231005A1 (fr) Dispositif de traitement d&#39;image et son procédé de fonctionnement
WO2024107035A1 (fr) Procédé et système de pré-apprentissage de transformateur de vision par distillation de connaissances, et transformateur de vision pré-appris par ce biais
WO2021125521A1 (fr) Procédé de reconnaissance d&#39;action utilisant des données caractéristiques séquentielles et appareil pour cela
WO2020101434A1 (fr) Dispositif de traitement d&#39;image et procédé de reciblage d&#39;image
WO2024162581A1 (fr) Système de réseau d&#39;attention antagoniste amélioré et procédé de génération d&#39;image l&#39;utilisant
WO2024128377A1 (fr) Procédé et système de compression de données au moyen de codes potentiels et de décodeurs
CN117835007A (zh) 视频生成方法、装置、电子设备及存储介质
WO2020204610A1 (fr) Procédé, système et programme de coloration basés sur un apprentissage profond
WO2022244997A1 (fr) Procédé et appareil pour le traitement de données
WO2023090627A1 (fr) Appareil et procédé d&#39;optimisation de composés
WO2025053571A1 (fr) Édition locale d&#39;image et de scène par des instructions textuelles
JP2023042973A5 (fr)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22968637

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22968637

Country of ref document: EP

Kind code of ref document: A1