WO2023084174A1

WO2023084174A1 - Method, device and computer program product for configuring a distributed computing system

Info

Publication number: WO2023084174A1
Application number: PCT/FR2022/052030
Authority: WO
Inventors: Felix RENARD; Nicolas Vuillerme; Noel DE PALMA
Original assignee: Centre National de la Recherche Scientifique CNRS; Institut Polytechnique de Grenoble; Universite Grenoble Alpes
Current assignee: Centre National de la Recherche Scientifique CNRS; Institut Polytechnique de Grenoble; Universite Grenoble Alpes
Priority date: 2021-11-09
Filing date: 2022-10-26
Publication date: 2023-05-19
Anticipated expiration: 2024-05-09
Also published as: FR3129229A1; FR3129229B1

Abstract

The invention, in one embodiment, relates to a method for determining configurations of a distributed system of computing units for implementing a deep-learning algorithm, said determination implementing a recommendation based on a knowledge model, said method being implemented by a device comprising a processor that causes said device to implement the method, the method comprising: determining a deep-learning algorithm intended to be implemented in the distributed system; obtaining (S901) a knowledge domain including, for said algorithm, at least one observed configuration and a value of at least one performance indicator for said at least one observed configuration; determining (S902) a parametric model of the performance indicator or indicators on the basis of the knowledge domain; determining (S903) at least one configuration different from said at least one observed configuration and estimating the value of the performance indicator or indicators associated with said different configuration on the basis of said model in order to complete the knowledge domain; selecting (S904) at least one configuration among the configurations of the completed knowledge domain that satisfies at least one constraint relating to the at least one performance indicator; and generating a signal representative of said at least one selected configuration. Other embodiments relate to an implementation device and to a computer program product.

Description

PROCEDE, DISPOSITIF ET PRODUIT PROGRAMME METHOD, DEVICE AND PROGRAMMED PRODUCT

D’ORDINATEUR POUR EA CONFIGURATION D’UN SYSTEME DE CAECUE DISTRIBUE COMPUTER FOR EA CONFIGURING A DISTRIBUTED CAECUE SYSTEM

Domaine technique de l’invention Technical field of the invention

La présente invention concerne un procédé, un dispositif et un produit programme d’ordinateur pour la configuration d’un système de calcul distribué. L’invention porte plus particulièrement sur la détermination d’une ou plusieurs configurations pour la mise en œuvre d’un algorithme par le système distribué à configurer, la détermination se basant sur un modèle enrichi de connaissances. L’algorithme est par exemple un algorithme à apprentissage profond. The present invention relates to a method, device and computer program product for configuring a distributed computing system. The invention relates more particularly to the determination of one or more configurations for the implementation of an algorithm by the distributed system to be configured, the determination being based on a model enriched with knowledge. The algorithm is for example a deep learning algorithm.

Arrière-plan technique Technical background

Les systèmes de recommandation informatiques peuvent être classés en quatre grandes catégories : les recommandeurs basés sur le filtrage collaboratif ; Computer recommender systems can be classified into four broad categories: recommenders based on collaborative filtering;

- les recommandeurs basés sur la similarité de contenu ; - recommenders based on content similarity;

- les recommandeurs basés sur un modèle de connaissances ; - recommenders based on a knowledge model;

- les recommandeurs hybrides combinant les caractéristiques de plusieurs recommandeurs des catégories précédentes. - hybrid recommenders combining the characteristics of several recommenders from the previous categories.

Les algorithmes d’apprentissage profond sont très consommateurs de temps, de ressources informatiques et d’énergie. Avoir une recommandation spécifique sur la configuration de système distribué en fonction de ses besoins et/ou contraintes permet de limiter ces consommations. Deep learning algorithms are very time-, computer-resource- and energy-consuming. Having a specific recommendation on the distributed system configuration according to its needs and/or constraints makes it possible to limit these consumptions.

Les algorithmes d’apprentissage profond sont basés sur des optimisations stochastiques.Deep learning algorithms are based on stochastic optimizations.

Ces recommandeurs sont pertinents s’ils arrivent à proposer une certaine exhaustivité des choix possibles pour pouvoir au mieux répondre aux contraintes et besoins. Cependant, dû à l’optimisation stochastique préalablement citée, ainsi qu’au long coût pour l’entraînement sur base de données résultant d’expériences et d’observations réelles, l’obtention de l’ensemble des configurations possibles n’est souvent pas réalisable. Le document [1] décrit une planification de radiothérapies utilisant un système de recommandation basé sur la connaissance. These recommenders are relevant if they manage to offer a certain exhaustiveness of the possible choices in order to best meet the constraints and needs. However, due to the previously mentioned stochastic optimization, as well as the long cost for database training resulting from real experiments and observations, obtaining the set of possible configurations is often not possible. feasible. Document [1] describes radiotherapy planning using a knowledge-based recommendation system.

Résumé de l’invention Summary of the invention

Un mode de réalisation concerne une méthode de détermination de configurations d’un système distribué d’unités de calcul pour la mise en œuvre d’un algorithme à apprentissage profond, ladite détermination mettant en œuvre une recommandation basée sur un modèle de connaissances, ladite méthode étant mise en œuvre par un dispositif comprenant un processeur qui conduit ledit dispositif à mettre en œuvre la méthode, la méthode comprenant: la détermination d’un algorithme à apprentissage profond destiné à être mis en œuvre dans le système distribué; l’obtention d’un domaine de connaissances comportant, pour ledit algorithme, au moins une configuration observée et une valeur d’au moins un indicateur de performance pour ladite au moins une configuration observée; la détermination d’un modèle paramétrique du ou des indicateurs de performance en fonction du domaine de connaissances ; la sélection d’au moins une configuration parmi les configurations du domaine de connaissances complété et répondant à au moins une contrainte relative à l’au moins un indicateur de performance ; la génération d’un signal représentatif de ladite au moins une configuration sélectionnée. An embodiment relates to a method for determining configurations of a distributed system of computing units for the implementation of a deep learning algorithm, said determination implementing a recommendation based on a knowledge model, said method being implemented by a device comprising a processor which causes said device to implement the method, the method comprising: determining a deep learning algorithm to be implemented in the distributed system; obtaining a knowledge domain comprising, for said algorithm, at least one observed configuration and a value of at least one performance indicator for said at least one observed configuration; the determination of a parametric model of the performance indicator(s) according to the knowledge domain; the selection of at least one configuration from among the configurations of the knowledge domain completed and meeting at least one constraint relating to the at least one performance indicator; generating a signal representative of said at least one selected configuration.

Ainsi, il n’est nécessaire que d’obtenir des données expérimentales que pour un nombre limité de configurations - le domaine de connaissances peut être complété par mise en œuvre du modèle paramétrique. C’est dans ce domaine de connaissance ainsi complété - de façon exhaustive ou non - que l’on peut ensuite procéder au filtrage des configurations correspondant à une ou plusieurs contraintes, par exemple des contraintes fixées par un utilisateur ou prédéterminées, pour sélectionner parmi les configurations répondant à ces contraintes une ou plusieurs configurations les mieux adaptées. Thus, it is only necessary to obtain experimental data for a limited number of configurations - the knowledge domain can be completed by implementing the parametric model. It is in this area of knowledge thus supplemented - exhaustively or not - that one can then proceed to the filtering of the configurations corresponding to one or more constraints, for example constraints fixed by a user or predetermined, to select among the configurations meeting these constraints one or more best suited configurations.

Le signal généré peut être affiché, l’information qu’il contient peut être stockée ou servir de commande à un ou plusieurs serveurs d’unités de calcul. Selon un mode de réalisation particulier, la sélection est effectuée par un algorithme de décision en fonction d’une pondération d’au moins un indicateur de performance, la méthode comportant en outre l’affichage de la ou des configurations sélectionnées par mise en œuvre dudit signal et la réception d’une commande d’identification d’une configuration unique parmi les configurations sélectionnées qui sera à appliquer au système distribué. The signal generated can be displayed, the information it contains can be stored or serve as a command to one or more computing unit servers. According to a particular embodiment, the selection is made by a decision algorithm according to a weighting of at least one performance indicator, the method further comprising the display of the configuration(s) selected by implementing said signal and receiving a command to identify a unique configuration among the selected configurations which will be applied to the distributed system.

Selon un mode de réalisation particulier, la sélection est effectuée par un algorithme de décision qui sélectionne une configuration unique. According to a particular embodiment, the selection is performed by a decision algorithm which selects a unique configuration.

Selon un mode de réalisation particulier, la méthode comprend en outre la réservation d’unités de calcul selon la configuration unique. According to a particular embodiment, the method further comprises the reservation of calculation units according to the unique configuration.

Selon un mode de réalisation particulier, la méthode comprend en outre l’estimation de la variabilité dudit au moins indicateur de performance pour une configuration différente.According to a particular embodiment, the method further comprises the estimation of the variability of said at least one performance indicator for a different configuration.

Selon un mode de réalisation particulier, ledit au moins un indicateur de performance d’une configuration comprend un ou plusieurs parmi : le temps d’entraînement requis du système pour la configuration considérée ; la précision relative de calcul ; la puissance consommée par le système pour la configuration considérée. According to a particular embodiment, said at least one performance indicator of a configuration comprises one or more of: the required training time of the system for the configuration considered; the relative precision of calculation; the power consumed by the system for the configuration considered.

Selon un mode de réalisation particulier, les modèles paramétriques appliqués d’une part à l’inverse du temps d’entrainement et d’autre part à la précision de calcul sont des fonctions affines d’un nombre d’unités de calcul du système distribué mis en œuvre dans la configuration considérée, et la détermination du modèle paramétrique comporte la détermination des paramètres de la fonction affine associée à chaque indicateur de performance. According to a particular embodiment, the parametric models applied on the one hand to the inverse of the training time and on the other hand to the calculation precision are affine functions of a number of calculation units of the distributed system implemented in the configuration considered, and the determination of the parametric model includes the determination of the parameters of the affine function associated with each performance indicator.

Selon un mode de réalisation particulier, la puissance consommée est une fonction affine du carré du nombre d’unités de calcul du système distribué à utiliser dans la configuration envisagée. According to a particular embodiment, the power consumed is an affine function of the square of the number of calculation units of the distributed system to be used in the configuration envisaged.

Selon un mode de réalisation particulier, l’algorithme de décision classe les configurations filtrées en fonction d’un score fonction du ou des indicateurs de performance et de coefficients de pondération de ces derniers. According to a particular embodiment, the decision algorithm classifies the filtered configurations according to a score depending on the performance indicator(s) and weighting coefficients of the latter.

Selon un mode de réalisation particulier, la méthode comprend en outre l’obtention d’au moins une contrainte sur une caractéristique intrinsèque de configuration matérielle, ladite contrainte étant prise en compte dans le cadre du filtrage des configurations. According to a particular embodiment, the method further comprises obtaining at least one constraint on an intrinsic characteristic of hardware configuration, said constraint being taken into account within the framework of the filtering of the configurations.

Selon un mode de réalisation particulier, ledit algorithme à apprentissage profond met en œuvre un réseau de neurones convolutif. Selon un mode de réalisation particulier, la méthode ci-dessus est appliquée dans le cadre de la segmentation d’images. According to a particular embodiment, said deep learning algorithm implements a convolutional neural network. According to a particular embodiment, the above method is applied in the context of image segmentation.

Un exemple de réalisation concerne un dispositif comprenant un processeur et une mémoire, ladite mémoire comportant des instructions qui, quand elles sont exécutées par le processeur, conduisent le dispositif à mettre en œuvre les étapes de l’une des méthodes décrites. An exemplary embodiment relates to a device comprising a processor and a memory, said memory comprising instructions which, when they are executed by the processor, lead the device to implement the steps of one of the methods described.

Produit programme d’ordinateur comprenant des instructions qui, lorsque le programme est exécuté par un dispositif comprenant un processeur, conduit le dispositif à mettre en œuvre les étapes de l’une des méthodes décrites. A computer program product comprising instructions which, when the program is executed by a device comprising a processor, causes the device to perform the steps of one of the methods described.

Brève description des figures Brief description of figures

D'autres caractéristiques et avantages de l'invention apparaitront au cours de la lecture de la description détaillée qui va suivre pour la compréhension de laquelle on se reportera aux dessins annexés dans lesquels : Other characteristics and advantages of the invention will appear during the reading of the detailed description which will follow for the understanding of which reference will be made to the appended drawings in which:

- la figure 1 est un diagramme d’un réseau neuronal pouvant être utilisé pour la mise en œuvre d’un algorithme à apprentissage profond selon l’état de l’art; - figure 1 is a diagram of a neural network that can be used for the implementation of a deep learning algorithm according to the state of the art;

- la figure 2 est un diagramme bloc d’un système dans lequel un dispositif selon un mode de réalisation est mis en œuvre ; - Figure 2 is a block diagram of a system in which a device according to one embodiment is implemented;

- la figure 3 est un algorigramme d’un recommandeur basé sur un modèle de connaissances selon un exemple de réalisation ; - Figure 3 is an algorigram of a recommender based on a knowledge model according to an example embodiment;

- la figure 4 est un graphique représentant l’accélération du temps d’entraînement en fonction du nombre d’unités de calcul ; - Figure 4 is a graph representing the acceleration of the training time as a function of the number of calculation units;

- la figure 5 est un exemple d’interface graphique permettant à un utilisateur d’interagir avec le recommandeur selon un exemple de réalisation ; - Figure 5 is an example of a graphical interface allowing a user to interact with the recommender according to an example embodiment;

- la figure 6 est une vue agrandie d’une première partie de l’interface utilisateur de la figure 5 proposant une fonctionnalité de choix de contraintes et de coefficients de pondération de ces contraintes ; - Figure 6 is an enlarged view of a first part of the user interface of Figure 5 offering a functionality for choosing constraints and weighting coefficients of these constraints;

- la figure 7 est une vue agrandie d’une seconde partie de l’interface utilisateur de la figure 5 proposant une fonctionnalité de choix de paramètres de visualisation de configurations du recommandeur ; - la figure 8 est une vue agrandie d’une troisième partie de l’interface utilisateur de la figure 5 et correspondant à une visualisation de différentes configurations ; FIG. 7 is an enlarged view of a second part of the user interface of FIG. 5 proposing a functionality for choosing parameters for displaying recommender configurations; - Figure 8 is an enlarged view of a third part of the user interface of Figure 5 and corresponding to a visualization of different configurations;

- la figure 9 est un organigramme d’une méthode selon un exemple de réalisation non limitatif. - Figure 9 is a flowchart of a method according to a non-limiting embodiment.

Description détaillée de l'invention Detailed description of the invention

La description qui suit utilise l’imagerie médicale comme exemple d’application. Cependant, l’invention n’est pas limitée à cette application spécifique et peut être mise en œuvre dans d’autres domaines que l’analyse d’images. The following description uses medical imaging as an application example. However, the invention is not limited to this specific application and can be implemented in fields other than image analysis.

La figure 1 est un exemple d’architecture de réseau neuronal 10 pouvant servir pour la mise en œuvre d’un algorithme à apprentissage profond. Le réseau neuronal de la figure 1 est un réseau très simple et connu en soi destiné à illustrer les bases du fonctionnement d’un tel réseau dans le cadre d’un algorithme à apprentissage profond - des réseaux bien plus complexes peuvent être utilisés et sont par ailleurs connus par la personne du métier. Le réseau de la figure 1 comporte une couche d’entrée 11, une ou plusieurs couches dites cachées 12 et une couche de sortie 13. Les différentes couches comportent des neurones 14 et chaque couche est connectée à la couche suivante - par ces connexions, chaque couche prend en compte la sortie de la couche précédente. Les connexions sont pondérées et les poids de pondération définissent l’influence d’un neurone sur un autre neurone, ce dernier recevant la somme des sorties pondérées de neurones précédents. Cette somme est traitée par une fonction dite d’activation ou de propagation qui génère la sortie du neurone. La couche d’entrée reçoit des données externes, par exemple des images issues de l’imagerie médicale. La couche de sortie fournit le résultat désiré, par exemple la reconnaissance d’un objet dans l’image ou encore une image segmentée. L’apprentissage du réseau peut par exemple être effectué par mise en œuvre d’une méthode de rétropropagation du gradient dans le cadre d’un entrainement supervisé, sur base de données d’entrée connues générant un résultat connu, la différence entre la sortie effective générée par le réseau et le résultat attendu permettant d’adapter de façon connue en soi les paramètres du réseau, notamment en ajustant les poids pour minimiser les erreurs. Ces poids auront été initialisés de façon aléatoire. La figure 2 est un diagramme bloc d’un système comprenant un dispositif 200 de recommandation de configurations selon un mode de réalisation non limitatif. Le dispositif 200 comprend un processeur 201, une mémoire vive 202, une mémoire de masse 203 (par exemple une mémoire morte, un disque dur, une mémoire statique...), une interface de communication 205 et une interface 206 pour la connexion de périphériques. Les différents éléments du dispositif 200 sont interconnectés grâce à un bus de communication interne 204. La mémoire de masse 203 stocke notamment du code logiciel 213 qui met en œuvre les méthodes décrites quand il est exécuté par le processeur 201. Le dispositif est connecté à un ou plusieurs dispositifs d’interface utilisateur 208, tels par exemple un clavier, et/ou une souris. Un dispositif d’affichage 207 est également connecté au dispositif 200. Bien que la figure montre un dispositif d’affichage externe au dispositif 200, il peut également être intégré au dispositif 200. Le dispositif 200 communique avec un serveur 209 à travers un réseau de communication 210, par exemple l’internet. Le serveur 209 contrôle un ou plusieurs bancs 211 d’unités de calcul 212. Selon un exemple de réalisation, ces unités de calcul sont des unités de traitement graphiques (aussi appelés ‘Graphics Processing Units’ ou ‘GPU’s en langue anglaise), mais d’autres types d’unités de traitement peuvent également être mises en œuvre. Le dispositif 200 peut réserver des capacités de calcul des bancs 211 auprès du serveur 209 pour les besoins d’exécution d’un algorithme à apprentissage profond. Cette réservation peut par exemple s’effectuer sur la base d’une configuration du système de calcul distribué, cette configuration pouvant par exemple être définie par un nombre d’unités de calcul à réserver. Bien que le serveur 209 soit représenté dans la figure 2 de façon séparée des bancs d’unités de calcul 211, ces deux entités peuvent être colocalisées. L’ensemble des communications peut se faire de manière filaire ou sans fil. Figure 1 is an example of a neural network 10 architecture that can be used to implement a deep learning algorithm. The neural network in figure 1 is a very simple and well-known network intended to illustrate the basics of how such a network works within the framework of a deep learning algorithm - much more complex networks can be used and are therefore otherwise known to the person skilled in the art. The network of Figure 1 comprises an input layer 11, one or more so-called hidden layers 12 and an output layer 13. The different layers comprise neurons 14 and each layer is connected to the next layer - by these connections, each layer takes into account the output of the previous layer. The connections are weighted and the weighting weights define the influence of a neuron on another neuron, the latter receiving the sum of the weighted outputs of preceding neurons. This sum is processed by a so-called activation or propagation function which generates the output of the neuron. The input layer receives external data, for example images from medical imaging. The output layer provides the desired result, for example the recognition of an object in the image or even a segmented image. The learning of the network can for example be carried out by implementing a method of backpropagation of the gradient within the framework of a supervised training, on the basis of known input data generating a known result, the difference between the effective output generated by the network and the expected result making it possible to adapt the parameters of the network in a manner known per se, in particular by adjusting the weights to minimize errors. These weights will have been initialized randomly. FIG. 2 is a block diagram of a system comprising a device 200 for recommending configurations according to a non-limiting embodiment. The device 200 comprises a processor 201, a random access memory 202, a mass memory 203 (for example a ROM, a hard disk, a static memory, etc.), a communication interface 205 and an interface 206 for the connection of peripheral devices. The various elements of the device 200 are interconnected by means of an internal communication bus 204. The mass memory 203 notably stores software code 213 which implements the methods described when it is executed by the processor 201. The device is connected to a or several user interface devices 208, such as for example a keyboard, and/or a mouse. A display device 207 is also connected to the device 200. Although the figure shows a display device external to the device 200, it can also be integrated into the device 200. The device 200 communicates with a server 209 through a network of communication 210, for example the internet. The server 209 controls one or more banks 211 of calculation units 212. According to an exemplary embodiment, these calculation units are graphics processing units (also called 'Graphics Processing Units' or 'GPU's in English), but Other types of processing units can also be implemented. The device 200 can reserve computing capacities of the banks 211 with the server 209 for the needs of executing a deep learning algorithm. This reservation can for example be made on the basis of a configuration of the distributed calculation system, this configuration being able for example to be defined by a number of calculation units to be reserved. Although the server 209 is represented in FIG. 2 separately from the banks of calculation units 211, these two entities can be collocated. All communications can be wired or wireless.

Le but du dispositif recommandeur 200 est de proposer une configuration de système de calcul distribué en fonction de spécifications données par un utilisateur. Selon le présent exemple de réalisation, ces spécifications, représentatives des objectifs attendus par l’utilisateur, sont basées sur une ou plusieurs caractéristiques des configurations du système de calcul distribué et/ou sur un ou plusieurs critères de performance définis par un utilisateur. Le ou les critères de performance peuvent comprendre un ou plusieurs parmi le temps d’entraînement (ou d’apprentissage) de l’algorithme, la précision (ou perte de précision) ou encore la puissance électrique consommée. Un exemple de combinaison de critères défini par un utilisateur pourrait être un temps d’entraînement inférieur à trois heures et une précision de 95%. Il n’est pas nécessaire que tous les critères soient définis. Ces critères ou leurs combinaisons sont aussi appelés ‘contraintes’ et le dispositif recommandeur est alors considéré être un recommandeur sous contrainte dans le sens où les critères spécifiés par l’utilisateur contraignent certaines parties du domaine de données. En tant que tel, les recommandeurs basés connaissances sous contrainte sont aussi connus sous la désignation de ‘configurations basées connaissances’ (‘knowledge based configurations’ en langue anglaise). Le dispositif devra alors déterminer une configuration adaptée, sous la forme d’un nombre d’unités de calcul à réserver et le cas échéant mettre en œuvre cette configuration. Selon un mode de réalisation, le dispositif recommandeur 200 détermine une configuration adaptée, qui pourra ensuite être implémentée par l’utilisateur par l’intermédiaire d’un autre dispositif que le dispositif 200. Selon le présent exemple de réalisation, le dispositif 200 met en œuvre un recommandeur basé sur un modèle de connaissances sous contrainte. The aim of the recommender device 200 is to propose a distributed computing system configuration according to specifications given by a user. According to the present embodiment, these specifications, representative of the objectives expected by the user, are based on one or more characteristics of the configurations of the distributed computing system and/or on one or more performance criteria defined by a user. The performance criterion or criteria may comprise one or more of the training (or learning) time of the algorithm, the precision (or loss of precision) or even the electrical power consumed. An example of a user-defined combination of criteria could be a training time of less than three hours and a 95% accuracy. Not all criteria need to be defined. These criteria or their combinations are also called 'constraints' and the recommender device is then considered to be a constrained recommender in the sense that the criteria specified by the user constrain certain parts of the data domain. As such, constrained knowledge-based recommenders are also known as 'knowledge-based configurations'. The device will then have to determine a suitable configuration, in the form of a number of calculation units to be reserved and if necessary implement this configuration. According to one embodiment, the recommender device 200 determines a suitable configuration, which can then be implemented by the user via a device other than the device 200. According to the present example embodiment, the device 200 implements implements a recommender based on a constrained knowledge model.

La figure 3 est un algorigramme d’un procédé de recommandation selon un exemple de réalisation. FIG. 3 is an algorigram of a recommendation method according to an exemplary embodiment.

Le procédé de recommandation reçoit en entrée au moins les données suivantes : The recommendation method receives as input at least the following data:

Un domaine de connaissances 301 A domain of knowledge 301

Ce domaine comprend des configurations du système distribué (comme indiqué par exemple sous la forme de nombre d’unités de calcul) - ainsi que les performances associées à ces configurations (en termes de temps d’entraînement de l’algorithme mis en œuvre, de la précision relative, de la puissance nécessaire ou de la consommation d’énergie) - déterminées expérimentalement pour une tâche donnée, par exemple une tâche de segmentation d’images médicales. This domain includes configurations of the distributed system (as indicated for example in the form of number of calculation units) - as well as the performances associated with these configurations (in terms of training time of the implemented algorithm, the relative precision, of the power required or of the energy consumption) - determined experimentally for a given task, for example a medical image segmentation task.

- Les spécifications 302 données par l’utilisateur - The 302 specifications given by the user

Selon un mode de réalisation, ces spécifications comprennent des limites minimales et/ou maximales ou des valeurs spécifiques des critères de performance. Selon une variante de réalisation, ces critères peuvent également être prédéterminés, ou être dérivés automatiquement de données disponibles par ailleurs comme par exemple des coûts à respecter. Les coûts à respecter sont par exemple établis sur base de fonctions de coût selon des critères de performance comme ci-dessus, ou encore des critères écologiques et/ou économiques. Selon une variante de réalisation, le procédé de recommandation reçoit par ailleurs en entrée des critères de décision 303 utilisés pour ordonnancer ou encore classer les configurations répondant aux spécifications de l’utilisateur. Dans l’exemple de la figure 3, ces critères sont déterminés par un expert, mais ils peuvent aussi être fournis par l’utilisateur ou par une autre personne. According to one embodiment, these specifications include minimum and/or maximum limits or specific values of the performance criteria. According to a variant embodiment, these criteria can also be predetermined, or be automatically derived from data available elsewhere such as, for example, the costs to be respected. The costs to be respected are for example established on the basis of cost functions according to performance criteria as above, or alternatively ecological and/or economic criteria. According to a variant embodiment, the recommendation method also receives as input decision criteria 303 used to schedule or even classify the configurations meeting the user's specifications. In the example of FIG. 3, these criteria are determined by an expert, but they can also be provided by the user or by another person.

L’algorigramme de la figure 3 comporte deux étapes principales, les étapes A et B, chacune divisée en sous-étapes Ax et By : The flowchart in Figure 3 has two main steps, steps A and B, each divided into sub-steps Ax and By:

A - La première de ces étapes, l’étape A, concerne l’enrichissement du domaine de connaissances initial 301. Cette étape A comporte trois sous-étapes : A - The first of these steps, step A, concerns the enrichment of the initial knowledge domain 301. This step A comprises three sub-steps:

Al - Une première sous-étape (étape Al) de paramétrisation du domaine de connaissances. Cette étape détermine une modélisation paramétrique des critères de performance de l’algorithme à apprentissage profond considéré en fonction du nombre d’unités de calcul.Al - A first sub-step (step Al) of parameterization of the knowledge domain. This step determines a parametric modeling of the performance criteria of the considered deep learning algorithm according to the number of calculation units.

A2 - Une seconde sous-étape (étape A2) d’estimation de configurations non expérimentées mettant en œuvre le modèle paramétrique et fournissant un ensemble de configurations 305 à l’étape B décrite ci-dessous. Cette sous-étape rajoute au domaine de connaissances des configurations non expérimentées (c’est-à-dire ne faisant pas partie des configurations du domaine de connaissances initial) mais potentiellement possibles. Le domaine de connaissances ainsi complémenté est appelé domaine de connaissances enrichi. A2 - A second substep (step A2) of estimating untested configurations implementing the parametric model and providing a set of configurations 305 in step B described below. This sub-step adds to the knowledge domain configurations that are not experienced (i.e. not part of the configurations of the initial knowledge domain) but potentially possible. The knowledge domain thus complemented is called the enriched knowledge domain.

Selon un premier exemple non limitatif, si l’ensemble des expériences ont été effectuées avec un nombre d’unités de calcul pairs, alors cette étape rajoutera les configurations pour un nombre impair d’unités de calcul. According to a first non-limiting example, if all the experiments were carried out with an even number of calculation units, then this step will add the configurations for an odd number of calculation units.

Selon un autre exemple non limitatif, on dispose par exemple d’un banc de seize unités de calcul. Pour mesurer la variabilité, on effectue quatre mesures par groupes d’unités de calcul. Si on devait mesurer l’ensemble des configurations, on devrait avoir 4*16 = 64 simulations. On teste un nombre limité de configurations, par exemple les configurations avec 1, 2, 4, 8 et 16 unités de calcul - soit 4*5=20 configurations à tester expérimentalement, soit moins d’un tiers du nombre total d’expériences potentielles. La réduction du nombre d’expériences est d’autant plus importante qu’il peut être nécessaire de les conduire pour des types différents d’unités de calcul. Un type d’unité de calcul peut être défini par une puissance de calcul et/ou le constructeur de l’unité de calcul et/ou le coût énergétique de l’unité de calcul. Les configurations rajoutées pour enrichir le domaine de connaissances comprennent toutes les configurations non expérimentées. Selon un troisième exemple non limitatif plus général, on rajoute des configurations non expérimentées qui correspondent à des incréments réguliers d’unités de calcul entre le nombre minimal d’unités de calcul des configurations expérimentées et le nombre maximal d’unités de calcul des configurations expérimentées. According to another non-limiting example, there is for example a bank of sixteen calculation units. To measure the variability, four measurements are carried out by groups of calculation units. If we were to measure all the configurations, we should have 4*16 = 64 simulations. A limited number of configurations are tested, for example configurations with 1, 2, 4, 8 and 16 computing units - i.e. 4*5=20 configurations to be tested experimentally, i.e. less than a third of the total number of potential experiments . The reduction in the number of experiments is all the more significant as it may be necessary to conduct them for different types of computing units. A calculation unit type can be defined by a calculation power and/or the manufacturer of the calculation unit and/or the energy cost of the calculation unit. The configurations added to enrich the domain of knowledge include all the unexperienced configurations. According to a third more general non-limiting example, untested configurations are added which correspond to regular increments of calculation units between the minimum number of calculation units of the tested configurations and the maximum number of calculation units of the tested configurations .

Selon un quatrième exemple non limitatif plus général, on rajoute les configurations non expérimentées correspondant à tout nombre entier d’unités de calcul entre le nombre minimal d’unités de calcul des configurations expérimentées et le nombre maximal d’unités de calcul des configurations expérimentées. According to a fourth, more general, non-limiting example, the untested configurations are added corresponding to any whole number of calculation units between the minimum number of calculation units of the tested configurations and the maximum number of calculation units of the tested configurations.

Selon un cinquième exemple de réalisation non limitatif plus général, le nombre d’unités de calcul pris en compte pour déterminer les configurations non expérimentées rajoutées peut être plus grand que le nombre d’unités de calcul des configurations expérimentées. According to a fifth, more general, non-limiting example of embodiment, the number of calculation units taken into account to determine the added non-tested configurations may be greater than the number of calculation units of the tested configurations.

Dans tous les cas, on ne rajoute que les configurations correspondant à des nombres d’unités de calcul qui ne figurent pas dans les configurations expérimentées. In all the cases, one adds only the configurations corresponding to numbers of units of calculation which do not appear in the experimented configurations.

Grâce au modèle paramétrique, il sera possible d’estimer le temps d’entraînement ainsi que la précision pour des nombres d’unités de calcul de ces configurations rajoutées. Thanks to the parametric model, it will be possible to estimate the training time as well as the precision for numbers of calculation units of these added configurations.

A3 - Une troisième sous-étape (étape A3) d’estimation de la variabilité des critères de performance. Disposant de données relatives à de multiples expérimentations, il est possible d’estimer un écart de valeurs des critères de performance pour chacune des configurations, et cela même si elles n’ont pas été expérimentées. En effet, chaque paramètre du modèle linéaire est estimé avec un intervalle de confiance. A3 - A third sub-step (step A3) for estimating the variability of the performance criteria. With data relating to multiple experiments, it is possible to estimate a difference in the values of the performance criteria for each of the configurations, even if they have not been tested. Indeed, each parameter of the linear model is estimated with a confidence interval.

Les données en sortie de cette étape sont également fournies en entrée de l’étape B. The data output from this step is also provided as input to step B.

B - La seconde étape, l’étape B, concerne la sélection de la ou des configurations recommandées. L’étape B comprend deux sous-étapes : B - The second step, step B, concerns the selection of the recommended configuration(s). Step B includes two sub-steps:

B1 - Une première sous-étape (étape Bl) de filtrage des configurations du domaine de connaissances enrichi en fonction des spécifications de l’utilisateur et en prenant en compte la variabilité des critères de performance. Selon une variante de réalisation, la variabilité est prédéfinie par un intervalle de confiance. B1 - A first sub-step (step Bl) of filtering the configurations of the enriched knowledge domain according to the user's specifications and taking into account the variability of the performance criteria. According to a variant embodiment, the variability is predefined by a confidence interval.

B2 - Une seconde sous-étape (étape B2) qui ordonne les configurations du sous-ensemble de configurations sur base d’un ou plusieurs critères de décision 303 et détermine la ou les configurations recommandées. Ces critères de décision permettent, le cas échéant, de favoriser certains critères de performance par rapport à d’autres. Selon un exemple de réalisation non limitatif, ces critères de décision comportent des pondérations associées aux critères de performance. B2—A second sub-step (step B2) which orders the configurations of the subset of configurations on the basis of one or more decision criteria 303 and determines the recommended configuration(s). These decision criteria make it possible, where appropriate, to favor certain performance criteria over others. According to an example of Non-limiting achievement, these decision criteria include weightings associated with the performance criteria.

Selon un exemple de réalisation non limitatif, l’algorithme d’ordonnancement est choisi dans une famille d’algorithmes appelés ‘Aide à la décision multicritères’ (‘Multi Criteria Decision Making’ ou ‘MCDM’ en langue anglaise). Selon un mode de réalisation particulier, c’est l’algorithme de pondération additive simple (‘Simple Additive Weighting’ ou ‘SAW’ en langue anglaise) qui est mis en œuvre. L’algorithme SAW est par exemple décrit dans [4], According to a non-limiting exemplary embodiment, the scheduling algorithm is chosen from a family of algorithms called “Multi Criteria Decision Making” or “MCDM” in English. According to a particular embodiment, it is the simple additive weighting algorithm (“Simple Additive Weighting” or “SAW” in English) which is implemented. The SAW algorithm is for example described in [4],

Le processus de recommandation de la figure 3 fournit en sortie une configuration recommandée 304 répondant aux spécifications données par l’utilisateur. Selon une variante de réalisation, si plusieurs configurations répondent aux spécifications, la plus pertinente de ces recommandations est fournie. The recommendation process in FIG. 3 outputs a recommended configuration 304 that meets the specifications given by the user. According to a variant embodiment, if several configurations meet the specifications, the most relevant of these recommendations is provided.

La recommandation peut alors être implémentée. Par exemple, le dispositif 200 peut communiquer la configuration recommandée au serveur en vue de la réservation des ressources. Une étape de validation par l’utilisateur peut optionnellement être prévue, ainsi qu’une étape optionnelle de fourniture d’informations complémentaires par l’utilisateur (comme par exemple la date choisie, la source des données à apprentissage à laquelle les unités de calcul devront accéder etc. ..). The recommendation can then be implemented. For example, device 200 may communicate the recommended configuration to the server for resource reservation. A step of validation by the user can optionally be provided, as well as an optional step of supplying additional information by the user (such as for example the chosen date, the source of the learning data on which the calculation units will have to access etc.).

Selon une variante de réalisation, le processus de recommandation fournit une liste comportant plusieurs configurations recommandées classées selon l’algorithme d’ordonnancement, l’utilisateur pouvant alors choisir la configuration qui lui convient. According to a variant embodiment, the recommendation process provides a list comprising several recommended configurations classified according to the scheduling algorithm, the user then being able to choose the configuration that suits him.

Dans ce qui suit, un exemple de réalisation particulier dans le domaine de la segmentation d’images sera décrit en détail. Dans ce cadre, on utilisera à titre d’exemples d’algorithmes à apprentissage profond : un algorithme mettant en œuvre un réseau entièrement convolutif (aussi appelé ‘Fully Convolutional Network’ en langue anglaise, ou ‘FCN’) ; et un algorithme dénommé ‘UN et’, mettant également en œuvre un réseau de neurones convolutif. Les deux algorithmes utilisés sont connus par ailleurs et respectivement décrits dans [2] et [3], Ils sont particulièrement adaptés à la segmentation d’images. Il est toutefois à noter que ces algorithmes ne sont donnés qu’à titre d’exemple et de façon non limitative pour illustrer les exemples de réalisation. D’autres algorithmes à apprentissage profond peuvent être utilisés. Les deux algorithmes sont des réseaux de neurones de type convolutif. Le réseau FCN est une succession de filtres convolutifs qui décroit la taille des images successives en entrée de ces filtres. Le réseau U-Net a une première phase de compression comme le réseau FCN, puis une phase dite « d’expansion » où la taille des images en sorties des filtres convolutifs augmente, ce qui donne une forme en U du réseau et lui donne son nom. Le réseau Unet, grâce à ses deux parties (compression/ expansion), a de bonne propriétés locales et globales pour la segmentation. L’intérêt des filtres successifs de convolution est d’avoir de bonnes propriétés d’invariance vis à vis des rotations et des translations dans l’image. Le réseau Unet dispose de sept couches de filtres convolutifs, alors que le FCN en comporte huit. In the following, a particular embodiment in the field of image segmentation will be described in detail. In this context, the following will be used as examples of deep learning algorithms: an algorithm implementing a fully convolutional network (also called 'Fully Convolutional Network' in English, or 'FCN'); and an algorithm called 'UN and', also implementing a convolutional neural network. The two algorithms used are known elsewhere and respectively described in [2] and [3]. They are particularly suitable for image segmentation. It should however be noted that these algorithms are given only by way of example and in a non-limiting manner to illustrate the embodiment examples. Other deep learning algorithms can be used. Both algorithms are convolutional neural networks. The FCN network is a succession of convolutional filters which decreases the size of the successive images at the input of these filters. The U-Net network has a first phase of compression like the FCN network, then a so-called "expansion" phase where the size of the images at the outputs of the convolutional filters increases, which gives a U-shape to the network and gives it its name. The Unet network, thanks to its two parts (compression/expansion), has good local and global properties for segmentation. The advantage of successive convolution filters is to have good invariance properties with respect to rotations and translations in the image. The Unet network has seven layers of convolutional filters, while the FCN has eight.

Dans le cadre du présent exemple, les deux algorithmes ci-dessus sont appliqués à la segmentation d’images médicales. Pour les besoins de l’exemple, l’apprentissage est réalisé sur base de deux jeux de données : In the context of this example, the two algorithms above are applied to the segmentation of medical images. For the purposes of the example, learning is performed on the basis of two sets of data:

- Un premier jeu de données concerne la segmentation de tumeurs cérébrales. Ce premier jeu de données sera libellé ‘BRATS’ (voir référence [5]) dans le reste de la description. Il comporte 750 images de résonance magnétique (IRM) provenant de 19 institutions médicales différentes : un premier groupe de 484 images est utilisé pour l’apprentissage et un second groupe de 266 images est utilisé pour tester le modèle résultant de l’apprentissage. Chaque image est assortie d’une annotation manuelle de référence localisant la tumeur. Ce jeu de données peut être obtenu aux liens indiqués au niveau de la référence [5], - A first set of data concerns the segmentation of brain tumours. This first dataset will be labeled 'BRATS' (see reference [5]) in the rest of the description. It includes 750 magnetic resonance images (MRI) from 19 different medical institutions: a first group of 484 images is used for training and a second group of 266 images is used to test the model resulting from the training. Each image is accompanied by a manual reference annotation locating the tumour. This dataset can be obtained at the links indicated in reference [5],

- Un second jeu de données concerne la segmentation d’une cavité cardiaque, appelée atrium gauche. Le jeu de données sera libellé ‘Atrium’ (voir référence [6]) dans le reste de la description). Il comporte trente images IRM cardiaques en trois dimensions, dont vingt pour l’apprentissage et dix pour la phase test. Ce jeu de données peut être obtenu au lien indiqué au niveau de la référence [6], Dans les deux jeux, les annotations consistent en des masques colorés se superposant aux images et identifiant la partie segmentée. - A second set of data concerns the segmentation of a cardiac chamber, called the left atrium. The dataset will be labeled 'Atrium' (see reference [6]) in the rest of the description). It includes thirty three-dimensional cardiac MRI images, including twenty for learning and ten for the test phase. This dataset can be obtained at the link indicated in reference [6], In both games, the annotations consist of colored masks overlaying the images and identifying the segmented part.

Selon l’étape Al, un modèle paramétrique des critères de performance est établi. Selon le présent exemple de réalisation, ce modèle comporte trois équations, une pour chaque critère de performance. According to step Al, a parametric model of the performance criteria is established. According to the present embodiment, this model comprises three equations, one for each performance criterion.

La première équation correspond à la relation entre l’inverse du temps d’entraînement et le nombre d’unités de calcul. Il est proposé de modéliser cette relation sous la forme d’une équation linéaire, la linéarité pouvant être montrée statistiquement: The first equation corresponds to the relationship between the inverse of the training time and the number of calculation units. It is proposed to model this relationship in the form of a linear equation, the linearity can be shown statistically:

Equation (1) : (1/Temps d’entraînement) = A * nombre d’unités de calcul + B où A et B sont deux coefficients à estimer. Equation (1): (1/Training time) = A * number of calculation units + B where A and B are two coefficients to be estimated.

La figure 4 est un graphique qui montre l’accélération de l’apprentissage en fonction du nombre d’unités de calcul pour chacun des deux algorithmes et les deux jeux de données précédemment cités. L’accélération en ordonnée est donnée par le temps d’apprentissage pour une unité de calcul divisé par le temps d’apprentissage pour N unités de calcul. Dans l’exemple de la figure 4, le nombre d’itérations ou ‘époques’ de l’algorithme d’optimisation est fixé préalablement et constant. Figure 4 is a graph that shows the acceleration of learning as a function of the number of calculation units for each of the two algorithms and the two sets of data mentioned above. The acceleration on the ordinate is given by the learning time for one computation unit divided by the learning time for N computation units. In the example of Figure 4, the number of iterations or ‘epochs’ of the optimization algorithm is fixed beforehand and constant.

La deuxième équation modélise la précision en fonction du nombre d’unités de calcul : The second equation models accuracy as a function of the number of compute units:

Equation (2) : Précision = C * nombre d’unités de calcul + D où C et D sont deux coefficients à estimer. Equation (2): Precision = C * number of calculation units + D where C and D are two coefficients to be estimated.

Concernant la deuxième équation, les inventeurs ont pris pour hypothèse une augmentation de la perte de précision lors de l’augmentation du nombre d’unités de calcul. Les paramètres des algorithmes à apprentissage sont obtenus après des optimisations complexes basées sur des algorithmes stochastiques (i.e. aléatoires). Une première conséquence est que les optimums obtenus sont multiples et avec des plateaux. De plus, ils sont très dépendants des jeux de données utilisés. En parallélisant l’optimisation, on divise le jeu de données en fonction du nombre d’unités de calcul. Dans le présent exemple non limitatif où les données d’entrainement sont des jeux d’images, chaque unité de calcul prend par exemple en charge plusieurs images d’un jeu. Chaque résultat va être dans ce plateau d’autant moins précis que le jeu de données va être divisé. Les résultats de chaque unité de calcul étant fusionnés dans le modèle, le modèle fusionné n’en sera pas plus précis. Regarding the second equation, the inventors assumed an increase in the loss of precision when increasing the number of calculation units. The parameters of learning algorithms are obtained after complex optimizations based on stochastic (ie random) algorithms. A first consequence is that the optima obtained are multiple and with plateaus. Moreover, they are very dependent on the datasets used. By parallelizing the optimization, we divide the data set according to the number of calculation units. In this non-limiting example where the data of training are sets of images, each calculation unit for example supports several images of a set. Each result will be in this tray all the less precise as the set of data will be divided. Because the results of each compute unit are merged into the model, the merged model will not be more accurate.

De plus, chaque fonction de coût f(W,D) dépend des poids W à estimer ainsi que des données D. En divisant le jeu de données sur chaque unité de calcul, la fonction de coût mise en œuvre par chaque unité de calcul sera donc différente, et son optimum local le sera aussi. Par rapport à la figure 1, les poids W sont rattachés aux liens entre les couches 14, tandis que les données D sont représentées par le bloc 11. Moreover, each cost function f(W,D) depends on the weights W to be estimated as well as the data D. By dividing the dataset over each computational unit, the cost function implemented by each computational unit will be therefore different, and its local optimum will be too. Compared to figure 1, the weights W are attached to the links between the layers 14, while the data D is represented by the block 11.

La somme des résultats des optimisations sur chaque unité de calcul sera donc à priori d’une approximation d’autant plus importante que le nombre d’unités de calcul sera important. The sum of the results of the optimizations on each calculation unit will therefore be a priori an approximation that is all the greater as the number of calculation units increases.

La troisième équation modélise la puissance consommée : The third equation models the power consumed:

Equation (3) : Puissance = PUE * Temps d’entraînement * (Pcpu + Pram + Nbre GPUs * Pgpu) /1000 où ‘PUE’ est la Puissance Utilisée Effective (prise égale à 1.58), ‘Pcpu’ est la puissance liée à l’utilisation du processeur du serveur, ‘Pram’ est la puissance liée à l’utilisation de la mémoire du serveur, et ‘Pgpu’ est la puissance liée à chaque unité de calcul du serveur. Chacune de ces puissances est dépendante du matériel utilisé. Equation (3): Power = PUE * Training time * (Pcpu + Pram + Nbr GPUs * Pgpu) / 1000 where 'PUE' is the Effective Power Used (taken equal to 1.58), 'Pcpu' is the power related to server CPU usage, 'Pram' is the power related to the server's memory usage, and 'Pgpu' is the power related to each compute unit of the server. Each of these powers is dependent on the material used.

L’équation (3) provient de l’article référencé en [7], Equation (3) comes from the article referenced in [7],

Les valeurs des différentes puissances peuvent être obtenues des fabricants des différents composants. A titre d’exemple, des valeurs de puissance pour un équipement spécifique sont indiquées dans le tableau 1 : The values of the different powers can be obtained from the manufacturers of the different components. By way of example, power values for specific equipment are shown in Table 1:

[Tableau 1]

Une fois les coefficients et paramètres de puissance du modèle paramétrique estimés, respectivement déterminés, on estime les performances moyennes (temps d’entraînement, précision et puissance) pour un nombre d’unités de calcul allant de 0 à N. Pour chaque estimation, on précisera les écarts types. [Table 1]

Once the coefficients and power parameters of the parametric model have been estimated, respectively determined, the average performances are estimated (training time, precision and power) for a number of calculation units ranging from 0 to N. For each estimate, specify the standard deviations.

Dans le cadre du présent exemple, l’estimation des coefficients des deux premières équations est effectuée à l’aide d’une régression par moindres carrés. Bien entendu, d’autres méthodes peuvent être utilisées pour réaliser cette estimation. Dans ce qui suit, on vérifie la validité de l’hypothèse d’équations linéaires pour les équations (1) et (2), par rapport aux deux jeux de données différents choisis, en testant si les coefficients obtenus (au moins A et C) ne sont pas nuis. In the context of this example, the estimation of the coefficients of the first two equations is carried out using a least squares regression. Of course, other methods can be used to make this estimate. In what follows, we check the validity of the hypothesis of linear equations for equations (1) and (2), with respect to the two different data sets chosen, by testing whether the coefficients obtained (at least A and C ) are not harmed.

Pour l’équation (1) : For equation (1):

[Tableau 2]

[Table 2]

[Tableau 3]

[Table 3]

On peut remarquer que le modèle est valide pour les deux algorithmes et les deux jeux de données. A noter que pour le modèle UNET, le coefficient B est égal à zéro (zéro étant compris dans l’intervalle de confiance). Il existe donc bien une relation linéaire entre l’inverse du temps d’entraînement et le nombre d’unités de calcul. We can notice that the model is valid for the two algorithms and the two sets of data. It should be noted that for the UNET model, the coefficient B is equal to zero (zero being included in the confidence interval). There is therefore a linear relationship between the inverse of the training time and the number of calculation units.

Pour l’équation (2) : For equation (2):

[Tableau 4]

[Table 4]

[Tableau 5]

[Table 5]

Pour les deux algorithmes et les deux jeux de données, les coefficients C et D sont statistiquement différents de 0. For the two algorithms and the two data sets, the coefficients C and D are statistically different from 0.

En conséquence, la précision de chaque algorithme pour chaque jeu de données peut être modélisée par une équation linéaire. As a result, the accuracy of each algorithm for each dataset can be modeled by a linear equation.

Il peut être observé que pour l’équation (1), les coefficients A et B sont très proches pour un même algorithme. Pour l’équation (2), il peut être observé des coefficients D différents mais des coefficients C, à savoir les pentes, semblables pour les algorithmes et les deux jeux de données. Ce coefficient C peut être vu comme la perte de précision relative en fonction du nombre d’unités de calcul. La variable D peut être vue comme la précision intrinsèque à chaque jeu de données, et plus ou moins grande en fonction de la complexité du jeu de données. Dans ce qui suit, une validation par test statistique de l’hypothèse que les performances d’un algorithme spécifique sur un jeu de données peuvent être transposées à un autre jeu de données est décrite. Les principes d’une telle validation statistique sont décrits dans [8], En vue d’introduire un vocabulaire utilisé plus loin, si l’on considère un modèle linéaire de type Y= aX + b, où Y est la précision ou le temps d'entraînement et X est le nombre d’unités de calcul, la variable 'a' est représentative de la notion de dynamique en fonction du nombre d’unités de calcul et la variable ‘b’ est représentative du biais. It can be observed that for equation (1), the coefficients A and B are very close for the same algorithm. For equation (2), different coefficients D can be observed but coefficients C, namely the slopes, similar for the algorithms and the two sets of data. This coefficient C can be seen as the relative loss of precision as a function of the number of calculation units. The variable D can be seen as the intrinsic precision of each dataset, and more or less depending on the complexity of the dataset. In the following, a statistical test validation of the hypothesis that the performance of a specific algorithm on one dataset can be transposed to another dataset is described. The principles of such a statistical validation are described in [8], with a view to introducing a vocabulary used later, if we consider a linear model of type Y= aX + b, where Y is the precision or the time of training and X is the number of calculation units, the variable 'a' is representative of the concept of dynamics according to the number of calculation units and the variable 'b' is representative of the bias.

Pour l’équation (1) : For equation (1):

On considère l’équation suivante : We consider the following equation:

Equation (4) : (1 / Temps d’entraînement) = Kl + K2 * nombre d’unités de calcul + K3 * Z + K4 * nombre d’unités de calcul * Z où (Kl, K2, K3, K4) sont des variables à estimer et Z est la variable permettant d’indiquer à quel jeu de données la valeur du temps d’entraînement appartient (l’un des couples (Kl, K3) et (K2, K4) étant coté à 0, l’autre à 1). Plus précisément, si Z =0, on retrouve l’équation (1) pour un couple. Si Z=l, on a: Equation (4): (1 / Training time) = Kl + K2 * number of calculation units + K3 * Z + K4 * number of calculation units * Z where (Kl, K2, K3, K4) are of the variables to be estimated and Z is the variable making it possible to indicate to which data set the value of the training time belongs (one of the pairs (Kl, K3) and (K2, K4) being quoted at 0, the other to 1). More precisely, if Z =0, we find equation (1) for a couple. If Z=l, we have:

Equation (5) : (1 / Temps d’entraînement) = (K1+K3) + (K2+K4) * nombre d’unités de calcul Equation (5): (1 / Training time) = (K1+K3) + (K2+K4) * number of calculation units

Si K3 et K4 sont statistiquement égaux à 0, alors on pourra considérer que chaque algorithme a les mêmes propriétés sur les deux jeux de données. If K3 and K4 are statistically equal to 0, then we can consider that each algorithm has the same properties on the two sets of data.

[Tableau 6]

[Tableau 7]

[Table 6]

[Table 7]

On observe bien que K3 et K4 sont différents de 0 pour les deux algorithmes. La conséquence est qu’on peut prédire le temps d’entraînement pour un même algorithme (UNET ou FCN) pour différents jeux de données. We observe that K3 and K4 are different from 0 for the two algorithms. The consequence is that we can predict the training time for the same algorithm (UNET or FCN) for different datasets.

Pour l’équation (2) : For equation (2):

On considère l’équation suivante : We consider the following equation:

Equation (6) : Précision = Kl + K2 * nombre d’unités de calcul + K3 * Z + K4 * nombre d’unités de calcul * Z où (Kl , K2, K3, K4) sont des variables à estimer et Z est la variable permettant de considérer à quel jeu de données la valeur du temps d’entraînement appartient (l’un des couples (Kl, K3) et (K2, K4) étant coté à 0, l’autre à 1). Plus précisément, si Z =0, on retrouve l’équation (2) pour un couple. Si Z=l, on a: Equation (6): Precision = Kl + K2 * number of calculation units + K3 * Z + K4 * number of calculation units * Z where (Kl , K2, K3, K4) are variables to be estimated and Z is the variable making it possible to consider to which data set the value of the training time belongs (one of the pairs (K1, K3) and (K2, K4) being quoted at 0, the other at 1). More precisely, if Z =0, we find equation (2) for a couple. If Z=l, we have:

Equation (7) : Précision = (K1+K3) + (K2+K4) * Nbre de nombre d’unités de calcul Equation (7): Precision = (K1+K3) + (K2+K4) * Number of number of calculation units

Si K3 et K4 sont égaux à 0 statistiquement, alors on pourra considérer que chaque algorithme a les mêmes propriétés sur les deux jeux de données. If K3 and K4 are equal to 0 statistically, then we can consider that each algorithm has the same properties on the two sets of data.

[Tableau 8]

[Table 8]

[Tableau 9]

[Table 9]

On observe notamment que la dynamique de perte de précision entre deux jeux de données est semblable pour l’algorithme UNET (car statistiquement K4 = 0), mais que ce modèle n’est pas valide entre les deux jeux de données pour l’algorithme FCN (car K4 est différent de 0). Pour les deux algorithmes, le biais est différent pour les deux jeux de données. We observe in particular that the dynamics of loss of precision between two data sets is similar for the UNET algorithm (because statistically K4 = 0), but that this model is not valid between the two data sets for the FCN algorithm (because K4 is different from 0). For the two algorithms, the bias is different for the two datasets.

On pourra noter que la valeur K4 obtenue pour l’algorithme FCN est très petite (-0.1242). Pour 18 unités de calcul, on a donc une perte de précision relative supplémentaire de 18*- 0.1242= 2.23. Cette valeur peut cependant être considérée comme négligeable. Note that the K4 value obtained for the FCN algorithm is very small (-0.1242). For 18 calculation units, there is therefore an additional relative loss of precision of 18*- 0.1242= 2.23. However, this value can be considered negligible.

Il est à noter qu’à partir d’un certain nombre N d’unités de calcul, l’avantage du gain en temps résultant du partage du calcul est contrebalancé par le temps de communication additionnel entre les appareils. De plus, la perte en précision peut être d’autant plus importante que la quantité de données vue par chaque unité de calcul est petite. It should be noted that from a certain number N of calculation units, the advantage of the time saving resulting from the sharing of the calculation is counterbalanced by the additional communication time between the devices. Moreover, the loss in precision can be all the more important as the quantity of data seen by each calculation unit is small.

De plus, le fait que les modèles soient robustes pour les différents jeux de données qui sont de tailles et de nombres d’images différents peut s’expliquer par la phase d’augmentation de données pratiquée très généralement dans les analyses d’apprentissage profond. Même si les jeux de données sont hétérogènes, ils sont normalisés en nombre et en taille pour un même algorithme. Enfin les hyper paramètres (les paramètres intrinsèques à chaque algorithme comme la vitesse de convergence de l’algorithme d’optimisation, le nombre d’images analysées à chaque itération, etc.. ..) sont fixes pour un algorithme spécifique. Moreover, the fact that the models are robust for the different datasets which are of different sizes and number of images can be explained by the data augmentation phase practiced very generally in deep learning analyses. Even if the datasets are heterogeneous, they are standardized in number and size for the same algorithm. Finally the hyper parameters (the parameters intrinsic to each algorithm like the speed of convergence of the optimization algorithm, the number of images analyzed in each iteration, etc.. ..) are fixed for a specific algorithm.

Le modèle paramétrique d’un algorithme à apprentissage profond peut donc être transposable à différents types de données. The parametric model of a deep learning algorithm can therefore be transposed to different types of data.

La figure 5 représente un exemple non limitatif d’une interface graphique 500 pouvant être mise en œuvre pour interagir avec le dispositif recommandeur 200. Cette interface graphique peut être une interface accessible à distance par rapport au dispositif, par exemple par l’intermédiaire d’un navigateur d’un ordinateur. Elle est utilisable tant pour acquérir les différentes données et les choix de l’utilisateur que pour présenter la ou les recommandations obtenues et permet ainsi de gérer le filtrage des configurations ainsi que l’algorithme de décision. FIG. 5 represents a non-limiting example of a graphic interface 500 that can be implemented to interact with the recommender device 200. This graphic interface can be an interface accessible remotely with respect to the device, for example via a computer browser. It can be used both to acquire the different data and the user's choices and to present the recommendation(s) obtained and thus makes it possible to manage the filtering of the configurations as well as the decision algorithm.

L’interface comporte quatre parties : The interface has four parts:

1 - Une première partie permet d’indiquer les contraintes imposées aux critères de performance (ce qui correspond aux données 302 de la figure 3) ainsi que d’ajuster les poids associés à chaque critère (ce qui correspond aux données 303 de la figure 3). 1 - A first part makes it possible to indicate the constraints imposed on the performance criteria (which corresponds to the data 302 of figure 3) as well as to adjust the weights associated with each criterion (which corresponds to the data 303 of figure 3 ).

Dans l’exemple de la figure 5, cette partie est positionnée dans la colonne de gauche (zoneIn the example of figure 5, this part is positioned in the left column (area

501). La figure 6 illustre cette partie en détail. 501). Figure 6 illustrates this part in detail.

2 - Une seconde partie permet de sélectionner des paramètres gérant la visualisation des configurations recommandées. Cette visualisation permet à l’utilisateur notamment d’affiner la configuration du dispositif de recommandation. 2 - A second part is used to select the parameters managing the visualization of the recommended configurations. This visualization allows the user in particular to refine the configuration of the recommendation device.

Dans l’exemple de la figure 5, cette partie est positionnée dans la colonne de droite (zoneIn the example of Figure 5, this part is positioned in the right column (area

502). La figure 7 illustre cette partie en détail. 502). Figure 7 illustrates this part in detail.

3 - Une troisième partie permet de visualiser un ou plusieurs critères de performance ainsi que la variabilité associée, et ce pour plusieurs configurations. Les configurations prises en compte dans cette partie peuvent couvrir un large spectre et ne sont données qu’à titre informatif. La figure 8 illustre cette partie en détail. 3 - A third part makes it possible to visualize one or more performance criteria as well as the associated variability, and this for several configurations. The configurations taken into account in this part can cover a wide spectrum and are given for information only. Figure 8 illustrates this part in detail.

Dans l’exemple de la figure 5, cette partie est positionnée dans la partie centrale basse (zoneIn the example of figure 5, this part is positioned in the lower central part (zone

503). 503).

4 - Une quatrième partie affiche le résultat de la recommandation, à savoir une configuration ou plusieurs configurations indiquées selon un ordonnancement. Dans l’exemple de la figure 5, cette partie est positionnée dans la partie centrale basse (zone 504). 4 - A fourth part displays the result of the recommendation, namely a configuration or several configurations indicated according to a schedule. In the example of FIG. 5, this part is positioned in the lower central part (zone 504).

L’exemple illustré par les figures 5 à 8 peut se placer comme précédemment dans le contexte de l’imagerie médicale. Les serveurs avec leurs unités de calcul sont à disposition pour un temps limité, par exemple uniquement la nuit car utilisés pour une autre application en journée. Une durée maximale d’apprentissage de 12 heures est fixée. La perte de précision est laissée libre. Ces deux choix sont effectués dans la colonne de gauche (501) de l’interface. Hormis la contrainte sur le temps, il n’est pas souhaité de privilégier la précision sur le temps ou inversement. Un poids identique de ‘5’ sur ‘ 10’ est affecté à chacun de ces deux critères.The example illustrated by Figures 5 to 8 can be placed as before in the context of medical imaging. The servers with their calculation units are available for a limited time, for example only at night because they are used for another application during the day. A maximum learning time of 12 hours is set. The loss of precision is left free. These two choices are made in the left column (501) of the interface. Apart from the constraint on time, it is not desired to favor precision over time or vice versa. An identical weight of '5' out of '10' is assigned to each of these two criteria.

On obtient alors une configuration recommandée de sept unités de calcul, à utiliser pour un temps d’entraînement de 9h48 avec une variabilité de lh05, ce qui est bien inférieur au temps maximal fixé de 12 heures. We then obtain a recommended configuration of seven calculation units, to be used for a training time of 9h48 with a variability of 1h05, which is well below the fixed maximum time of 12 hours.

La zone 504 peut alors présenter le résultat de la façon suivante, sous la forme de plusieurs configurations ordonnées Zone 504 can then present the result as follows, in the form of several ordered configurations

[Tableau 10]

[Table 10]

Il est à noter que les trois configurations proposent bien un temps d’apprentissage inférieur à 12h. It should be noted that the three configurations offer a learning time of less than 12 hours.

La figure 8 montre la zone 503 de l’interface pour cet exemple - la précision y est donnée pour 19 valeurs du nombre d’unités de calcul, avec une indication de la plage de variabilité autour de la précision moyenne. Figure 8 shows area 503 of the interface for this example - the precision is given there for 19 values of the number of calculation units, with an indication of the range of variability around the average precision.

Dans un second exemple, aucune contrainte n’est imposée sur le temps, mais une contrainte est imposée sur la précision du modèle. En effet, dans le cadre d’un marquage CE, il est nécessaire de disposer d’une précision supérieure à 75% de score de Dice. On favorisera donc la précision par rapport aux autres critères : un poids de 10 est affecté à la précision tandis que seulement un poids de 5 est affecté au temps. On obtient une configuration recommandée de quatre unités de calcul à utiliser pour une précision de 77,23% ± 0,57, qui est bien supérieure à la valeur désirée dans le cadre du marquage CE. Le premier choix de configuration peut alors se présenter sous la forme suivante dans l’interface : In a second example, no constraint is imposed on the time, but a constraint is imposed on the accuracy of the model. Indeed, in the context of CE marking, it is necessary to have an accuracy greater than 75% of Dice score. Precision will therefore be favored over the other criteria: a weight of 10 is assigned to precision while only a weight of 5 is assigned to time. This results in a recommended configuration of four calculation units to be used for an accuracy of 77.23% ± 0.57, which is well above the value desired in the context of CE marking. The first choice of configuration can then appear in the following form in the interface:

[Tableau 11]

[Table 11]

La figure 9 est un organigramme d’un exemple de réalisation non limitatif. Figure 9 is a flowchart of a non-limiting example embodiment.

Selon une étape S901, on obtient un domaine de connaissances comportant au moins une configuration observée et une valeur d’au moins un indicateur de performance. According to a step S901, a knowledge domain comprising at least one observed configuration and a value of at least one performance indicator is obtained.

Selon une étape S902, il est déterminé d’un modèle paramétrique sur base du modèle de connaissances. Selon une étape S903, il est déterminé, en vue de compléter le domaine de connaissances, au moins une configuration différente du ou des configurations du domaine de connaissances, et de la valeur de l’au moins un critère de performance sur base du modèle paramétrique. Selon une étape S904, une sélection est réalisée d'au moins une configuration parmi les configurations du domaine de connaissances complété et répondant à au moins une contrainte relative à l’au moins un indicateur de performance. According to a step S902, a parametric model is determined on the basis of the knowledge model. According to a step S903, it is determined, with a view to completing the knowledge domain, at least one configuration different from the configuration(s) of the knowledge domain, and from the value of the at least one performance criterion on the basis of the parametric model . According to a step S904, a selection is made of at least one configuration from among the configurations of the knowledge domain completed and meeting at least one constraint relating to the at least one performance indicator.

Cette sélection peut être effectuée par un algorithme de décision, Selon un mode particulier, l’algorithme de décision va choisir une configuration unique, qui donne une bonne performance ou une performance optimale. Par exemple, l’algorithme de décision peut appliquer une fonction de coût prenant en compte un ou plusieurs indicateurs de performance et de façon optionnelle des coefficients de pondération du ou des indicateurs. Selon un autre mode particulier, une sélection d’une configuration unique est effectuée par un utilisateur qui peut en cela être assisté par un algorithme de décision qui ordonne une ou plusieurs configurations sur base d’une fonction indicative des performances de chaque configuration. This selection can be made by a decision algorithm. According to a particular mode, the decision algorithm will choose a single configuration, which gives good performance or optimal performance. For example, the decision algorithm can apply a cost function taking into account one or more performance indicators and optionally weighting coefficients of the indicator(s). According to another particular mode, a selection of a single configuration is made by a user who can in this be assisted by a decision algorithm which orders one or more configurations on the basis of a function indicative of the performance of each configuration.

Références References

[1] Appenzoller, L.M., Michalski, J.M., Thorstad, W.L., Mutic, S. and Moore, K.L. (2012), Predicting dose-volume histograms for organs-at-risk in IMRT planning. Med. Phys., 39: 7446-7461. https://doi.Org/10.1118/l.4761864 [1] Appenzoller, L.M., Michalski, J.M., Thorstad, W.L., Mutic, S. and Moore, K.L. (2012), Predicting dose-volume histograms for organs-at-risk in IMRT planning. Med. Phys., 39: 7446-7461. https://doi.org/10.1118/l.4761864

[2] J. Long, E. Shelhamer and T. Darrell, "Fully convolutional networks for semantic segmentation," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431-3440, doi: 10.1109/CVPR.2015.7298965. [2] J. Long, E. Shelhamer and T. Darrell, "Fully convolutional networks for semantic segmentation," 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431-3440, doi: 10.1109/CVPR.2015.7298965.

[3] Olaf Ronneberger, Philipp Fischer, Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation” Medical Image Computing and Computer- Assisted Intervention (MICCAI), Springer, LNCS, Vol.9351 : 234--241, 2015. https://arxiv.org/abs/1505.04597 [3] Olaf Ronneberger, Philipp Fischer, Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation” Medical Image Computing and Computer-Assisted Intervention (MICCAI), Springer, LNCS, Vol.9351: 234--241, 2015 https://arxiv.org/abs/1505.04597

[4] Nurmalini, N., and Robbi Rahim. "Study Approach of Simple Additive Weighting For Decision Support System." Int. J. Sci. Res. Sci. Technol 3.3 (2017): 541-544. [4] Nurmalini, N., and Robbi Rahim. "Study Approach of Simple Additive Weighting For Decision Support System." Int. J.Sci. Res. Science. Technol 3.3 (2017): 541-544.

[5] Menze B, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Transactions on Medical Imaging 2014 Oct;34(10): 1993-2024. https://hal.inria.fr/hal- 00935640. [5] Menze B, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Transactions on Medical Imaging 2014 Oct;34(10): 1993-2024. https://hal.inria.fr/hal-00935640.

Liens d’accès pour les données d’images IRM : http://medicaldecathlon.com/ et https://decathlon.grand-challenge.org/ Access links for MRI image data: http://medicaldecathlon.com/ and https://decathlon.grand-challenge.org/

[6] Tobon-Gomez C, Geers AJ, Peters J, Weese J, Pinto K, Karim R, et al. Benchmark for algorithms segmenting the left atrium from 3D CT and MRI datasets. IEEE transactions on medical imaging 2015;34(7): 1460-1473. [6] Tobon-Gomez C, Geers AJ, Peters J, Weese J, Pinto K, Karim R, et al. Benchmark for algorithms segmenting the left atrium from 3D CT and MRI datasets. IEEE transactions on medical imaging 2015;34(7): 1460-1473.

Lien d’accès pour les données d’images IRM : https://www.cardiacatlas.org/challenges/left- atrium-segmentation-challenge/ Access link for MRI image data: https://www.cardiacatlas.org/challenges/left-atrium-segmentation-challenge/

[7] Strubell, E., et al. “Energy and policy considerations for deep learning in NLP”. arXiv preprint arXiv: 1906.02243. (2019)). [7] Strubell, E., et al. “Energy and policy considerations for deep learning in NLP”. arXiv preprint arXiv: 1906.02243. (2019)).

[8] Andrade 2014, J.M., and M. G. Estévez- Pérez, “Statistical comparison of the slopes of two regression lines: A tutorial.” Analytica chimica acta 838 (2014): 1 - 12[9] Ge, Yaorong, and Q. Jackie Wu. "Knowledge-based planning for intensity-modulated radiation therapy: A review of data-driven approaches." Medical physics 46.6 (2019): 2760-2775. [8] Andrade 2014, J.M., and M. G. Estévez-Pérez, “Statistical comparison of the slopes of two regression lines: A tutorial.” Analytica chimica acta 838 (2014): 1 - 12[9] Ge, Yaorong, and Q. Jackie Wu. "Knowledge-based planning for intensity-modulated radiation therapy: A review of data-driven approaches." Medical physics 46.6 (2019): 2760-2775.

Claims

24 CLAIMS

1. Method for determining configurations of a distributed system of computing units for the implementation of a deep learning algorithm, said determination implementing a recommendation based on a knowledge model, said method being implemented by a device comprising a processor which causes said device to implement the method, the method comprising: determining a deep learning algorithm to be implemented in the distributed system; obtaining (S901) a knowledge domain comprising, for said algorithm, at least one observed configuration and a value of at least one performance indicator for said at least one observed configuration; determining (S902) a parametric model of the performance indicator(s) based on the knowledge domain; the determination (S903) of at least one configuration different from said at least one observed configuration and an estimation of the value of the performance indicator(s) associated with said different configuration, on the basis of said model, with the aim of completing the domain of knowledge ; the selection (S904) of at least one configuration from among the configurations of the knowledge domain completed and meeting at least one constraint relating to the at least one performance indicator; generating a signal representative of said at least one selected configuration.

2. Method according to claim 1, in which the selection is carried out by a decision algorithm according to a weighting of at least one performance indicator, the method further comprising the display of the configuration(s) selected by implementing said signal and receiving a command to identify a unique configuration among the selected configurations which will be applied to the distributed system.

3. Method according to claim 1, in which the selection is carried out by a decision algorithm which selects a single configuration.

4. Method according to one of claims 2 or 3, further comprising the reservation of calculation units according to the unique configuration. Method according to one of the preceding claims, further comprising estimating the variability of said at least one performance indicator for a different configuration. Method according to one of the preceding claims, in which the said at least one performance indicator of a configuration comprises one or more of: the required training time of the system for the configuration considered; the relative precision of calculation; the power consumed by the system for the configuration considered. Method according to claim 6, in which the parametric models applied on the one hand to the inverse of the training time and on the other hand to the calculation precision are affine functions of a number of calculation units of the system distributed implemented in the configuration considered, and the determination of the parametric model includes the determination of the parameters of the affine function associated with each performance indicator. Method according to claim 6, in which the power consumed is an affine function of the square of the number of calculation units of the distributed system to be used in the configuration envisaged. Method according to one of the preceding claims, in which the decision algorithm classifies the filtered configurations according to a score depending on the performance indicator(s) and weighting coefficients of the latter. Method according to one of the preceding claims further comprising obtaining at least one constraint on an intrinsic characteristic of hardware configuration, said constraint being taken into account within the framework of the filtering of the configurations. Method according to one of the preceding claims, in which the deep learning algorithm implements a convolutional neural network. Method according to one of the preceding claims, the method being applied in the context of image segmentation. Device (200) comprising a processor (201) and a memory (203), said memory comprising instructions which, when they are executed by the processor, lead the device to implement the steps of the method according to one of the claims 1 to 12. Computer program product (213) comprising instructions which, when the program is executed by a device (200) comprising a processor (201), causes the device to implement the steps of the method according to one of claims 1 to 12.