WO2025196213A1

WO2025196213A1 - Method and system for characterising microorganisms contained in a complex sample

Info

Publication number: WO2025196213A1
Application number: PCT/EP2025/057685
Authority: WO
Inventors: Benoît COURBON; Nicolas Faure; Nicolas SAPAY
Original assignee: Biomerieux SA; Centre National de la Recherche Scientifique CNRS; Bioaster; Universite Jean Monnet
Current assignee: Biomerieux SA; Centre National de la Recherche Scientifique CNRS; Bioaster; Universite Jean Monnet
Priority date: 2024-03-22
Filing date: 2025-03-20
Publication date: 2025-09-25
Anticipated expiration: 2026-09-22

Abstract

The disclosed method for classifying microorganisms contained in a sample comprises: preparing a slide of the sample; acquiring at least one digital image of the slide; and implementing, with a computer, a model for predicting the class of the microorganisms depending on the acquired image. According to the invention, said image is subdivided into sub-images and each sub-image is sub-divided into patches, and: A. for each patch, a microorganism feature extractor is applied, the feature extractor forming a convolutional part of a first convolutional neural network trained on patches annotated individually with at least one class; B. for each sub-image, a second neural network connected to the extractor is applied, the second neural network comprising an upstream pooling layer and one or more downstream layers comprising a layer for predicting at least one class, and being trained on training sub-images globally; and C. for the acquired image: F.a. calculating a feature vector calculated for the sub-images; F.b. applying a model for predicting at least one class for the microorganisms.

Description

PROCEDE ET SYSTEME DE CARACTERISATION DE MICROORGANISMES CONTENUS DANS UN ECHANTILLON COMPLEXE METHOD AND SYSTEM FOR CHARACTERIZING MICROORGANISMS CONTAINED IN A COMPLEX SAMPLE

DOMAINE DE L’INVENTION FIELD OF THE INVENTION

La présente invention a trait au domaine de l'analyse microbiologique, en particulier de la caractérisation automatique de microorganismes contenus dans un échantillon complexe tel que du sang. Sans être limitatif, l'invention trouve application à la caractérisation du Gram, de la morphologie et de l'agrégation de bactéries, de levures et de champignons infectant un patient ou un animal, notamment des patients suspectés de septicémie. The present invention relates to the field of microbiological analysis, in particular the automatic characterization of microorganisms contained in a complex sample such as blood. Without being limiting, the invention finds application to the characterization of the Gram, morphology and aggregation of bacteria, yeasts and fungi infecting a patient or an animal, in particular patients suspected of septicemia.

ETAT DE LA TECHNIQUE STATE OF THE ART

Le domaine de l'analyse microbiologique est confronté à de nombreux défis dès lors qu'elle s'écarte des techniques conventionnelles à base de culture sur ou dans un milieu nutritif, par exemple sur boite de Petri. La détection d'une septicémie déclenchée par une infection bactérienne du sang est un exemple bien connu. La présence de bactéries pathogènes dans le sang peut mener au décès d'un patient, de sorte que leur détection au plus tôt est vital pour ce dernier. Malheureusement, à un stade précoce, la quantité de bactéries est non seulement infime mais en outre ces dernières sont noyées dans une matrice complexe comprenant des globules blancs, rouges, des plaquettes, et bien d'autres éléments de tailles et de quantités significativement plus importantes, rendant à ce jour leur détection difficile, voire impossible. C'est pourquoi une hémoculture, à savoir une étape de pousse des bactéries présentes dans le sang, sang auquel est ajouté un milieu nutritif choisi pour accélérer au plus leur développement, est déclenchée pour tout patient suspecté de septicémie. Cette pousse a pour objectif de multiplier au plus vite les bactéries afin de les rendre détectables. Malheureusement l'hémoculture est un processus encore trop lent au regard de la sévérité de l'infection. The field of microbiological analysis faces many challenges when it deviates from conventional techniques based on culture on or in a nutrient medium, for example on Petri dishes. The detection of sepsis triggered by a bacterial infection of the blood is a well-known example. The presence of pathogenic bacteria in the blood can lead to the death of a patient, so their early detection is vital for the latter. Unfortunately, at an early stage, the quantity of bacteria is not only tiny but also they are embedded in a complex matrix comprising white blood cells, red blood cells, platelets, and many other elements of significantly larger sizes and quantities, making their detection difficult, if not impossible, to date. This is why a blood culture, namely a step of growing the bacteria present in the blood, blood to which is added a nutrient medium chosen to accelerate their development as much as possible, is triggered for any patient suspected of sepsis. The goal of this growth is to multiply the bacteria as quickly as possible to make them detectable. Unfortunately, blood culture is still too slow a process given the severity of the infection.

La complexité des échantillons met également en difficulté l'analyse microbiologique dans des applications autrefois routinières dans les laboratoires. C'est le cas notamment de la caractérisation du Gram de bactéries, quel que soit l'échantillon, complexe ou non. En effet, la caractérisation du Gram de bactéries requiert une grande expertise humaine, étant fondée sur l'analyse microscopique de lames de Gram par un technicien qualifié. Or, on observe une perte importante de cette expertise dans un nombre important, voire majoritaire, de laboratoires d'analyse, conduisant à une perte de qualité, voire menant de facto à ne plus mettre en œuvre cette analyse. Pourtant cette dernière, menée très tôt une fois un prélèvement reçu, permet au clinicien de prendre très rapidement des décisions d'antibiothérapies parfois décisives. La diffusion des techniques d'apprentissage automatisé, et en particulier de l'apprentissage profond à base de réseaux de neurones, est porteuse d'espoir. En effet, ces techniques ont été appliquées avec succès sur des images complexes, comme par exemple la détection de tumeurs pulmonaires. On s'attend ainsi à un progrès analogue dans la microbiologie : la caractérisation de microorganismes présents dans un échantillon à partir d'une image de ce dernier. Pourtant cet espoir se heurte à une barrière de ressources si importante qu'une organisation, si grande soit-elle, éprouve les plus grandes difficultés à mettre au point une solution de caractérisation automatique à base d'intelligence artificielle. En effet, en raison de la taille des bactéries, de leur nombre à un stade précoce et de la complexité de l'échantillon, il est nécessaire d’ imager une surface importante avec une très haute résolution. En pratique, on combine pour ce faire plusieurs images brutes de typiquement quelques millions de pixels chacune, et qui ne sont pas nécessairement recouvrantes ni même contiguës, en une seule image composite que nous appellerons simplement « image » par la suite. L’image résultante totalise typiquement plusieurs de dizaines, de millions de pixels à plusieurs milliards de pixels. Quand bien même des techniques de réseaux de neurones convolutifs réduisant la dimensionnalité du problème seraient utilisées, la dimensionalité finale reste si importante que l'apprentissage des architectures classiques nécessiteraient un très grand nombre d’images annotées, sans doute des millions. Outre cette quantité astronomique, l'annotation ne peut être réalisée par des non spécialistes comme c'est le cas par exemple de modèles présents sur Google zoo ou des modèles appris par ubérisation de l'annotation sur des plateformes dédiées. Or, comme évoqué plus haut dans le cas de la caractérisation du Gram, les experts manquent. Sample complexity also challenges microbiological analysis in applications that were once routine in laboratories. This is particularly the case for the characterization of the Gram of bacteria, regardless of the sample, whether complex or not. Indeed, the characterization of the Gram of bacteria requires significant human expertise, being based on the microscopic analysis of Gram slides by a qualified technician. However, a significant loss of this expertise is observed in a significant number, even the majority, of analysis laboratories, leading to a loss of quality, or even leading de facto to no longer implementing this analysis. However, the latter, carried out very early once a sample is received, allows the clinician to make very rapid decisions on antibiotic therapy, sometimes decisive. The spread of automated learning techniques, and in particular deep learning based on neural networks, is a source of hope. Indeed, these techniques have been successfully applied to complex images, such as the detection of lung tumors. We therefore expect a similar advance in microbiology: the characterization of microorganisms present in a sample from an image of the latter. However, this hope comes up against such a significant resource barrier that an organization, however large, experiences the greatest difficulties in developing an automatic characterization solution based on artificial intelligence. Indeed, due to the size of the bacteria, their number at an early stage and the complexity of the sample, it is necessary to image a large surface with very high resolution. In practice, this is achieved by combining several raw images of typically a few million pixels each, which are not necessarily overlapping or even contiguous, into a single composite image that we will simply call "image" hereinafter. The resulting image typically totals several tens, millions of pixels to several billion pixels. Even if convolutional neural network techniques reducing the dimensionality of the problem were used, the final dimensionality remains so important that learning classical architectures would require a very large number of annotated images, probably millions. In addition to this astronomical quantity, annotation cannot be carried out by non-specialists as is the case, for example, with models present on Google Zoo or models learned by uberization of annotation on dedicated platforms. However, as mentioned above in the case of Gram characterization, experts are lacking.

A ce jour donc, la mise au point d'outils de caractérisation automatique à base d'intelligence artificielle dans le domaine de l'analyse microbienne, en particulier dans le domaine du Diagnostic In Vitro ("IVD"), reste difficile. To date, therefore, the development of automatic characterization tools based on artificial intelligence in the field of microbial analysis, particularly in the field of In Vitro Diagnostics ("IVD"), remains difficult.

Le but de la présente invention est de proposer un procédé de caractérisation automatique de microorganismes présents dans un échantillon se fondant sur l'analyse par intelligence artificielle d'images de très haute résolution, cette analyse étant mise en œuvre par une architecture d'apprentissage profond entrainé sur un jeu d'apprentissage réduit et permettant un diagnostic performant, en particulier un diagnostic pouvant être qualifié de Diagnostic In Vitro The aim of the present invention is to propose a method for the automatic characterization of microorganisms present in a sample based on the analysis by artificial intelligence of very high resolution images, this analysis being implemented by a deep learning architecture trained on a reduced learning set and allowing a high-performance diagnosis, in particular a diagnosis which can be qualified as In Vitro Diagnosis.

A cet effet, l'invention a pour objet un procédé de classification de microorganismes contenus dans un échantillon, parmi plusieurs classes de microorganismes, le procédé comprenant :To this end, the invention relates to a method for classifying microorganisms contained in a sample, among several classes of microorganisms, the method comprising:

A. la préparation d'une lame, en particulier de microscope, comprenant l'étalement de l'échantillon sur ladite lame ; B. l'acquisition d'au moins une image numérique de la lame avec une résolution micrométrique ou sub-micrométrique ; A. the preparation of a slide, in particular a microscope slide, comprising spreading the sample on said slide; B. the acquisition of at least one digital image of the slide with micrometric or sub-micrometric resolution;

C. l'application, mise en œuvre par ordinateur, d'un modèle de prédiction de la classe des microorganismes en fonction de l'image acquise. C. the computer-implemented application of a model for predicting the class of microorganisms based on the acquired image.

Selon l'invention, ladite image est subdivisée en une pluralité de sous-images et chaque sous- image est subdivisée en patchs, et l'application du modèle de prédiction comporte : According to the invention, said image is subdivided into a plurality of sub-images and each sub-image is subdivided into patches, and the application of the prediction model comprises:

D. pour chaque patch, l'application d'un extracteur de caractéristiques de microorganismes, ledit extracteur comportant une partie convolutive d'un premier réseau de neurones convolutif, ledit premier réseau étant entraîné sur un ensemble de patchs d'entrainement comprenant des microorganismes, lesdits microorganismes étant annotés individuellement par au moins une classe; D. for each patch, applying a microorganism feature extractor, said extractor comprising a convolutional part of a first convolutional neural network, said first network being trained on a set of training patches comprising microorganisms, said microorganisms being individually annotated by at least one class;

E. pour chaque sous-image, l'application d'un deuxième réseau de neurones, connecté pour recevoir les caractéristiques extraites des patchs constitutifs de ladite sous-image, ledit deuxième réseau comprenant une couche amont de pooling et une ou plusieurs couches aval comprenant une couche de prédiction d'au moins une classe, sous la forme d'un score, ledit deuxième réseau étant entraîné sur des sous-images d'entrainement comprenant des microorganismes, chaque sous-image d'entrainement étant globalement annotée par au moins une classe; et E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream layers comprising a prediction layer of at least one class, in the form of a score, said second network being trained on training sub-images comprising microorganisms, each training sub-image being globally annotated by at least one class; and

F. pour l'image acquise: F. for the acquired image:

F. a. le calcul d'un vecteur de caractéristique calculé pour les sous-images; F. a. the calculation of a feature vector calculated for the sub-images;

F. b. l'application d'un modèle de prédiction d'au moins une classe pour les microorganismes présents dans l'échantillon en fonction du vecteur de caractéristiques, ledit modèle de prédiction étant entrainé sur des vecteurs de caractéristiques calculés à partir d'images d'entrainement. F. b. applying a prediction model of at least one class for the microorganisms present in the sample based on the feature vector, said prediction model being trained on feature vectors calculated from training images.

En d'autres termes, l'invention propose une architecture d'intelligence artificielle à plusieurs étages spécifiques caractérisés par une annotation de plus en plus faible le long des étages. Cette architecture est entraînée avec seulement quelques centaines de sous-images de haute résolution subdivisées en patchs de dimension réduites (par exemple 256 pixels par 256 pixels ou 224 pixels par 224 pixels), patchs dont une quantité infime (par exemple moins de 5%) sont annotés de manière forte, et ce pour une précision de prédiction globale de plus de 90%. In other words, the invention proposes an artificial intelligence architecture with several specific stages characterized by increasingly weak annotation along the stages. This architecture is trained with only a few hundred high-resolution sub-images subdivided into patches of reduced dimension (for example 256 pixels by 256 pixels or 224 pixels by 224 pixels), patches of which a tiny quantity (for example less than 5%) are strongly annotated, and this for an overall prediction accuracy of more than 90%.

Selon un mode de réalisation, le vecteur de caractéristiques est une distribution des scores calculés pour les sous-images. Selon un mode de réalisation, chaque image d’entrainement est subdivisée en sous-images et chacune desdites sous-images est subdivisée en patchs, et moins de 50% des patchs des images d’entrainement sont annotés, lesdits patchs annotés formant les patchs d’entrainement du premier réseau convolutif. En particulier, moins de 10% des patchs des images d’entrainement sont annotés. Selon un mode de réalisation, l’entrainement des premier et second réseaux convolutifs et du modèle de prédiction est configuré pour obtenir une spécificité macro de prédiction supérieure ou égale à 90% Selon un mode de réalisation le modèle de prédiction de l’étape F est un modèle « random forest ». Selon un mode de réalisation, la couche amont de pooling est associée à une couche d’attention configurée pour appliquer un poids à la sortie de chaque extracteur. Selon un mode de réalisation, le deuxième réseau est un réseau MIL-CNN entrainé par lots d’instances, les lots d’instances étant constitués des patchs d’entrainement. According to one embodiment, the feature vector is a distribution of the scores calculated for the sub-images. According to one embodiment, each training image is subdivided into sub-images and each of said sub-images is subdivided into patches, and less than 50% of the patches of the training images are annotated, said annotated patches forming the training patches of the first convolutional network. In particular, less than 10% of the patches training images are annotated. According to one embodiment, the training of the first and second convolutional networks and the prediction model is configured to obtain a macro prediction specificity greater than or equal to 90%. According to one embodiment, the prediction model of step F is a “random forest” model. According to one embodiment, the upstream pooling layer is associated with an attention layer configured to apply a weight to the output of each extractor. According to one embodiment, the second network is a MIL-CNN network trained by batches of instances, the batches of instances consisting of the training patches.

Selon un mode de réalisation, le vecteur de caractéristiques de l’étape F. a comporte pour chaque classe : According to one embodiment, the characteristic vector of step F. a comprises for each class:

- le score maximal parmi les sous-images ; et/ou - the maximum score among the sub-images; and/or

- le X^ieme percentile des scores parmi les sous-images, avec X supérieur ou égal à 90%, de préférence égal à 95% ; et/ou - the ^Xth percentile of the scores among the sub-images, with X greater than or equal to 90%, preferably equal to 95%; and/or

- le Y^eme percentile des scores parmi les sous-images, avec Y inférieur ou égal à 10%, de préférence égal à 5% ; et/ou - the ^Yth percentile of the scores among the sub-images, with Y less than or equal to 10%, preferably equal to 5%; and/or

- le score médian parmi les sous-images. - the median score among the sub-images.

Selon un mode de réalisation, le vecteur de caractéristiques comprend en outre le nombre de sous-images dans l’image. According to one embodiment, the feature vector further comprises the number of sub-images in the image.

Selon un mode de réalisation, le premier réseau convolutif est pré-entraîné sur des images ne comprenant pas de microorganismes puis entraîné sur des patchs d’entrainement annotés. En particulier, le premier réseau convolutif pré-entrainé est un réseau VVG16 ou ResNet. According to one embodiment, the first convolutional network is pre-trained on images not comprising microorganisms and then trained on annotated training patches. In particular, the first pre-trained convolutional network is a VVG16 or ResNet network.

Selon un mode de réalisation, les microorganismes comprennent des bactéries et les classes comprennent au moins le Gram positif et le Gram négatif, et la préparation de la lame comprend la préparation d’une lame de Gram. En particulier, les classes de microorganismes comprennent en outre des classes de morphotypes. Notamment, les classes de microorganismes comprennent une classe « ni bactérie ni levure », une classe « bacilles à Gram négatif », une classe « bacilles à Gram positif », une classe « bacilles corynée à Gram positif », une classe « coques à Gram négatif », une classe « coques à Gram positif en chaînette », une classe « coques à Gram positif en cluster » et une classe « levures ». In one embodiment, the microorganisms comprise bacteria and the classes comprise at least Gram positive and Gram negative, and preparing the slide comprises preparing a Gram slide. In particular, the classes of microorganisms further comprise classes of morphotypes. In particular, the classes of microorganisms comprise a “neither bacteria nor yeast” class, a “Gram negative bacilli” class, a “Gram positive bacilli” class, a “Gram positive coryneum bacilli” class, a “Gram negative cocci” class, a “Gram positive cocci in chains” class, a “Gram positive cocci in clusters” class and a “yeast” class.

Selon un mode de réalisation, l’échantillon comprend du sang, notamment l’échantillon est une hémoculture positive. L’invention a également pour objet un procédé d’entrainement d’un modèle de prédiction d’une classe de microorganismes parmi plusieurs classes de microorganismes à partir d’une image numérique d’une lame sur laquelle est étalé un échantillon susceptible de comprendre des microorganismes, ledit procédé d’entraînement comprenant : According to one embodiment, the sample comprises blood, in particular the sample is a positive blood culture. The invention also relates to a method for training a model for predicting a class of microorganisms among several classes of microorganisms from a digital image of a slide on which a sample likely to comprise microorganisms is spread, said training method comprising:

A. la constitution d’une base de données d’entrainement comprenant des images numériques de lame annotées par une ou plusieurs classes de microorganismes, les images des lames étant divisées en une pluralité de sous-images annotées par une ou plusieurs classes de microorganismes et les sous-images d’entrainement étant subdivisée en patchs, au moins une partie des patchs étant annotés par une ou plusieurs classes de microorganismes, A. the creation of a training database comprising digital slide images annotated by one or more classes of microorganisms, the slide images being divided into a plurality of sub-images annotated by one or more classes of microorganisms and the training sub-images being subdivided into patches, at least part of the patches being annotated by one or more classes of microorganisms,

B. l’entraînement d’un premier réseau de neurones convolutifs en fonction des patchs annotés ; B. training a first convolutional neural network based on the annotated patches;

C. l'entraînement d'un deuxième réseau de neurones à partir de la base annotée de sous- images, ledit second réseau comprenant au moins un extracteur de caractéristiques de patchs, un étage de pooling des caractéristiques de patchs, et un étage de prédiction de la ou les classes de microorganismes présents sur la sous-image; C. training a second neural network from the annotated base of sub-images, said second network comprising at least one patch feature extractor, a patch feature pooling stage, and a stage for predicting the class(es) of microorganisms present in the sub-image;

D. la constitution d’une base de données de distributions des classes présentes dans les lames d’entrainement en fonction des classes prédites par le deuxième réseau de neurones appliqués aux sous-images annotées ; D. the creation of a database of distributions of the classes present in the training slides based on the classes predicted by the second neural network applied to the annotated sub-images;

E. d’un modèle de prédiction d’au moins une classe de microorganismes présents dans une lame en fonction de distributions spatiales des classes, procédé selon lequel le prédicteur comprend la partie convolutive du premier réseau de neurones, en aval duquel est connectée le deuxième réseau de neurone, en aval duquel est connecté le modèle de prédiction. E. of a model for predicting at least one class of microorganisms present in a slide based on spatial distributions of the classes, method according to which the predictor comprises the convolutional part of the first neural network, downstream of which is connected the second neural network, downstream of which is connected the prediction model.

Dans une variante de l’étape C, le procédé d’entrainement comprend l’entraînement d’un deuxième réseau de neurones comprenant une couche amont de pooling et une ou plusieurs couches aval de prédiction de la classe de microorganisme, le deuxième réseau ayant pour fonction de prédire la ou les classes d’une sous-image en fonction de vecteurs de caractéristiques produits par un extracteur de caractéristiques de patch, l’entrainement du deuxième réseau étant réalisé en fonction de vecteurs de caractéristiques en fonction des sous- images annotées de la base de données et des vecteurs de caractéristiques des patchs des sous- images annotés In a variant of step C, the training method comprises training a second neural network comprising an upstream pooling layer and one or more downstream layers for predicting the microorganism class, the second network having the function of predicting the class(es) of a sub-image as a function of characteristic vectors produced by a patch characteristic extractor, the training of the second network being carried out as a function of characteristic vectors as a function of the annotated sub-images of the database and the characteristic vectors of the patches of the annotated sub-images

En particulier, l’entrainement du premier réseau est réalisé sur les classes de microorganismes auxquels est ajoutée une classe « ambiguë », un patch annoté étant également annoté par cette classe lorsque des objets dans le patch en cas d’incertitude sur les objets présents dans le patch. En particulier, l’entrainement du deuxième réseau comprend l’entrainement d’un réseau MIL- CNN, la portion MIL entraînée du réseau MIL-CNN constituant le second réseau de neurones entraînés. In particular, the training of the first network is carried out on the classes of microorganisms to which an “ambiguous” class is added, an annotated patch also being annotated by this class when objects in the patch in case of uncertainty about the objects present in the patch. In particular, training the second network comprises training a MIL-CNN network, the trained MIL portion of the MIL-CNN network constituting the second trained neural network.

L’invention a également pour objet un procédé de prédiction d’une classe de microorganismes contenus dans un échantillon, parmi plusieurs classes de microorganismes, le procédé comprenant : The invention also relates to a method for predicting a class of microorganisms contained in a sample, from among several classes of microorganisms, the method comprising:

C. l’application, mise en œuvre par ordinateur, d’un modèle de prédiction de la classe des microorganismes dans un échantillon étalé sur une lame en fonction d’une image acquise de ladite lame, C. the computer-implemented application of a model for predicting the class of microorganisms in a sample spread on a slide based on an image acquired from said slide,

Selon l’invention, ladite image est subdivisée en sous-images et chaque sous-image est subdivisée en patchs, et l’application du modèle de prédiction comporte : According to the invention, said image is subdivided into sub-images and each sub-image is subdivided into patches, and the application of the prediction model comprises:

D. pour chaque patch, l’application d’un extracteur de caractéristiques de microorganismes, ledit extracteur comportant une partie convolutive d’un premier réseau de neurones convolutif, ledit premier réseau étant entraîné sur un ensemble de patchs d’entrainement comprenant des microorganismes, lesdits microorganismes étant annotés individuellement par au moins une classe ; D. for each patch, applying a microorganism feature extractor, said extractor comprising a convolutional part of a first convolutional neural network, said first network being trained on a set of training patches comprising microorganisms, said microorganisms being individually annotated by at least one class;

E. pour chaque sous-image, l’application d’un deuxième réseau de neurones, connecté pour recevoir les caractéristiques extraites des patchs constitutifs de ladite sous- image, ledit deuxième réseau comprenant une couche amont de pooling et une ou plusieurs couches aval de prédiction d’au moins une classe, sous la forme d’un score, ledit deuxième réseau étant entraîné sur des sous-images d’entrainement comprenant des microorganismes, chaque sous-image d’entrainement étant globalement annotée; et E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream prediction layers of at least one class, in the form of a score, said second network being trained on training sub-images comprising microorganisms, each training sub-image being globally annotated; and

F. pour l’image acquise : a. le calcul d’un vecteur de caractéristiques en fonction des scores calculés pour les sous-images ; b. l’application d’un modèle de prédiction d’au moins une classe pour les microorganismes présents dans l’échantillon en fonction du vecteur de caractéristiques, ledit modèle de prédiction étant entrainé sur des vecteurs de caractéristiques calculés à partir d’images d’entrainement. F. for the acquired image: a. the calculation of a feature vector based on the scores calculated for the sub-images; b. the application of a prediction model of at least one class for the microorganisms present in the sample based on the feature vector, said prediction model being trained on feature vectors calculated from training images.

L’invention a également pour objet un système de prédiction de prédiction d’une classe de microorganismes contenus dans un échantillon, parmi plusieurs classes de microorganismes, le système comprenant une unité informatique configurée pour mettre en œuvre un modèle de prédiction de la classe des microorganismes dans un échantillon étalé sur une lame en fonction d’une image acquise de ladite lame. Selon l’invention, ladite image est subdivisée en sous- images et chaque sous-image est subdivisée en patchs, et l’application du modèle de prédiction comporte : The invention also relates to a system for predicting a class of microorganisms contained in a sample, among several classes of microorganisms, the system comprising a computer unit configured to implement a model for predicting the class of microorganisms in a sample spread on a slide as a function of an image acquired from said slide. According to the invention, said image is subdivided into sub- images and each sub-image is subdivided into patches, and the application of the prediction model involves:

F. pour l’image acquise : a. le calcul d’un vecteur de caractéristique calculés pour les sous-images ; b. l’application d’un modèle de prédiction d’au moins une classe pour les microorganismes présents dans l’échantillon en fonction du vecteur de caractéristiques, ledit modèle de prédiction étant entrainé sur des vecteurs de caractéristiques calculés à partir d’images d’entrainement. F. for the acquired image: a. the calculation of a characteristic vector calculated for the sub-images; b. the application of a prediction model of at least one class for the microorganisms present in the sample based on the characteristic vector, said prediction model being trained on characteristic vectors calculated from training images.

L’invention a également pour objet des produit programme d’ordinateur comprenant une mémoire informatique mémorisant des instructions lisibles par ordinateur pour la mise en œuvre des étapes D à F ci -dessus ou des étapes A à E ci -dessus. The invention also relates to computer program products comprising a computer memory storing computer-readable instructions for implementing steps D to F above or steps A to E above.

L’invention a également pour objet un procédé de prédiction d’une classe d’objets contenu dans une image numérique parmi plusieurs classes d’objets, le procédé selon lequel l’image numérique est subdivisée en une pluralité de sous-images et chaque sous-image est subdivisée en patchs, et selon lequel : The invention also relates to a method for predicting a class of objects contained in a digital image from among several classes of objects, the method according to which the digital image is subdivided into a plurality of sub-images and each sub-image is subdivided into patches, and according to which:

D. pour chaque patch, l’application d’un extracteur de caractéristiques d’objets, ledit extracteur comportant une partie convolutive d’un premier réseau de neurones convolutif, ledit premier réseau étant entraîné sur un ensemble de patchs d’entrainement comprenant des objets, lesdits objets étant annotés individuellement par au moins une classe ; D. for each patch, applying an object feature extractor, said extractor comprising a convolutional part of a first convolutional neural network, said first network being trained on a set of training patches comprising objects, said objects being individually annotated by at least one class;

E. pour chaque sous-image, l’application d’un deuxième réseau de neurones, connecté pour recevoir les caractéristiques extraites des patchs constitutifs de ladite sous-image, ledit deuxième réseau comprenant une couche amont de pooling et une ou plusieurs couches aval de prédiction d’au moins une classe, sous la forme d’un score, ledit deuxième réseau étant entraîné sur des sous-images d’entrainement comprenant des objets, chaque sous-image d’entrainement étant globalement annotée; et E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream prediction layers of at least one class, under the form of a score, said second network being trained on training sub-images comprising objects, each training sub-image being globally annotated; and

F. pour l’image acquise : a. le calcul d’un vecteur de caractéristique calculés pour les sous-images ; b. l’application d’un modèle de prédiction d’au moins une classe pour les objets présents dans l’échantillon en fonction du vecteur de caractéristiques, ledit modèle de prédiction étant entrainé sur des vecteurs de caractéristiques calculés à partir d’images d’entrainement. F. for the acquired image: a. the calculation of a characteristic vector calculated for the sub-images; b. the application of a prediction model of at least one class for the objects present in the sample according to the characteristic vector, said prediction model being trained on characteristic vectors calculated from training images.

BREVE DESCRIPTION DES FIGURES BRIEF DESCRIPTION OF THE FIGURES

L’invention sera mieux comprise à la lecture de la description qui suit, donnée uniquement à titre d’exemple, et réalisée en relation avec les dessins annexés, dans lesquels : The invention will be better understood from reading the following description, given solely by way of example, and drawn up in relation to the appended drawings, in which:

- la figure 1 illustre les classes de microorganismes présentes sur une lame de Gram prédites par l’invention ; - figure 1 illustrates the classes of microorganisms present on a Gram slide predicted by the invention;

- la figure 2 est un organigramme d’un flux de travail de laboratoire d’analyse microbiologique mettant en œuvre l’invention ; - Figure 2 is a flowchart of a microbiological analysis laboratory workflow implementing the invention;

- la figure 3 est une vue schématique d’un système d’acquisition d’images de lames de Gram ; - Figure 3 is a schematic view of a Gram slide image acquisition system;

- la figure 4 est une illustration des différentes échelles d’une image de lame de Gram exploitée par l’invention ; - figure 4 is an illustration of the different scales of a Gram slide image used by the invention;

- la figure 5 est un organigramme détaillant le fonctionnement du modèle prédiction à trois étage selon l’invention se fondant sur les différentes échelles de la figure 4 ;- Figure 5 is a flowchart detailing the operation of the three-stage prediction model according to the invention based on the different scales of Figure 4;

- les figure 6A-C sont des vues schématique d extracteurs de caractéristiques de patch, comportant la partie convolutive d’un réseau de neurones convolutifs, dans cet exemple un réseau à architecture VGG16 ; - Figures 6A-C are schematic views of patch feature extractors, comprising the convolutional part of a convolutional neural network, in this example a network with VGG16 architecture;

- la figure 7 est une vue schématique d’un modèle prédictif au niveau sous-image, formé dans l’exemple de la partie MIL d’un réseau MIL-CNN à mécanisme d’attention ;- Figure 7 is a schematic view of a predictive model at the sub-image level, formed in the example of the MIL part of a MIL-CNN network with attention mechanism;

- la figure 8 est une vue schématique du modèle prédictif au niveau image, se fondant sur un étage de prédiction du type Random Forest analysant une distribution de descripteurs fournis par l’étage de prédiction au niveau sous-image ; la figure 9 A est une vue schématique illustrant l’affichage de zone présomptive de microorganisme dans l’image de la lame de Gram, ou dans une ou plusieurs sous-images constituant celle-ci en fonction des poids calculés par le mécanisme d’attention du modèle prédictif au niveau sous-image, et l’affichage des prédictions du modèle prédictif fonctionnant au niveau patch ; - la figure 9B illustrent deux sous-images analysées par le réseau de neurones convolutifs dont est issu l’extracteur de caractéristique de patches constitutifs des sous-images, et les prédictions dudit réseau de neurones affichées sous la forme de « heat maps » ;- Figure 8 is a schematic view of the predictive model at the image level, based on a Random Forest type prediction stage analyzing a distribution of descriptors provided by the prediction stage at the sub-image level; Figure 9 A is a schematic view illustrating the display of the presumptive microorganism zone in the image of the Gram slide, or in one or more sub-images constituting it as a function of the weights calculated by the attention mechanism of the predictive model at the sub-image level, and the display of the predictions of the predictive model operating at the patch level; - Figure 9B illustrates two sub-images analyzed by the convolutional neural network from which the feature extractor of patches constituting the sub-images is derived, and the predictions of said neural network displayed in the form of “heat maps”;

- la figure 10 est un organigramme illustrant l’entrainement du modèle prédictif à trois étage selon l’invention ; - Figure 10 is a flowchart illustrating the training of the three-stage predictive model according to the invention;

- les figures 11A et 11B illustrent des performances de classification d’images de lame de Gram en imagerie RGB réalisée par le modèle prédictif à trois étages selon l’invention ; - Figures 11A and 11B illustrate the classification performance of Gram slide images in RGB imaging carried out by the three-stage predictive model according to the invention;

- les figures 12A et 12B illustrent des performances de classification d’images de lame de Gram en imagerie holographique réalisée par le modèle prédictif à trois étages selon l’invention ; et - Figures 12A and 12B illustrate the classification performance of Gram slide images in holographic imaging carried out by the three-stage predictive model according to the invention; and

- les figures 13 à 15 illustrent différentes architectures informatiques pour la mise en œuvre du modèle prédictif à trois étage selon l’invention et du flux de travail d’un laboratoire microbiologique exploitant ce dernier. - Figures 13 to 15 illustrate different computer architectures for the implementation of the three-stage predictive model according to the invention and the workflow of a microbiological laboratory using the latter.

DESCRIPTION DETAILLEE DE L'INVENTION DETAILED DESCRIPTION OF THE INVENTION

A. PROCEDE ET SYSTEME DE CARACTERISATION DU GRAM DE BACTERIES DANS UNE LAME DE GRAM A. METHOD AND SYSTEM FOR CHARACTERIZING THE GRAM OF BACTERIA IN A GRAM SLIDE

Il va à présent être décrit un mode de réalisation de l'invention, à savoir un flux de travail (ou « workflow ») de laboratoire microbiologique pour la caractérisation IVD notamment du Gram de bactéries chez un patient suspecté de septicémie, workflow se fondant sur l'analyse de l'image RGB d'une lame de Gram produite à partir d'une hémoculture positive. An embodiment of the invention will now be described, namely a microbiological laboratory workflow for the IVD characterization, in particular of the Gram of bacteria in a patient suspected of septicemia, workflow based on the analysis of the RGB image of a Gram slide produced from a positive blood culture.

En particulier (figure 1), ce mode de réalisation comporte la classification automatique à plat d'une image de lame de Gram 10 en huit classes mutuellement exclusives, à savoir une lame ne comprenant ni bactérie ni levure, une lame comprenant des bacilles à Gram négatif, une lame comprenant des bacilles à Gram positif, une lame comprenant des bacilles corynée à Gram positif, une lame comprenant des coques à Gram négatif, une lame comprenant des coques à Gram positif en chaînette, une lame comprenant des coques à Gram positif en cluster et une lame comprenant des levures. In particular (Figure 1), this embodiment comprises the automatic flat classification of a Gram slide image 10 into eight mutually exclusive classes, namely a slide comprising neither bacteria nor yeast, a slide comprising Gram-negative bacilli, a slide comprising Gram-positive bacilli, a slide comprising Gram-positive coryneae bacilli, a slide comprising Gram-negative cocci, a slide comprising Gram-positive cocci in chains, a slide comprising Gram-positive cocci in clusters and a slide comprising yeast.

En se référant aux figures 2 et 3, ce workflow 20 débute par la production d'un échantillon 22 à partir d'un prélèvement de sang chez le patient, ici une hémoculture positive par exemple réalisée en utilisant un flacon de milieu BACT/ALERT® mis en culture dans le système BACT/ALERT® VIRTUO® de la demanderesse. Comme cela est connu en soit, l'hémoculture consiste à multiplier le nombre de bactéries et levures initialement contenues dans un prélèvement de sang afin de les rendre détectables (identification de l'état "présence" ou "absence" de bactéries ou de levure) et d'en faciliter leur caractérisation ultérieure en raison d'une biomasse plus importante. Pour ce faire le sang est mélangé à un milieu de culture par exemple à base de trypticase de soja, supplémentés de billes polymériques absorbantes. L'échantillon d'hémoculture est donc complexe en ce qu'il comprend les éléments naturellement présents dans le sang (hématies, globules blancs, plaquettes, fibrogènes...) ainsi que les éléments propres à l'hémoculture. Bien que le nombre de bactéries et levures a été multiplié, ces dernières peuvent malgré tout constituer une infime partie de l'échantillon et être masquées par, ou se confondre avec, des éléments hétérogènes notamment en nombre, forme, taille et colorimétrie. Referring to Figures 2 and 3, this workflow 20 begins with the production of a sample 22 from a blood sample from the patient, here a positive blood culture for example carried out using a bottle of BACT/ALERT® medium cultured in the applicant's BACT/ALERT® VIRTUO® system. As is known per se, the blood culture consists of multiplying the number of bacteria and yeasts initially contained in a blood sampling in order to make them detectable (identification of the state "presence" or "absence" of bacteria or yeast) and to facilitate their subsequent characterization due to a greater biomass. To do this, the blood is mixed with a culture medium, for example based on soy trypticase, supplemented with absorbent polymer beads. The blood culture sample is therefore complex in that it includes elements naturally present in the blood (red blood cells, white blood cells, platelets, fibrogenic cells, etc.) as well as elements specific to blood culture. Although the number of bacteria and yeasts has been multiplied, the latter can still constitute a tiny part of the sample and be masked by, or confused with, heterogeneous elements, particularly in number, shape, size and colorimetry.

Le procédé se poursuit, en 24, par la production d'une lame de Gram à partir de l'hémoculture positive. Cette production comprend l'étalement ("smearing"), en 240, d'une fraction de l'hémoculture de manière à obtenir une épaisseur d'étalement préférentiellement inférieure à 10pm, cette épaisseur correspondant à la profondeur de champ d'un microscope avec un grossissement de 1000 utilisé ultérieurement dans le workflow. En ajustant l'épaisseur à ladite profondeur de champ, seule une inspection bidimensionnelle de l'étalement est nécessaire, facilitant de fait l'analyse telle que décrite ci-après. Une fois étalé, l'échantillon subit une coloration de Gram ("staining") telle que connue en soit, par exemple réalisée automatiquement au moyen de l'instrument PREVI® COLOR GRAM commercialisé par la Demanderesse, cette coloration ayant pour but de colorer de manière différente les bactéries en fonction de leur Gram. The method continues, at 24, with the production of a Gram slide from the positive blood culture. This production comprises the spreading ("smearing"), at 240, of a fraction of the blood culture so as to obtain a spreading thickness preferably less than 10 μm, this thickness corresponding to the depth of field of a microscope with a magnification of 1000 used subsequently in the workflow. By adjusting the thickness to said depth of field, only a two-dimensional inspection of the spreading is necessary, thereby facilitating the analysis as described below. Once spread, the sample undergoes Gram staining as known per se, for example carried out automatically using the PREVI® COLOR GRAM instrument marketed by the Applicant, this staining having the purpose of staining the bacteria differently depending on their Gram.

Une fois séchée et recouverte d'une lamelle, la lame de Gram est positionnée, en 26, dans un microscope 40 (figure 3) à objectif à immersion à huile de fort grossissement 42, compris en 60x et lOOx, couplé à système d'illumination 44 du type Kôhler (illumination en lumière incohérente blanche dans au moins la gamme [400nm-900nm]), et à un système d'imagerie RGB 46. Le système 46 comprend par exemple un capteur bidimensionnel de photosites CMOS, capteur recouvert d'une matrice de Bayer pour la production d'images en couleur d'une manière connue en soi, capteur placé dans le plan image de l'objectif 42. Le système optique du microscope et le système d'imagerie sont choisis et/ou contrôlés de manière à ce que des bactéries et levures, objets de quelques centaines de nanomètres à quelques dizaines de micromètres, représentent au moins 5 pixels, de préférence au moins 10 pixels dans l'image obtenue. Once dried and covered with a coverslip, the Gram slide is positioned, at 26, in a microscope 40 (figure 3) with a high-magnification oil immersion objective 42, between 60x and 100x, coupled with a Köhler-type illumination system 44 (illumination with white incoherent light in at least the range [400nm-900nm]), and with an RGB imaging system 46. The system 46 comprises, for example, a two-dimensional CMOS photosite sensor, a sensor covered with a Bayer matrix for producing color images in a manner known per se, a sensor placed in the image plane of the objective 42. The optical system of the microscope and the imaging system are chosen and/or controlled so that bacteria and yeasts, objects of a few hundred nanometers to a few tens of micrometers, represent at least 5 pixels, preferably at least 10 pixels in the image obtained.

Le champ de vision d'un tel objectif étant limité, la lame de Gram 48 est avantageusement placée sur un support mobile 50 déplaçable au moyen d'une platine à moteurs piézoélectriques 52, permettant le déplacement de la lame dans le plan (x,y) perpendiculaire à l'axe optique z de l'objectif 42. Cette platine est connectée à une unité informatique 54 également connectée au capteur 46, unité informatique qui coordonne le déplacement du support 50, et donc de la lame 48, et la prise d'images par le capteur 46, afin d'obtenir d'imager la totalité de la lame 48, à tout le moins la totalité de la surface de lame sur laquelle est étalé l'échantillon 56. De préférence, les images brutes acquises de la lame se recouvrent partiellement afin d'éviter des effets de bords, par exemple des bactéries, clusters ou chainettes de bactéries coupées. The field of vision of such an objective being limited, the Gram blade 48 is advantageously placed on a mobile support 50 movable by means of a plate with piezoelectric motors 52, allowing the movement of the blade in the plane (x,y) perpendicular to the optical axis z of the objective 42. This plate is connected to a computer unit 54 also connected to the sensor 46, a computer unit which coordinates the movement of the support 50, and therefore of the slide 48, and the taking of images by the sensor 46, in order to obtain an image of the entire slide 48, at least the entire surface of the slide on which the sample 56 is spread. Preferably, the raw images acquired from the slide partially overlap in order to avoid edge effects, for example bacteria, clusters or chains of cut bacteria.

La collection d'images peut être conservée pour former l'ensemble des « sous-images », cette collection correspond à une « image », telles que décrites ci-après, ou bien une seule image de la lame peut être obtenue par reconstruction informatique d'une manière connue en soi, ou bien cette collection d’images est redécoupée en « sous-images ». Selon l’invention, trois échelles sont obtenues pour une lame de Gram : une image (composite ou non), des sous-images constitutives de l’image, et des patchs constitutifs des sous-images. Cette collection d'images numériques, et de préférence chacune de ces images numériques, couvre une surface suffisamment importante de la lame pour être a priori représentative de la population de microorganismes présents dans l’échantillon initial, de telle sorte que la ou les classes de microorganismes présente(s) sur la lame soi(en)t a priori présente(s) quelque part sur l’image. The collection of images can be kept to form the set of “sub-images”, this collection corresponds to an “image”, as described below, or a single image of the slide can be obtained by computer reconstruction in a manner known per se, or this collection of images is re-divided into “sub-images”. According to the invention, three scales are obtained for a Gram slide: an image (composite or not), sub-images constituting the image, and patches constituting the sub-images. This collection of digital images, and preferably each of these digital images, covers a sufficiently large surface area of the slide to be a priori representative of the population of microorganisms present in the initial sample, such that the class(es) of microorganism(s) present on the slide are a priori present somewhere on the image.

En se référant à nouveau à la figure 2, une image à résolution microscopique ou submicroscopique de la lame (dans cet exemple, une image composite de haute résolution, ou image « ICHR ») est donc produite à l'étape 28 du workflow. A titre d'illustration pour des lames de Gram classiquement utilisées en laboratoire, d'une dimension de 25mm par 75mm, il résulte une image d'au moins plusieurs dizaines de millions de pixels, avec le prototype des inventeurs 150 millions de pixels, avec une dimension latérale des pixels correspondant à 40nm, chaque pixel étant codés sur 16 bits pour chaque couleur. En particulier, le système d’acquisition 40, 46 produit quelques centaines de pixels pour un bacille de l’ordre du micromètre. Referring again to Figure 2, a microscopic or submicroscopic resolution image of the slide (in this example, a high-resolution composite image, or “ICHR” image) is therefore produced at step 28 of the workflow. As an illustration, for Gram slides conventionally used in the laboratory, with a dimension of 25mm by 75mm, this results in an image of at least several tens of millions of pixels, with the inventors’ prototype 150 million pixels, with a lateral dimension of the pixels corresponding to 40nm, each pixel being coded on 16 bits for each color. In particular, the acquisition system 40, 46 produces a few hundred pixels for a bacillus of the order of a micrometer.

Le workflow de laboratoire 20 se poursuit par une étape automatique 30, mise en œuvre par ordinateur, d'analyse de l'image produite afin de caractériser au moins le Gram des bactéries présentes dans celle-ci, et plus spécifiquement prédire la ou les classes de la lame de Gram parmi les classes décrites précédemment. En particulier, une étape 300 de prédiction de type « multi-classes » ayant une architecture à trois étages telle que décrite ci-après est mise en œuvre. Cette prédiction génère un vecteur dont les composantes correspondent aux probabilités des classes, à tout le moins des scores compris entre 0 et 1, dont la somme est égale à 1, ainsi qu’un indice de confiance ("TC") associé à la prédiction. La spécificité de la prédiction est optimisée de sorte que si l'indice IC dépasse un seuil de confiance prédéterminé (étape 302), aucun autre test de caractérisation de la lame n'est nécessaire. Le résultat de la prédiction peut être ainsi directement poussé au clinicien (étape 32), par exemple au moyen d'un rapport reçu par courriel ou au moyen d'une notification sur un smartphone ou équivalent, clinicien qui peut sur cette base choisir une antibiothérapie appropriée à administrer au patient (étape 34). The laboratory workflow 20 continues with an automatic step 30, implemented by computer, of analyzing the image produced in order to characterize at least the Gram of the bacteria present therein, and more specifically to predict the class(es) of the Gram slide among the classes described previously. In particular, a “multi-class” type prediction step 300 having a three-stage architecture as described below is implemented. This prediction generates a vector whose components correspond to the probabilities of the classes, at least scores between 0 and 1, the sum of which is equal to 1, as well as a confidence index (“TC”) associated with the prediction. The specificity of the prediction is optimized so that if the IC index exceeds a predetermined confidence threshold (step 302), no further characterization test of the slide is necessary. The result of the prediction can be pushed directly to the clinician (step 32), for example by means of a report received by email or by means of a notification on a smartphone or equivalent, clinician who can on this basis choose an appropriate antibiotic therapy to administer to the patient (step 34).

De manière avantageuse, mais optionnelle, l'architecture de la prédiction comporte un mécanisme d'attention décrit ci-après. Outre une interprétabilité accrue de la prédiction, ce mécanisme permet d'identifier des zones de l'image de la lame de Gram susceptibles de contenir des microorganismes. Ainsi, là où le mécanisme a porté son attention, c’est-à-dit produit une pondération plus élevée, des zones présomptives de présence de microorganismes sont affichées, en 304, sur un écran d'ordinateur en surimpression de l'image de la lame de Gram, à destination d'un technicien de laboratoire expert en interprétation de lame. Ce dernier, aidé par l'affichage, peut alors caractériser manuellement la lame (étape 36). En option, ou en complément, le technicien peut également observer la lame directement au travers du microscope et caractériser cette dernière de manière classique. Notons que cet affichage peut être réalisé au niveau de l’image entière, ou bien décomposée, par exemple sur chacune des sous-images pour faciliter la lecture. Advantageously, but optionally, the prediction architecture includes an attention mechanism described below. In addition to increased interpretability of the prediction, this mechanism makes it possible to identify areas of the Gram slide image likely to contain microorganisms. Thus, where the mechanism has focused its attention, i.e. produced a higher weighting, presumptive areas of microorganism presence are displayed, at 304, on a computer screen superimposed on the Gram slide image, intended for a laboratory technician expert in slide interpretation. The latter, aided by the display, can then manually characterize the slide (step 36). Optionally, or in addition, the technician can also observe the slide directly through the microscope and characterize it in a conventional manner. Note that this display can be carried out at the level of the entire image, or broken down, for example on each of the sub-images to facilitate reading.

De manière avantageuse, mais optionnelle, l’architecture de la prédiction comporte également une étape intermédiaire de prédiction au niveau des patchs qui composent l’image et ses sous- images. Là encore, les prédictions des différents patchs peuvent être présentés en surimpression de l’image au technicien de laboratoire, ce qui permet non seulement d’orienter son attention sur les zones présomptives, mais également fournir des éléments présomptifs sur la classe des microorganismes présents dans les différentes zones de l’image. Là encore, une décomposition de l’affichage en sous-images est possible. Advantageously, but optionally, the prediction architecture also includes an intermediate prediction step at the level of the patches that make up the image and its sub-images. Here again, the predictions of the different patches can be presented as an overlay on the image to the laboratory technician, which not only allows his attention to be directed to the presumptive areas, but also provides presumptive elements on the class of microorganisms present in the different areas of the image. Here again, a decomposition of the display into sub-images is possible.

Comme décrit ci-après, le prototype développé par les inventeurs, entrainé sur moins de 600 lames, présente une précision macro de 95% pour un taux de rejet de lame de 17%. Très largement améliorable, ce prototype permet déjà d'automatiser la lecture de Gram d'une grande partie des échantillons d'hémoculture, tout en délivrant une information au-delà du simple Gram (bacille, coque, cluster, chaînette, levure...). Le temps gagné avec une telle automatisation pallie efficacement la raréfaction de l'expertise humaine dans ce domaine. As described below, the prototype developed by the inventors, trained on fewer than 600 slides, has a macro accuracy of 95% for a slide rejection rate of 17%. This prototype, which can be greatly improved, already makes it possible to automate the Gram reading of a large proportion of blood culture samples, while providing information beyond simple Gram (bacillus, coccus, cluster, chain, yeast, etc.). The time saved with such automation effectively compensates for the scarcity of human expertise in this field.

B. ARCHITECTURE DE PREDICTION A MULTIPLES ETAGES POUR L'ANALYSE AUTOMATIQUE D'IMAGES COMPLEXES, EN PARTICULIER DE LAMES DE GRAM B. MULTI-STAGE PREDICTION ARCHITECTURE FOR AUTOMATIC ANALYSIS OF COMPLEX IMAGES, ESPECIALLY GRAM SLIDES

Il va à présent être décrit plus en détail un mode de réalisation de l'architecture à étages d’un modèle prédictif (ou « prédicteur ») selon l'invention. Bien que cette architecture soit décrite en relation avec l'analyse d'une image RGB, cette architecture peut également être mise en œuvre pour l'analyse d'images de nature différente, notamment holographiques comme cela sera décrit ci -après, ou encore multispectrales ou hyperspectrales. An embodiment of the multi-stage architecture of a predictive model (or "predictor") according to the invention will now be described in more detail. Although this architecture is described in relation to the analysis of an RGB image, this architecture can also be implemented for the analysis of images of a different nature, notably holographic as will be described below, or even multispectral or hyperspectral.

L’invention tire avantage de la très grande résolution de l’image ICHR, cette dernière pouvant être divisée en plusieurs dizaines, de préférence au moins une centaine, de sous-images, chaque sous-image pouvant être subdivisées en plusieurs dizaines de patchs, chaque patch ayant des dimensions en pixels suffisantes pour contenir un microorganisme ou un morphotype (cluster, chaînette...) dans sa globalité. Une telle subdivision est illustrée à la figure 4 qui décrit une subdivision régulière de l’image en sous-images et des sous-images en patchs. En particulier sur la sous-image et le patch illustrés, des bacilles correspondent aux objets foncés, et sont de dimensions inférieures à celles du patch, et sont contenus entièrement dans le patch. Par exemple, les patchs ont une dimension de 256 pixels par 256 pixels (ou de 224 par 224) correspondant à une surface réelle d’environ 10 x 10 pm² alors que la dimension maximale typique des bactéries et levures est de l’ordre du micromètre, et les sous-images ont une dimension de 1024 pixels par 1536 pixels. Dans une variante privilégiée de l’invention, les patchs voisins se recouvrent partiellement afin d’éviter les effets de bords lors de l’entrainement du prédicteur au niveau patch. De manière similaire, les sous-images voisines se recouvrent partiellement pour la même raison. En variante, les sous-images ne couvrent pas la totalité de l’image ICHR et ne se recouvrent pas de manière à mieux couvrir la diversité liée à l’inhomogénéité du l’étalement. The invention takes advantage of the very high resolution of the ICHR image, the latter being able to be divided into several tens, preferably at least a hundred, of sub-images, each sub-image being able to be subdivided into several tens of patches, each patch having dimensions in pixels sufficient to contain a microorganism or a morphotype (cluster, chain, etc.) in its entirety. Such a subdivision is illustrated in Figure 4 which describes a regular subdivision of the image into sub-images and of the sub-images into patches. In particular on the sub-image and the patch illustrated, bacilli correspond to the dark objects, and are of dimensions smaller than those of the patch, and are contained entirely in the patch. For example, the patches have a dimension of 256 pixels by 256 pixels (or 224 by 224) corresponding to a real surface of approximately 10 x 10 pm ² while the typical maximum dimension of bacteria and yeasts is of the order of a micrometer, and the sub-images have a dimension of 1024 pixels by 1536 pixels. In a preferred variant of the invention, the neighboring patches partially overlap in order to avoid edge effects during the training of the predictor at the patch level. Similarly, the neighboring sub-images partially overlap for the same reason. Alternatively, the sub-images do not cover the entire ICHR image and do not overlap so as to better cover the diversity linked to the inhomogeneity of the spreading.

Comme détaillé ci-après en relation avec l’apprentissage des étages du prédicteur multi-classe, cette subdivision, rendue possible par la très haute résolution de l’image ICHR, est associée à une annotation de force décroissante depuis les patchs vers l’image ICHR. En particulier, pour une base de données d’images ICHR d’apprentissage subdivisées en sous-images et chaque sous-image en patchs : i. des patchs sont annotés de manière forte pour chaque sous-image à minima par la ou les classes des microorganismes qu’il contient. En particulier, un patch annoté est avantageusement, mais optionnellement, associé à une double annotation : une première annotation globale correspondant à la ou les classes des microorganismes qu’il contient ainsi qu’une seconde annotation pour chacun de ses pixels codant la présence ou absence de microorganisme ; ii. chaque sous-image est annotée globalement par la ou les classes des microorganismes présents; iii. chaque image est globalement annotée par la ou les classes des microorganismes présents. Selon l’invention, un modèle extrait des caractéristiques des patchs par un modèle de prédiction de classe au niveau patch, caractéristiques de patchs qui sont transmise à un deuxième étage qui extrait des caractéristiques des sous-images par un modèle de prédiction de classe au niveau sous-image, caractéristiques de sous-image qui sont transmises à un troisième étage qui réalise une prédiction de classe finale des microorganismes présents dans l’image, et donc dans la lame de Gram. As detailed below in relation to the training of the stages of the multi-class predictor, this subdivision, made possible by the very high resolution of the ICHR image, is associated with an annotation of decreasing strength from the patches towards the ICHR image. In particular, for a database of training ICHR images subdivided into sub-images and each sub-image into patches: i. patches are strongly annotated for each sub-image at least by the class(es) of microorganisms it contains. In particular, an annotated patch is advantageously, but optionally, associated with a double annotation: a first global annotation corresponding to the class(es) of microorganisms it contains as well as a second annotation for each of its pixels coding the presence or absence of microorganism; ii. each sub-image is globally annotated by the class(es) of microorganisms present; iii. each image is globally annotated by the class(es) of microorganisms present. According to the invention, a model extracts characteristics of the patches by a class prediction model at the patch level, patch characteristics which are transmitted to a second stage which extracts characteristics of the sub-images by a class prediction model at the sub-image level, sub-image characteristics which are transmitted to a third stage which carries out a final class prediction of the microorganisms present in the image, and therefore in the Gram slide.

Dans ce qui suit, une image ICHR est subdivisée en une matrice bi-dimensionnelle de sous- images référencées par les indices (i,j) et chaque sous-images est subdivisées par une matrices bi-dimensionnelle de patchs référencés par les indices (k, 1). In the following, an ICHR image is subdivided into a two-dimensional matrix of sub-images referenced by the indices (i,j) and each sub-image is subdivided by a two-dimensional matrix of patches referenced by the indices (k, 1).

En se référant à la figure 5, un mode de réalisation particulier du prédicteur 60 comprend : Referring to Figure 5, a particular embodiment of the predictor 60 comprises:

A. un flux d’analyse 62 pour chaque sous-image d’une image ICHR, chaque flux produisant un vecteur de scores, un score par classe, (noté « Class scores i, j) » pour la sous-image de coordonnée (i, JJ). Les flux d’analyse 62 sont réalisés de manière indépendante et sont identiques en termes de modules computationnels ; A. an analysis flow 62 for each sub-image of an ICHR image, each flow producing a vector of scores, one score per class, (denoted “Class scores i, j)” for the sub-image of coordinate (i, JJ). The analysis flows 62 are carried out independently and are identical in terms of computational modules;

B. un flux d’analyse globale 64 recevant les vecteurs de caractéristiques de chaque flux 62, dans une variante privilégiée illustrée à la figure 5 les vecteurs de scores, et produisant un vecteur de scores, un score par classe, pour l’image ICHR, noté « Class image », ainsi qu’un indice de confiance pour cette prédiction, noté « Conf_index ». B. a global analysis flow 64 receiving the characteristic vectors of each flow 62, in a preferred variant illustrated in figure 5 the score vectors, and producing a score vector, a score per class, for the ICHR image, noted “Class image”, as well as a confidence index for this prediction, noted “Conf_index”.

Chaque flux 62 d’analyse d’une sous-image comporte un extracteur de caractéristiques 66 constitué préférentiellement de la partie convolutive d’un réseau de neurones convolutif (figure 6A). En variante, l’extracteur comporte la partie convolutive d’un réseau de neurones, suivie d’une couche d’aplatissement (« flattening ») (figure 6B). En variante, l’extracteur comporte la partie convolutive d’un réseau de neurones, suivie d’une couche d’aplatissement suivi d’une ou plusieurs couches pleinement connectées (figure 6C). Optionnellement, ces variantes sont complétées en aval par une ou plusieurs couches de neurones pleinement connectées, le vecteur de caractéristiques correspondant à la sortie de la dernière couche connectée. Each stream 62 for analyzing a sub-image comprises a feature extractor 66 preferably consisting of the convolutional part of a convolutional neural network (Figure 6A). Alternatively, the extractor comprises the convolutional part of a neural network, followed by a flattening layer (Figure 6B). Alternatively, the extractor comprises the convolutional part of a neural network, followed by a flattening layer followed by one or more fully connected layers (Figure 6C). Optionally, these variants are completed downstream by one or more fully connected neural layers, the feature vector corresponding to the output of the last connected layer.

Cet extracteur a pour fonction d’extraire des caractéristiques de chaque patch constitutif de la sous-image, notées « Emb », qui résument l’information contenue dans les patchs, notamment en termes de présence ou d’absence de microorganismes. This extractor has the function of extracting characteristics from each patch constituting the sub-image, noted “Emb”, which summarize the information contained in the patches, notably in terms of the presence or absence of microorganisms.

En se référant à la figure 6A, ce réseau de neurones convolutif 66_CNN est entrainé sur une base de données de patchs d’entrainement 66_BDD annotés, le réseau 66_CNN ayant pour fonction de prédire la ou les classes de microorganismes présents dans les patchs. Dans un mode de réalisation privilégié de l’invention, le réseau de neurones 66_CNN est un réseau de type VGG16, ResNet, MobileNetV2, ou efficientNet, de préférence pré-entraîné sur des bases de données publiques tel que disponibles à l’URL https://www.image-net.org, en particulier un réseau VGG16 entraîné sur la base d’ImageNet « ImageNet Large Scale Visual Recognition Challenge (ILSVRC) » (https://www.image-net.org/challenges/LSVRC/index.php). Bien qu’il soit possible de prendre un réseau vierge, c’est-à-dire initialisé par des poids aléatoires, et l’entrainer ah initio avec des patchs annotés, l’utilisation d’un réseau pré-entrainé permet une augmentation de la précision macro de la prédiction de la ou des classes de l’image ICHR de plusieurs pourcents. Parmi les dizaines de réseaux pré-entraînés testés par les inventeurs, un VGG16 est le plus performant, menant à une augmentation de la précision macro d’environ 5% par rapport à un réseau entraîné uniquement à des patchs d’images de lame. S’agissant d’un réseau à l’architecture figée, la dimension des patchs issus des images ICHR est choisie pour être identique. Cette dimension satisfait la condition sur les dimensions des microorganismes, bactéries et levures, telle que décrite plus haut. Referring to Figure 6A, this convolutional neural network 66_CNN is trained on a database of annotated training patches 66_BDD, the network 66_CNN having for function of predicting the class(es) of microorganisms present in the patches. In a preferred embodiment of the invention, the 66_CNN neural network is a VGG16, ResNet, MobileNetV2, or efficientNet type network, preferably pre-trained on public databases such as those available at the URL https://www.image-net.org, in particular a VGG16 network trained on the basis of ImageNet “ImageNet Large Scale Visual Recognition Challenge (ILSVRC)” (https://www.image-net.org/challenges/LSVRC/index.php). Although it is possible to take a blank network, i.e. initialized by random weights, and train it ah initio with annotated patches, the use of a pre-trained network allows an increase in the macro accuracy of the prediction of the class(es) of the ICHR image by several percent. Among the dozens of pre-trained networks tested by the inventors, a VGG16 is the most efficient, leading to an increase in macro accuracy of approximately 5% compared to a network trained only on slide image patches. Since this is a network with a fixed architecture, the dimension of the patches from the ICHR images is chosen to be identical. This dimension satisfies the condition on the dimensions of microorganisms, bacteria and yeasts, as described above.

Dans une variante avantageuse mais optionnelle, les prédictions rendues par le réseau de neurones convolutif 66_CNN peuvent être utilisées pour identifier les zones des sous-images où telle ou telle classe est probablement présente. Ces résultats peuvent être présentés sous forme de « heat map » et sur-imposées à l’image. Une sous-partie 66 du réseau 66_CNN est ensuite extraite, avec ses paramètres. Cette partie 66 peut comprendre les couches convolutives de 66_CNN, et éventuellement certaines des couches pleinement connectées situées en aval. On peut également, optionnellement, y ajouter en aval de nouvelles couches pleinement connectées. Cet élément 66 fonctionne comme un extracteur de caractéristiques des patchs, caractéristiques pertinentes pour la prédiction des classes concernées. In an advantageous but optional variant, the predictions rendered by the convolutional neural network 66_CNN can be used to identify the areas of the sub-images where a particular class is likely to be present. These results can be presented in the form of a "heat map" and superimposed on the image. A sub-part 66 of the 66_CNN network is then extracted, with its parameters. This part 66 can include the convolutional layers of 66_CNN, and possibly some of the fully connected layers located downstream. It is also possible, optionally, to add new fully connected layers downstream. This element 66 functions as an extractor of features of the patches, features relevant for the prediction of the classes concerned.

Une fois les caractéristiques extraites de chaque patch, les vecteurs de caractéristiques correspondants sont communiqués à un deuxième étage de prédiction 68 à base de réseaux de neurones mettant en œuvre une prédiction multi-classes de la ou les classes de microorganisme présents dans la sous-image constituée des patchs. Plus spécifiquement, comme illustré à la figure 7, ce deuxième étage 68 est la portion « MIL» (pour « Multiple Instance Learning ») d’un réseau de type MIL-CNN dont la partie extracteur de caractéristiques est issue du modèle 66_CNN, avantageusement l’extracteur 66. Once the characteristics have been extracted from each patch, the corresponding characteristic vectors are communicated to a second prediction stage 68 based on neural networks implementing a multi-class prediction of the class(es) of microorganism present in the sub-image consisting of the patches. More specifically, as illustrated in Figure 7, this second stage 68 is the “MIL” (for “Multiple Instance Learning”) portion of a MIL-CNN type network whose characteristic extractor part comes from the 66_CNN model, advantageously the extractor 66.

Le deuxième prédicteur 68 comporte : i. un étage amont 70 mettant en œuvre un mécanisme d’attention à porte (en anglais « gated attention mechanism »). Cet étage 70 calcule pour chaque vecteur de caractéristiques issu des patchs (noté Embi j(k, Z) pour le patch de coordonnées (fc, Z) dans la sous-image de coordonnées (Z, j)), un coefficient ai (k, Z) mesurant le poids du patch dans la prédiction de classe de la sous-image, poids que l’étage 70 multiplie aux vecteurs correspondant pour produire un nouveau vecteur de caractéristique a_t j (k, Z) x Embi j(k, l) pour chaque patch. Dans un mode de réalisation préféré, mais non obligatoire, ce poids est calculé selon la relation : The second predictor 68 comprises: i. an upstream stage 70 implementing a gated attention mechanism. This stage 70 calculates for each feature vector from the patches (denoted Embi j(k, Z) for the coordinate patch (fc, Z) in the sub-image of coordinates (Z, j)), a coefficient ai (k, Z) measuring the weight of the patch in the class prediction of the sub-image, weight that the stage 70 multiplies to the corresponding vectors to produce a new characteristic vector a _t j (k, Z) x Embi j(k, l) for each patch. In a preferred, but not mandatory, embodiment, this weight is calculated according to the relation:

Où U E IR^ÇxR et W E sont des matrices formant paramètres du prédicteur 68, Q étant la dimension des vecteurs de caractéristiques Embij^k, !), R un entier positif prédéterminé, par exemple égal à 512, sigm la fonction non-linéaire sigmoïde, et O est l’opérateur de multiplication terme à terme. Un tel mécanisme est notamment décrit dans l’article de M. Use et al., « Attention-based Deep Multiple Instance Learning », arXivfl 802.04712v4 [cs.LG], 28 Jun 2018. D’autres mécanismes d’attention sont cependant possibles. ii. en aval de l’étage à mécanisme d’attention 70, ou intégré à ce dernier, un étage de pooling 72 réduisant la dimensionnalité globale des vecteurs de caractéristiques. Par exemple, l’étage 72 produit pour chaque sous-image (i, j) un vecteur unique de caractéristiques Z(i ) à partir des K x L vecteurs issus des patchs selon la relation : iii. en aval de l’étage de pooling 72, un étage à couches de neurones pleinement connectés 74, recevant le vecteur Z(ij) en entrée et produisant en sortie un vecteur de scores de prédiction Class_scores(i,j Les scores sont normalisés, compris entre 0 et 1, et par convention, plus un score est élevé, plus la probabilité que la classe correspondante soit présente dans la sous-image est élevée. Dans la modalité considérée, cet étage aval peut se réduire à une couche de type sigmoïde qui produit les scores, mais si besoin d’autres couches peuvent être insérées entre l’étage de pooling et l’étage d’obtention des scores. Where EU IR ^ÇxR and WE are matrices forming parameters of the predictor 68, Q being the dimension of the feature vectors Embij^k, !), R a predetermined positive integer, for example equal to 512, sigm the sigmoid non-linear function, and O is the term-by-term multiplication operator. Such a mechanism is notably described in the article by M. Use et al., “Attention-based Deep Multiple Instance Learning”, arXivfl 802.04712v4 [cs.LG], 28 Jun 2018. Other attention mechanisms are however possible. ii. downstream of the attention mechanism stage 70, or integrated into the latter, a pooling stage 72 reducing the overall dimensionality of the feature vectors. For example, the stage 72 produces for each sub-image (i, j) a unique feature vector Z(i ) from the K x L vectors from the patches according to the relation: iii. downstream of the pooling stage 72, a stage with layers of fully connected neurons 74, receiving the vector Z(ij) as input and producing as output a vector of prediction scores Class_scores(i,j The scores are normalized, between 0 and 1, and by convention, the higher a score, the higher the probability that the corresponding class is present in the sub-image. In the modality considered, this downstream stage can be reduced to a sigmoid type layer which produces the scores, but if necessary other layers can be inserted between the pooling stage and the stage for obtaining the scores.

Dans un mode de réalisation privilégié, le modèle 66 et sa partie aval 68 se trouvent ainsi intégrés dans un nouveau prédicteur qui fonctionne au niveau-sous-image, et qui est celui mis en œuvre dans le flux de travail 62 de la figure 5. Comme décrit dans l’article de Use et al., ce nouveau prédicteur peut être appris sur des exemples de sous-images, notamment par des techniques classiques de rétropropagation. Préférentiellement, mais optionnellement, on laissera les paramètres de la partie 66 du modèle libre d’évoluer au cours de cet apprentissage, de sorte qu’à la fin de ce processus ceux-ci n’auront plus leur valeur issue de l’apprentissage du modèle 66_CNN. Par ce procédé, on co-optimise l’extracteur de caractéristique 66 et les couches aval 68. In a preferred embodiment, the model 66 and its downstream part 68 are thus integrated into a new predictor which operates at the sub-image level, and which is the one implemented in the workflow 62 of figure 5. As described in the article by Use et al., this new predictor can be learned on examples of sub-images, in particular by classic backpropagation techniques. Preferably, but optionally, the parameters of part 66 of the model will be left free to evolve during this learning, so that at the end of this process they will no longer have their value resulting from the learning of the 66_CNN model. By this process, the feature extractor 66 and the downstream layers 68 are co-optimized.

En se référant à la figure 8, le dernier étage 64 du prédicteur 60 comporte : i. un étage amont 80 recevant des descripteurs, issus par exemple des vecteurs de scores de prédiction Class_scores(i,j) de l’ensemble des sous-images constitutives de l’image de la lame de Gram, et calculant une distribution des scores sur la lame pour chacune des classes à partir de ces vecteurs. Cet étage 80 tire avantage du nombre important de sous-images constituant l’image ICHR en raison de la grande résolution de cette dernière. Plus particulièrement, l’image ICHR étant constituée de plusieurs dizaines, voire une centaine ou plus, de sous-images, il est ainsi possible de calculer de manière pertinente statistiquement une distribution des scores desdites classes présentes dans l’image ICHR. Un avantage de calculer une distribution est d’obtenir une caractérisation de l’image ICHR indépendante de la position des microorganismes dans l’image tout en prenant en compte l’ensemble de l’image. En particulier, pour des applications aussi sensibles que le diagnostic in vitro de patient, notamment pour le sepsis, il est douteux de rendre un résultat se fondant sur une zone unique ou limitée de la lame. En effet, la prédiction basée sur une seule sous-image peut être erronée ou bien trop incertaine. Referring to Figure 8, the last stage 64 of the predictor 60 comprises: i. an upstream stage 80 receiving descriptors, for example from the prediction score vectors Class_scores(i,j) of all the sub-images constituting the image of the Gram slide, and calculating a distribution of the scores on the slide for each of the classes from these vectors. This stage 80 takes advantage of the large number of sub-images constituting the ICHR image due to the high resolution of the latter. More particularly, the ICHR image being made up of several tens, or even a hundred or more, sub-images, it is thus possible to calculate in a statistically relevant manner a distribution of the scores of said classes present in the ICHR image. An advantage of calculating a distribution is to obtain a characterization of the ICHR image independent of the position of the microorganisms in the image while taking into account the entire image. In particular, for applications as sensitive as in vitro patient diagnostics, especially for sepsis, it is questionable to provide a result based on a single or limited area of the slide. Indeed, the prediction based on a single sub-image can be erroneous or too uncertain.

Par exemple, pour chaque classe de microorganisme, l’étage 80 calcule une approximation de cette distribution consistant en l’extraction des statistiques suivantes le maximum des scores le 95^ieme percentile le score médian le 5^ieme percentile For example, for each class of microorganism, stage 80 calculates an approximation of this distribution consisting of the extraction of the following statistics: the maximum of the scores the ^95th percentile the median score the ^5th percentile

Cette approximation est rapide à calculer et permet de régler, au travers de chacune de ses composantes les performances globales de la prédiction. Notamment, le maximum ainsi que le 95^ieme percentile permet de régler le niveau de sensibilité de la prédiction des classes de microorganismes effectivement présents (ici le 95^ieme percentile est choisi, mais d’autres valeurs supérieures à 50% sont possibles en fonction de la sensibilité voulue). La valeur médiane permet d’atténuer les erreurs de prédiction commises au niveau sous-image. Le 5^ieme percentile permet de régler la sensibilité de la prédiction concernant la classe « pas de microorganisme » (ici le 95^ieme percentile est choisi, mais d’autres valeurs inférieures à 50% sont possibles en fonction de la sensibilité voulue). On notera que des approximations de distribution plus simples sont également dans la portée de l’invention (par exemple uniquement le score maximal pour chaque classe) comme des approximations plus complexes (comme par exemple une interpolation polynomiales des scores ou encore l’approximation par une distribution caractérisée par ses équations tels que la loi de Fisher, de Gauss, de Bernouilli...). En particulier, l’avantage d’obtenir des statistiques descriptives est de pouvoir tenir compte des spécificités de l’application, connues de l’expert en microbiologie, comme l’illustre le choix des percentiles pour aider à régler la sensibilité et la sensibilité qui sont deux critères importants dans le diagnostic IVD. ii. un étage aval 82 recevant chacune des distributions des classes et le nombre de sous- images / x /, et prédisant en sortie la ou les classes Class mage de microorganismes présents dans l’image ICHR ainsi qu’un indice de confiance Conf _index de cette prédiction. Plus particulièrement, la prédiction mise en œuvre par l’étage 82 est une prédiction multi-classes par apprentissage automatisé, et de préférence une prédiction ne mettant pas en œuvre de réseau de neurones. En effet, les caractéristiques reçues par cet étage sont structurées avec un nombre déterminé, fixe et de nature connue de caractéristiques. Aussi, les approches « classiques » d’apprentissage automatisé (i.e. non à base de réseau de neurones), telles que les approches à base de SVM (« support vector machine »), des K plus proches voisins, d’arbres de décision pour en citer quelques-unes, sont plus adaptées en termes de performances, d’interprétabilité et de facilité d’entrainement. Parmi l’ensemble de ces approches, l’étage 82 met en œuvre de manière préférentielle une prédiction de type « Random Forest » particulièrement efficace pour éviter le sur-apprentissage et pour délivrer un indice de confiance IC (par exemple égal au maximum des pourcentage de votes pour chaque classe) interprétable. En effet, comme l’ont remarqué les inventeurs pour l’interprétation d’images complexes, l’indice de confiance du Random Forest prend une valeur élevée pour les prédictions précises (i.e. conformes à la réalité des microorganismes présents dans la lame) et s’effondre dans le cas contraire. Le choix d’une valeur seuil telle qu’utilisée à l’étape 302 (figure 2) pour une application aussi sensible qu’un diagnostic in vitro est ainsi facilité et robuste. Le modèle prédictif de type Random Forest est par exemple celui décrit à l’article de Breiman et al., “Random Forests", Machine Learning, 45(1), 5-32, 2001. Notons que les descripteurs reçus par le modèle 80 ne sont pas nécessairement issus des scores de prédiction rendus par le prédicteur 68. En particulier, il est possible d’exploiter des descripteurs issus des couches situés plus en amont dans son architecture. This approximation is quick to calculate and allows the overall performance of the prediction to be adjusted through each of its components. In particular, the maximum and the ^95th percentile allow the sensitivity level of the prediction of the classes of microorganisms actually present to be adjusted (here the ^95th percentile is chosen, but other values above 50% are possible depending on the desired sensitivity). The median value allows prediction errors to be reduced. committed at the sub-image level. The ^5th percentile allows to adjust the sensitivity of the prediction concerning the “no microorganism” class (here the ^95th percentile is chosen, but other values lower than 50% are possible depending on the desired sensitivity). It should be noted that simpler distribution approximations are also within the scope of the invention (for example only the maximum score for each class) as well as more complex approximations (such as for example a polynomial interpolation of the scores or the approximation by a distribution characterized by its equations such as the Fisher, Gauss, Bernoulli law...). In particular, the advantage of obtaining descriptive statistics is to be able to take into account the specificities of the application, known to the microbiology expert, as illustrated by the choice of percentiles to help adjust the sensitivity and the sensitivity which are two important criteria in IVD diagnosis. ii. a downstream stage 82 receiving each of the distributions of the classes and the number of sub-images / x /, and predicting at output the class(es) Class mage of microorganisms present in the ICHR image as well as a confidence index Conf _index of this prediction. More particularly, the prediction implemented by stage 82 is a multi-class prediction by automated learning, and preferably a prediction not implementing a neural network. Indeed, the characteristics received by this stage are structured with a determined, fixed number of characteristics of known nature. Also, the “classic” approaches to automated learning (i.e. not based on a neural network), such as approaches based on SVM (“support vector machine”), K nearest neighbors, decision trees to name a few, are more suitable in terms of performance, interpretability and ease of training. Among all these approaches, stage 82 preferentially implements a “Random Forest” type prediction that is particularly effective in avoiding over-fitting and in delivering an interpretable confidence index IC (for example equal to the maximum percentage of votes for each class). Indeed, as the inventors have noted for the interpretation of complex images, the confidence index of the Random Forest takes a high value for precise predictions (i.e., consistent with the reality of the microorganisms present in the slide) and collapses in the opposite case. The choice of a threshold value such as used in stage 302 (figure 2) for an application as sensitive as in vitro diagnosis is thus facilitated and robust. The Random Forest type predictive model is for example that described in the article by Breiman et al., “Random Forests”, Machine Learning, 45(1), 5-32, 2001. Note that the descriptors received by model 80 are not necessarily derived from the prediction scores returned by predictor 68. In particular, it is possible to exploit descriptors from layers located further upstream in its architecture.

Les différents modèles prédictifs (niveau patch, sous-image, ou image) décrits ci -avant sont des modèles prédictifs « multi-classes ». En variante, il est possible pour le modèle patch et/ou le modèle sous-image et/ou le modèle image de mettre en œuvre ici une prédiction « multi- étiquette », c’est-à-dire de type présence/absence pour chacune des classes considérées, et non présence d’une seulement des classes (la classe négative étant alors traitée comme une classe à part). La prédiction multi-étiquette permet de traiter certains cas particuliers, tel celui d’échantillons poly-microbiens présentant plusieurs types de Gram. L’avantage de choisir des modèles prédictif multi-classes est de réduire le nombre de données annotées d’entraînement et/ou la campagne de collecte de lames de Gram ou d’échantillons correspondant. En effet, dans le cadre d’infections microbiennes, le cas poly-microbien par exemple est largement minoritaire par rapport aux infections mono-microbiennes, de sorte qu’il existe beaucoup moins de données associées, ce qui rend plus difficile l’entraînement des modèles prédictifs multi-étiquettes. The different predictive models (patch, sub-image, or image level) described above are “multi-class” predictive models. Alternatively, it is possible for the patch model and/or the sub-image model and/or the image model to implement a “multi-label” prediction here, i.e. of the presence/absence type for each of the classes considered, and not the presence of only one of the classes (the negative class then being treated as a separate class). Multi-label prediction makes it possible to deal with certain special cases, such as that of poly-microbial samples presenting several types of Gram. The advantage of choosing multi-class predictive models is to reduce the number of annotated training data and/or the collection campaign of Gram slides or corresponding samples. Indeed, in the context of microbial infections, the poly-microbial case, for example, is largely in the minority compared to mono-microbial infections, so that there is much less associated data, which makes it more difficult to train multi-label predictive models.

En se référant à la figure 9A, une utilisation privilégiée des résultats en sortie du prédicteur 60 faisant suite à l’analyse d’une image de lame 90 par ce dernier comporte l’affichage sur un écran d’ordinateur 92 de l’image 90 dont l’intensité lumineuse est réglée patch par patch en fonction des coefficients ai j(k, l). Notamment, plus le poids d’un patch est élevé, signifiant son importance élevée dans la prédiction au niveau sous-image, plus le patch est lumineux, comme illustré par le patch 94 bien plus clair que les autres zones de l’image 90. De cette manière, un technicien de laboratoire peut, s’il le souhaite, vérifier le contenu de ce patch en termes de microorganismes présents et confirmer ou infirmer son contenu. Le technicien peut ainsi afficher l’ensemble de l’image ou, préférentiellement pour des raisons de lisibilité, une sous-image particulière telle qu’illustré dans cette figure. De préférence, sont également affichés la ou des classes prédites, codée dans le vecteur Class mage, l’indice de confiance associée Conf_index, et un signal codant l’échec de la prédiction lors que l’indice est inférieur au seuil. Les patchs plus lumineux font alors office de zones présomptives susceptibles de contenir des microorganismes. Cette aide permet alors au technicien de plus rapidement réaliser son analyse en se focalisant sur ces zones en premier. Referring to Figure 9A, a preferred use of the output results of the predictor 60 following the analysis of a slide image 90 by the latter comprises the display on a computer screen 92 of the image 90 whose light intensity is adjusted patch by patch according to the coefficients ai j(k, l). In particular, the higher the weight of a patch, signifying its high importance in the prediction at the sub-image level, the brighter the patch, as illustrated by the patch 94 which is much lighter than the other areas of the image 90. In this way, a laboratory technician can, if he wishes, check the content of this patch in terms of microorganisms present and confirm or deny its content. The technician can thus display the entire image or, preferably for reasons of readability, a particular sub-image as illustrated in this figure. Preferably, the predicted class(es) are also displayed, encoded in the Class mage vector, the associated confidence index Conf_index, and a signal encoding the failure of the prediction when the index is below the threshold. The brighter patches then act as presumptive areas likely to contain microorganisms. This aid then allows the technician to carry out his analysis more quickly by focusing on these areas first.

De manière avantageuse, comme illustré sur le deuxième écran d’affichage de la figure 9A et les images de la figure 9B, les sous-images sont également analysées par le réseau de neurones convolutifs 66_CNN dont les prédictions peuvent être affichées sur un écran, avantageusement sous forme de « heat maps », de sorte que le technicien peut directement savoir quelles zones de l’image sont susceptibles de correspondre à des microorganismes, et à quelle classe ceux-ci correspondent. Advantageously, as illustrated in the second display screen of Figure 9A and the images of Figure 9B, the sub-images are also analyzed by the convolutional neural network 66_CNN whose predictions can be displayed on a screen, advantageously in the form of “heat maps”, so that the technician can directly know which areas of the image are likely to correspond to microorganisms, and to which class these correspond.

C. ANNOTATION DES DONNÉES ET ENTRAINEMENT DE L’ARCHITECTURE DE PREDICTION A MULTIPLES ETAGES C. DATA ANNOTATION AND TRAINING OF MULTI-STAGE PREDICTION ARCHITECTURE

C.1. ANNOTATION DES DONNÉES C.1. DATA ANNOTATION

Trois échelles d’annotation sont réalisées : au niveau de l’image entière, à laquelle correspond la lame ; au niveau de la sous-image ; et au niveau du patch. Three annotation scales are performed: at the level of the entire image, to which the blade corresponds; at the level of the sub-image; and at the level of the patch.

Considérons d’abord le niveau lame, qui est le niveau pertinent d’un point de vue biologique et médical. Chaque image couvre a priori une surface suffisamment grande pour rendre compte du contenu de la lame, et de l’échantillon initial. Aussi on peut utiliser toute connaissance disponible sur le contenu de l’échantillon ou de la lame pour produire une annotation au niveau image. Let us first consider the slide level, which is the relevant level from a biological and medical point of view. Each image a priori covers a sufficiently large area to account for the contents of the slide, and of the initial sample. Therefore, any available knowledge about the contents of the sample or slide can be used to produce an annotation at the image level.

De préférence, les lames de Gram sont issues de laboratoires d’analyse microbiologique, réalisées et annotées en conditions réelles par des techniciens spécialistes de l’analyse de Gram. Toutefois, l’interprétation des lames complexes peut s’avérer difficile, même pour un technicien expérimenté. De préférence, une partie de l’échantillon servant à la réalisation d’une lame est caractérisé de manière plus approfondie. Notamment, l’identité réelle des microorganismes présents dans l’échantillon est déterminée, de préférence par spectrométrie de masse de type MALDI-TOF, au moyen d’un VITEK® MS commercialisé par la Demanderesse. L’image de la lame de Gram hérite tout d’abord de l’annotation de la lame faite en laboratoire, qualifiée de « brute ». Cette annotation brute est alors vérifiée par un second expert et les erreurs et ambiguïtés d’annotation corrigées grâce aux caractérisations supplémentaires des microorganismes. L’annotation finale de l’image de la lame est alors qualifiée de « vérité terrain ». Outre la correction de l’annotation brute, un premier tri des lames est réalisé : celles comprenant un mélange polymicrobien ou des Campylobacter sont écartés du jeu d’entrainement. Afin d’améliorer les performances globales du prédicteur selon l’invention, les Acinetobacter sont classées dans la classe « Bacille à Gram Négatif ». D’autres caractéristiques des échantillons (métadonnées) sont également collectées comme, le cas échéant, le type de milieu de culture compris dans l’échantillon, le temps écoulé pour obtenir la positivité de l’hémoculture, le temps écoulé pour la révélation de la coloration de Gram, ou encore un antibiogramme des microorganismes. Décrivons maintenant l’annotation au niveau sous-image. Chaque sous-image d’une image reçoit à défaut l’annotation « vérité terrain » de cette dernière, mais cette annotation peut être modifiée par le second expert, par exemple dans le cas où le microorganisme présent sur la lame est absent de la sous-image considérée. Pour ce faire, pour caractériser l’image de la lame dans son entier, le second expert parcourt chaque zone de celle-ci et donc chaque sous-image. De manière préférentielle et optionnelle, lorsque le second expert n’arrive pas à annoter de manière non ambigüe une sous-image, cette dernière est écartée du jeu d’entrainement. De manière préférentielle et optionnelle, l’annotation « vérité terrain » des sous-images est modifiée et réduite à une seule classe, de sorte qu’on pourra se restreindre ensuite à un modèle de type « multi-classes », les rares sous-images présentant simultanément deux classes étant exclues du jeu d’entraînement. De manière préférentielle et optionnelle, le cas particulier de bactéries présentant plusieurs morphes (par exemple les Bacilles Gram Variables comme Bacillus sublilis. qui est taxonomiquement un Bacille Gram Positifs, mais qui souvent présente l’apparence de Bacilles Gram Négatifs), l’annotation au niveau sous-image pourra être modifiée de manière à rendre compte de l’apparence réelle des bactéries, de telle sorte que l’annotation sous-image peut se trouver différente de l’annotation image. On obtient ainsi une annotation pour tout ou partie des sous-images, qui sont typiquement une centaine de fois plus nombreuses que le nombre d’images dans la base. Les sous-images sont annotées de manière forte puisque le concepteur du prédicteur selon l’invention injecte une connaissance, donc une information a priori supplémentaire, dans l’annotation de la sous-image. Cette annotation servira à améliorer l’entraînement du prédicteur. Preferably, the Gram slides come from microbiological analysis laboratories, produced and annotated under real conditions by technicians specializing in Gram analysis. However, the interpretation of complex slides can be difficult, even for an experienced technician. Preferably, a portion of the sample used to produce a slide is characterized in greater depth. In particular, the actual identity of the microorganisms present in the sample is determined, preferably by MALDI-TOF mass spectrometry, using a VITEK® MS marketed by the Applicant. The image of the Gram slide first inherits the annotation of the slide made in the laboratory, referred to as “raw”. This raw annotation is then verified by a second expert and the annotation errors and ambiguities are corrected thanks to additional characterizations of the microorganisms. The final annotation of the slide image is then referred to as “ground truth”. In addition to correcting the raw annotation, a first sorting of the slides is carried out: those comprising a polymicrobial mixture or Campylobacter are removed from the training set. In order to improve the overall performance of the predictor according to the invention, the Acinetobacter are classified in the “Gram-Negative Bacillus” class. Other characteristics of the samples (metadata) are also collected such as, where applicable, the type of culture medium included in the sample, the time elapsed to obtain positivity of the blood culture, the time elapsed for the revelation of the Gram stain, or an antibiogram of the microorganisms. Let us now describe the annotation at the sub-image level. Each sub-image of an image receives by default the "ground truth" annotation of the latter, but this annotation can be modified by the second expert, for example in the case where the microorganism present on the slide is absent from the sub-image considered. To do this, to characterize the image of the slide in its entirety, the second expert goes through each zone of it and therefore each sub-image. Preferably and optionally, when the second expert is unable to unambiguously annotate a sub-image, the latter is excluded from the training set. Preferably and optionally, the "ground truth" annotation of the sub-images is modified and reduced to a single class, so that we can then restrict ourselves to a "multi-class" type model, the rare sub-images presenting two classes simultaneously being excluded from the training set. Preferably and optionally, in the particular case of bacteria with several morphs (for example, Gram-Variable Bacilli such as Bacillus sublilis, which is taxonomically a Gram-Positive Bacillus, but which often has the appearance of Gram-Negative Bacilli), the annotation at the sub-image level may be modified to reflect the actual appearance of the bacteria, so that the sub-image annotation may be different from the image annotation. This provides an annotation for all or part of the sub-images, which are typically a hundred times more numerous than the number of images in the database. The sub-images are strongly annotated since the designer of the predictor according to the invention injects knowledge, therefore additional a priori information, into the annotation of the sub-image. This annotation will be used to improve the training of the predictor.

Selon la même logique, une annotation plus forte encore peut être produite au niveau patch, sur une partie au moins du jeu de données. Dans un mode de réalisation de l’invention, cette annotation est réalisée en effectuant d’abord une segmentation sémantique des microorganismes, c’est-à-dire en associant un type de Gram à chacun des pixels d’une sous-image. Divers procédés sont possibles pour réaliser cette étape de segmentation sémantique. On peut par exemple effectuer une segmentation semi -automatisée via un outil comme Ilastik décrit dans l’article de S. Berg et al. « ilastik: interactive machine learning for (bio)image analysis », Nature Methods, (2019) et disponible à l’URL https://www.ilastik.org/, associant un statut de microorganisme ou non à chaque pixel. On peut ensuite utiliser l’annotation de la sous-image pour associer une classe aux pixels de microorganismes. Une dernière étape manuelle effectuée par un expert permet enfin de corriger les masques de segmentation obtenue, et également de pointer des zones ambiguës. Une fois la segmentation sémantique effectuée, un algorithme explicite permet d’attribuer à chacun des patchs de la sous-image une classe, en fonction du nombre de pixels de la classe en question présents sur le patch. De même que l’annotation sous-image n’est pas nécessairement identique à l’annotation image, l’annotation patch n’est pas nécessairement identique à l’annotation sous-image. En particulier, lorsque certaines classes sont liées à l’état d’organisation des microorganismes (chaînettes, grappes), une telle organisation n’est pas toujours détectable au niveau d’un patch d’étendue réduite, aussi dans le cas où le nombre de pixels de microorganismes n’est pas suffisamment grand sur un patch, il peut convertir l’annotation de telle sorte qu’elle ne rende plus compte de l’état d’agrégation. Par exemple, on convertira par ce procédé la classe « Coque Gram Positif Agrégé en Chaînette » vers « Coque Gram Positif - état d’agrégation indéterminable ». Enfin, une nouvelle classe « ambiguë » peut être attribuée à un patch soit lorsque des pixels qui le composent ont été annotés comme tel par un expert, soit quand le nombre de pixels associés à un microorganisme est très faible (ce qui correspond en général à un organisme vu partiellement car situé en bord de patch). Following the same logic, an even stronger annotation can be produced at the patch level, on at least part of the dataset. In one embodiment of the invention, this annotation is carried out by first performing a semantic segmentation of the microorganisms, that is to say by associating a Gram type with each of the pixels of a sub-image. Various methods are possible to carry out this semantic segmentation step. For example, a semi-automated segmentation can be carried out using a tool such as Ilastik described in the article by S. Berg et al. “ilastik: interactive machine learning for (bio)image analysis”, Nature Methods, (2019) and available at the URL https://www.ilastik.org/, associating a microorganism status or not with each pixel. The annotation of the sub-image can then be used to associate a class with the microorganism pixels. A final manual step carried out by an expert finally makes it possible to correct the segmentation masks obtained, and also to point out ambiguous areas. Once the semantic segmentation is carried out, an explicit algorithm makes it possible to assign a class to each of the patches of the sub-image, depending on the number of pixels of the class in question present on the patch. Just as sub-image annotation is not necessarily identical to image annotation, patch annotation is not necessarily identical to sub-image annotation. In particular, when certain classes are linked to the state of organization of microorganisms (chains, clusters), such an organization is not always detectable at the level of a patch of reduced extent, also in the case where the number of microorganism pixels is not sufficiently large on a patch, it can convert the annotation in such a way that it no longer reflects the state of aggregation. For example, we will convert by this process the class "Gram Positive Cocci Aggregated in Chain" to "Gram Positive Cocci - indeterminate state of aggregation". Finally, a new "ambiguous" class can be attributed to a patch either when pixels that compose it have been annotated as such by an expert, or when the number of pixels associated with a microorganism is very low (which generally corresponds to an organism seen partially because located at the edge of the patch).

Une fois ce procédé mis en œuvre, on obtient une annotation au niveau patch pour tout ou partie du jeu de données disponible. Notons que le nombre de patchs, dans la modalité considérée, est typiquement de l’ordre de quelques dizaines ou centaines par sous-image. Les patchs sont annotés de manière très forte puisque le concepteur du prédicteur selon l’invention injecte une information a priori supplémentaire dans l’annotation du patch. Cette annotation servira à améliorer l’entraînement du prédicteur. Once this process is implemented, we obtain a patch-level annotation for all or part of the available dataset. Note that the number of patches, in the modality considered, is typically of the order of a few tens or hundreds per sub-image. The patches are annotated very strongly since the designer of the predictor according to the invention injects additional a priori information into the patch annotation. This annotation will be used to improve the training of the predictor.

Les images des lames sont stockées avec leur annotation « vérité terrain » dans une mémoire informatique pour leur traitement ultérieur. De même, les sous-images et leur annotation sont stockées dans une mémoire informatique. De même, les patchs avec leurs annotations de patch sont stockés dans une mémoire informatique pour leur traitement ultérieur. Trois bases de données sont ainsi constituées. The slide images are stored with their “ground truth” annotation in a computer memory for further processing. Similarly, the sub-images and their annotation are stored in a computer memory. Similarly, the patches with their patch annotations are stored in a computer memory for further processing. Three databases are thus created.

On obtient par le procédé décrit ci-dessus une annotation à trois niveaux croissants de force : niveau image, sous-image et patch. Ce faisant, on renforce considérablement la force globale de notre annotation sur le jeu d’apprentissage. Or, étant donné le nombre d’images disponibles (737 dans l’exemple de mise en œuvre considéré) et leur dimensionnalité (plus de 100 millions de pixels dans l’exemple considéré), l’entraînement direct d’un prédicteur exploitant uniquement l’annotation niveau image serait très vraisemblablement voué à l’échec. Using the method described above, we obtain an annotation with three increasing levels of strength: image level, sub-image level and patch level. In doing so, we considerably strengthen the overall strength of our annotation on the training set. However, given the number of images available (737 in the implementation example considered) and their dimensionality (more than 100 million pixels in the example considered), direct training of a predictor using only image-level annotation would most likely be doomed to failure.

C.2. ENTRAINEMENT DU PRÉDICTEUR A MULTIPLE ETAGES C.2. TRAINING THE MULTI-STAGE PREDICTOR

Il est à présent décrit un procédé d’entrainement du prédicteur 60. Selon ce procédé, chaque étage est entraîné indépendamment des autres tout en se fondant sur les mêmes images de lame de Gram d’entrainement. En se référant à la figure 10A, un procédé 100 d’entrainement du prédicteur 60 comprend la constitution, en 102, d’une base de données images de lame de Gram d’entrainement, annotées par la ou les classes de microorganismes présents, notamment une base de données d’images annotés, de sous-images annotées et de patchs annotés de la manière décrite précédemment. A method of training the predictor 60 is now described. According to this method, each stage is trained independently of the others while relying on the same training Gram slide images. Referring to Figure 10A, a method 100 for training the predictor 60 comprises constituting, at 102, a database of training Gram slide images, annotated by the class(es) of microorganisms present, in particular a database of annotated images, annotated sub-images and annotated patches in the manner described previously.

Le procédé se poursuit par l’entrainement, en 104, du réseau de neurones convolutifs 66_CNN à prédire les classes de patchs. Cet entrainement peut débuter par un réseau non entraîné. Cependant comme décrit précédemment, un réseau initial pré-entrainé est préférentiellement sélectionné puis réentraîné sur la base de données de patchs annotés. Dans cette option, en fonction du réseau pré-entrainé choisi, une normalisation de la valeur des pixels des patchs peut être mise en œuvre. Notamment, certains réseaux accessibles dans des bibliothèques de réseaux, notamment ceux se fondant sur les bibliothèques logicielles « TensorFlow », sont entraînés sur images dont les valeurs de pixels sont standardisées. Dans ce cas, en amont du réseau, 66_CNN il est prévu une normalisation des patchs, par exemple une standardisation (centrage des données sur 0, et de l’écart-type à 1). Cet étage de normalisation est alors optionnellement prévu dans le prédicteur selon l’invention, en amont de l’extracteur de caractéristiques 66. The process continues with the training, at 104, of the convolutional neural network 66_CNN to predict the patch classes. This training can begin with an untrained network. However, as described previously, an initial pre-trained network is preferably selected and then re-trained on the basis of annotated patch data. In this option, depending on the pre-trained network chosen, a normalization of the pixel values of the patches can be implemented. In particular, certain networks accessible in network libraries, notably those based on the “TensorFlow” software libraries, are trained on images whose pixel values are standardized. In this case, upstream of the network, 66_CNN, a normalization of the patches is provided, for example a standardization (centering of the data on 0, and of the standard deviation at 1). This normalization stage is then optionally provided in the predictor according to the invention, upstream of the characteristic extractor 66.

La base de données de patchs annotés est scindée en un jeu d’entrainement et un jeu de test, en faisant en sorte (technique dite de « blocking ») que tous les patchs issus d’une image donnée soient soit tous versés dans le jeu d’entraînement, soit versés dans le jeu de test. L’optimisation des hyperparamètres du réseau 66_CNN est réalisée sur le jeu d’entraînement par une technique de validation croisée, par exemple 4 « folds » constitués en utilisant la technique dite de « blocking » (ou « blocked cross-validation »), qui garantit notamment que tous les patchs associés à une image donnée sont positionnés dans le même « fold ». La valeur des hyperparamètres est par exemple optimisée selon une technique de recherche par grille (« grid search »), de recherche aléatoire (« random search »), ou par une méthode bayésienne. The annotated patch database is split into a training set and a test set, ensuring (a technique called "blocking") that all patches from a given image are either all added to the training set or all added to the test set. The optimization of the hyperparameters of the 66_CNN network is carried out on the training set using a cross-validation technique, for example 4 "folds" created using the technique called "blocking" (or "blocked cross-validation"), which guarantees in particular that all patches associated with a given image are positioned in the same "fold". The value of the hyperparameters is for example optimized using a grid search technique, a random search technique, or a Bayesian method.

Dans une variante, les patchs ayant été annotés « ambigus » sont conservés pour l’entraînement et le test, et l’apprentissage comprend une classe supplémentaire « ambigüe », cette classe regroupant les patchs pour lequel l’expert annotant n’est pas certains des microorganismes présents. In a variant, patches that have been annotated as “ambiguous” are kept for training and testing, and the training includes an additional “ambiguous” class, this class grouping the patches for which the annotating expert is not certain of the microorganisms present.

Une fois le réseau CNN 66_CNN entrainé, la partie convolutive est extraite, les dernières couches étant supprimées du réseau. Once the CNN network 66_CNN is trained, the convolutional part is extracted, with the last layers being removed from the network.

Dans une première variante de l’apprentissage du modèle prédictif du niveau sous-image, l’extracteur est intégré à un modèle de type MIL-CNN par un procédé de transfer learning. Ce réseau est par exemple celui décrit dans l’article de M. Use et al. Par exemple le code source, ainsi que le code source d’entrainement, du réseau sont ceux accessibles sur Github à l’adresse https://github.com/AMLab-Amsterdam/AttentionDeepMIL. Le procédé se poursuit, en 108, par l’entrainement du réseau MIL-CNN, la base de sous-images étant scindée en un jeu d’entrainement et un jeu de test, en faisant en sorte d’avoir la même répartition des images entre jeu d’entraînement et test que pour l’entraînement du modèle patch. La valeur de ses hyperparamètres est optimisée là aussi par validation croisée, par exemple selon une technique de recherche par grille (« grid search »), de recherche aléatoire (« random search »), ou par méthode bayésienne. De manière préférentielle, ces hyperparamètres comprennent l’architecture de l’étage à couche de neurones pleinement connectés 74 (e.g. nombre de couches, nombre de neurones par couche...). Dans cette variante, la partie convolutive et la partie MIL du modèle MIL-CNN entraîné peuvent constituer respectivement l’extracteur 66 au niveau patch et le modèle de prédiction 68 au niveau sous-image. En variante, la portion MIL est extraite pour former le modèle 68 et l’extracteur est celui entraîné à l’étape précédente. In a first variant of the learning of the predictive model of the sub-image level, the extractor is integrated into a MIL-CNN type model by a transfer learning process. This network is for example that described in the article by M. Use et al. For example, the source code, as well as the training source code, of the network are those accessible on Github at the address https://github.com/AMLab-Amsterdam/AttentionDeepMIL. The process continues, in 108, by training the MIL-CNN network, the sub-image base being split into a training set and a test set, ensuring the same distribution of images between training and test sets as for training the patch model. The value of its hyperparameters is also optimized by cross-validation, for example using a grid search technique, a random search technique, or a Bayesian method. Preferably, these hyperparameters include the architecture of the fully connected neuron layer stage 74 (e.g. number of layers, number of neurons per layer, etc.). In this variant, the convolutional part and the MIL part of the trained MIL-CNN model may constitute the extractor 66 at the patch level and the prediction model 68 at the sub-image level, respectively. Alternatively, the MIL portion is extracted to form the model 68 and the extractor is the one trained in the previous step.

Dans une deuxième variante d’apprentissage du modèle prédictif au niveau sous-image, seule la portion MIL d’un réseau MIL-CNN est entraînée. Par exemple, l’ensemble des patchs de toutes les sous-images sont traités par l’extracteur entraîné 66 de manière à obtenir des ensembles de vecteurs de caractéristiques correspondants, chacun annoté par l’annotation de la sous-image correspondante comme illustrée à la figure 10A. Ces jeux de vecteurs de caractéristiques et leurs annotations sont mémorisés dans une base de données 106. Le modèle 68 est alors entraîné sur cette base selon une technique de cross-validation telle que décrite précédemment. In a second variant of training the predictive model at the sub-image level, only the MIL portion of a MIL-CNN network is trained. For example, all the patches of all the sub-images are processed by the trained extractor 66 so as to obtain sets of corresponding feature vectors, each annotated by the annotation of the corresponding sub-image as illustrated in FIG. 10A. These sets of feature vectors and their annotations are stored in a database 106. The model 68 is then trained on this basis according to a cross-validation technique as described previously.

Une fois l’étage MIL 68 entraîné, chaque lot de sous-images correspondant à une image est traité, en 110, par cet étage 68 et le module 80 de calcul de distribution afin de produire des distributions de scores, distributions qui sont mémorisées avec les annotations d’image correspondante dans une base de données. Puis, en 112, le dernier étage 82 du prédicteur à base de Random Forest est entraîné, la base de distributions étant scindée en un jeu d’entrainement et un jeu de test, là encore avec la même répartition des lames que précédemment., L’optimisation des hyperparamètres est là encore réalisée par validation croisée avec 4 folds. La valeur de ses hyperparamètres est par exemple optimisée selon une technique de recherche par grille (« grid search »), le nombre d’hyperparamètres étant plus limité que dans le cadre des prédicteurs précédents. Once the MIL stage 68 has been trained, each batch of sub-images corresponding to an image is processed, at 110, by this stage 68 and the distribution calculation module 80 in order to produce score distributions, distributions which are stored with the corresponding image annotations in a database. Then, at 112, the last stage 82 of the Random Forest-based predictor is trained, the distribution base being split into a training set and a test set, again with the same distribution of the blades as previously. The optimization of the hyperparameters is again carried out by cross-validation with 4 folds. The value of its hyperparameters is for example optimized according to a grid search technique, the number of hyperparameters being more limited than in the context of the previous predictors.

De manière optionnelle, le procédé se poursuit, en 114, par le calcul d’un critère de performance du prédicteur à étage selon l’invention, par exemple la précision globale, en fonction d’un taux de rejet des lames. Notamment, ce calcul à traiter l’ensemble des images de lame par le prédicteur et de faire varier le seuil de confiance de l’étape 302 (figure 2). Les lames ne passant pas le seuil sont écartées, le taux de rejet étant égal au pourcentage de lames écartées, et la précision globale est calculée pour les lames restantes. Selon l’invention, si le taux de rejet est jugé trop important ou la précision trop faible, le procédé se poursuit par l’acquisition de nouvelles lames et/ou l’annotation supplémentaire de patchs et le réentrainement du prédicteur tel que décrit précédemment afin de diminuer le taux de rejet et d’augmenter la précision. Optionally, the method continues, at 114, by calculating a performance criterion of the stage predictor according to the invention, for example the overall precision, as a function of a rejection rate of the slides. In particular, this calculation involves processing all of the slide images by the predictor and to vary the confidence threshold of step 302 (figure 2). The slides not passing the threshold are discarded, the rejection rate being equal to the percentage of discarded slides, and the overall accuracy is calculated for the remaining slides. According to the invention, if the rejection rate is deemed too high or the accuracy too low, the method continues with the acquisition of new slides and/or the additional annotation of patches and the retraining of the predictor as described previously in order to reduce the rejection rate and increase the accuracy.

La figure 10B illustre un exemple de précision vs taux de rejet pour le prototype de prédicteur à étage conçu par les inventeurs après plusieurs itérations. Sur cet exemple, en ciblant une précision supérieure ou égale à 95%, le taux de rejet obtenu est de 17%, signifiant que ce prototype, si utilisé en l’état en combinaison avec le système d’acquisition d’image décrit à la figure 3, permet de traiter de manière automatique, sans test supplémentaire, plus de 80% des lames de Gram. Ces performances ont été obtenues sur un jeu de données initial de 737 lames de Gram, divisées chacune en 100 sous-images, chaque sous-image étant divisée en 24 patchs. Sur nombre total de patchs de près de 1,8 millions, seuls 3% de ceux-ci, soit environ 50 000, ont été annotés de manière forte, alors que le nombre total de paramètres du prédicteur selon l’invention est d’environ 15 millions. Figure 10B illustrates an example of accuracy vs. rejection rate for the prototype of the stage predictor designed by the inventors after several iterations. In this example, by targeting an accuracy greater than or equal to 95%, the rejection rate obtained is 17%, meaning that this prototype, if used as is in combination with the image acquisition system described in Figure 3, can automatically process, without additional testing, more than 80% of Gram slides. These performances were obtained on an initial data set of 737 Gram slides, each divided into 100 sub-images, each sub-image being divided into 24 patches. Out of a total number of patches of nearly 1.8 million, only 3% of these, or approximately 50,000, were strongly annotated, while the total number of parameters of the predictor according to the invention is approximately 15 million.

La figure 11A présente avec plus de détails les performances obtenues par le prototype du prédicteur selon l’invention, entrainé sur le jeu de lame de Gram précédemment décrit, performances obtenues sur l’ensemble des lame (i.e. avec un taux de rejet réglé à 0%). La figure 11B présente la matrice de confusion correspondante. Figure 11A presents in more detail the performances obtained by the prototype of the predictor according to the invention, trained on the set of Gram slides previously described, performances obtained on all the slides (i.e. with a rejection rate set to 0%). Figure 11B presents the corresponding confusion matrix.

D. APPLICATION DE L’INVENTION A DES SIGNAUX AUTRES QUE DES SIGNAUX RGB D. APPLICATION OF THE INVENTION TO SIGNALS OTHER THAN RGB SIGNALS

Il a été décrit une application de l’invention à des images dont la valeur des pixels est codée sur trois canaux de couleur (RGB). L’invention s’applique à d’autres types de signaux, notamment des images numériques dont la valeur des pixels code des informations holographiques, sur une ou plusieurs longueurs d’onde. An application of the invention to images whose pixel values are coded on three color channels (RGB) has been described. The invention applies to other types of signals, in particular digital images whose pixel values code holographic information, on one or more wavelengths.

Par exemple, le système d’acquisition de l’image de la lame de Gram est celui décrit dans les demandes de brevet EP4307051 et EP4172699, connecté à un module de calcul pour la reconstruction d’une image refocalisée. Indépendamment des variantes particulières décrites dans ces demandes, le principe de l’imagerie holographique est d’éclairer en lumière cohérente selon une ou plusieurs longueurs d’onde la lame, d’enregistrer les images d’intensité correspondantes et de produire par reconstruction informatique (reconstruction dite « paramétrique » comme par exemple décrit dans ces demandes ou reconstruction dite « non paramétrique » comme décrit par exemple dans les demandes WO2016075279, WO20 17077238 ou WO 17207184) une image numérique holographique de la lame dont chaque pixel est codé par une valeur d’intensité et une valeur de phase pour chaque longueur d’onde d’éclairage (dans un mode de réalisation décrit dans les deux demandes EP4307051 et EP4172699, au nombre de huit, chaque pixel étant ainsi codé sur 16 canaux). For example, the system for acquiring the image of the Gram slide is that described in patent applications EP4307051 and EP4172699, connected to a calculation module for the reconstruction of a refocused image. Independently of the particular variants described in these applications, the principle of holographic imaging is to illuminate the slide with coherent light according to one or more wavelengths, to record the corresponding intensity images and to produce by computer reconstruction (so-called "parametric" reconstruction as for example described in these applications or so-called "non-parametric" reconstruction) parametric" as described for example in applications WO2016075279, WO20 17077238 or WO 17207184) a holographic digital image of the blade, each pixel of which is coded by an intensity value and a phase value for each illumination wavelength (in an embodiment described in the two applications EP4307051 and EP4172699, eight in number, each pixel thus being coded on 16 channels).

Dans une première variante, l’extracteur de caractéristique 66 est issu d’un réseau convolution 66_CNN qui n’est pas pré-entraîné, ce qui permet de prendre en entrée l’ensemble des canaux. Dans une seconde variante, cet extracteur est issu d’un réseau convolutif pré-entrainé, notamment sur des images RGB, tel que décrit précédemment. Dans cette option, en amont de l’extracteur 66, il est prévu un étage de réduction de dimensionnalité si le nombre de canaux par pixel est supérieur à 3. De manière avantageuse, cet étage consiste à calculer 3 composantes principales de l’image holographique et d’injecter ces trois composantes principales dans l’extracteur. De manière avantageuse, les composantes principales choisies pour la réduction de dimensionnalité sont des hyperparam êtres lors de l’entraînement du réseau convolutif 66 CNN In a first variant, the feature extractor 66 comes from a convolutional network 66_CNN which is not pre-trained, which makes it possible to take all the channels as input. In a second variant, this extractor comes from a pre-trained convolutional network, in particular on RGB images, as described previously. In this option, upstream of the extractor 66, a dimensionality reduction stage is provided if the number of channels per pixel is greater than 3. Advantageously, this stage consists of calculating 3 principal components of the holographic image and injecting these three principal components into the extractor. Advantageously, the principal components chosen for the dimensionality reduction are hyperparameters during the training of the convolutional network 66 CNN.

Les annotations des images de lame, des sous-images et des patchs, ainsi que l’entrainement du prédicteur, sont réalisées de manière analogue à la manière décrites ci -avant Les figures 12A et 12B décrivent les performances du prototype à image holographique développé par les inventeurs, performances obtenues sur la base initiale des 737 lames de Gram. The annotations of the slide images, sub-images and patches, as well as the training of the predictor, are carried out in a manner analogous to the manner described above. Figures 12A and 12B describe the performance of the holographic image prototype developed by the inventors, performance obtained on the initial basis of the 737 Gram slides.

E. EXTENSION DE L’ENSEIGNEMENT DES MODES DE REALISATION DÉTAILLÉS i. Il a été décrit l’application de l’invention à la caractérisation de microorganismes dans un échantillon préparé en lame de Gram. L’invention s’applique à tout type d’image numérique comprenant des objets à caractériser. Par exemple, en restant dans le domaine de la biologie, cette technique s’applique à l’imagerie cellulaire de cellules eucaryotes, en particulier de cellules marquées par fluorescence dont les différents types peuvent être détectés, ainsi que leurs différents organites, tel le noyau cellulaire. Un autre exemple est l’hématologie, où les différents éléments figurés du sang (érythrocytes, leucocytes) peuvent être détectés. Un troisième exemple est l’analyse d’échantillons d’urine, avec là encore les éléments figurés peuvent être détectés conjointement à des microorganismes. Un quatrième exemple est l’histopathologie, pour détecter des cellules cancéreuses qui peuvent ne couvrir qu’une très petite partie de la surface d’une image de très haute résolution. Pour certaines de ces applications où la notion de quantification ou de numération est importante (ce qui n’est pas le cas sur les hémocultures positives), l’analyse du nombre de patchs et/ou de sous-images positifs à telle ou telle classe peut constituer un point de départ pour mettre en œuvre un comptage des différents types cellulaires. E. EXTENSION OF THE TEACHING OF THE DETAILED EMBODIMENTS i. The application of the invention to the characterization of microorganisms in a sample prepared on a Gram slide has been described. The invention applies to any type of digital image comprising objects to be characterized. For example, remaining in the field of biology, this technique applies to cellular imaging of eukaryotic cells, in particular fluorescently labeled cells whose different types can be detected, as well as their different organelles, such as the cell nucleus. Another example is hematology, where the different formed elements of blood (erythrocytes, leukocytes) can be detected. A third example is the analysis of urine samples, where again the formed elements can be detected together with microorganisms. A fourth example is histopathology, to detect cancer cells that may cover only a very small part of the surface of a very high-resolution image. For some of these applications where the notion of quantification or counting is important (which is not the case for positive blood cultures), the analysis of the number of patches and/or sub-images positive for a particular class can constitute a starting point for implementing a count of the different cell types.

D’une manière générale, l’invention trouve application à la caractérisation de tout objet dans une image de haute résolution, composite ou non. ii. Il a été décrit un mode de réalisation dans lequel les caractéristiques fournies au dernier étage sont les scores de prédictions du second étage. En variante, les caractéristiques sont par exemple celles générées par une couche de neurones de la portion MIL. Selon l’invention, des statistiques descriptives sont également générées à partir de ces caractéristiques et communiquées au modèle prédictif final. iii. Il a été décrit un modèle prédictif au niveau sous-image mettant en œuvre un mécanisme d’attention. En variante, ce modèle ne met pas en œuvre cette fonction, les vecteurs de caractéristiques issus des patchs étant directement groupés via une méthode de pooling telle l’application d’une moyenne ou d’un max pooling. In general, the invention finds application in the characterization of any object in a high-resolution image, composite or not. ii. An embodiment has been described in which the characteristics provided at the last stage are the prediction scores of the second stage. Alternatively, the characteristics are for example those generated by a layer of neurons of the MIL portion. According to the invention, descriptive statistics are also generated from these characteristics and communicated to the final predictive model. iii. A predictive model at the sub-image level has been described implementing an attention mechanism. Alternatively, this model does not implement this function, the characteristic vectors from the patches being directly grouped via a pooling method such as the application of an average or max pooling.

F. MISE EN ŒUVRE INFORMATIQUE F. IT IMPLEMENTATION

La phase d’entrainement du prédicteur de l’invention, notamment celle décrite en relation avec la figure 10A et la phase de prédiction, notamment celle décrite en relation avec la figure 5, hormis les étapes de préparation de la lame et d’acquisition du signal optique correspondant à l’image de celle-ci, sont mises en œuvre par ordinateur, à savoir au moyen de circuits matériels comprenant des mémoires informatiques (cache, RAM, ROM...) et un ou plusieurs microprocesseurs ou processeurs (CPU et/ou GPU), organisés ou non sous forme de nœuds de calcul, nécessaires à l’exécution d’instructions informatiques mémorisées dans les mémoires pour la mise en œuvre desdites phases. On comprendra que s’agissant de calcul tout type d’architecture informatique peut convenir et que la description ci -avant et ci-après ne doit pas être comprise comme limitant la portée de l’invention. The training phase of the predictor of the invention, in particular that described in relation to Figure 10A and the prediction phase, in particular that described in relation to Figure 5, apart from the steps of preparing the slide and acquiring the optical signal corresponding to the image thereof, are implemented by computer, namely by means of hardware circuits comprising computer memories (cache, RAM, ROM, etc.) and one or more microprocessors or processors (CPU and/or GPU), organized or not in the form of calculation nodes, necessary for the execution of computer instructions stored in the memories for the implementation of said phases. It will be understood that, with regard to calculation, any type of computer architecture may be suitable and that the description above and below should not be understood as limiting the scope of the invention.

Plusieurs architectures sont possibles, comme par exemple illustrées aux figures 13-15. Several architectures are possible, as illustrated in Figures 13-15.

Dans une première variante d’architecture (fig. 13), une première organisation 2000, par exemple la Demanderesse, héberge ou contrôle un ou plusieurs serveurs de calcul 2002 associés à une ou plusieurs bases de données 2002 pour la mémorisation des images de lames annotées et la phase d’apprentissage est mise en œuvre par l’organisation 2000 sur son ou ses serveurs 2002. Une seconde organisation 2006, par exemple un laboratoire microbiologique, héberge un microscope 2008 tel que décrit ci-avant, connecté à, ou incorporant, un ordinateur personnel, un ordinateur de bureau ou un serveur 2010 et met en œuvre la préparation de la lame de Gram jusqu’à l’acquisition de l’image numérique de la lame mémorisé par l’unité informatique 2010. Cette image est alors poussée, au travers d’un réseau de connexion à distance, à la première organisation 2000 pour la mise en œuvre sur le serveur 2002 (ou une unité informatique différente de celle utilisée pour l’entrainement, par exemple mise en œuvre sur un Cloud sous la forme du Software As a Service) du reste de la phase de prédiction. Le compte rendu de la classification des microorganismes présents, ou non, produit est alors poussé, au travers du réseau 2012, vers la seconde organisation 2006 qui prend ou non des mesures thérapeutiques en fonction du compte rendu. In a first architectural variant (fig. 13), a first organization 2000, for example the Applicant, hosts or controls one or more calculation servers 2002 associated with one or more databases 2002 for storing annotated slide images and the learning phase is implemented by the organization 2000 on its server(s). 2002. A second organization 2006, for example a microbiological laboratory, hosts a microscope 2008 as described above, connected to, or incorporating, a personal computer, a desktop computer or a server 2010 and implements the preparation of the Gram slide until the acquisition of the digital image of the slide stored by the computing unit 2010. This image is then pushed, through a remote connection network, to the first organization 2000 for the implementation on the server 2002 (or a computing unit different from that used for training, for example implemented on a Cloud in the form of Software As a Service) of the remainder of the prediction phase. The report of the classification of the microorganisms present, or not, produced is then pushed, through the network 2012, to the second organization 2006 which takes or does not take therapeutic measures depending on the report.

Une seconde variante d’architecture (fig. 14) diffère de la première, en ce qu’une copie du logiciel mettant en œuvre la phase de prédiction sont téléchargées dans la seconde organisation qui met en œuvre à l’aide de l’ordinateur 2010 ou d’un serveur de calcul (non représenté) la totalité de la phase de prédiction. Dans une variante de cette architecture, ce téléchargement correspond à une copie du logiciel dans l’ordinateur 2010, ordinateur fourni par la première organisation 2000. Dans une troisième variante (fig. 15), la totalité des phases d’apprentissage et de prédiction sont mises en œuvre par une seule organisation 2006. A second architectural variant (fig. 14) differs from the first in that a copy of the software implementing the prediction phase is downloaded into the second organization, which implements the entire prediction phase using the computer 2010 or a computing server (not shown). In a variant of this architecture, this download corresponds to a copy of the software in the computer 2010, a computer provided by the first organization 2000. In a third variant (fig. 15), all of the learning and prediction phases are implemented by a single organization 2006.

Claims

1. Method for predicting a class of microorganisms contained in a sample, from among several classes of microorganisms, the method comprising:

A. the preparation of a slide, in particular a microscope slide, comprising spreading the sample on said slide;

B. the acquisition of at least one digital image of the slide with micrometric or sub-micrometric resolution;

C. the application, implemented by computer, of a model for predicting the class of microorganisms as a function of the acquired image, characterized in that said image is subdivided into a plurality of sub-images and each sub-image is subdivided into patches, and in that the application of the prediction model comprises:

D. for each patch, applying a microorganism feature extractor, said extractor comprising a convolutional part of a first convolutional neural network, said first network being trained on a set of training patches comprising microorganisms, said microorganisms being individually annotated by at least one class;

E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream layers comprising a layer implementing a prediction of at least one class, in the form of a score, said second network being trained on training sub-images comprising microorganisms, each training sub-image being globally annotated; and

F. for the acquired image: a. the calculation of a characteristic vector calculated for the sub-images; b. the application of a prediction model of at least one class for the microorganisms present in the sample based on the characteristic vector, said prediction model being trained on characteristic vectors calculated from training images.

2. Method according to claim 1, characterized in that the characteristic vector is a distribution of the scores calculated for the sub-images.

3. Method according to one of claims 1 or 2, characterized in that each training image is subdivided into sub-images and each of said sub-images is subdivided into patches, and in that less than 50% of the patches of the training images are annotated, said annotated patches forming the training patches of the first convolutional network.

4. Method according to claim 3, characterized in that less than 10% of the patches of the training images are annotated.

5. Method according to any one of the preceding claims, characterized in that the training of the first and second convolutional networks and of the prediction model is configured to obtain a macro prediction specificity greater than or equal to 90%.

6. Method according to any one of the preceding claims, characterized in that the prediction model of step F is a “random forest” model.

7. Method according to any one of the preceding claims, characterized in that the upstream pooling layer is associated with an attention layer configured to apply a weight to the output of each extractor.

8. Method according to any one of the preceding claims, characterized in that the second network is a MIL-CNN network trained by batches of instances, the batches of instances being made up of training patches.

9. Method according to any one of the preceding claims, characterized in that the characteristic vector of step F. a comprises for each class:

- the maximum score among the sub-images; and/or

- the ^Xth percentile of the scores among the sub-images, with X greater than or equal to 90%, preferably equal to 95%; and/or

- the ^Yth percentile of the scores among the sub-images, with Y less than or equal to 10%, preferably equal to 5%; and/or

- the median score among the sub-images.

10. Method according to claim 8, characterized in that it further comprises the number of sub-images in the image.

11. Method according to any one of the preceding claims, characterized in that the first convolutional network is pre-trained on images not comprising microorganisms and then trained on annotated training patches.

12. Method according to claim 10, characterized in that the first pre-trained convolutional network is a VVG16 or ResNet network.

13. A method according to any one of the preceding claims, characterized in that the microorganisms comprise bacteria and the classes comprise at least Gram positive and Gram negative, and in that the preparation of the slide comprises the preparation of a Gram slide.

14. Method according to claim 13, characterized in that the classes of microorganisms further comprise classes of morphotypes.

15. Method according to any one of the preceding claims, characterized in that the classes of microorganisms comprise a “neither bacteria nor yeast” class, a “Gram-negative bacilli” class, a “Gram-positive bacilli” class, a “Gram-positive coryneum bacilli” class, a “Gram-negative cocci” class, a “Gram-positive cocci in chains” class, a “Gram-positive cocci in clusters” class and a “yeasts” class.

16. Method according to any one of the preceding claims, characterized in that the sample comprises blood.

17. Method according to any one of the preceding claims, characterized in that the sample is a positive blood culture.

18. Method for training a model for predicting a class of microorganisms among several classes of microorganisms from a digital image of a slide on which a sample likely to comprise microorganisms is spread, said training method comprising:

A. the creation of a training database comprising digital slide images annotated by one or more classes of microorganisms, the slide images being divided into a plurality of sub-images annotated by one or more classes of microorganisms and the training sub-images being subdivided into patches, at least part of the patches being annotated by one or more classes of microorganisms,

B. training a first convolutional neural network based on the annotated patches;

C. training a second neural network from the annotated base of sub-images, said second network comprising at least one extractor of patch characteristics, a patch characteristics pooling stage, and a prediction stage of the class(es) of microorganisms present in the sub-image;

D. the creation of a database of distributions of the classes present in the training slides based on the classes predicted by the second neural network applied to the annotated sub-images;

E. of a model for predicting at least one class of microorganisms present in a slide based on spatial distributions of the classes, method according to which the predictor comprises the convolutional part of the first neural network, downstream of which is connected the second neural network, downstream of which is connected the prediction model.

19. Training method according to claim 18, characterized in that the training of the first network is carried out on the classes of microorganisms to which an “ambiguous” class is added, an annotated patch also being annotated by this class when objects in the patch in the event of uncertainty about the objects present in the patch.

20. Training method according to claim 18 or 19, characterized in that the training of the second network comprises the training of a MIL-CNN network, the trained MIL portion of the MIL-CNN network constituting the second trained neural network.

21. Method for predicting a class of microorganisms contained in a sample, from among several classes of microorganisms, the method comprising:

C. the application, implemented by computer, of a model for predicting the class of microorganisms in a sample spread on a slide based on an image acquired from said slide, characterized in that said image is subdivided into sub-images and each sub-image is subdivided into patches, and in that the application of the prediction model comprises:

E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream prediction layers of at least one class, in the form of a score, said second network being trained on training sub-images comprising microorganisms, each training sub-image being globally annotated by at least one class; and

F. for the acquired image: a. the calculation of a feature vector based on the scores calculated for the sub-images; b. the application of a prediction model of at least one class for the microorganisms present in the sample based on the feature vector, said prediction model being trained on feature vectors calculated from training images.

22. Prediction method according to claim 21, characterized in that steps D to F are in accordance with any one of claims 2 to 17.

23. System for predicting a class of microorganisms contained in a sample, among several classes of microorganisms, the system comprising a computer unit configured to implement a model for predicting the class of microorganisms in a sample spread on a slide as a function of an image acquired from said slide, characterized in that said image is subdivided into sub-images and each sub-image is subdivided into patches, and in that the application of the prediction model comprises:

F. for the acquired image: a. the calculation of a feature vector calculated for the sub-images; b. the application of a prediction model of at least one class for the microorganisms present in the sample based on the feature vector features, said prediction model being trained on feature vectors calculated from training images.

24. Prediction system according to claim 23, characterized in that the computer unit is configured to implement steps D to F which are in accordance with any one of claims 2 to 17.

25. Computer program product comprising a computer memory storing computer-readable instructions for implementing steps D to F according to any one of claims 2 to 17.

26. Computer program product comprising a computer memory storing computer-readable instructions for implementing steps A to E according to any one of claims 18 to 20.

27. Method for predicting a class of objects contained in a digital image from among several classes of objects, the method according to which the digital image is subdivided into a plurality of sub-images and each sub-image is subdivided into patches, and according to which:

D. for each patch, applying an object feature extractor, said extractor comprising a convolutional part of a first convolutional neural network, said first network being trained on a set of training patches comprising objects, said objects being individually annotated by at least one class;

E. for each sub-image, the application of a second neural network, connected to receive the characteristics extracted from the patches constituting said sub-image, said second network comprising an upstream pooling layer and one or more downstream prediction layers of at least one class, in the form of a score, said second network being trained on training sub-images comprising objects, each training sub-image being globally annotated; and

F. for the acquired image: a. the calculation of a characteristic vector calculated for the sub-images; b. the application of a prediction model of at least one class for the objects present in the sample according to the characteristic vector, said prediction model being trained on characteristic vectors calculated from training images.