[go: up one dir, main page]

EP4097647A1 - Procédé pour assurer la qualité d'un système basé sur des exemples - Google Patents

Procédé pour assurer la qualité d'un système basé sur des exemples

Info

Publication number
EP4097647A1
EP4097647A1 EP21711743.1A EP21711743A EP4097647A1 EP 4097647 A1 EP4097647 A1 EP 4097647A1 EP 21711743 A EP21711743 A EP 21711743A EP 4097647 A1 EP4097647 A1 EP 4097647A1
Authority
EP
European Patent Office
Prior art keywords
examples
complexity
quality
determined
assessment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP21711743.1A
Other languages
German (de)
English (en)
Inventor
Thomas Waschulzik
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens Mobility GmbH
Original Assignee
Siemens Mobility GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Mobility GmbH filed Critical Siemens Mobility GmbH
Publication of EP4097647A1 publication Critical patent/EP4097647A1/fr
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/2163Partitioning the feature space
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Definitions

  • the invention relates to a method for quality assurance of an example-based system.
  • Example-based systems such as artificial neural networks, are known in principle. These are usually used in areas in which a direct algorithmic solution does not exist or cannot be adequately created using conventional software methods. Using example-based systems, it is possible to create and train a task on the basis of a number of examples. The learned task can be applied to a number of other examples.
  • this object is achieved by a method for quality assurance of an example-based system, in which the example-based system is created and trained on the basis of collected examples which form an example set.
  • the respective example of the sample set includes an input value which is in an input space.
  • a quality assessment (or a quality indicator), which covers the input space with examples of the example amount represented is determined based on the distribution of the input values in the input space.
  • the invention is based on the one hand on the knowledge that game-based systems such as neural networks are often viewed as black boxes.
  • the internal processing of information is not analyzed and an understandable model is not generated.
  • the system is not verified by an inspection. This leads to reservations when using example-based systems in tasks with a high level of criticality.
  • the invention is also based on the knowledge that when recording examples for creating and training the example-based system, it is often unknown how many examples must be recorded in which areas of the input space in order to create a suitable knowledge base.
  • the solution according to the invention eliminates these problems in that the coverage of the input space is determined by means of examples based on the distribution of the input values in the input space. This results in a mapping of the input space, which serves as a basis for the further acquisition of examples for the creation of a suitable knowledge base. In this way, the acquisition of the examples can be controlled according to the distribution in the input space, although the specific type of classifier or approximator has not yet been determined. The number of degrees of freedom with which the knowledge base is trained does not yet have to be specified either. By knowing the areas in which further examples have to be recorded, the examples can be recorded in a more targeted manner and consequently the costs for the recording of examples (since fewer examples have to be recorded in total) can be considerably reduced.
  • mapping of the input space for example-based systems
  • Coding of the characteristics is a prerequisite for the use of mapping of the input space for example-based systems.
  • the raw data are converted into a representation adapted to the solution of the task by application-specific transformations.
  • This representation is converted using standard procedures so that it can be used as the activity of the input neurons of a neural network (so-called coding).
  • the quality assessment which represents the coverage of the input space by examples of the sample set, can be used on the level of the representations and on the level of the coding.
  • the invention is further based on the knowledge that the coding and / or representation of the input features in the input space preferably have a semantic connection with the desired output of the example-based system. For example, pixel values of an RGB image are unsuitable as input for the large-, rotation- and translation-invariant classification of objects.
  • the input space should preferably be mapped if, for example, preprocessing has determined features that have a semantic relationship to the outputs.
  • the invention is further based on the knowledge that the relationship between the number of independent input features which determine the dimension of the state space spanned and the number of examples to be recorded for the configuration, training, evaluation and testing of the system is preferably not too large: because the coverage of the input space by examples is not sufficient in the case of a large ratio.
  • the invention is also based on the knowledge that the dimensions which span the state space are preferably semantically independent of one another (ie represent independent aspects of the task). Furthermore, the dimensions are preferably of equal relevance for solving the task. Further preferably, only a single classification task or approximation task is considered for quality assurance. For example, in an artificial neural network that is used as a single shot multibox detector (SSD), only the classification for a given object size is shown in a so-called default box (i.e. with a given aspect ratio, with a given scaling and at a given position in the Image) considered.
  • SSD single shot multibox detector
  • the example-based system is preferably provided for use in a safety-related function.
  • safety-related function to be a function of a system that is safety-relevant, i.e. whose behavior has an impact on the safety of the system's environment.
  • safety is to be understood in the sense of so-called safety.
  • safety the goal of protecting the environment of a system from dangers emanating from the system is referred to as "safety” the environment of the system, referred to as "security”.
  • the determination comprises: distributing representatives in the input space and assigning a number of examples of the sample set to the respective representative.
  • the examples assigned to the representative are located in an area surrounding the input space which surrounds the representative.
  • a local quality assessment for the surrounding area is determined as a quality assessment.
  • example data sets are determined within the environmental areas that are assigned to the representatives.
  • the local quality assessments are calculated for each of these sample data sets.
  • the subdivision of the sample set into several surrounding areas brings with it the advantages that usually result from the ITS approach of the divide-and-conquer method.
  • a developer of the example-based system can concentrate on those parts of the input space in which certain quality criteria are not met by the determined quality assessment.
  • a representative example is preferably distributed as a representative.
  • the distribution is preferably a uniform distribution.
  • a grid for arranging the representative examples is selected in the input room.
  • the grid can be set individually for each dimension of the input space.
  • a criterion for defining the grid for example in the case of categorical variables, can be a model of target properties of the example distribution in the input space, which is made on the basis of the requirements of the example-based system.
  • the grid can have a hierarchical structure, for example in order to map hierarchical codings.
  • a representative example is distributed in each hypercube in the input space of the grid. In the case of a hierarchical structure of the grid, a representative example is distributed per hierarchy level.
  • the representative is a center of a cluster, which is determined by means of a cluster method.
  • the cluster method is preferably used to determine the position and to determine the extent of the respective cluster in the input space. More preferably, the cluster method is carried out taking into account output values of the examples that are in an output space.
  • the clusters can be configured on the basis of property requirements ten of the example-based system or on the basis of a subset of example data. In the application of the example-based system, for example, a set of examples can be recorded in an early phase, which are selected on the basis of knowledge to meet the requirements. This distribution of the sample data is then quality assured. In a subsequent project phase, further examples with the same distribution can be recorded.
  • each example of the quality-assured sample set represents a representative for the following phase of capturing the examples. This ensures that an additional quality-assured set of examples is captured for each initial example.
  • the position of the representative can for example be determined by the cluster center.
  • a hierarchical clustering method can be used in which a representative is inserted per cluster and per hierarchical level and in which each example per hierarchical level is assigned to a cluster and consequently to a representative.
  • the set of examples that is available for calculating the quality assessment is then assigned to the clusters and consequently to the representative using a predefined metric. For an example that cannot be assigned to a cluster, a new cluster with a representative is preferably created. Alternatively, this example is recorded separately by a quality assessment together with other examples that could not be assigned to any cluster.
  • the examples are preferably not assigned to a representative in full, but only to a predetermined proportion. This can result, for example, from the fact that a cluster algorithm is used which provides a partial assignment of the examples to the sample data sets (for example a percentage assignment to several surrounding areas, the sum of the proportions being 1).
  • the respective example is taken into account according to the associated proportion.
  • the quality assessment is preferably determined on the basis of the number of examples assigned to the respective representative or on the basis of other features. This is particularly advantageous if the specific examples are no longer used in the following.
  • the specific examples or a reference to the examples are stored in the representative (transformation of the sample data volume into a structure based on the topography of the input space). This is advantageous if the specific examples are needed later.
  • the storage space required for the processing is preferably reduced in that the representatives are only stored if there is at least one example in the respective surrounding area.
  • the quality assessment comprises a statistical means which is determined on the basis of the sample set and / or the examples assigned to a respective representative.
  • a histogram of the number of examples assigned to a representative is created as a statistical means.
  • a statistical measure in particular a mean value, median, minimum, maximum and / or quantile of the number of examples assigned to a representative, is determined as the statistical mean.
  • adjacent surrounding areas are determined in the input room, the respective representatives of which are assigned a number of examples that meet a predefined quality criterion of the quality assessment.
  • the specified quality criterion is preferably met if the number of examples assigned to a respective representative exceeds a specified quality threshold value. falls below, exceeds or is in a specified quality band of the quality assessment.
  • different neighborhood relationships can be used, for example the Von Neumann neighborhood (also called the 4-way neighborhood), the Moore neighborhood (also called the 8-part neighborhood) or the neighborhood from graph theory, be used.
  • the defined neighborhood relationships must be transferred accordingly for higher-dimensional spaces: In three-dimensional space, for example, the 6-fold neighborhood for cuboids with common areas, the 18-fold neighborhood for cuboids with common edges and the 26-fold neighborhood for cuboids with common corner points considered.
  • the neighborhood is defined by how many dimensions two grid points may differ in order to still be seen as adjacent.
  • a context area is determined within the input space, which consists of neighboring surrounding areas, the representatives of which are each assigned a number of examples that meet a predefined quality criterion.
  • the predefined quality criterion is preferably met when the number of examples assigned to a respective representative falls below or exceeds a predefined quality threshold value or is in a predefined quality band of the quality assessment.
  • the location and size of areas of the input space in which too few examples were recorded can be determined in a particularly advantageous manner.
  • a particular advantage of the embodiment is that sub-areas of the input space are identifiable. in which the sample values do not provide a sufficient basis for a safety-critical application. This in turn has the advantage that corrective action can be taken, for example by recording further examples or by restricting the knowledge base in the application to the high-quality related areas.
  • the determination of the areas in which too few examples were recorded has the advantage that attacks by adversarial examples can be counteracted preventively. Because in these areas the probability of an attack being successful by an adversarial example is comparatively high. It can be reduced by recording further examples in these areas or by restricting the knowledge base to the high-quality related areas.
  • Quality assessments can be calculated on the basis of the determined connection areas. For example, the number of representatives in a related area can be determined. Histograms can be created for the size or other properties of a context area. In addition, statistical measures such as a mean value, median, quantile or standard deviation of properties of the areas of connection can be calculated. In addition, the extent of the connected areas in the dimensions of the input space can be determined. The dimensions can be arranged in the order of the greatest extent of the context.
  • examples are recorded in the respective surrounding area if the quality assessment determined for the respective surrounding area is less than a predetermined quality threshold value.
  • examples are removed from a respective surrounding area if those for the respective surrounding area determined quality assessment is greater than a predetermined quality threshold value.
  • the respective example comprises an output value that lies in an output space.
  • a local complexity assessment is determined for the respective environmental area, which represents a complexity of a task of the example-based system defined by the examples of the environmental area.
  • the local complexity assessment is determined by the relative position of the examples of the surrounding area to one another in the input space and output space.
  • the complexity assessment corresponds, for example, to the WASCHULZIK section 4 (QUEEN quality indicators) written quality indicators. These quality indicators can be defined and used for the representation or coding of the characteristics (see section 4.5 of WASCHULZIK).
  • the integrated quality indicator QI 2 according to section 4.6 of WASCHULZIK, which is defined according to formula 4.21 as follows, is used as the quality indicator for the representations: where according to formula 4.18 from WASCHULZIK: the normalized distance between the represented inputs (NRE) and is the normalized spacing of the represented expenditure (NRA).
  • x is the pair (x- ⁇ x 2 ,) consisting of the two examples x 1 and x 2 .
  • C and x 2 are examples from the example set P. P.
  • BAG ⁇ ri, Ri, ..., p ⁇ p ⁇ ] is the set of elements of BAG P, where
  • BAG is a multiset (called multiset or bag in English), as defined in Specification 21.5 on page 27 of the WASCHULZIK appendix.
  • the QAG task is defined in definition 3.1 on page 23 of WASCHULZIK and is referred to there as the QUEEN task.
  • ⁇ RE () is an abbreviation for the distance in the input space d re ( ve P xi> ve V xi ) and d R4 (x) is an abbreviation for the distance im
  • Output space d ra (vap xl , vap x2 ).
  • the definition of the distance between the representation of two examples according to WASCHULZIK is based on the Euclidean norm.
  • the distance in the input space is defined as (see formula 4.3 from WASCHULZIK): -reiPkl'Vkl) ⁇
  • an aggregated complexity assessment is determined by aggregating the local complexity assessments.
  • the aggregated complexity assessment has the advantage that a developer of the example-based system can easily perform his quality assurance.
  • a histogram of the complexity in the different areas surrounding the input space is created as an aggregated complexity assessment.
  • the range of values of the complexity assessments is binned (ie divided into ranges).
  • the bins preferably contain only the number of surrounding areas with a corresponding complexity when the positions of the surrounding areas are no longer required.
  • This histogram is preferably combined with information about the number of examples. summarized, for example also in a histogram of the number of examples assigned to the representative. More preferably, information about the representatives is stored in the histogram so that they can be used for detailed analyzes.
  • environmental areas are identified on the basis of the aggregated complexity assessment, the complexity assessment of which falls below a predefined complexity threshold value.
  • the task of the example-based system is implemented through an algorithmic solution. This is particularly advantageous for applications with high quality requirements, for example in the case of safety-oriented functions.
  • the input space is divided hierarchically on the basis of the quality assessment.
  • a hierarchical mapping of the input space is preferably achieved through the hierarchical division of the input space.
  • the hierarchy is furthermore preferably derived from the representation or coding of the input feature and / or from the analysis of the complexity of the task.
  • the density of the representatives can either be increased dynamically (until a homogeneous complexity is achieved) or a new hierarchy level can be introduced.
  • a new hierarchy level is introduced by adding a new subdivision with a higher resolution in the area of the representative. The procedure can be iterated by adding a further hierarchy level in the high-resolution area when the local complexity increases again. This means that the resolution can be dynamically adapted to the task at hand.
  • a complexity distribution is determined by means of a histogram representation of the complexity assessment over k nearest neighbors of an example in the input space.
  • it is determined for the local environment of an example how the complexity is distributed.
  • the characteristic of the complexity in the local environment of the example is determined and, so to speak, a fingerprint of the local environment of the example is determined with regard to the complexity.
  • the value range of the complexity evaluations is preferably binned for the histogram display (ie divided into areas). For example, the "binned" values are plotted on the y-axis and the representation of the increasing k (the k-nearest neighbors) is entered on the x-axis.
  • the number of values of the complexity evaluation is preferably stored for the calculated histogram field (complexity evaluation binned, k). More preferably, identification information (for example a number) containing the example in the vicinity of which the complexity distribution was determined is also stored.
  • the example-based system is intended for use in a safety-related function, the safety-related function comprising object recognition based on image recognition, in which the object is recognized using the example-based system.
  • the object recognition is performed during automated operation of a vehicle, in particular a track-bound vehicle, a motor vehicle, an aircraft, a watercraft and / or a spacecraft used.
  • the object recognition in an automated operation of a vehicle is a particularly expedient embodiment of a Si ⁇ cherheits penetrateeten function.
  • the object recognition is necessary, for example, to recognize obstacles on the road or to analyze traffic situations with regard to the right of way of road users.
  • the motor vehicle is, for example, a motor vehicle, e.g. a passenger car (passenger car), a truck (truck) or a tracked vehicle.
  • a motor vehicle e.g. a passenger car (passenger car), a truck (truck) or a tracked vehicle.
  • the watercraft is, for example, a ship or a submarine.
  • the vehicle can be manned or unmanned.
  • An example of an application area is the autonomous or automated driving of a rail vehicle.
  • object recognition systems are used to analyze scenes that are digitized with sensors. This scene analysis is necessary, for example, to recognize obstacles on the road or to analyze traffic situations with regard to the right of way of road users.
  • Systems based on the use of examples with which the parameters of the pattern recognition system are trained are currently used particularly successfully for the recognition of the objects. Examples of this are neural networks, e.g. with deep learning algorithms.
  • the example-based system is provided for use in a safety-related function, the safety-related function comprising a classification based on sensor data from organisms.
  • the tissue classification of animal or human tissue is a particularly useful implementation of a safety-oriented function in the field of medical image processing.
  • the organisms include, for example, Archaea (primordial bacteria), Bacteria (real bacteria) and Eukarya (nuclei) or from tissue from Protista (also Protoctista, greener), Plantae (plants), Fungi (fungi, chitin fungi) and Animalia (Animals).
  • the example-based system comprises
  • an artificial neural network with one or more layers of neurons that are not input neurons or output neurons and are trained with backpropagation
  • the one or more layers of neurons that are not input neurons or output neurons are often referred to in technical terms as "hidden” neurons.
  • the training of neural networks with many levels hidden neurons is also often referred to by experts as deep learning.
  • a special type of deep learning network for pattern recognition are the so-called Conventional Neuronal Networks (CNNs).
  • CNNs Conventional Neuronal Networks
  • SSD networks Single Shot MultiBox.
  • Single Shot MultiBox Detector Single shot multibox detector. European Conference on Computer Vision. Lecture Notes in Computer Science. 9905. pp. 21-37. ArXiv: 1512.02325
  • the invention also relates to a computer program comprising instructions which, when the program is executed by a computing unit, cause the computing unit to carry out the method of the type described above.
  • the invention also relates to a computer-readable storage medium, comprising instructions which, when executed by a computing unit, cause the computing unit to carry out the method of the type described above.
  • Figure 1 schematically the sequence of an embodiment example of a method according to the invention
  • Figure 2 schematically shows the structure of an exemplary system based on thewhosbei game of the method according to the invention
  • Figure 3 schematically shows a two-dimensional input space according to the embodiment of the method according to the invention
  • FIG. 4 shows a schematic side view of a track-bound vehicle on a route
  • FIG. 5 shows a hierarchical division of the input space
  • FIG. 6 shows two axis diagrams which represent the application of the complexity assessment to a first synthetic function
  • FIG. 7 shows two axis diagrams which represent the application of the complexity assessment to a second synthetic function
  • FIG. 8 shows two axis diagrams which represent the application of the complexity assessment to a third synthetic function
  • FIG. 9 schematically shows a further example of a two-dimensional input space in accordance with a further exemplary embodiment of the method according to the invention.
  • FIG. 1 shows a schematic flow diagram which represents the sequence of an exemplary embodiment of a method according to the invention for quality assurance of an example-based system.
  • FIG. 2 shows schematically the structure of an example-based system 1 in which the quality assurance of the system is carried out using the exemplary embodiment of the method according to the invention.
  • the example-based system 1 is a system with supervised learning and is formed by an artificial neural network 2, which has a layer 4 of input neurons 5 and a layer 6 of output neurons 7.
  • the artificial neural network 2 has several layers 8 of neurons 9 that are not input neurons 5 or output neurons 7.
  • the artificial neural network 2 is a so-called multi-layer perceptron, but it can also be a recurrent neural network, a convolutional neural network, or in particular a so-called single-shot multi-box detector network.
  • the example-based system and the method according to the invention are implemented using one or more computer programs.
  • the computer program comprises commands which, when the program is executed by a computer unit, cause the computer unit to carry out the method according to the invention in accordance with the exemplary embodiment shown in FIG.
  • the computer program is stored on a computer-readable storage medium.
  • the example-based system is used in a safety-related function of a system.
  • the behavior of the function therefore has an impact on the safety of the system's environment.
  • An example of a safety-related function is object recognition based on image recognition, in which the object is recognized using the example-based system 1.
  • the object recognition is used, for example, in automated operation of a vehicle, in particular a track-bound vehicle 40 shown in FIG. 4, a motor vehicle, an aircraft, a watercraft or a spacecraft.
  • a safety-related function is a classification based on sensor data from organisms, e.g. from Archaea (original bacteria), Bacteria (real bacteria) and Eukarya (nuclei) or from tissue from Protista (also Protoctista, founder), Plantae (plants), Fungi (mushrooms, chitin mushrooms) and Animalia (animals), a safe control of industrial plants, a classification of chemical substances, a classification of signatures of Vehicles or a controller in the field of industrial automation.
  • organisms e.g. from Archaea (original bacteria), Bacteria (real bacteria) and Eukarya (nuclei) or from tissue from Protista (also Protoctista, founder), Plantae (plants), Fungi (mushrooms, chitin mushrooms) and Animalia (animals), a safe control of industrial plants, a classification of chemical substances, a classification of signatures of Vehicles or a controller in the field of industrial automation
  • a process step A it is determined which examples are to be collected.
  • a step B the examples are collected:
  • the collected examples form an example set.
  • the respective example has an input value 12, which lies in an input space, and an output value 14, which lies in an output space.
  • object recognition as one of several possible examples of a safety-oriented function
  • the examples are collected by providing the track-bound vehicle 40 with a camera unit 42 for capturing images.
  • the camera unit 42 is oriented in the direction of travel 41 in such a way that a spatial area 43 ahead in the direction of travel 41 is captured by the camera unit.
  • the lane-bound vehicle 40 drives with the camera unit 42 in the direction of travel 41 along a route 44.
  • scenes that are relevant for the creation and training of the example-based system 1 for object recognition are simulated.
  • cardboard figures, crash test dummies or actors 45 are used to represent people on the route 44 who are to be recognized by means of the example-based system 1 to be created and trained.
  • scenes can be simulated using so-called virtual reality.
  • a quality assessment which represents coverage of the input space by examples of the sample set, is determined.
  • C the quality assessment
  • CI representatives are distributed in the input space in a method step.
  • FIG. 3 shows a two-dimensional input space 20 as an example. In the actual application of the method according to the invention, the input space and output space will often have a higher dimensionality.
  • the examples 22 of the example set are shown as crosshairs 23 in FIG.
  • the representatives 24 are evenly distributed and are shown as intersection points 25 of the grid 26 shown.
  • a respective representative 28 is assigned a number of examples 29 of the example set.
  • the examples 29 assigned to the representative 28 are located in a surrounding area 30 of the input space 20, which surrounds the respective representative 28.
  • the surrounding area 30 is shown by way of example in FIG. 3 as a dotted area.
  • a quality assessment a local quality assessment for the surrounding area 30 is determined in a method step C3.
  • a method step C4 adjacent surrounding areas 32-36 are determined in the input space, the respective representative of which is assigned a number of examples which fall below a predetermined quality threshold value.
  • these surrounding areas 32-36 are shown as areas with diagonal stripes.
  • the surrounding areas 32-36 are areas in which there is no example.
  • a context area 38 is determined within the input space 20, which consists of the adjacent surrounding areas 32-36, the representatives of which are each assigned a number of examples that are below a predetermined quality threshold. This determines the position and size of areas of input space 20 in which too few examples have been recorded. In other words: partial areas of the input space 20 are identified in which the example values do not provide a sufficient basis for a safety-critical application.
  • Corrective action can be taken on the basis of the identification: For this purpose, for example, in a method step D, further examples are recorded in a respective surrounding area if the quality assessment determined for the respective surrounding area is less than a predetermined quality threshold.
  • a local complexity assessment is determined for the respective surrounding area, which represents a complexity of a task of the example-based system defined by the examples of the surrounding area.
  • the local complexity assessment is determined according to a method step E1 by the relative position of the examples of the surrounding area to one another in the input space 20 and the output space. That is to say, the complexity assessment is defined based on the consideration of the similarity of the distances between the examples in the input space 20 and the distances in the output space.
  • the task of the example-based system has a comparatively low complexity if the distances in the input space 20 (apart from the scaling) correspond approximately to the distances in the output space.
  • the complexity assessment is used to identify areas in which, due to the high complexity of the task of the example-based system, a comparatively high number of examples must be recorded. For example, in areas of the input space 20 in which there is a higher complexity, the density of the representatives is dynamically increased until a homogeneous complexity is reached. Alternatively, a new hierarchy level can be introduced (as is described below by way of example with reference to FIG. 5).
  • the complexity assessment corresponds to the quality indicators described in section 4 (QUEEN quality indicators) of WASCHULZIK. These quality indicators can be defined and used for the representation or coding of the characteristics (see section 4.5 of WASCHULZIK).
  • An example of this quality indicator for the representations is the integrated quality indicator QI 2 according to Section 4.6 of WASCHULZIK.
  • an aggregated complexity assessment is determined by aggregating the local complexity assessment: For example, the aggregated complexity Complexity assessment creates a histogram of the complexity in the various surrounding areas of the input space. For this purpose, the value range of the complexity assessments is binned (ie divided into areas). The bins contain only the number of surrounding areas with the corresponding complexity, provided that the positions of the surrounding areas are no longer required.
  • This histogram is summarized with information about the number of examples, for example also in a histogram about the number of examples assigned to the representative. More preferably, information about the representatives is stored in the histogram so that they can be used for detailed analyzes.
  • a method step F On the basis of the complexity assessment, it can be recorded in a method step F whether an appropriate number of examples were recorded in all areas. If an area is identified in which too many examples were captured with low complexity, examples can be removed from this area. This reduction of the examples reduces the storage space requirement and the costs for the calculations, e.g. for quality assurance measures based on the sample data volume. If an area is identified in which too few examples were recorded (e.g. because the complexity is comparatively high), further examples may have to be recorded in this area. The latter case frequently occurs in those areas in which a new hierarchical level has been introduced (as is described below by way of example with reference to FIG. 5). After further examples have been recorded, a quality assurance loop (according to method steps C to E) is run through until all the desired quality requirements are met.
  • a method step G environmental areas are identified whose complexity assessment falls below a predetermined complexity threshold.
  • the task of the example-based system is implemented according to a method step H by an algorithmic solution if the functionality of the system (ie semantic relationships) is known for the surrounding area.
  • the system's task is therefore implemented as a conventional algorithm (instead of an example-based system).
  • the statistical system is also created in step H or the structure of the neural network is established and the neural network is trained.
  • FIG. 5 shows, by way of example, a hierarchical division of an input space 120, by means of which a hierarchical mapping of the input space is achieved.
  • the collected examples 122 of the example set are shown as stars 123 and circles 125 in FIG.
  • the stars 123 and Kriese 125 are examples of different object classes (i.e. have a different position in the output space).
  • a new hierarchy level 126 can also be introduced.
  • the new hierarchy level 126 is introduced, for example, by adding a new subdivision 132 with a higher resolution 134 in the area 130.
  • the procedure can be iterated by adding a further hierarchy level in the high-resolution area when the local complexity increases again.
  • FIGS. 6 to 8 each show, for a synthetic function, a histogram of the distribution of the complexity evaluation over k-nearest neighbors of a preselected example.
  • the example is a proxy or a center of a cluster (as described above).
  • the example can also be an example selected from the area surrounding a representative, which was selected for a more in-depth investigation with regard to the complexity of the task.
  • Figure 6 shows Figures 4.1 on the left and Figure 4.4 from WASCHULZIK on the right.
  • Figure 7 shows Figure 4.17 on the left and Figure 4.20 from WASCHULZIK on the right.
  • the axis diagram in FIG. 7 on the right is scaled in such a way that 40 stands for the value 1.
  • Figure 8 shows Figure 4.41 on the left and Figure 4.44 from WASCHULZIK on the right.
  • y sin (8 * pi * x / 300) + br (seed, 300) is shown as an axis diagram on the left in FIG. It is a sine function that has stochastic noise in the ranges 0 ⁇ xd 50 and 100 ⁇ xd 200.
  • the axis diagram in Figure 8 is so sketchy determines that 40 stands for the value 1.
  • the person skilled in the art can also identify the representatives in which, for example, very
  • FIG. 9 shows an exemplary embodiment of an input space 220 in which the representatives each form a center of a cluster which is determined by means of a clustering method. Examples 222 of the example set are shown in FIG. 9 as crosshairs 223.
  • FIG. 9 shows, by way of example, four clusters 230, 232, 234 and 236, each of which comprises several examples. These examples lie within a dashed border line in the representation, which does not represent an actual delimitation of a cluster, but has only been drawn in for illustration.
  • the clusters 230, 232, 234 and 236 each have an associated cluster center 240, 242, 244 and 246 (shown as a plus).
  • the cluster centers 240, 242, 244, 246 each lie centrally within the cluster and are assigned to a cluster regardless of the boundaries of the grid of the input space.
  • the clusters according to FIG. 9 have the advantage that they represent the topology of the data in a particularly suitable manner.
  • the grating according to FIG. 3 has the advantage that the uncovered areas are mapped more appropriately.
  • the coverage of the input space (according to method step C) can be calculated using the grid and the complexity assessment (according to method step E) can also be calculated using the cluster center in addition to the grid.
  • Which approach is more suitable can also depend on the neural network method. If the coding neurons can move in the input space, then the cluster approach is preferably chosen or the cluster centers are equated with the positions of the coding neurons in the input space.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention concerne un procédé pour assurer la qualité d'un système basé sur des exemples. Afin d'améliorer l'assurance qualité au moyen dudit procédé, le système basé sur des exemples (1) est établi et entraîné à l'aide d'exemples (22) relevés qui forment une quantité d'exemples. L'exemple (22) respectif de la quantité d'exemples comprend une valeur d'entrée (12) qui se trouve dans une plage d'entrée (20). L'évaluation de qualité qui représente une couverture de la plage d'entrée (20) par les exemples (22) de la quantité d'exemples, est déterminée (C) grâce à la répartition des valeurs d'entrées (12) dans la plage d'entrée (20). Figure 1 :
EP21711743.1A 2020-03-11 2021-02-24 Procédé pour assurer la qualité d'un système basé sur des exemples Pending EP4097647A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102020203135.5A DE102020203135A1 (de) 2020-03-11 2020-03-11 Verfahren zur Qualitätssicherung eines beispielbasierten Systems
PCT/EP2021/054507 WO2021180470A1 (fr) 2020-03-11 2021-02-24 Procédé pour assurer la qualité d'un système basé sur des exemples

Publications (1)

Publication Number Publication Date
EP4097647A1 true EP4097647A1 (fr) 2022-12-07

Family

ID=74873684

Family Applications (1)

Application Number Title Priority Date Filing Date
EP21711743.1A Pending EP4097647A1 (fr) 2020-03-11 2021-02-24 Procédé pour assurer la qualité d'un système basé sur des exemples

Country Status (5)

Country Link
US (1) US20230121276A1 (fr)
EP (1) EP4097647A1 (fr)
CN (1) CN115280328A (fr)
DE (1) DE102020203135A1 (fr)
WO (1) WO2021180470A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4379671A1 (fr) * 2022-12-01 2024-06-05 Siemens Mobility GmbH Évaluation d'ensembles de données d'entrée-sortie à l'aide de valeurs de complexité locale et structure de données associée

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8086640B2 (en) * 2008-05-30 2011-12-27 Caterpillar Inc. System and method for improving data coverage in modeling systems
US11651277B2 (en) * 2010-03-15 2023-05-16 Numenta, Inc. Sparse distributed representation for networked processing in predictive system
US20170046615A1 (en) * 2015-08-13 2017-02-16 Lyrical Labs Video Compression Technology, LLC Object categorization using statistically-modeled classifier outputs
US10657457B1 (en) * 2013-12-23 2020-05-19 Groupon, Inc. Automatic selection of high quality training data using an adaptive oracle-trained learning framework
EP3138075A4 (fr) * 2014-04-30 2017-10-25 Battelle Memorial Institute Système d'aide à la décision pour évaluation de qualité d'hôpital
CN107392255B (zh) * 2017-07-31 2020-06-12 深圳先进技术研究院 少数类图片样本的生成方法、装置、计算设备及存储介质
CN108830201B (zh) * 2018-06-01 2020-06-23 平安科技(深圳)有限公司 样例三元组的获取方法、装置、计算机设备以及存储介质
CN109102029B (zh) * 2018-08-23 2023-04-07 重庆科技学院 信息最大化生成对抗网络模型合成人脸样本质量评估方法
US20210357729A1 (en) * 2018-09-27 2021-11-18 Carnegie Mellon University System and method for explaining the behavior of neural networks
CN110059752A (zh) * 2019-04-19 2019-07-26 江苏工程职业技术学院 一种基于信息熵抽样估计的统计学习查询方法

Also Published As

Publication number Publication date
DE102020203135A1 (de) 2021-09-16
CN115280328A (zh) 2022-11-01
WO2021180470A1 (fr) 2021-09-16
US20230121276A1 (en) 2023-04-20

Similar Documents

Publication Publication Date Title
EP3785177B1 (fr) Procede et dispositif pour determiner une configuration de reseau d'un reseau neuronal
DE102017220307B4 (de) Vorrichtung und Verfahren zum Erkennen von Verkehrszeichen
DE102017203276B4 (de) Verfahren und Vorrichtung zur Ermittlung einer Trajektorie in Off-road-Szenarien
EP4046049B1 (fr) Générateur de dissuasion d'attaque, procédé de prévention d'une attaque sur une unité d'ia et support de stockage lisible par ordinateur
DE102019209644A1 (de) Verfahren zum Trainieren eines neuronalen Netzes
DE102019214402A1 (de) Verfahren und vorrichtung zum verarbeiten von daten mittels eines neuronalen konvolutionsnetzwerks
EP3828758A1 (fr) Procédé de classification des objets, circuit de classification des objets, véhicule automobile
DE102021201124A1 (de) Trainieren von bildklassifizierernetzen
DE102008036219A1 (de) Verfahren zur Erkennung von Objekten im Umfeld eines Fahrzeugs
DE102021207613A1 (de) Verfahren zur Qualitätssicherung eines Systems
DE102020203047A1 (de) Effiziente gleichzeitige Inferenzberechnung für mehrere neuronale Netzwerke
DE102019209463A1 (de) Verfahren zur Bestimmung eines Vertrauenswertes eines Objektes einer Klasse
EP4097647A1 (fr) Procédé pour assurer la qualité d'un système basé sur des exemples
DE102020208080A1 (de) Erkennung von Objekten in Bildern unter Äquivarianz oder Invarianz gegenüber der Objektgröße
EP4000011B1 (fr) Traitement de grandeurs d'entrée fondé sur des composants
EP4323862A1 (fr) Procédé d'assurance qualité d'un système
WO2022069182A1 (fr) Procédé d'assurance qualité pour un système fondé sur des exemples
DE102020128952A1 (de) Verfahren und Assistenzeinrichtung zur zweistufigen bildbasierten Szenenerkennung und Kraftfahrzeug
DE102007025620A1 (de) Vorrichtung zur Bestimmung einer Objekt- und/oder Existenzwahrscheinlichtkeit eines Suchobjekts in einem Auslesefenster eines Bildes, Verfahren sowie Computerprogramm
DE102023105860A1 (de) Verfahren und Datenverarbeitungseinrichtung zur lidarbasierten Umgebungserkennung und Kraftfahrzeug damit
DE102021131179A1 (de) Formpriorisierte Bildklassifizierung unter Verwendung tiefer Faltungsnetze
EP4033452B1 (fr) Apprentissage indépendant du domaine des classificateurs d'image
DE102022212374A1 (de) Computerimplementiertes Verfahren zum Erkennen von Objekten
DE102022213064A1 (de) Erkennung unbekannter Objekte durch neuronale Netze für Fahrzeuge
DE102022212666A1 (de) Computerimplementierte Verfahren zur Anker und zur Keypoint basierten Erkennung von Objektzentren

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20220829

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20250207

RAP3 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SIEMENS MOBILITY GMBH