CN116168201B

CN116168201B - Lane line segmentation method and device without accurate data labeling

Info

Publication number: CN116168201B
Application number: CN202310390212.4A
Authority: CN
Inventors: 孟鹏飞; 万如; 贾双成; 郭杏荣
Original assignee: Zhidao Network Technology Beijing Co Ltd
Current assignee: Zhidao Network Technology Beijing Co Ltd
Priority date: 2023-04-13
Filing date: 2023-04-13
Publication date: 2023-06-20
Anticipated expiration: 2043-04-13
Also published as: CN116168201A

Abstract

The application relates to a lane line segmentation method and device without accurately marking data, wherein the method comprises the following steps: inputting the lane line sample image into an initial semantic segmentation network to obtain a predicted image, and training the initial semantic segmentation network according to the predicted image and the lane line sample image to obtain a target semantic segmentation network; the training initial semantic segmentation network comprises a first training process and a second training process: in a first training process, generating a candidate semantic segmentation network according to a loss function generated by first lane line data; and correcting the second lane line data according to the prediction result of the second lane line data output by the candidate semantic segmentation network to update a sample training set, training the candidate semantic segmentation network to obtain a target semantic segmentation network model, and identifying the lane line of the road image to be detected based on the target semantic segmentation network model. According to the scheme provided by the application, lane line segmentation in the lane line image can be completed according to the inaccurately marked data.

Description

Lane line segmentation method and device without accurate data labeling

Technical Field

The present disclosure relates to the field of map image processing, and in particular, to a lane line segmentation method and apparatus without accurate labeling data.

Background

When an automatic driving vehicle runs or a high-precision map is manufactured, the accurate position of a lane line needs to be known so as to determine a running safety area or make some decisions according to the lane line.

At present, various image segmentation models are gradually developed for segmenting various images. Image segmentation models are generally categories that perform pixel-level classification on an image, and may perform fine segmentation on the image. Before the semantic segmentation of the lane lines, each pixel of the lane line part in the image needs to be marked manually, the marking is difficult to be accurate to the pixel level, and the fact that each pixel is marked accurately is difficult to be ensured, so that the situation that some pixels are marked incorrectly exists in the data marked manually generally.

The estimation of the model is worse than actually, due to the use of erroneous data. Additionally, the marked data is also used to update weights, and erroneous marked data can cause image segmentation weight updating errors. However, the data to be marked is accurate to each pixel, so that on one hand, the marking personnel needs to spend more time for marking, the marking cost is high, and on the other hand, even if a large amount of time is spent for marking, it is difficult to ensure that each pixel is marked correctly.

Therefore, in the prior art, how to realize accurate segmentation of lane lines is a problem to be solved under the condition that lane line marking data in a training semantic segmentation network is inaccurate.

Disclosure of Invention

In order to solve or partially solve the problems in the related art, the application provides a lane line segmentation method and a lane line segmentation device without accurately marking data, which can remove lane line data with abnormal lane line marking and improve the accuracy of lane line segmentation.

The first aspect of the present application provides a lane line segmentation method without precisely labeling data, including:

the method comprises the steps of obtaining a lane line sample image, and identifying first lane line data and second lane line data of the lane line sample image, wherein the first lane line data is lane line region pixels with accurate lane line marking in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line marking in the lane line sample image;

in a first training process, inputting the lane line sample image into an initial semantic segmentation network, wherein the output result of the initial semantic segmentation network comprises a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks;

In a second training process, a second lane line prediction result output by the candidate semantic segmentation network is obtained, whether the second lane line prediction result meets a preset prediction condition is judged, and if the second lane line prediction result meets the preset prediction condition, second lane line data are corrected according to the second lane line prediction result, and the corrected second lane line data are updated to a training set of the candidate semantic segmentation network model;

training the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identifying lane lines of the road image to be detected based on the target semantic segmentation network model.

Optionally, identifying the first lane line data and the second lane line data of the lane line sample image includes:

identifying lane line pixels of the lane line sample image based on an edge detection algorithm, wherein the lane line pixels carry lane line labels;

detecting edge pixels of lane line pixels, setting the edge pixels of the lane line pixels as unlabeled pixels, and generating first lane line data based on the pixels carrying lane line labeling;

and eliminating the first lane line data in the lane line pixels to obtain the second lane line data.

Optionally, calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining a candidate semantic segmentation network, including:

inputting the lane line sample image into an initial semantic segmentation network to obtain a lane line prediction result, wherein the lane line prediction result comprises a prediction result of first lane line data, and the prediction result of the first lane line data comprises a first prediction probability that the first lane line data is a target lane line;

according to the first lane line data and the first lane line prediction result, a first loss function of the initial semantic segmentation network is calculated, and parameters of the initial semantic segmentation network are adjusted according to the first loss function, so that candidate semantic segmentation networks are obtained.

Optionally, the method further comprises: when the first training process meets the starting condition of the second training process, starting the second training process; the second training process starting conditions include: in the first training process, the initial semantic segmentation network training times meet the preset training times.

Optionally, when the second lane line prediction result meets the preset prediction condition, correcting the second lane line data according to the second lane line prediction result includes:

The second lane line prediction result comprises a second prediction probability that second lane line data is a target lane line;

under the condition that the second prediction probability is smaller than the first threshold value and the second lane line data carries lane line marks, judging that lane line marking of the second lane line data is wrong, and eliminating lane line data with wrong lane line marking;

and under the condition that the second prediction probability is larger than a second threshold value and the second lane line data does not carry lane line marks, judging that lane line marks of the second lane line data are missed, and adding the lane line marks to the second lane line data with the lane line marks being missed.

Optionally, after adding the lane line identifier to the second lane line data of the lane line marked with the missing mark, the method includes:

inputting the updated lane line data into a candidate semantic segmentation network to obtain a third lane line prediction result;

and under the condition that the third lane line prediction result meets the preset prediction condition, inputting the third lane line prediction result into a candidate semantic segmentation network so as to realize the dynamic growth of the second lane line data.

Optionally, the dynamic growth of the second lane line data includes: when the distance from the first pixel point to the second pixel point is greater than the preset pixel distance, the candidate semantic segmentation network stops training; the first pixel points are pixels far away from the middle of the lane lines in the predicted lane line data of the second lane line data, and the second pixel points are pixels closest to the first pixel points in the second lane line data to be corrected and carrying lane line marks.

A second aspect of the present application provides a lane line segmentation apparatus without precisely labeling data, the apparatus comprising:

the system comprises an acquisition module, a lane line sample image acquisition module and a data processing module, wherein the acquisition module is used for acquiring a lane line sample image, and identifying first lane line data and second lane line data of the lane line sample image, wherein the first lane line data is lane line region pixels with accurate lane line labeling in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line labeling in the lane line sample image;

the first training module is used for inputting the lane line sample image into an initial semantic segmentation network in a first training process, and the output result of the initial semantic segmentation network comprises a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks;

the second training module is used for acquiring a second lane line prediction result output by the candidate semantic segmentation network in a second training process, judging whether the second lane line prediction result meets a preset prediction condition, correcting second lane line data according to the second lane line prediction result under the condition that the second lane line prediction result meets the preset prediction condition, and updating the corrected second lane line data to a training set of the candidate semantic segmentation network model;

The segmentation module is used for training the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identifying lane lines of the road image to be detected based on the target semantic segmentation network model.

A third aspect of the present application provides an electronic device, comprising:

a processor; and

a memory having executable code stored thereon which, when executed by the processor, causes the processor to perform the method as described above.

A fourth aspect of the present application provides a computer readable storage medium having stored thereon executable code which, when executed by a processor of an electronic device, causes the processor to perform a method as described above.

The technical scheme that this application provided can include following beneficial effect: the first aspect the present application trains the initial semantic segmentation network including a first training process and a second training process; first lane line data and second lane line data of a lane line sample image are identified, wherein the first lane line data is lane line region pixels with accurate lane line marking in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line marking in the lane line sample image; in a first training process, inputting lane line sample images into an initial semantic segmentation network to obtain a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks; in the first training process, a candidate semantic segmentation model with high accuracy is obtained through accurate lane marking training.

In the second aspect, in a second training process, a second lane line prediction result output by the candidate semantic segmentation network is obtained, whether the second lane line prediction result meets a preset prediction condition is judged, and under the condition that the second lane line prediction result meets the preset prediction condition, second lane line data is corrected according to the second lane line prediction result, and the corrected second lane line data is updated to a training set of the candidate semantic segmentation network model; training the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identifying lane lines of the road image to be detected based on the target semantic segmentation network model. The second training process corrects the second lane line data according to the second lane line prediction result in the first training process, so that deviation caused by manual annotation can be corrected, and meanwhile, the training set of the semantic segmentation model is updated, so that the finally obtained target semantic segmentation network result is more accurate.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

The foregoing and other objects, features and advantages of the application will be apparent from the following more particular descriptions of exemplary embodiments of the application as illustrated in the accompanying drawings wherein like reference numbers generally represent like parts throughout the exemplary embodiments of the application.

Fig. 1 is a flow chart of a lane line segmentation method without precisely labeling data according to an embodiment of the present application;

FIG. 2 is a schematic generation diagram of a schematic of dynamic growth of lane line data shown in an embodiment of the present application;

fig. 3 is a schematic structural diagram of a lane line segmentation apparatus without precisely labeling data according to an embodiment of the present application;

fig. 4 is a schematic structural view of a vehicle shown in an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

The terminology used in the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the present application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that although the terms "first," "second," "third," etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first message may also be referred to as a second message, and similarly, a second message may also be referred to as a first message, without departing from the scope of the present application. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.

In order to facilitate a better understanding of the technical solutions of the present application, the following description refers to the terms related to the present application.

1. Artificial intelligence (Artificial Intelligence, AI): the system is a theory, a method, a technology and an application system which simulate, extend and extend human intelligence by using a digital computer or a machine controlled by the digital computer, sense environment, acquire knowledge and acquire an optimal result by using the knowledge. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

2. Machine Learning (ML): is a multi-domain interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

3. Convolutional neural network (Convolutional Neural Network, CNN): is a feedforward neural network, and its artificial neuron can respond to a part of surrounding units in coverage area, and has excellent performance for large-scale image processing. Convolutional neural networks consist of one or more convolutional layers and a top fully connected layer (corresponding to classical neural networks) and also include associated weights and pooling layers (pooling layers).

4. Depth characteristics: the image features extracted through the depth network contain abstract information of the image.

5. Semantic segmentation: and according to the interested object to which each pixel in the image belongs, a corresponding class label is allocated to the interested object.

6. Semantic image: the result obtained after assigning a category note to each pixel in the image.

7. Mask image: in the embodiment of the present application, the labeling image used for representing the road image may be a binary image, where the binary image includes a first type of pixel point with a first value and a second type of pixel point with a second value, for example, a value of 0 in the binary image indicates that the pixel point is not selected, and if the value of 1 in the binary image indicates that the pixel point is selected.

8. Condition generating countermeasure network (Conditional Generative Adversarial Nets, CGAN): an improvement on the GAN base is achieved by adding additional condition information to the Generator and Discriminator of the original GAN. The additional condition information may be a category tag or other auxiliary information.

9. ImageNet database: a large-scale database of 1000 categories is included.

10. MobileNet V2, a commonly used lightweight network model architecture, is trained on an ImageNet database and can be used to extract image features.

11. Image classification and category: image classification refers to an image processing method of distinguishing objects of different categories according to different features each reflected in image information. The method utilizes a computer to quantitatively analyze the image, and classifies each pixel point or region in the image into one of a plurality of categories to replace the visual interpretation of people. The category may also be referred to as a classification. There may be two or more categories in embodiments of the present application, such as vehicles, roads, etc. When the image semantic segmentation model is applicable to different scenes, the corresponding categories to be annotated can be different. Each object in the image is actually constituted by a pixel, and the category of the pixel corresponds to the category of the object.

12. Sample image and target image: all belong to the image, in the embodiment of the application, the image used for training the model is called a sample image, and the image which is processed by using the model later is called a target image.

13. Edge information and edge pixel points: the edge information is used for describing the information of the pixels with discontinuous gray level change of the neighborhood pixels in the image, the pixels with discontinuous gray level change of the neighborhood pixels in the pixels are edge pixels, and the edge information can specifically comprise gray level values of the edge pixels, shapes formed by the edge pixels and the like. Edges exist widely between objects and the background, object to object. Edge information in an image may be obtained by image edge detection.

14. Conditional Random Fields (CRFs), which are a discriminant probability model, are one type of random fields that are commonly used to label or analyze sequence data, such as natural language text or biological sequences. The conditional random field is a conditional probability distribution model P (y|x) representing a markov random field for a given set of input random variables X and another set of output random variables Y, that is to say CRF is characterized by the assumption that the output random variables constitute the markov random field. Conditional random fields can be seen as a generalization of the maximum entropy markov model over labeling problems. Like a Markov random field, a conditional random field is a model of a graph with undirected directions, vertices in the graph represent random variables, links between the vertices represent dependencies between the random variables, and in the conditional random field, the distribution of the random variable Y is a conditional probability, and a given observed value is the random variable X. In principle, the graph model layout of the conditional random field can be arbitrarily given, and a common layout is a link-type architecture, and a link-type architecture has a relatively high algorithm available for calculation, whether in training (training), inference (reference), or decoding (decoding). Conditional random fields are a typical discriminant model whose joint probabilities can be written in the form of a number of potential function multiplications, with linear chain member random fields being the most common.

When an automatic driving vehicle is driving or when a high-precision map is manufactured, the accurate position of a lane line needs to be known so as to determine a driving safety area or make some decisions according to the lane line. At present, various image segmentation models are gradually developed for segmenting various images. The image segmentation model relates to an important image segmentation model, namely an image semantic segmentation model. The image semantic segmentation model is generally a class of pixels for an image, and can finely segment the image. Before the semantic segmentation of the lane lines, each pixel of the lane line part in the image needs to be marked manually, the marking is difficult to be accurate to the pixel level, and the fact that each pixel is marked accurately is difficult to be ensured, so that the situation that some pixels are marked incorrectly exists in the data marked manually generally. If the data to be marked is accurate to each pixel, on one hand, the marking personnel needs to spend more time marking, the marking cost becomes high, on the other hand, even if a large amount of time marking is spent, finally, it is difficult to ensure that each pixel is marked correctly.

In view of the above problems, embodiments of the present application provide a lane line segmentation method that does not require accurate labeling of data, and improves the efficiency of lane line segmentation from road images.

The following describes the technical scheme of the embodiments of the present application in detail with reference to the accompanying drawings.

Referring to fig. 1, a flow chart of a lane line segmentation method without precisely labeling data according to an embodiment of the present application mainly includes steps S101 to S104, which are described as follows:

step S101: the method comprises the steps of obtaining a lane line sample image, and identifying first lane line data and second lane line data of the lane line sample image, wherein the first lane line data is lane line region pixels with accurate lane line marking in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line marking in the lane line sample image.

In this embodiment, the lane line sample image is an image including a lane line manual mark. Because the manually marked lane line pixels may be inaccurate, the lane line pixels are divided into the first lane line data with more accurate pixel marking areas and the second lane line data with inaccurate pixel marking areas.

Step S101, inputting a lane line sample image into an initial semantic segmentation network to obtain a predicted image by acquiring the lane line sample image, and training the initial semantic segmentation network according to the predicted image and the lane line sample image to obtain a target semantic segmentation network; the training initial semantic segmentation network comprises a first training process and a second training process, wherein the first training process comprises a candidate semantic segmentation network.

In one embodiment, identifying first lane line data and second lane line data of a lane line sample image includes: performing edge detection on the lane line sample image subjected to gray level processing, and determining pixel points with lane line characteristics; and marking lane line marks on pixel points of the lane line features, and marking non-lane line marks on other pixels in the road image.

In this embodiment, the edge detection includes: detecting edge information of the sample image, determining edge pixel points with lane line edge characteristics in the sample image according to the edge information, and correspondingly determining lane line pixels of the edge pixel points included in the image. Edge pixels can be understood as pixels whose pixel values vary greatly according to the lane line characteristics. Specifically, the sample image is converted into a gray scale image, and then edge information of the image is extracted based on a preset edge detection algorithm, such as a canny edge detection algorithm.

In one embodiment, in a first training process, identifying first lane line data and second lane line data of a lane line sample image includes: identifying lane line pixels of the lane line sample image based on an edge detection algorithm, wherein the lane line pixels carry lane line labels; detecting edge pixels of lane line pixels, setting the edge pixels of the lane line pixels as unlabeled pixels, and generating first lane line data based on the pixels carrying lane line labeling; and eliminating the first lane line data in the lane line pixels to obtain the second lane line data.

Because the reliability of the labeling of the middle pixels of the lane line pixels is stronger, the embodiment eliminates the edge pixels and divides the lane line data into reliable first lane line data and unreliable second lane line data.

Machine learning includes supervised learning (supervised learning), unsupervised learning (unsupervised learning), and semi-supervised learning (semi-supervised learning). In supervised learning, data is annotated, in the form of (x, t), where x is the input data and t is the annotation. (all annotation data are also known as group trunk). The data from the model function then appears in the form of (x, y). Where x is the previous input data and y is the value of the model prediction. The labels are compared with the model predicted results. Y and t are compared in the loss function (loss function/error function) to calculate loss (loss/error). Therefore, if the annotation data is not the group trunk, errors will be generated in the calculation of loss, and thus the model quality will be affected.

The estimation of the model is worse than actually, due to the use of erroneous data. Additionally, the tag data is also used to update the weights, and erroneous tag data can cause weight update errors. It is therefore necessary to use high quality data. Since lane line pixels are marked with lane line marks, the lane line pixels often have problems at the lane line edges. Therefore, lane line marks of the lane line edge pixels in the first lane line data are changed into unlabeled marks, and the generated second lane line data are more accurate than the first lane line data.

In one embodiment, identifying lane line edge pixels of the first lane line data, setting the lane line edge pixels to be unlabeled with a mark, and obtaining second lane line data includes: identifying lane line edge pixels of the first lane line data; determining a lane line edge interval of a lane line edge pixel according to the first pixel threshold; and setting lane pixels in the lane line edge interval as unlabeled marks.

In this embodiment, the lane mark may be a pixel value with a gray value of 255, and the unlabeled mark may be a pixel value with a gray value of 0. According to the embodiment, the first lane line data is changed into the second lane line data, so that a more accurate lane line sample set can be obtained. In a specific embodiment, the lane line edge pixel value may be set to 8, that is, 8 pixels are removed from the lane line edge pixel at the end, and the remaining lane line pixels are lane line edge intervals.

Step S102: in a first training process, inputting lane line sample images into an initial semantic segmentation network, wherein the output result of the initial semantic segmentation network comprises a prediction result of first lane line data; according to the prediction result of the first lane line data and the first lane line data, a first loss function of the initial semantic segmentation network is calculated, and the initial semantic segmentation network is trained according to the first loss function, so that a candidate semantic segmentation network is obtained.

In one embodiment, training an initial semantic segmentation network based on first lane line data and lane line sample images to obtain candidate semantic segmentation networks includes: inputting the lane line sample image into an initial semantic segmentation network to obtain a lane line prediction result, wherein the lane line prediction result comprises a prediction result of first lane line data, and the prediction result of the first lane line data comprises a first prediction probability that the first lane line data is a target lane line; according to the first lane line data and the first lane line prediction result, a first loss function of the initial semantic segmentation network is calculated, and parameters of the initial semantic segmentation network are adjusted according to the first loss function, so that candidate semantic segmentation networks are obtained.

In one embodiment, the method further comprises: when the first training process meets the starting condition of the second training process, starting the second training process; the second training process starting conditions include: the number of times in the first training process meets the preset training number.

Specifically, setting the number of manual training to 175 and the preset number of training to 50, and starting a second training process after the 50 th training of the initial semantic segmentation network is completed.

Step S103: and acquiring a second lane line prediction result output by the candidate semantic segmentation network, judging whether the second lane line prediction result meets a preset prediction condition in a second training process, correcting second lane line data according to the second lane line prediction result under the condition that the second lane line prediction result meets the preset prediction condition, and updating the corrected second lane line data to a training set of the candidate semantic segmentation network model.

Since the candidate semantic segmentation network in step S102 already has a certain judging capability, the second lane line can be judged according to the candidate semantic segmentation network. In step S103, the second lane line prediction result is a probability of being the target lane line output based on the image corresponding to the second lane line data. And judging whether the second lane line prediction result is available according to the probability and a preset probability threshold value. And under the condition that the second lane line prediction result meets the preset prediction condition, correcting the second lane line data according to the second lane line prediction result, and updating the corrected second lane line data to the training set of the candidate semantic segmentation network model.

In one embodiment, when the second lane line prediction result meets the preset prediction condition, correcting the second lane line data according to the second lane line prediction result includes: the second lane line prediction result comprises the prediction probability that the second lane line data is the target lane line; under the condition that the prediction probability is smaller than a first threshold value and the second lane line data carries lane line marks, judging that lane line marking of the second lane line data is wrong, and eliminating lane line data with wrong lane line marking; and under the condition that the prediction probability is larger than a second threshold value and the second lane line data does not carry the lane line mark, judging that the lane line marking of the second lane line data is missed, and adding the lane line mark to the second lane line data of the lane line marking missed.

Specifically, whether the second lane line prediction result is available is judged to be divided into two cases, wherein in the case that the prediction probability is smaller than a first threshold value and lane line marks are carried by second lane line data, lane line marking errors of the second lane line data are judged, the second lane line is considered to be required to shrink at the moment, and lane line data with the lane line marking errors are removed. And secondly, judging that the lane line marking of the second lane line data is missed and adding the lane line mark to the second lane line data marked with the missed mark under the condition that the prediction probability is larger than a second threshold value and the second lane line data does not carry the lane line mark. The lane lines are considered to need to be flared at this time.

Specifically, the first threshold value is set to be 2 percent, the second threshold value is set to be 90 percent, the second lane line data has 6 pixels, two pixels meeting the first threshold value and the first condition are set, four pixels meeting the second threshold value and the second condition are set, the two pixels are removed, the four pixels are reserved, and the four pixels are second lane line prediction pixels of the second lane line pixel extension.

In one embodiment, modifying the second lane line data based on the second lane line prediction result comprises: each prediction result only corrects the lane line pixels within the first pixel distance at the two ends of the first lane line data. For example, if the first pixel distance is 4 pixels, the second lane line prediction result is a prediction result that the two ends of the first lane line pixel extend outwards by 4 pixels.

In one embodiment, after adding the lane line identifier to the lane line-labeled second lane line data, the method includes: inputting the updated lane line data into a candidate semantic segmentation network to obtain a third lane line prediction result; and under the condition that the third lane line prediction result meets the preset prediction condition, inputting the third lane line prediction result into a candidate semantic segmentation network so as to realize the dynamic growth of the second lane line data.

In this embodiment, the pixel length is the same for each dynamic growth. The length of the second lane line data predictor is the same as the length of the third lane line data predictor. For example, if the first pixel distance is the distance from the end of the second lane line prediction result to the same end of the first lane line data, the pixel distance from the end of the third lane line prediction result to the end of the second lane line in the same direction is also the first pixel distance.

In this embodiment, the dynamic growth of the second lane line data includes: when the distance from the first pixel point to the second pixel point is greater than the preset pixel distance, the candidate semantic segmentation network stops training; the first pixel points are pixels far away from the middle of the lane lines in the predicted lane line data of the second lane line data, and the second pixel points are pixels closest to the first pixel points in the second lane line data to be corrected and carrying lane line marks. For example, the pixel length of the first lane line data is 8, if the pixel length generated by the candidate semantic segmentation network dynamically at two ends of the lane line pixel is 4, the candidate semantic segmentation network generates 8+4+4=16 lane lines for the first time, when the lane lines with the 16 pixel lengths are input into the candidate semantic segmentation network, the lane lines with the 4 pixel lengths are generated again at two ends, the length of the predicted growing lane line is 16+4+4=24, but if the outermost one of the lane lines with the 24 length at this time is marked as the first pixel point, at this time, in the same direction, the pixel point closest to the first pixel point in the second lane line data to be corrected and carrying the lane line mark is the second pixel point, the distance between the first pixel point and the second pixel point is X, if the preset pixel distance is 4, if X is greater than 4, the lane line distance of the first pixel is generated, at this time the pixel length is still 16.

In this embodiment, according to the intermediate data (new label data) of the lane line, the data extends toward the two side edges of the two ends, so as to update the two side edge data of the original lane line.

Specifically, as shown in fig. 2, fig. 2 is a schematic diagram of dynamic growth of lane line data according to an embodiment of the present application. Taking the example that the threshold value of the data output of the second lane line is a pixel value 8 each time, the pixel value 8 adjacently represents 8 pixel points around one pixel point. The method is that the edge part of the lane line data A0 in the sample picture is removed, only accurate marking data in the lane line is reserved, the data after the edge is removed is represented by A (first lane line data), the pixel point of the A is removed from A0 to be second lane line data, the second lane line still carries lane line identification at this time, but the second lane line does not participate in iteration of the initial semantic segmentation network in the first training process. A is dynamically updated during the training process. The specific updating flow is as follows: in the first training process, a sample image A0 is input into an initial semantic segmentation network, the initial semantic segmentation network outputs a segmentation result, a first lane line prediction probability corresponding to a first lane line A is selected from the segmentation result, and a loss function is calculated based on the first lane line prediction probability and the first lane line data A. In the first training process, the recognition capability of the network is improved, and the initial semantic segmentation network is trained according to the loss function calculated by the first lane line A to obtain a candidate semantic segmentation model. In the second training process, obtaining a result A2 output by the candidate semantic segmentation network model, wherein the result A2 comprises a predicted result of a first lane line A and a predicted result of a second lane line; and correcting the second lane line according to the prediction result of the second lane line, taking the case that the lane line needs to be newly marked as an example under the condition of meeting the preset prediction condition, correcting lane line data in A2 at the moment to obtain a third lane line A3, setting the growth threshold value of the third lane line to be 8 in the training process, namely searching 8 adjacent positions of each edge pixel point in each direction between A2 and A3, wherein the 8 positions correspond to 8 positions output by a network, updating input lane line data if the value of the corresponding 8 positions output by the network is larger than the set threshold value, and setting the position to be in a marked state, otherwise, keeping an unmarked state. And replacing A3 with A2, updating the sample picture training set, inputting A3 into a candidate semantic segmentation network model trained after the training set is updated, sequentially searching 8 adjacent positions of each edge pixel point in each direction of A3 to obtain a third lane line prediction result, and correcting the third lane line prediction result under the condition that the 8 pixels meet preset prediction conditions to obtain fourth lane line data. At this time, the second lane line data undergoes two rounds of training in the second training process, and the edge pixel point of the fourth lane line data in each direction is larger than 16 pixels of the second lane line data (only 8 pixels are added in each training). In this embodiment, only the case of lane-line growth is considered.

When the network calculates the loss function to carry out back propagation, only the pixel points with marked states are used. For example, in the first dynamic growth process, data A is input, and the first growth lane line data A1 is obtained by growth, wherein A1 is larger than A8 pixel adjacent positions. Inputting data A1 into the target neural network to obtain lane line data A2 of a second time growth, wherein A2 is larger than A1, the adjacent positions of 8 pixels are larger than A1, and repeatedly inputting the lane line data until the lane line data B triggering a preset stop condition.

In one embodiment, the method comprises: performing convolution training on a preset neural network according to the first lane line data and the second lane line data to obtain a target neural network, including: inputting the second lane line data into a preset neural network to obtain an output result of the preset convolutional network; calculating a loss function of a preset convolution network according to the output result and the first lane line data; training a preset neural network according to a loss function of the preset convolution network to obtain a target neural network, and outputting first lane line data by the target neural network according to the input second lane line data.

In this embodiment, the second lane line data is relatively accurate, the first lane line data is input into the preset convolutional neural network, a segmentation result of the first lane line data is obtained, a loss function of the second lane line is obtained by comparing the segmentation result with the second lane line data, and the preset convolutional network is iterated according to the loss function, so that the target neural network is obtained.

In embodiments of the present application, a conditional random field algorithm may be used to calculate and label the probability of becoming a lane line for each lane line pixel. And comparing probability labels of the same pixels to obtain a loss function.

Step S104: training the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identifying lane lines of the road image to be detected based on the target semantic segmentation network model.

In this embodiment, since the updated training set is more accurate, the updated training set may enable more accurate calculation of the loss function of the candidate semantic segmentation model.

A second aspect of the present application provides a lane line segmentation apparatus without precisely labeling data, as shown in fig. 3, the apparatus includes:

the obtaining module 301 is configured to obtain a lane line sample image, identify first lane line data and second lane line data of the lane line sample image, where the first lane line data is lane line region pixels with accurate lane line labeling in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line labeling in the lane line sample image;

the first training module 302 is configured to input a lane line sample image into an initial semantic segmentation network in a first training process, where an output result of the initial semantic segmentation network includes a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks;

The second training module 303 is configured to obtain a second lane line prediction result output by the candidate semantic segmentation network in a second training process, determine whether the second lane line prediction result meets a preset prediction condition, correct second lane line data according to the second lane line prediction result when the second lane line prediction result meets the preset prediction condition, and update the corrected second lane line data to a training set of the candidate semantic segmentation network model;

the segmentation module 304 is configured to train the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identify a lane line of the road image to be detected based on the target semantic segmentation network model.

In one embodiment, identifying first lane line data and second lane line data of a lane line sample image includes: identifying lane line pixels of the lane line sample image based on an edge detection algorithm; detecting edge pixels of lane line pixels, setting the edge pixels of the lane line pixels as unlabeled pixels, and generating first lane line data based on middle lane line pixels of the lane line pixels and pixel labeling of the middle lane line; and eliminating the first lane line data in the lane line pixels to obtain the second lane line data.

In one embodiment, calculating a first loss function of the initial semantic segmentation network according to a prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining a candidate semantic segmentation network, including: inputting the lane line sample image into an initial semantic segmentation network to obtain a lane line prediction result, wherein the lane line prediction result comprises a prediction result of first lane line data, and the prediction result of the first lane line data comprises a first prediction probability that the first lane line data is a target lane line; according to the first lane line data and the first lane line prediction result, a first loss function of the initial semantic segmentation network is calculated, and parameters of the initial semantic segmentation network are adjusted according to the first loss function, so that candidate semantic segmentation networks are obtained.

In one embodiment, the method further comprises: when the first training process meets the starting condition of the second training process, starting the second training process; the second training process starting conditions include: in the first training process, the initial semantic segmentation network training times meet the preset training times.

In one embodiment, when the second lane line prediction result meets the preset prediction condition, correcting the second lane line data according to the second lane line prediction result includes: the second lane line prediction result comprises a second prediction probability that second lane line data is a target lane line; under the condition that the second prediction probability is smaller than the first threshold value and the second lane line data carries lane line marks, judging that lane line marking of the second lane line data is wrong, and eliminating lane line data with wrong lane line marking; and under the condition that the second prediction probability is larger than a second threshold value and the second lane line data does not carry lane line marks, judging that lane line marks of the second lane line data are missed, and adding the lane line marks to the second lane line data with the lane line marks being missed.

In one embodiment, the dynamic growth of the second lane line data includes: and stopping training by the candidate semantic segmentation network when the distance from the predicted lane line data of the second lane line data to the last marked pixel point of the lane line is greater than the preset pixel distance.

The technical scheme that this application provided can include following beneficial effect: the technical scheme that this application provided can include following beneficial effect: the first aspect the present application trains the initial semantic segmentation network including a first training process and a second training process; first lane line data and second lane line data of a lane line sample image are identified, wherein the first lane line data is lane line region pixels with accurate lane line marking in the lane line sample image, and the second lane line data is lane line region pixels with inaccurate lane line marking in the lane line sample image; in a first training process, inputting lane line sample images into an initial semantic segmentation network to obtain a prediction result of first lane line data and a prediction result of second lane line; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks; in the first training process, a candidate semantic segmentation model with high accuracy is obtained through accurate lane marking training.

In the second aspect, in the second training process, judging whether a second lane line prediction result meets a preset prediction condition, correcting second lane line data according to the second lane line prediction result under the condition that the second lane line prediction result meets the preset prediction condition, and updating the corrected second lane line data to a training set of the candidate semantic segmentation network model; training the candidate semantic segmentation network model according to the updated training set to obtain a target semantic segmentation network model, and identifying lane lines of the road image to be detected based on the target semantic segmentation network model. The second training process corrects the second lane line data according to the second lane line prediction result in the first training process, so that deviation caused by manual annotation can be corrected, and meanwhile, the training set of the semantic segmentation model is updated, so that the finally obtained target semantic segmentation network result is more accurate.

Referring to fig. 4, a schematic structural diagram of an electronic device according to an embodiment of the present application is shown. The vehicle 400 includes a memory 410 and a processor 420.

The processor 420 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Memory 410 may include various types of storage units, such as system memory, read Only Memory (ROM), and persistent storage. Where the ROM may store static data or instructions that are required by the processor 420 or other modules of the computer. The persistent storage may be a readable and writable storage. The persistent storage may be a non-volatile memory device that does not lose stored instructions and data even after the computer is powered down. In some embodiments, the persistent storage device employs a mass storage device (e.g., magnetic or optical disk, flash memory) as the persistent storage device. In other embodiments, the persistent storage may be a removable storage device (e.g., diskette, optical drive). The system memory may be a read-write memory device or a volatile read-write memory device, such as dynamic random access memory. The system memory may store instructions and data that are required by some or all of the processors at runtime. Furthermore, memory 410 may include any combination of computer-readable storage media including various types of semiconductor memory chips (e.g., DRAM, SRAM, SDRAM, flash memory, programmable read-only memory), magnetic disks, and/or optical disks may also be employed. In some implementations, memory 410 may include readable and/or writable removable storage devices such as Compact Discs (CDs), digital versatile discs (e.g., DVD-ROMs, dual layer DVD-ROMs), blu-ray discs read only, super-density discs, flash memory cards (e.g., SD cards, min SD cards, micro-SD cards, etc.), magnetic floppy disks, and the like. The computer readable storage medium does not contain a carrier wave or an instantaneous electronic signal transmitted by wireless or wired transmission.

The memory 410 has stored thereon executable code that, when processed by the processor 420, can cause the processor 420 to perform some or all of the methods described above.

Furthermore, the method according to the present application may also be implemented as a computer program or computer program product comprising computer program code instructions for performing part or all of the steps of the above-described method of the present application.

Alternatively, the present application may also be embodied as a computer-readable storage medium (or non-transitory machine-readable storage medium or machine-readable storage medium) having stored thereon executable code (or a computer program or computer instruction code) which, when executed by a processor of a vehicle (or a server, etc.), causes the processor to perform part or all of the steps of the above-described method according to the present application.

The embodiments of the present application have been described above, the foregoing description is exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. The lane line segmentation method without precisely marking data is characterized by comprising the following steps:

a lane line sample image is obtained, first lane line data and second lane line data of the lane line sample image are identified, the first lane line data are lane line region pixels with accurate lane line labeling in the lane line sample image, and the second lane line data are lane line region pixels with inaccurate lane line labeling in the lane line sample image;

in a first training process, inputting the lane line sample image into an initial semantic segmentation network, wherein the output result of the initial semantic segmentation network comprises a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, and training the initial semantic segmentation network according to the first loss function to obtain candidate semantic segmentation networks;

in a second training process, a second lane line prediction result output by the candidate semantic segmentation network is obtained, whether the second lane line prediction result meets a preset prediction condition is judged, and when the second lane line prediction result meets the preset prediction condition, the second lane line data is corrected according to the second lane line prediction result, and the corrected second lane line data is updated to a training set of the candidate semantic segmentation network model;

2. The method of claim 1, wherein the identifying the first lane line data and the second lane line data of the lane line sample image comprises:

identifying lane line pixels of a lane line sample image based on an edge detection algorithm, wherein the lane line pixels carry lane line labels;

detecting edge pixels of the lane line pixels, setting the edge pixels of the lane line pixels as unlabeled pixels, and generating first lane line data according to the remaining pixels carrying lane line labeling;

3. The method of claim 1, wherein the calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, training the initial semantic segmentation network according to the first loss function, and obtaining candidate semantic segmentation networks comprises:

according to the first lane line data and the first prediction probability, a first loss function of an initial semantic segmentation network is calculated, and parameters of the initial semantic segmentation network are adjusted according to the first loss function, so that candidate semantic segmentation networks are obtained.

4. The method according to claim 1, wherein the method further comprises: when the first training process meets a second training process starting condition, starting a second training process; the second training process starting conditions include: in the first training process, training times of the initial semantic segmentation network meet preset training times.

5. The method according to claim 1, wherein, in the case where the second lane line prediction result meets a preset prediction condition, correcting the second lane line data according to the second lane line prediction result includes:

The second lane line prediction result comprises a second prediction probability that the second lane line data is a target lane line;

judging that the lane line marking of the second lane line data is wrong and eliminating the second lane line data with wrong lane line marking when the second prediction probability is smaller than a first threshold value and the second lane line data carries a lane line mark;

and under the condition that the second prediction probability is larger than a second threshold value and the second lane line data does not carry lane line marks, judging that the lane line marking of the second lane line data is missed, and adding the lane line marks to the second lane line data with the lane line marking missed.

6. The method of claim 5, wherein after adding the lane mark to the lane mark-missing second lane mark data, comprising:

inputting the updated lane line data into the candidate semantic segmentation network to obtain a third lane line prediction result;

and under the condition that the third lane line prediction result meets the preset prediction condition, inputting the third lane line prediction result into the candidate semantic segmentation network, and correcting the third lane line to realize the dynamic growth of the second lane line data.

7. The method of claim 6, wherein the dynamic growth of the second lane line data comprises: when the distance from the first pixel point to the second pixel point is greater than the preset pixel distance, the candidate semantic segmentation network stops training; the first pixel points are pixels far away from the middle of the lane lines in the predicted lane line data of the second lane line data, and the second pixel points are pixels closest to the first pixel points in the second lane line data to be corrected and carrying lane line marks.

8. A lane line segmentation apparatus that does not require precisely marked data, the apparatus comprising:

the first training module is used for inputting the lane line sample image into an initial semantic segmentation network in a first training process, and the output result of the initial semantic segmentation network comprises a prediction result of first lane line data; calculating a first loss function of the initial semantic segmentation network according to the prediction result of the first lane line data and the first lane line data, and training the initial semantic segmentation network according to the first loss function to obtain candidate semantic segmentation networks;

The second training module is used for acquiring a second lane line prediction result output by the candidate semantic segmentation network in a second training process, judging whether the second lane line prediction result meets a preset prediction condition, correcting the second lane line data according to the second lane line prediction result under the condition that the second lane line prediction result meets the preset prediction condition, and updating the corrected second lane line data to a training set of the candidate semantic segmentation network model;

9. An electronic device, comprising:

a processor; and

a memory having executable code stored thereon, which when executed by the processor causes the processor to perform the method of any of claims 1 to 7.

10. A computer readable storage medium having stored thereon executable code which when executed by a processor of an electronic device causes the processor to perform the method of any of claims 1 to 7.