CN112396610B

CN112396610B - Image processing method, computer device, and storage medium

Info

Publication number: CN112396610B
Application number: CN201910741275.3A
Authority: CN
Inventors: 张云柯; 龚立雪; 樊鲁斌; 任沛然
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2019-08-12
Filing date: 2019-08-12
Publication date: 2025-01-07
Anticipated expiration: 2039-08-12
Also published as: CN112396610A

Abstract

The present application discloses an image processing method, a computer device, and a storage medium. The method comprises: obtaining an image, wherein the image comprises a foreground portion and a background portion; obtaining feature data corresponding to the foreground portion and the background portion of the image respectively; inputting the feature data corresponding to the foreground portion and the background portion of the image respectively into a mixer of a neural network model, wherein the mixer is used to predict the pixel transparency of the image by comparing the foreground portion and the background portion; and determining the pixel transparency of the image through the mixer. The embodiment of the present application can be applied to the field of computer cutouts, and provides a solution for automatically calculating the pixel transparency of an image and automatically cutting out the image according to the pixel transparency of the image, which overcomes the problems of large manpower cost and low efficiency of manual cutouts, and can meet the needs of batch cutouts.

Description

Image processing method, computer equipment and storage medium

Technical Field

The present application relates to the field of data processing technology, and in particular, to an image processing method, a computer device, and a computer readable storage medium.

Background

In the process of planar design and image editing, there is often a need for image segmentation, that is, matting, which refers to separating a certain part of a picture or image from an original picture or image into separate layers, for example, matting a model, a product, and other foreground objects in an image from the image, and placing the segmented foreground objects in another background image, thereby synthesizing a new image.

The high-precision image matting is generally required to be very high, and the high-precision image matting is required to be completed when the pixel level is reached, and the matting processing is further carried out according to the pixel transparency through the estimated transparency of the professional pixel by pixel.

Therefore, the high-precision image matting has very large labor cost, and generally, a professional needs approximately five hours to scratch one large image, the time consumption is long, the image matting efficiency is low, and the method cannot be applied to scenes needing large-batch image matting.

Disclosure of Invention

The present application has been made in view of the above problems, and has as its object to provide an image processing method, a computer device, and a computer-readable storage medium that overcome or at least partially solve the above problems.

According to an aspect of the present application, there is provided an image processing method including:

acquiring an image, wherein the image comprises a foreground portion and a background portion;

acquiring characteristic data corresponding to a foreground part and a background part of the image respectively;

Inputting feature data corresponding to a foreground portion and a background portion of the image, respectively, into a mixer of a neural network model, wherein the mixer is used for predicting pixel transparency of the image by comparing the foreground portion and the background portion;

and determining, by the mixer, a pixel transparency of the image.

Optionally, the determining, by the mixer, the transparency of the image includes:

Obtaining a pixel transparency relative value of the foreground part and the background part according to the characteristic data respectively corresponding to the foreground part and the background part;

and determining the pixel transparency of the image according to the pixel transparency relative value and the pixel transparency respectively corresponding to the foreground part and the background part.

Optionally, the method further comprises:

And performing image segmentation processing based on the pixel transparency of the image.

Optionally, the neural network model further includes at least two decoders, and the method further includes:

And respectively inputting the characteristic data of the image into the decoder to respectively obtain the characteristic data respectively corresponding to the foreground part and the background part of the image and the pixel transparency respectively corresponding to the foreground part and the background part.

Optionally, the neural network model further includes an encoder, and the method further includes:

and inputting the image into an encoder to obtain the characteristic data of the image.

According to another aspect of the present application, there is provided an image processing method including:

Determining a first content element and a second content element of the image;

Acquiring a pixel attribute relative value of the first content element and the second content element;

Determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attribute respectively corresponding to the first content element and the second content element;

and performing image processing based on the pixel attributes.

Optionally, the first content element includes an image foreground portion, the second content element includes an image background portion, and determining the first content element and the second content element of the image includes:

identifying at least one target object in the image;

and determining the image foreground part and the image background part according to the corresponding relation between the target object and the image foreground part or the image background part.

Optionally, the determining the first content element and the second content element of the image includes:

And inputting the image into a first content decoder and a second content decoder of the neural network model to obtain a first content element and a second content element of the image respectively.

Optionally, the acquiring the pixel attribute relative value of the first content element and the second content element includes:

Acquiring characteristic data corresponding to a first content element and a second content element of the image respectively;

and determining the pixel attribute relative value of the foreground part and the background part of the image according to the acquired characteristic data.

Optionally, the determining the pixel attribute relative value of the foreground portion and the background portion of the image according to the acquired feature data includes:

and inputting the characteristic data respectively corresponding to the first content element and the second content element and the characteristic data of the image into a mixer of a neural network to obtain the pixel attribute relative value of the foreground part and the background part of the image.

Optionally, the acquiring the feature data respectively corresponding to the first content element and the second content element of the image includes:

Inputting the image into an encoder of a neural network model to obtain characteristic data of a plurality of dimensions corresponding to the image;

And inputting the characteristic data of the image into a first content decoder and a second content decoder of a neural network model to respectively obtain the characteristic data corresponding to the first content element and the second content element of the image.

Optionally, the pixel attribute of the image is within a numerical interval formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

Optionally, the determining, according to the pixel attribute relative value and the pixel attributes respectively corresponding to the first content element and the second content element, the pixel attribute of the image includes:

determining attribute weights of pixel attributes corresponding to the first content element and the second content element respectively according to the pixel attribute relative values and preset numerical relationships;

And determining the pixel attribute of the image according to the pixel attribute corresponding to the first content element and the second content element and the attribute weight corresponding to the pixel attribute.

Optionally, the pixel attribute includes a pixel transparency, and the image processing based on the pixel attribute includes:

And performing image segmentation processing according to the pixel transparency of the image.

Optionally, before the image processing based on the pixel attribute, the method further includes:

And determining a target object to be processed in the image.

Optionally, the determining the target object to be processed in the image includes:

and determining the target object to be processed in the image according to whether at least one target object in the image belongs to a preset category.

Determining a plurality of content elements of the image;

acquiring pixel attribute relative values among content elements;

determining pixel attributes of the image according to the pixel relative parameters and the pixel attributes of the content elements;

and performing image processing based on the pixel attributes.

According to another aspect of the present application, there is provided an image processing method characterized by comprising:

Acquiring an input first image;

Determining characteristic data corresponding to a foreground part and a background part of the first image respectively by adopting a first neural network and a second neural network;

determining pixel attributes of the first image according to the characteristic data respectively corresponding to the foreground part and the background part of the first image;

extracting the first content element based on pixel attributes of the first image;

Obtaining a second image according to the first content element and the updated second content element;

Providing the second image.

Optionally, the method further comprises:

determining feature data of an intermediate scene portion of the first image using a third neural network;

The determining the pixel attribute of the first image according to the feature data corresponding to the foreground part and the background part of the first image respectively includes:

and determining the pixel attribute of the first image according to the characteristic data respectively corresponding to the foreground part, the background part and the middle scenery part of the first image.

According to another aspect of the present application there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing a method as described above in one or more of the above when executing the computer program.

According to another aspect of the present application, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs a method as one or more of the above.

According to the embodiment of the application, after the first content element and the second content element of the image are determined, the relative values of the pixel attributes of the two content elements are obtained, the pixel attributes of the image can be obtained according to the relative values of the pixel attributes and the pixel attributes corresponding to the first content element and the second content element respectively, further image processing can be carried out according to the pixel attributes, the first content element and the second content element respectively correspond to the foreground and the background of the image, and the pixel attributes are taken as pixel transparency as an example.

In addition, the embodiment of the application determines the pixel attribute of the image according to the relative value of the pixel attribute and the pixel attribute corresponding to the first content element and the second content element respectively, so that the pixel attribute of the image can be in the range of the pixel attribute corresponding to the first content element and the second content element, and the accuracy of the matting is ensured.

The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the alternative embodiments. The drawings are only for purposes of illustrating alternative embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:

FIG. 1 is a flowchart of an embodiment of an image processing method based on a neural network model according to a first embodiment of the present application;

fig. 2 shows a flowchart of an embodiment of an image processing method according to a second embodiment of the present application;

FIG. 3 is a flowchart showing an embodiment of an image processing method according to a third embodiment of the present application;

FIG. 4 illustrates an architectural diagram of image processing in one example of the application;

FIG. 5 is a block diagram showing an embodiment of an image processing apparatus based on a neural network model according to a fourth embodiment of the present application;

fig. 6 is a block diagram showing an image processing apparatus according to a fifth embodiment of the present application;

fig. 7 is a block diagram showing the construction of an embodiment of an image processing apparatus according to a sixth embodiment of the present application;

FIG. 8 is a flowchart showing an embodiment of an image processing method according to a seventh embodiment of the present application;

FIG. 9 shows a schematic diagram of image processing in another example according to the application;

FIG. 10 illustrates an exemplary system that can be used to implement various embodiments described in this disclosure.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

For a better understanding of the present application, the following description is given to illustrate the concepts related to the present application to those skilled in the art:

The embodiment of the application divides the image into a plurality of content elements, which can comprise a first content element and a second content element, and the specific division mode can be determined according to the actual service requirement. For example, the foreground portion of the image is taken as a first content element and the background portion is taken as a second content element. The foreground portion is the subject of image presentation or baking, and is typically an object that is positioned near the lens and that moves, and the background portion is typically the environment in which the foreground portion is located. The content elements can be divided according to the image area, the image content of a certain part of the area is used as a first content element, the image content of another part of the area is used as a second content element, and the content elements can be divided according to the classification of the image content, wherein the classification of the image content comprises characters, graphics, videos, templates and the like, for example, the character part in the image is used as the first content element, and the video part is used as the second content element. The specific contents of the first object and the second object may also be set according to actual service requirements, which is not limited by the present application.

The image and content elements each have a corresponding pixel attribute, which may be, for example, one or more of a pixel RGB value (color value of the pixel in three channels of red, green, and blue), a pixel transparency, a pixel position coordinate, a content classification to which the pixel corresponds, a pixel sharpness, and the like. In one embodiment of the application, the pixel attributes may include pixel transparency.

The pixel attribute relative value characterizes the relative relationship among the plurality of content elements on the pixel attribute, which can be characterized as the pixel attribute of one or more pixels, the ratio or the weighted average value of the pixel attribute, and the like. Taking the example that the pixel attribute includes pixel transparency, the pixel attribute relative value may be the pixel transparency of one or more content elements, or the pixel transparency relative value (e.g., ratio) of two content elements.

According to the pixel attribute and the relative value of the pixel attribute of the plurality of content elements, the pixel attribute of the image is determined, and then the image attribute is used as the basis of image processing.

Image processing to which embodiments of the present application relate may include image segmentation, noise removal, enhancement or restoration, and so forth. How to perform image processing according to the image attribute can be set according to actual service requirements, which is not limited by the present application.

The image processing method of the present application will be described below by taking an application scenario in which image processing includes image segmentation as an example, wherein an image content element includes a foreground portion and a background portion, pixel attributes corresponding to the image and the content element include pixel transparency, and pixel relative parameters of the content element or the image include pixel transparency relative values.

Referring to fig. 1, a flowchart of an embodiment of an image processing method based on a neural network model according to a first embodiment of the present application is shown, where the image processing process is performed using the neural network model, and the neural network model includes a mixer, and the method may specifically include the following steps:

step 101, an image is acquired, wherein the image comprises a foreground portion and a background portion.

Step 102, acquiring characteristic data corresponding to a foreground part and a background part of the image respectively.

And inputting the characteristic data corresponding to the foreground part and the background part of the image into the mixer, and further obtaining the relative pixel transparency value of the foreground part and the background part after the image is obtained.

The content elements for dividing the image comprise a foreground part and a background part, and after the image is acquired, the characteristic data of the foreground part and the background part are further acquired.

In Computer Vision (CV), in order for a machine to recognize an image, it is necessary to abstract the image into a form that can be understood by the machine, that is, to extract features from the image, and to represent the image using feature data. The feature data may comprise features of the content element in one or more dimensions, in particular in the form of a vector, i.e. vectorising the image. Such as the number of content elements corresponding to a certain data category, the number of pixels, position coordinates, etc. The characteristic data in the embodiment of the application is used for representing the content elements, and the content elements can be more accurately represented by constructing the characteristic data with high dimensionality.

And step 103, inputting the characteristic data corresponding to the foreground part and the background part of the image into a mixer of the neural network model.

The neural network model can adopt a supervised learning mode, the mixer is obtained by marking the characteristic data of the image sample and the corresponding pixel transparency relative value through a training method of the neural network model, and in the embodiment of the application, the mixer is used for predicting the transparency of the image by comparing the foreground part and the background part.

Step 104, determining, by the mixer, the pixel transparency of the image.

In an optional embodiment, when determining the transparency of the image by the mixer, the pixel transparency relative value of the foreground portion and the background portion may be obtained according to the feature data corresponding to the foreground portion and the background portion, and further, the pixel transparency of the image may be determined according to the pixel transparency relative value and the pixel transparency corresponding to the foreground portion and the background portion.

The pixel transparency of the image is determined according to the pixel transparency of the content elements and the relative value of the pixel transparency, so that the range of the pixel transparency of the image can be defined according to the pixel transparency of each content element, the pixel transparency of the image can be in the range of the pixel transparency corresponding to the foreground part and the background part, and the matting accuracy is ensured.

The specific scheme of determining the pixel transparency of the image according to the pixel transparency of the content element and the transparency relative value can be set according to actual requirements. For example, the pixel transparency of the image is a weighted value of the pixel transparency of the foreground portion and the pixel transparency of the background portion, wherein the weight of the pixel transparency of the foreground portion or the background portion may be determined according to the transparency relative value, and may be expressed as, for example, pixel transparency of the image = pixel transparency of the foreground portion + (1-transparency relative value) +pixel transparency relative value of the background portion.

In an alternative embodiment, the image segmentation process may also be performed based on the pixel transparency of the image.

According to the embodiment of the application, the characteristic data corresponding to the foreground part and the background part of the image are input into the mixer to obtain the corresponding pixel transparency relative values respectively, the pixel transparency of the image is determined according to the pixel transparency relative values and the pixel transparency corresponding to the foreground part and the background part respectively, and the image segmentation processing is further carried out based on the pixel transparency of the image.

In the embodiment of the application, the neural network model further comprises at least two decoders, the decoders take the characteristic data of the image as input, decode the image characteristic data by combining the model obtained by training, and can respectively obtain the characteristic data respectively corresponding to the foreground part and the background part of the image and the pixel transparency respectively corresponding to the foreground part and the background part.

In the embodiment of the application, the neural network model further comprises an encoder, the characteristic data of the image can be obtained according to the encoder, the image is input into the encoder, and the encoder recognizes the image characteristic according to the preset dimension to obtain the characteristic data of the image.

Referring to fig. 2, a flowchart of an embodiment of an image processing method according to a second embodiment of the present application is shown, and the method may specifically include the following steps:

step 201, a first content element and a second content element of an image are determined.

Step 202, obtaining a pixel attribute relative value of the first content element and the second content element.

Step 203, determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attributes respectively corresponding to the first content element and the second content element.

And 204, performing image processing based on the pixel attributes.

In the embodiment of the application, optionally, the first content element comprises an image foreground part, the second content element comprises an image background part, at least one target object in the image can be identified when the first content element and the second content element of the image are determined, and the image foreground part and the image background part are determined according to the corresponding relation between the target object and the image foreground part or the image background part. The region or layer where the target object is located can be identified as the foreground portion of the image, and the region or layer where the target object is located can be identified as the background portion of the image. For example, if a pedestrian is included in the recognition image, the image area where the pedestrian is located may be defined as a foreground portion of the image, and if a frame pattern is included in the recognition image, the image area where the frame pattern is located may be defined as a background portion of the image.

In the embodiment of the present application, optionally, a decoder included in the neural network model may be used to obtain content elements of an image, and when determining a first content element and a second content element of the image, the image may be input into the first content decoder and the second content decoder of the neural network model to obtain the first content element and the second content element of the image respectively.

In the embodiment of the application, optionally, when acquiring the pixel relative parameters of the first content element and the second content element, the characteristic data corresponding to the first content element and the second content element of the image can be acquired respectively, and the pixel attribute relative values of the foreground part and the background part of the image are determined according to the acquired characteristic data.

In the embodiment of the application, optionally, a mixer included in the neural network model may be used to obtain the pixel attribute relatedness between the content elements of the image. When determining the relative value of the pixel attribute of the foreground part and the background part of the image according to the obtained characteristic data, the characteristic data respectively corresponding to the first content element and the second content element and the characteristic data of the image can be input into a mixer of the neural network to obtain the relative value of the pixel attribute of the foreground part and the background part of the image.

In the embodiment of the application, optionally, when the characteristic data corresponding to the first content element and the second content element of the image are obtained, the image can be input into the encoder of the neural network model to obtain the characteristic data corresponding to a plurality of dimensions of the image, and the characteristic data of the image is input into the first content decoder and the second content decoder of the neural network model to obtain the characteristic data corresponding to the first content element and the second content element of the image respectively.

According to the scheme for determining the pixel attribute of the image in the embodiment of the application, the pixel attribute of the image is in a numerical interval formed by the pixel attribute of the first content element and the pixel attribute of the second content element.

In the embodiment of the application, the pixel attribute may include pixel transparency, and when image processing is performed based on the pixel attribute, image segmentation processing may be performed according to the pixel transparency of the image.

In the embodiment of the application, optionally, the attribute weights of the pixel attributes corresponding to the first content element and the second content element are determined according to the relative values of the pixel attributes and the preset numerical relationships, and the pixel attributes of the image are determined according to the pixel attributes corresponding to the first content element and the second content element and the attribute weights corresponding to the pixel attributes.

The preset data relationship between the pixel attribute relative value and the attribute weight can be set according to actual requirements, so that after the pixel attribute value is obtained, the attribute weight of the pixel transparency corresponding to the first content element and the second content element is obtained, and the pixel attribute of the image is further calculated according to the attribute weight.

For example, if the attribute weight of the pixel transparency of one content element is a relative value of the pixel attribute and the sum of the attribute weights of the pixel transparency of two content elements is 1, the attribute weight of the pixel transparency of the other content element can be obtained. Taking the example that the content element includes a foreground portion and a background portion, the pixel transparency of the image is characterized by foreground portion transparency by foreground attribute weight+background attribute weight by background portion pixel transparency, and further can be characterized by foreground portion transparency by transparency relative value+background portion pixel transparency (1-transparency relative value).

In an embodiment of the present application, optionally, before performing image processing based on the pixel attribute, the method further includes determining a target object to be processed in the image. The target object may be identified using various image identification techniques applicable in the art.

In the embodiment of the present application, optionally, when determining the target object to be processed in the image, the target object to be processed in the image may be determined according to whether at least one target object in the image belongs to a preset category. For example, according to whether the target object is an animal, it is determined that the target object needs to be subjected to matting processing. The specific preset categories can be set according to actual application requirements.

Referring to fig. 3, a flowchart of an embodiment of an image processing method according to a third embodiment of the present application is shown, and the method may specifically include the steps of:

step 301, a plurality of content elements of an image are determined.

In step 302, pixel attribute relative values between content elements are obtained.

Step 303, determining the pixel attribute of the image according to the pixel attribute relative value and the pixel attribute of the content element.

And step 304, performing image processing based on the pixel attributes.

In this embodiment, the content element may include a plurality of content elements, for example, two or more content elements, and specific implementation details may refer to the description of the foregoing embodiment, which is not repeated herein.

According to the embodiment of the application, after a plurality of content elements of an image are determined, the pixel attribute relative values of the content elements are obtained, the pixel attributes of the image are obtained according to the pixel attribute relative values and the pixel attributes of the content elements, the image is further processed according to the pixel attributes, the content elements comprise the foreground and the background of the image, and the image pixels are taken as pixel transparency as an example.

In addition, the pixel attribute of the image is determined according to the pixel relative parameter and the pixel attribute corresponding to the content element respectively, so that the pixel attribute of the image can be in a numerical interval formed by the pixel attributes of a plurality of content elements, and the accuracy of matting is ensured.

In order to enable those skilled in the art to better understand the present application, a scheme of the present application will be described below by way of specific examples.

Fig. 4 shows an architectural diagram of image processing in one example of the application. It can be seen that the neural network model includes at least an encoder, a decoder, and a mixer. The corresponding processing flow comprises:

1. Image input

2. Encoder encoding to obtain characteristic data

3. The front Jing Jiema decoder decodes and the background decoder obtains the foreground characteristic and the background characteristic, and the pixel transparencies corresponding to the foreground and the background respectively are stored in the video memory

4. The pixel transparency relative values of the foreground and the background are obtained by mixing by a mixer and are stored in a video memory

5. Determining the pixel transparency of the image according to the transparency relative value, and storing the pixel transparency in a memory to facilitate the subsequent image segmentation processing

6. Image segmentation based on pixel transparency of an image

Referring to fig. 5, a block diagram of an embodiment of an image processing apparatus based on a neural network model according to a fourth embodiment of the present application may specifically include:

A relative value obtaining module 401, configured to input feature data corresponding to a foreground portion and a background portion of the image into the mixer, to obtain a relative value of pixel transparency of the foreground portion and the background portion;

a transparency determining module 402, configured to determine a pixel transparency of the image according to the pixel transparency relative value and the pixel transparency corresponding to the foreground portion and the background portion respectively;

an image segmentation module 403, configured to perform image segmentation processing based on pixel transparency of the image.

In an alternative embodiment of the present application, the neural network model further includes at least two decoders, and the apparatus further includes:

And the characteristic data processing module is used for respectively inputting the characteristic data of the image into the decoder to respectively obtain the characteristic data respectively corresponding to the foreground part and the background part of the image and the pixel transparency respectively corresponding to the foreground part and the background part.

In an alternative embodiment of the present application, the neural network model further includes an encoder, and the apparatus further includes:

and the characteristic data acquisition module is used for inputting the image into an encoder to obtain the characteristic data of the image.

In addition, according to the embodiment of the application, the pixel transparency corresponding to the image is determined according to the pixel transparency relative value and the pixel transparency corresponding to the foreground part and the background part respectively, so that the pixel transparency of the image can be in the range of the pixel transparency corresponding to the foreground part and the background part, and the matting accuracy is ensured.

Referring to fig. 6, there is shown a block diagram of an embodiment of an image processing apparatus according to a fifth embodiment of the present application, which may specifically include:

An element determination module 501 for determining a first content element and a second content element of an image;

A parameter obtaining module 502, configured to obtain a pixel attribute relative value of the first content element and the second content element;

An attribute determining module 503, configured to determine a pixel attribute of the image according to the pixel attribute relative value and pixel attributes corresponding to the first content element and the second content element respectively;

an image processing module 504, configured to perform image processing based on the pixel attribute.

In an alternative embodiment of the present application, the first content element includes an image foreground portion, the second content element includes an image background portion, and the element determination module includes:

a target object recognition sub-module for recognizing at least one target object in the image;

and the content determination submodule is used for determining the image foreground part and the image background part according to the corresponding relation between the target object and the image foreground part or the image background part.

In an optional embodiment of the present application, the attribute determining module is specifically configured to input the image into a first content decoder and a second content decoder of a neural network model, to obtain a first content element and a second content element of the image respectively.

In an alternative embodiment of the present application, the parameter obtaining module includes:

The characteristic data acquisition sub-module is used for acquiring characteristic data corresponding to the first content element and the second content element of the image respectively;

And the relative value determining submodule is used for determining the relative value of the pixel attribute of the foreground part and the background part of the image according to the acquired characteristic data.

In an optional embodiment of the present application, the relative value determining submodule is specifically configured to input feature data corresponding to the first content element and the second content element respectively and feature data of the image into a mixer of the neural network, so as to obtain a relative value of pixel attributes of a foreground portion and a background portion of the image.

In an alternative embodiment of the present application, the feature data obtaining submodule is specifically configured to input the image into an encoder of the neural network model to obtain feature data corresponding to a plurality of dimensions of the image, and input the feature data of the image into a first content decoder and a second content decoder of the neural network model to obtain feature data corresponding to a first content element and a second content element of the image respectively.

In an alternative embodiment of the application, the pixel property of the image is within a numerical interval of the pixel property of the first content element and the pixel property of the second content element.

In an alternative embodiment of the present application, the attribute determining module includes:

the parameter determination submodule is used for determining the attribute weight of the pixel attribute corresponding to the first content element and the second content element respectively according to the pixel attribute relative value and a preset numerical relation;

And the operation submodule is used for determining the pixel attribute of the image according to the pixel attribute corresponding to the first content element and the second content element and the attribute weight corresponding to the pixel attribute.

In an optional embodiment of the application, the image processing module is specifically configured to perform image segmentation processing according to pixel transparency of the image.

In an alternative embodiment of the present application, the apparatus further comprises:

And the object determining module is used for determining a target object to be processed in the image before the image processing is performed based on the pixel attribute.

In an optional embodiment of the present application, the object determining module is specifically configured to determine, according to whether at least one target object in the image belongs to a preset category, a target object to be processed in the image.

Referring to fig. 7, there is shown a block diagram of an embodiment of an image processing apparatus according to a sixth embodiment of the present application, which may specifically include:

an element determination module 601 for determining a plurality of content elements of an image;

A relative value obtaining module 602, configured to obtain a relative value of a pixel attribute between content elements;

an attribute determining module 603, configured to determine a pixel attribute of the image according to the pixel attribute relative value and the pixel attribute of the content element;

an image processing module 604, configured to perform image processing based on the pixel attribute.

According to the embodiment of the application, after a plurality of content elements of an image are determined, the relative values of the pixel attributes of the content elements are obtained, the pixel attributes of the image are obtained according to the relative values of the pixel attributes and the pixel attributes of the content elements, further image processing is carried out according to the pixel attributes, the content elements comprise the foreground and the background of the image, and the pixel attributes of the image are taken as pixel transparency as an example.

In addition, the embodiment of the application determines the pixel attribute of the image according to the pixel attribute relative value and the pixel attribute respectively corresponding to the content elements, so that the pixel attribute of the image can be positioned in a numerical interval formed by the pixel attributes of a plurality of content elements, and the accuracy of the matting is ensured.

Referring to fig. 8, a flowchart of an embodiment of an image processing method according to a seventh embodiment of the present application may specifically include:

In step 701, an input first image is acquired.

The first image is an image to be processed, comprises a first content element and a second content element, and can also comprise at least one other content element. For example, the first content element is a foreground part, the second content element is a background part, and other content elements, such as an intermediate scene part, specifically, the content except the background part and the content not being the foreground part in the image, can be set according to actual needs, and the content and the identification mode of the content can be set.

Step 702, determining feature data corresponding to a foreground portion and a background portion of the first image respectively by using a first neural network and a second neural network.

Step 703, determining the pixel attribute of the first image according to the feature data corresponding to the foreground portion and the background portion of the first image.

After the first image is acquired, the embodiment of the application further determines the relative value of the pixel attribute of the first content element and the second content element, and determines the pixel attribute of the image according to the relative value of the pixel attribute. Specific details of the determination may be referred to above in embodiments 1-6, and will not be described here again.

Step 704, extracting the first content element based on the pixel attribute of the first image.

The pixel attribute of the image may extract a first content element, taking the pixel attribute of the image as the pixel transparency of the image and the first content element as the background content as an example, and the foreground content of the image may be further extracted according to the pixel transparency of the image.

Step 705, obtaining a second image according to the first content element and the updated second content element.

The acquired first content element may be resynthesized with the updated second content element into a second image, the first image differing from the second image in one of the second content elements. The scheme is applied to a matting scene, and a new image can be synthesized with a new background content according to the extracted foreground content for synthesizing the image.

Step 706 provides the second image.

Further a second image may be provided for display or use.

In an alternative embodiment, the first image may include an intermediate scene portion, excluding foreground and background portions, and the third neural network may be used to determine the characteristic data of the intermediate scene portion of the first image.

Accordingly, when determining the pixel attribute of the first image according to the feature data corresponding to the foreground portion and the background portion of the first image, the pixel attribute of the first image may be determined according to the feature data corresponding to the foreground portion, the background portion and the middle scene portion of the first image.

It will be appreciated that in a specific implementation, the number and types of content elements included in the first image may be set according to actual needs, and further, a plurality of corresponding neural networks may be used to perform corresponding feature extraction and processing, and combinations of corresponding encoders, decoders and mixers may also be different, which is not limited in this application. To increase the efficiency of image processing, multiple neural network models (including encoders, decoders, and mixers) may also be used for image processing.

The processing is performed by adopting a plurality of neural networks, so that the processing efficiency can be improved, and the image processing effect can be improved. Based on the consideration of factors such as specific requirements and cost, different levels of service can be provided for different clients, higher-precision image processing is required to be realized, a processing scheme with a plurality of neural networks can be selected to be adopted to bear higher resource cost, and fewer neural networks can be selected for clients with lower requirements and cost saving requirements, for example, one neural network is adopted to perform feature extraction.

It should be further noted that, the implementation of the steps may adjust the order of the steps according to actual needs, for example, different neural network models may perform the feature extraction steps simultaneously, or may be performed sequentially. The respective neural network models differ in the location and order of the individual components.

A schematic diagram according to another example of the application is shown with reference to fig. 9.

In this example, the image is input through the terminal interface, and after the terminal receives the image, the terminal further performs preprocessing on the image, and further performs prediction on the transparency of the image according to the neural network model. The neural network model comprises an encoder, a front Jing Jiema unit, a background decoder and a mixer.

Firstly, an image encoder is used for obtaining characteristic data of an image, and the characteristic data are input into a foreground neural network and a background neural network, namely a front Jing Jiema device and a background decoder of a neural network model to obtain the characteristic data respectively corresponding to foreground content and background content.

Inputting the characteristic data into a mixer of the neural network model to obtain pixel transparency relative values of the foreground part and the background part, and determining the pixel transparency of the image according to the pixel transparency relative values and the pixel transparency corresponding to the foreground part and the background part respectively.

The foreground part of the image can be buckled according to the transparency of the image, a new image is further synthesized with the new background part, and the new image can be input to the terminal for display.

The foreground portion may be pre-processed, for example, to increase the image effect (e.g., cartoon effect), adjust the turn (e.g., adjust the orientation of the person), and set the position of the foreground portion in the background portion when a new image is synthesized.

The above-described part of the image processing may be performed on the terminal, and also starts to transmit the image to the server through the network.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

Embodiments of the present disclosure may be implemented as a system configured as desired using any suitable hardware, firmware, software, or any combination thereof. Fig. 10 schematically illustrates an example system (or apparatus) 800 that may be used to implement various embodiments described in this disclosure.

For one embodiment, FIG. 10 illustrates an exemplary system 800 having one or more processors 802, a system control module (chipset) 804 coupled to at least one of the processor(s) 802, a system memory 806 coupled to the system control module 804, a non-volatile memory (NVM)/storage device 808 coupled to the system control module 804, one or more input/output devices 810 coupled to the system control module 804, and a network interface 812 coupled to the system control module 806.

The processor 802 may include one or more single-core or multi-core processors, and the processor 802 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, system 800 can function as a browser as described in embodiments of the present application.

In some embodiments, the system 800 can include one or more computer-readable media (e.g., system memory 806 or NVM/storage 808) having instructions and one or more processors 802 combined with the one or more computer-readable media configured to execute the instructions to implement the modules to perform the actions described in this disclosure.

For one embodiment, the system control module 804 may include any suitable interface controller to provide any suitable interface to at least one of the processor(s) 802 and/or any suitable device or component in communication with the system control module 804.

The system control module 804 may include a memory controller module to provide an interface to the system memory 806. The memory controller modules may be hardware modules, software modules, and/or firmware modules.

The system memory 806 may be used to load and store data and/or instructions for the system 800, for example. For one embodiment, system memory 806 may include any suitable volatile memory, such as, for example, a suitable DRAM. In some embodiments, the system memory 806 may include double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).

For one embodiment, the system control module 804 may include one or more input/output controllers to provide an interface to the NVM/storage 808 and the input/output device(s) 810.

For example, NVM/storage 808 may be used to store data and/or instructions. NVM/storage 808 may include any suitable nonvolatile memory (e.g., flash memory) and/or may include any suitable nonvolatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).

NVM/storage 808 may include storage resources that are physically part of the device on which system 800 is installed or which may be accessed by the device without being part of the device. For example, NVM/storage 808 may be accessed over a network via input/output device(s) 810.

Input/output device(s) 810 may provide an interface for system 800 to communicate with any other suitable devices, input/output device 810 may include communication components, audio components, sensor components, and the like. Network interface 812 may provide an interface for system 800 to communicate over one or more networks, and system 800 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as accessing a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof.

For one embodiment, at least one of the processor(s) 802 may be packaged together with logic of one or more controllers (e.g., memory controller modules) of the system control module 804. For one embodiment, at least one of the processor(s) 802 may be packaged together with logic of one or more controllers of the system control module 804 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 802 may be integrated on the same die with logic of one or more controllers of the system control module 804. For one embodiment, at least one of the processor(s) 802 may be integrated on the same die with logic of one or more controllers of the system control module 804 to form a system on chip (SoC).

In various embodiments, system 800 may be, but is not limited to being, a browser, workstation, desktop computing device, or mobile computing device (e.g., a laptop computing device, handheld computing device, tablet, netbook, etc.). In various embodiments, system 800 may have more or fewer components and/or different architectures. For example, in some embodiments, system 800 includes one or more cameras, keyboards, liquid Crystal Display (LCD) screens (including touch screen displays), non-volatile memory ports, multiple antennas, graphics chips, application Specific Integrated Circuits (ASICs), and speakers.

Wherein if the display comprises a touch panel, the display screen may be implemented as a touch screen display to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation.

The embodiment of the application also provides a non-volatile readable storage medium, wherein one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a terminal device, the terminal device can be caused to execute instructions (instructions) of each method step in the embodiment of the application.

In one example, a computer device is provided comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements a method according to an embodiment of the application when executing the computer program.

There is also provided in one example a computer readable storage medium having stored thereon a computer program, characterized in that the program when executed by a processor implements a method as in one or more of the embodiments of the application.

The embodiment of the application discloses an image processing method and device, and example 1 comprises an image processing method, comprising the following steps:

and determining, by the mixer, a pixel transparency of the image.

Example 2 may include the method of example 1, wherein the determining, by the mixer, the transparency of the image comprises:

Example 3 may include the method of example 1, wherein the method further comprises:

Example 4 may include the method of example 1, wherein the neural network model further comprises at least two decoders, the method further comprising:

Example 5 may include the method of example 1, wherein the neural network model further comprises an encoder, the method further comprising:

Example 6 includes an image processing method, comprising:

Determining a first content element and a second content element of the image;

and performing image processing based on the pixel attributes.

Example 7 may include the method of example 6, wherein the first content element includes an image foreground portion, the second content element includes an image background portion, and the determining the first content element and the second content element of the image includes:

identifying at least one target object in the image;

Example 8 may include the method of example 6, wherein the determining the first content element and the second content element of the image comprises:

Example 9 may include the method of example 6, wherein the obtaining the pixel attribute relative value of the first content element and the second content element comprises:

Example 10 may include the method of example 9, wherein the determining pixel attribute relative values of a foreground portion and a background portion of the image from the acquired feature data comprises:

Example 11 may include the method of example 9, wherein the acquiring feature data for each of the first content element and the second content element of the image includes:

Example 12 may include the method of example 6, wherein the pixel attribute of the image is within a numerical interval of the pixel attribute of the first content element and the pixel attribute of the second content element.

Example 13 may include the method of example 6, wherein the determining the pixel attribute of the image from the pixel attribute relative value and the pixel attributes respectively corresponding to the first content element and the second content element comprises:

Example 14 may include the method of example 6, wherein the pixel attribute comprises a pixel transparency, and the image processing based on the pixel attribute comprises:

Example 15 may include the method of example 6, wherein, prior to the image processing based on the pixel attributes, the method further comprises:

And determining a target object to be processed in the image.

Example 16 may include the method of example 6, wherein the determining a target object to be processed in the image comprises:

Example 17 includes an image processing method, comprising:

Determining a plurality of content elements of the image;

acquiring pixel attribute relative values among content elements;

and performing image processing based on the pixel attributes.

Example 18 includes an image processing method, comprising:

Acquiring an input first image;

Providing the second image.

Example 19 may include the method of example 18, further comprising:

Example 20 includes a computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method as in one or more of examples 1-19 when the computer program is executed.

While certain embodiments have been illustrated and described for purposes of description, various alternative, and/or equivalent embodiments, or implementations calculated to achieve the same purposes are shown and described without departing from the scope of the embodiments of the present application. This disclosure is intended to cover any adaptations or variations of the embodiments discussed herein. It is manifestly, therefore, that the embodiments described herein are limited only by the claims and the equivalents thereof.

Claims

1. An image processing method, comprising:

determining, by the mixer, a pixel transparency of the image;

Wherein said determining, by said mixer, the transparency of said image comprises:

2. The method according to claim 1, wherein the method further comprises:

3. The method of claim 1, wherein the neural network model further comprises at least two decoders, the method further comprising:

4. The method of claim 1, wherein the neural network model further comprises an encoder, the method further comprising:

5. An image processing method, comprising:

Determining a first content element and a second content element of the image;

performing image processing based on the pixel attributes;

wherein the obtaining the pixel attribute relative value of the first content element and the second content element includes:

Determining a pixel attribute relative value of a foreground part and a background part of the image according to the acquired characteristic data;

the determining the relative pixel attribute values of the foreground portion and the background portion of the image according to the acquired feature data comprises:

6. The method of claim 5, wherein the first content element comprises an image foreground portion and the second content element comprises an image background portion, and wherein determining the first content element and the second content element of the image comprises:

identifying at least one target object in the image;

7. The method of claim 5, wherein the determining the first content element and the second content element of the image comprises:

8. The method of claim 5, wherein the acquiring the feature data for each of the first content element and the second content element of the image comprises:

9. The method of claim 5, wherein the pixel attribute of the image is within a numerical interval of the pixel attribute of the first content element and the pixel attribute of the second content element.

10. The method of claim 5, wherein determining the pixel attribute of the image based on the pixel attribute relative value and the pixel attributes respectively corresponding to the first content element and the second content element comprises:

11. The method of claim 5, wherein the pixel attributes comprise pixel transparency, and wherein the image processing based on the pixel attributes comprises:

12. The method of claim 5, wherein prior to said image processing based on said pixel attributes, said method further comprises:

And determining a target object to be processed in the image.

13. The method of claim 5, wherein the determining a target object to be processed in the image comprises:

14. An image processing method, comprising:

Determining a plurality of content elements of an image, wherein the plurality of content elements are more than two content elements;

acquiring pixel attribute relative values among content elements;

performing image processing based on the pixel attributes;

wherein the acquiring the relative value of the pixel attribute between the content elements comprises:

acquiring characteristic data corresponding to a plurality of content elements of the image respectively;

And inputting the characteristic data corresponding to the content elements and the characteristic data of the image into a mixer of a neural network to obtain the pixel attribute relative value of the foreground part and the background part of the image.

15. An image processing method, comprising:

Acquiring an input first image;

Extracting a first content element based on pixel attributes of the first image;

Providing the second image.

16. The method as recited in claim 15, further comprising:

17. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to one or more of claims 1-16 when executing the computer program.

18. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to one or more of claims 1-16.