CN113610818A - A Position-Controllable Human Head Segmentation Method - Google Patents
A Position-Controllable Human Head Segmentation Method Download PDFInfo
- Publication number
- CN113610818A CN113610818A CN202110917750.5A CN202110917750A CN113610818A CN 113610818 A CN113610818 A CN 113610818A CN 202110917750 A CN202110917750 A CN 202110917750A CN 113610818 A CN113610818 A CN 113610818A
- Authority
- CN
- China
- Prior art keywords
- head
- feature
- human head
- module
- key point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a human head segmentation method based on position control. The human head segmentation device comprises a human head key point detection module, a position correction module and a human head segmentation module, wherein the position correction module corrects the user click position so as to match the position of the human head key point, and the final human head segmentation result is obtained by utilizing the key point information and the human head segmentation module. The invention has the beneficial effects that: the method can accurately divide a single head under a multi-person scene, is more flexible, improves the operation efficiency, and is deployed at a mobile phone end.
Description
Technical Field
The invention relates to the technical field of image processing, in particular to a human head segmentation method based on position control.
Background
Human head segmentation is a common function in the current short video software, and aims to provide a basis for special-effect playing methods of making facial expression packs, human head stickers, changing animation bodies and the like for users.
At present, most human head segmentation is developed based on a deep learning technology, and a semantic segmentation model is usually adopted to deduct human head parts in an image. However, when there are many people in the picture, the single head of the person cannot be taken in a targeted manner, which greatly reduces the playability. Although there are also example segmentation technologies capable of performing differentiated segmentation on the human head in a multi-person scenario, the current example segmentation technologies are not highly considered in operation, and also require complex post-processing, and are not suitable for end-side deployment.
Disclosure of Invention
The invention provides a human head segmentation method based on position control, which has high operation efficiency and is flexible in order to overcome the defects in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a human head segmentation method based on position control specifically comprises the following steps:
(1) preprocessing an input picture, scaling the resolution to 256x256, then carrying out normalization operation on the input picture, and controlling the range of pixel points to be between-1 and 1;
(2) building a human head key point detection module, inputting the input picture in the step (1) into the human head key point detection module to obtain 1x256x256 key point characteristics, carrying out position analysis on the key point characteristics to obtain the center point coordinates of each human head, and assuming that N is equal to { N ═ N1,N2,N3… indicates the number of heads, where Ni={xi,yiDenotes the specific position of each person's head in the picture;
(3) assuming that the click position of the user is I ═ x, y, and the click position of the user is not completely in the picture, the position correction module performs nearest distance matching on the positions of the position I circulation and the N heads, and compares the N nearest distances to obtain the head N closest to the click position of the userjDefaulting the head of the person to the head of the person which needs to be deducted from the picture by the user;
(4) for human head NjThe coordinates of (a) are subjected to gaussian blur to obtain the condition of the specific position, and the calculation formula of the gaussian blur is as follows:where σ is 2 and the size of the Gaussian kernel is set to 10The information range of the position condition is enlarged, so that the head of a person at a specific position is better segmented;
(5) constructing a human head segmentation module which is a full convolution neural network and consists of an encoding module and a decoding module, wherein the encoding module consists of 4 feature extraction units, each feature extraction unit consists of two convolution layers and a down-sampling layer, and the multiple of each down-sampling layer is 2, so that the whole encoding module carries out 16 times of down-sampling; in addition, in a decoding module, the size of the feature is restored by using the combination of one convolution layer and one upsampling, 2 times of upsampling is carried out each time, meanwhile, the feature is fused with the feature with the same size in the coding module, and 4 times of operation is carried out according to the mode, so that the output feature with the size consistent with that of the original image is finally obtained; finally, sigmoid function activation is carried out on the output characteristics;
(6) and (3) merging the input picture in the step (1) and the head position condition information in the step (4), and inputting the merged input picture into the head segmentation network in the step (5) to obtain a single head mask at a corresponding position, so as to complete the head segmentation required by the user.
According to the method, the position of the head of a user needing to be segmented is obtained by using the head key point detection network and the position correction module, then the position and the segmentation network for segmenting the head of the user are used for accurately segmenting the head of the user, the influence of other people in a picture is ignored, the single head of the user under a multi-person scene can be accurately segmented, the method is more flexible, the operation efficiency is improved, and the method is deployed at a mobile phone end.
Preferably, in step (2), the human head key point detection module refers to: the full-convolution neural network consists of an encoding module and a decoding module, the encoding module consists of 5 feature extraction units, each feature extraction unit consists of two convolution layers and a down-sampling layer, and the multiple of each down-sampling layer is 2, so that the whole encoding module carries out 32 times of down-sampling; in a decoding module, restoring the size of the feature by using the combination of two convolutional layers and one upsampling, performing 2 times of upsampling each time, simultaneously fusing the feature with the same size in an encoding module, and performing 5 times of operation according to the mode to finally obtain an output feature map with the size consistent with that of an original image; and analyzing the positions of the key points in the output feature diagram, and obtaining the specific coordinate information of each key point in the original drawing by using an expected method.
Preferably, in step (3), the closest distance dst between the position I and the positions of the N human heads is calculated as follows
The invention has the beneficial effects that: the method can accurately divide a single head under a multi-person scene, is more flexible, improves the operation efficiency, and is deployed at a mobile phone end.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
In the embodiment shown in fig. 1, a method for segmenting a human head based on position control specifically includes the following steps:
(1) preprocessing an input picture, scaling the resolution to 256x256, then carrying out normalization operation on the input picture, and controlling the range of pixel points to be between-1 and 1;
(2) building a human head key point detection module, inputting the input picture in the step (1) into the human head key point detection module to obtain 1x256x256 key point characteristics, carrying out position analysis on the key point characteristics to obtain the center point coordinates of each human head, and assuming that N is equal to { N ═ N1,N2,N3… indicates the number of heads, where Ni={xi,yiDenotes the specific position of each person's head in the picture; the human head key point detection module refers to: a complete convolution neural network composed of two modules of coding and decoding, the coding module is composed of 5 feature extraction units, each feature extraction unit is composed of two convolution layers and a down-sampling layer, the multiple of each down-sampling layer is 2, so the whole coding moduleThe block is downsampled by 32 times; in a decoding module, restoring the size of the feature by using the combination of two convolutional layers and one upsampling, performing 2 times of upsampling each time, simultaneously fusing the feature with the same size in an encoding module, and performing 5 times of operation according to the mode to finally obtain an output feature map with the size consistent with that of an original image; analyzing the positions of key points in the output characteristic diagram, and acquiring specific coordinate information of each key point in the original image by using an expected method;
(3) assuming that the click position of the user is I ═ x, y, and the click position of the user is not completely in the picture, the position correction module performs nearest distance matching on the positions of the position I circulation and the N heads, and compares the N nearest distances to obtain the head N closest to the click position of the userjDefaulting the head of the person to the head of the person which needs to be deducted from the picture by the user; the closest distance dst between the position I and the position of N persons' heads is calculated as follows
(4) For human head NjThe coordinates of (a) are subjected to gaussian blur to obtain the condition of the specific position, and the calculation formula of the gaussian blur is as follows:the sigma is 2, and the size of the Gaussian kernel is set to be 10 to enlarge the information range of the position condition, so that the human head at a specific position is better segmented;
(5) constructing a human head segmentation module which is a full convolution neural network and consists of an encoding module and a decoding module, wherein the encoding module consists of 4 feature extraction units, each feature extraction unit consists of two convolution layers and a down-sampling layer, and the multiple of each down-sampling layer is 2, so that the whole encoding module carries out 16 times of down-sampling; in addition, in a decoding module, the size of the feature is restored by using the combination of one convolution layer and one upsampling, 2 times of upsampling is carried out each time, meanwhile, the feature is fused with the feature with the same size in the coding module, and 4 times of operation is carried out according to the mode, so that the output feature with the size consistent with that of the original image is finally obtained; finally, sigmoid function activation is carried out on the output characteristics;
(6) and (3) merging the input picture in the step (1) and the head position condition information in the step (4), and inputting the merged input picture into the head segmentation network in the step (5) to obtain a single head mask at a corresponding position, so as to complete the head segmentation required by the user.
The whole method comprises a human head key point detection module, a position correction module and a human head segmentation module. And correcting the click position of the user through a position correction module so as to match the position of the key point of the head of the user. And obtaining a final human head segmentation result by using the key point information and the human head segmentation module. The whole system adopts a lightweight model design, and the operation effect is high. According to the method, the position of the head of a user needing to be segmented is obtained by using the head key point detection network and the position correction module, then the position and the segmentation network for segmenting the head of the user are used for accurately segmenting the head of the user, the influence of other people in a picture is ignored, the single head of the user under a multi-person scene can be accurately segmented, the method is more flexible, the operation efficiency is improved, and the method is deployed at a mobile phone end.
Claims (3)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110917750.5A CN113610818B (en) | 2021-08-11 | 2021-08-11 | A head segmentation method based on position controllable |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110917750.5A CN113610818B (en) | 2021-08-11 | 2021-08-11 | A head segmentation method based on position controllable |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN113610818A true CN113610818A (en) | 2021-11-05 |
| CN113610818B CN113610818B (en) | 2024-12-13 |
Family
ID=78340207
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110917750.5A Active CN113610818B (en) | 2021-08-11 | 2021-08-11 | A head segmentation method based on position controllable |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN113610818B (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106960195A (en) * | 2017-03-27 | 2017-07-18 | 深圳市丰巨泰科电子有限公司 | A kind of people counting method and device based on deep learning |
| CN108304820A (en) * | 2018-02-12 | 2018-07-20 | 腾讯科技(深圳)有限公司 | A kind of method for detecting human face, device and terminal device |
| US20190057507A1 (en) * | 2017-08-18 | 2019-02-21 | Samsung Electronics Co., Ltd. | System and method for semantic segmentation of images |
| CN111339395A (en) * | 2020-02-11 | 2020-06-26 | 山东经贸职业学院 | Data information matching method and system for electronic commerce system |
| CN111670457A (en) * | 2017-12-03 | 2020-09-15 | 脸谱公司 | Optimization of dynamic object instance detection, segmentation and structure mapping |
-
2021
- 2021-08-11 CN CN202110917750.5A patent/CN113610818B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106960195A (en) * | 2017-03-27 | 2017-07-18 | 深圳市丰巨泰科电子有限公司 | A kind of people counting method and device based on deep learning |
| US20190057507A1 (en) * | 2017-08-18 | 2019-02-21 | Samsung Electronics Co., Ltd. | System and method for semantic segmentation of images |
| CN111670457A (en) * | 2017-12-03 | 2020-09-15 | 脸谱公司 | Optimization of dynamic object instance detection, segmentation and structure mapping |
| CN108304820A (en) * | 2018-02-12 | 2018-07-20 | 腾讯科技(深圳)有限公司 | A kind of method for detecting human face, device and terminal device |
| CN111339395A (en) * | 2020-02-11 | 2020-06-26 | 山东经贸职业学院 | Data information matching method and system for electronic commerce system |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113610818B (en) | 2024-12-13 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110969124B (en) | Two-dimensional human body posture estimation method and system based on lightweight multi-branch network | |
| CN112330574B (en) | Portrait restoration method and device, electronic equipment and computer storage medium | |
| CN113112416B (en) | A semantically guided face image restoration method | |
| CN116664397B (en) | TransSR-Net structured image super-resolution reconstruction method | |
| Hu et al. | Face restoration via plug-and-play 3D facial priors | |
| CN113808005B (en) | A method and device for face posture migration based on video drive | |
| CN108537754B (en) | Face image restoration system based on deformation guide picture | |
| CN112906706A (en) | Improved image semantic segmentation method based on coder-decoder | |
| CN108596833A (en) | Super-resolution image reconstruction method, device, equipment and readable storage medium storing program for executing | |
| CN112966574A (en) | Human body three-dimensional key point prediction method and device and electronic equipment | |
| CN110059698A (en) | The semantic segmentation method and system based on the dense reconstruction in edge understood for streetscape | |
| CN111784623A (en) | Image processing method, image processing device, computer equipment and storage medium | |
| CN106960202A (en) | A kind of smiling face's recognition methods merged based on visible ray with infrared image | |
| CN105960657A (en) | Face hallucination using convolutional neural networks | |
| CN109325915A (en) | A super-resolution reconstruction method for low-resolution surveillance video | |
| CN111899169B (en) | Method for segmenting network of face image based on semantic segmentation | |
| CN107871103B (en) | A face authentication method and device | |
| CN113935435A (en) | Multi-modal emotion recognition method based on space-time feature fusion | |
| Shiri et al. | Identity-preserving face recovery from stylized portraits | |
| CN115731597A (en) | A face mask mask image automatic segmentation and restoration management platform and method | |
| CN116258627A (en) | A system and method for super-resolution restoration of extremely degraded face images | |
| CN110163156A (en) | It is a kind of based on convolution from the lip feature extracting method of encoding model | |
| CN113516604B (en) | Image restoration method | |
| CN118015142B (en) | Face image processing method, device, computer equipment and storage medium | |
| CN112200816A (en) | Method, device and equipment for segmenting region of video image and replacing hair |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |