CN108830139A - Depth context prediction technique, device, medium and the equipment of human body key point - Google Patents
Depth context prediction technique, device, medium and the equipment of human body key point Download PDFInfo
- Publication number
- CN108830139A CN108830139A CN201810395949.4A CN201810395949A CN108830139A CN 108830139 A CN108830139 A CN 108830139A CN 201810395949 A CN201810395949 A CN 201810395949A CN 108830139 A CN108830139 A CN 108830139A
- Authority
- CN
- China
- Prior art keywords
- human body
- key point
- body key
- depth
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/103—Static body considered as a whole, e.g. static pedestrian or occupant recognition
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Presently filed embodiment discloses depth context prediction technique, neural network training method, device, electronic equipment, computer readable storage medium and the computer program of a kind of human body key point, and the depth context prediction technique of human body key point therein includes:Obtain image to be processed;The image to be processed is supplied to neural network, is handled via the depth context prediction that the neural network executes human body key point, to obtain the depth context of human body key point;Wherein, the depth context of the human body key point is used to indicate the depth location relativeness between human body key point.Technical solution provided by the present application is conducive to improve the accuracy of 3 D human body attitude prediction, to be conducive to avoid due to 3 D human body attitude prediction mistake and generate adverse effect to interaction entertainment and behavioural analysis etc..
Description
Technical field
This application involves computer vision techniques, more particularly, to a kind of depth context prediction side of human body key point
Method, the depth context prediction meanss of human body key point, neural network training method, neural metwork training device, electronics are set
Standby, computer readable storage medium and computer program.
Background technique
3 D human body attitude prediction plays certain effect in the technical fields such as interaction entertainment and behavioural analysis.
During 3 D human body attitude prediction, three-dimensional is often led to due to the depth prediction mistake of human body key point
Human body gesture prediction mistake, for example, arm should be located on front side of body, and 3 D human body attitude prediction result may be due to phase
The depth prediction mistake of key point is answered, and finally predicts arm and is located on rear side of body.3 D human body attitude prediction mistake can be right
Interaction entertainment and behavioural analysis etc. generate adverse effect.The accuracy for how improving 3 D human body attitude prediction is a value
The technical issues of must paying close attention to.
Summary of the invention
Neural network is predicted and trained to the depth context that the application embodiment provides a kind of human body key point
Technical solution.
According to the application embodiment one aspect, a kind of depth context prediction side of human body key point is provided
Method, the method includes:Obtain image to be processed;The image to be processed is supplied to neural network, via the nerve net
Network executes the depth context prediction processing of human body key point, to obtain the depth context of human body key point;Wherein, institute
The depth context for stating human body key point is used to indicate the depth location relativeness between human body key point.
In one embodiment of the application, the acquisition image to be processed includes:Obtain image to be processed and to be processed
The characteristic pattern of at least two human body key points of image;It is described the image to be processed is supplied to neural network to include:By institute
The characteristic pattern for stating image to be processed and the human body key point is supplied to neural network.
In the another embodiment of the application, the characteristic pattern of the human body key point includes:The hotspot graph of human body key point.
In the application a further embodiment, closed before and after the depth that human body key point is executed via the neural network
It is that prediction processing includes:Via the neural network according to the image to be processed and the characteristic pattern of the human body key point,
The characteristic value of at least two human body key points is formed, and obtains the difference between characteristic value, human body is formed based on the difference and is closed
The depth context of key point.
In the another embodiment of the application, the depth context of the human body key point includes:Characterize a human body
Information of the key point before or after another human body key point.
In the another embodiment of the application, one human body key point of the characterization is before another human body key point
Or information later includes:Characterize probability value of the human body key point before or after another human body key point.
In the another embodiment of the application, the depth context of the human body key point includes:Human body key point
Depth context matrix;Wherein, the line number of the matrix and columns are the quantity of human body key point, the line n of the matrix
Indicate n-th of human body key point, the m column of the matrix indicate m-th of human body key point, the number of the matrix line n m column
Value indicates probability value of n-th of human body key point before or after m-th of human body key point.
In the another embodiment of the application, the neural network is using before a plurality of depth for being provided with human body key point
Made of the image pattern of relationship marking information is trained in advance afterwards;Wherein, the depth context mark of the human body key point
Information table is leted others have a look at the depth location relativeness between body key point.
According to the application embodiment wherein in another aspect, providing a kind of training method of neural network, the method
Including:Obtain image pattern;Described image sample is supplied to neural network to be trained, via the nerve net to be trained
Network executes the depth context prediction processing of human body key point, to obtain the depth context of human body key point;Using institute
State the depth context markup information of the human body key point of image pattern to the depth context of the human body key point into
Row supervision, the study so that neural network to be trained exercises supervision.
In one embodiment of the application, the acquisition image pattern includes:Obtain image pattern and image pattern
The characteristic pattern of at least two human body key points;It is described described image sample is supplied to neural network to be trained to include:By institute
The characteristic pattern for stating image pattern and the human body key point is supplied to neural network to be trained.
In the another embodiment of the application, the depth context of the human body key point of described image sample marks letter
Breath is formed using the human body key point Labeling Coordinate information in three dimensions of image pattern;Alternatively, described image sample
The depth context markup information of this human body key point, is by manually marking formation.
In the application a further embodiment, the depth context markup information of the human body key point of described image sample
Including:Characterize markup information of the human body key point before or after another human body key point.
In the application a further embodiment, one human body key point of the characterization is before another human body key point
Or markup information later includes:Characterize probability mark of the human body key point before or after another human body key point
Note value.
In the application a further embodiment, the depth context markup information of the human body key point includes:Human body
The depth context of key point marks matrix;Wherein, the line number of the mark matrix and columns are the quantity of human body key point,
The line n of the mark matrix indicates n-th of human body key point, and the m column of the mark matrix indicate that m-th of human body is crucial
Point, it is described mark matrix line n m column mark value indicate n-th of human body key point before m-th of human body key point or it
Probability mark value afterwards.
In the application a further embodiment, the probability mark value is:First mark value indicates a human body key point
Depth coordinate markup information in three dimensions is greater than, the depth coordinate mark of another human body key point in three dimensions
The sum of information and predetermined value;Alternatively, the second mark value, indicates the depth coordinate mark of a human body key point in three dimensions
Information is less than, the difference of another human body key point depth coordinate markup information in three dimensions and predetermined value;Alternatively, third
Mark value indicates the depth coordinate markup information of a human body key point in three dimensions and another human body key point three
The absolute value of the difference of depth coordinate markup information in dimension space is no more than predetermined value.
According to the application embodiment wherein in another aspect, providing a kind of depth context prediction dress of human body key point
It sets, described device includes:First obtains module, for obtaining image to be processed;It include the first depth front and back of neural network
Relationship module executes human body key point via the neural network for the image to be processed to be supplied to neural network
Depth context prediction processing, to obtain the depth context of human body key point;Wherein, the depth of the human body key point
Context is used to indicate the depth location relativeness between human body key point.
In one embodiment of the application, the first acquisition module is further used for:Obtain image to be processed and to
Handle the characteristic pattern of at least two human body key points of image;The first depth context module is further used for:By institute
The characteristic pattern for stating image to be processed and the human body key point is supplied to neural network.
In the another embodiment of the application, the characteristic pattern of the human body key point includes:The hotspot graph of human body key point.
In the application a further embodiment, the neural network includes:First unit, for according to the figure to be processed
The characteristic pattern of picture and the human body key point forms the characteristic value of at least two human body key points;Second unit, for obtaining
Difference between characteristic value;Third unit, for forming the depth context of human body key point based on the difference.
In the application a further embodiment, the second unit includes:Vector differentials computing unit, for for multiple
Characteristic value two-by-two in the characteristic value of human body key point executes characteristic value difference and calculates, to obtain the difference between characteristic value two-by-two
Value.
In the application a further embodiment, the third unit includes:Context forms unit, for according at least
One difference forms the depth context of the human body key point.
According to the application embodiment wherein in another aspect, providing a kind of training device of neural network, described device
Including:Second obtains module, for obtaining image pattern;It include the second depth context mould of neural network to be trained
Block executes human body via the neural network to be trained for described image sample to be supplied to neural network to be trained
The depth context prediction of key point is handled, to obtain the depth context of human body key point;Supervision module, for utilizing
Depth context of the depth context markup information of the human body key point of described image sample to the human body key point
It exercises supervision, the study so that neural network to be trained exercises supervision.
In one embodiment of the application, the second acquisition module is further used for:Obtain image pattern and image
The characteristic pattern of at least two human body key points of sample;The second depth context module is further used for:By the figure
The characteristic pattern of decent and the human body key point is supplied to neural network to be trained.
In the another embodiment of the application, described device further includes:First labeling module, for utilizing image pattern
The Labeling Coordinate information of human body key point in three dimensions forms pass before and after the depth of the human body key point of described image sample
It is markup information;Alternatively, the second labeling module, for providing artificial mark interface, according to received based on artificial mark interface
Information forms the depth context markup information of the human body key point of described image sample.
In the application a further embodiment, the depth context markup information of the human body key point of described image sample
Including:Characterize markup information of the human body key point before or after another human body key point.
In the application a further embodiment, one human body key point of the characterization is before another human body key point
Or markup information later includes:Characterize probability mark of the human body key point before or after another human body key point
Note value.
In the application a further embodiment, the depth context markup information of the human body key point includes:Human body
The depth context of key point marks matrix;Wherein, the line number of the mark matrix and columns are the quantity of human body key point,
The line n of the mark matrix indicates n-th of human body key point, and the m column of the mark matrix indicate that m-th of human body is crucial
Point, it is described mark matrix line n m column mark value indicate n-th of human body key point before m-th of human body key point or it
Probability mark value afterwards.
In the application a further embodiment, the probability mark value is:First mark value indicates a human body key point
Depth coordinate markup information in three dimensions is greater than, the depth coordinate mark of another human body key point in three dimensions
The sum of information and predetermined value;Alternatively, the second mark value, indicates the depth coordinate mark of a human body key point in three dimensions
Information is less than, the difference of another human body key point depth coordinate markup information in three dimensions and predetermined value;Alternatively, third
Mark value indicates the depth coordinate markup information of a human body key point in three dimensions and another human body key point three
The absolute value of the difference of depth coordinate markup information in dimension space is no more than predetermined value.
According to the application embodiment in another aspect, providing a kind of electronic equipment, including:Memory is calculated for storing
Machine program;Processor, for executing the computer program stored in the memory, and the computer program is performed,
Realize the application either method embodiment.
According to the application embodiment in another aspect, providing a kind of computer readable storage medium, it is stored thereon with calculating
Machine program when the computer program is executed by processor, realizes the application either method embodiment.
According to another aspect of the application embodiment, a kind of computer program, including computer instruction are provided, when this
When computer instruction is run in the processor of equipment, the application either method embodiment is realized.
Before and after depth context prediction technique, the depth of human body key point based on human body key point provided by the present application
Relationship Prediction device, neural network training method, neural metwork training device, electronic equipment, computer readable storage medium and
Computer program, the application is by that can predict the depth context of human body key point using neural network, due to human body
The depth context of key point can indicate the depth location relativeness between human body key point, therefore, human body key point
Depth context can be provided for 3 D human body attitude prediction referring to information, to be conducive to avoid 3 D human body posture pre-
During survey, the phenomenon that existing prediction error.It follows that technical solution provided by the present application is conducive to improve 3 D human body
The accuracy of attitude prediction, to be conducive to avoid due to 3 D human body attitude prediction mistake and to interaction entertainment and behavior point
Analysis etc. generates adverse effect.
Below by drawings and embodiments, the technical solution of the application is described in further detail.
Detailed description of the invention
The attached drawing for constituting part of specification describes presently filed embodiment, and together with description for solving
Release the principle of the application.
The application can be more clearly understood according to following detailed description referring to attached drawing, wherein:
Fig. 1 is the flow chart of one embodiment of depth context prediction technique of the application human body key point;
Fig. 2 is the schematic diagram of the image to be processed of the application;
Fig. 3 is the schematic diagram of one embodiment of hotspot graph of 16 human body key points of image to be processed shown in Fig. 2;
Fig. 4 is the schematic diagram of an embodiment of the human body key point of image to be processed shown in Fig. 2;
Fig. 5 is the schematic diagram of an embodiment of the depth context matrix of the human body key point of the application;
Fig. 6 is the image to be processed shown in Fig. 2 that is directed to of the application, the depth context prediction of progress human body key point
The schematic diagram of one embodiment of processing;
Fig. 7 is the flow chart of one embodiment of training method of the neural network of the application;
Fig. 8 is the structural representation of depth context one embodiment of prediction meanss of the human body key point of the application
Figure;
Fig. 9 is the structural schematic diagram of one embodiment of training device of the neural network of the application;
Figure 10 is the block diagram for realizing an example devices of the application embodiment.
Specific embodiment
The various exemplary embodiments of the application are described in detail now with reference to attached drawing.It should be noted that:Unless in addition having
Body explanation, the unlimited system of component and the positioned opposite of step, numerical expression and the numerical value otherwise illustrated in these embodiments is originally
The range of application.
Simultaneously, it should be appreciated that for ease of description, the size of various pieces shown in attached drawing is not according to reality
Proportionate relationship draw.
Be to the description only actually of at least one exemplary embodiment below it is illustrative, never as to the application
And its application or any restrictions used.
Technology, method known to person of ordinary skill in the relevant and equipment may be not discussed in detail, but
In appropriate situation, the technology, method and apparatus should be considered as part of specification.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain item exists
It is defined in one attached drawing, then in subsequent attached drawing does not need that it is further discussed.
The embodiment of the present application can be applied to the electronic equipments such as terminal device, computer system and server, can be with crowd
Mostly other general or dedicated computing system environment or configuration operate together.Suitable for terminal device, computer system with
And the example of well-known terminal device, computing system, environment and/or configuration that the electronic equipments such as server are used together,
Including but not limited to:It is personal computer system, server computer system, thin client, thick client computer, hand-held or above-knee set
It is standby, microprocessor-based system, set-top box, programmable consumer electronics, NetPC Network PC, little type Ji calculate machine Xi Tong ﹑
Large computer system and the distributed cloud computing technology environment including above-mentioned any system, etc..
The electronic equipments such as terminal device, computer system and server can be in the computer executed by computer system
It is described under the general context of system executable instruction (such as program module).In general, program module may include routine, program,
Target program, component, logic and data structure etc., they execute specific task or realize specific abstract data class
Type.Computer system/server can be implemented in distributed cloud computing environment, in distributed cloud computing environment, task be by
What the remote processing devices being linked through a communication network executed.In distributed cloud computing environment, program module can be located at packet
On the Local or Remote computing system storage medium for including storage equipment.
Exemplary embodiment
Fig. 1 is the flow chart of one embodiment of depth context prediction technique of the human body key point of the application.
As shown in Figure 1, the embodiment method mainly includes:Step S100 and step S110.Step in the application
S100 and step S110 are specially:
S100, image to be processed is obtained.
S110, image to be processed is supplied to neural network, before the depth that human body key point is executed via the neural network
Relationship Prediction is handled afterwards, to obtain the depth context of human body key point.
In an optional example, the image to be processed in the application can be original image to be processed, or former
Begin in image to be processed include human body topography.In addition, the image to be processed in the application can also be to pass through needle
To in original image to be processed include human body topography's image for handling, and obtaining.In addition, in the application
Image to be processed can be the images such as the static picture of presentation or photo, or the video frame in dynamic video is presented.
One specific example of the image to be processed in the application is as shown in Figure 2.
In an optional example, the human body that the image to be processed of the application is included can be complete human body (for example,
Image to be processed shown in Fig. 2 includes complete human body).The human body that image to be processed is included be also possible to due to blocking or
The reasons such as person's angle coverage and caused by partial body.The application does not limit the specific manifestation shape of the human body in image to be processed
State.
In an optional example, the application can not only obtain image to be processed, can also obtain image to be processed
The characteristic pattern (Feature Map) of at least two human body key points.The number of the human body key point of image to be processed in the application
Amount at least two.In general, the quantity of the human body key point of image to be processed is multiple, for example, image to be processed
12 or 14 or 16 human body key points.The quantity of human body key point in the application can be can substantially retouch
Stating out human body attitude is principle, to determine.The application does not limit the particular number of human body key point.Correspondingly, the application is obtained
The quantity at least two of the characteristic pattern of the human body key point for the image to be processed got.In general, the application is obtained
The quantity of the characteristic pattern of the human body key point of the image to be processed obtained is multiple, and the application would generally obtain all human bodies keys
The characteristic pattern of point, for example, in the case where the quantity of the human body key point of image to be processed is 12 or 14 or 16, the application
It can be directed to image to be processed, get the characteristic pattern of 12 or 14 or 16 human body key points.The application does not limit
The particular number of the characteristic pattern of the human body key point of the image to be processed got.
In an optional example, the characteristic pattern of the human body key point in the application is special commonly used in expression human body key point
Sign.For example, the characteristic pattern of the human body key point in the application can be specially the hotspot graph of human body key point.The application does not limit
The specific manifestation form of the characteristic pattern of human body key point.In addition, in the description of following technical proposals, sometimes with human body key
For the hotspot graph of point, the technical solution of the application is illustrated, however, this does not indicate that the application must be closed using human body
The hotspot graph of key point.
In an optional example, the application can by existing two-dimension human body key point Feature Extraction Technology obtain to
Handle the hotspot graph of the human body key point of image.For example, the application can use the nerve for extracting human body key point feature
Network (following be known as feature extraction neural network), come obtain image to be processed each human body key point hotspot graph.Specifically,
Image to be processed can be supplied to feature extraction neural network by the application, extracted neural network via this feature and executed human body pass
Key point feature extraction process, to extract the information of neural network output according to this feature, the application can obtain figure to be processed
The hotspot graph of at least two (as all) human body key points as in.
In an optional example, in the case where the quantity of human body key point is 16, for shown in Fig. 2 to be processed
Image, the available hotspot graph to 16 human body key points of the application.One tool of the hotspot graph of this 16 human body key points
Body example is as shown in Figure 3.In Fig. 3, the hotspot graph of 16 human body key points respectively corresponds to a human body key point, and different hot spots
The corresponding different human body key point of figure.That is, corresponding 16 human bodies of the hotspot graph of 16 human body key points shown in Fig. 3 close
Key point, this 16 human body key points can be specially 16 human body key points that number as shown in Figure 4 is 0-15.Shown in Fig. 3
Each human body key point hotspot graph in include a high bright spot.High bright spot in hotspot graph is generally it can be thought that be this
The hotspot location of human body key point in hotspot graph.For example, the number in the hotspot graph corresponding diagram 4 in the upper left corner of Fig. 3 is 5
Human body key point.For another example the human body key point that number is 13 in the hotspot graph corresponding diagram 4 in the lower left corner of Fig. 3.
In an optional example, the feature extraction neural network in the application be can include but is not limited to:It is convolutional layer, non-
Relu layers linear, pond layer and full articulamentum etc., the number of plies that this feature extraction neural network is included is more, then network is got over
It is deep;For another example the network structure of the feature extraction neural network of the application can use but be not limited to ALexNet, depth residual error
Network (Deep Residual Network, ResNet) or VGGnet (Visual Geometry Group Network, depending on
Feel geometry group network) etc. network structure used by neural networks.The application does not limit the hotspot graph for obtaining human body key point
The network structure of specific implementation and feature extraction neural network.
In an optional example, the depth context of the human body key point of the application can be used for predicting 3 D human body
Posture, for example, corresponding neural network is supplied to, in order to by this using the depth context of human body key point as input
Neural network executes corresponding 3 D human body attitude prediction processing.Certainly, the depth context of the human body key point of the application
It can be used for other aspects such as depth value of prediction human body key point.It closes the depth front and back that the application does not limit human body key point
The concrete application scene of system.
In an optional example, the depth context of the human body key point of the application can usually represent human body pass
Depth location relativeness between key point, for example, for two human body key points, the depth of the human body key point of the application
Degree context can represent the front-rear position relationship between the two human body key points.Above-mentioned two human body key point can be with
For any two human body key point in all human body key points, or preassigned two human body key points.
In an optional example, the depth context of the human body key point in the application may include:For any
For two human body key points, for characterize one of human body key point before the another one human body key point or it
Information afterwards.For example, the depth context of the human body key point in the application may include:Institute in all human body key points
Some two-by-two key point it is corresponding for characterize one of human body key point in another one human body key point it
Preceding or information later.In the application for characterize one of human body key point in another one human body key point it
Information preceding or later can be specially:For characterize one of human body key point in another one human body key point it
Preceding or probability value later.
In an optional example, the depth context of the human body key point in the application can be:Human body key point
Depth context matrix;Wherein, line number and columns included by the matrix are the human body key point of image to be processed
Quantity, for example, the quantity of human body key point be integer A in the case where, the matrix can be an A × A matrix.This Shen
Line n in matrix please indicates n-th of human body key point (for example, human body key point that number is n), in the matrix of the application
M column indicate m-th human body key point (for example, human body key point that number is m).Line n m in the matrix of the application
The value of column indicates:Probability value of n-th of human body key point before or after m-th of human body key point.The value of the probability value
Range typically 0-1.The depth context of human body key point in the application can also be using other forms such as arrays
It indicates, the application do not limit the contextual specific manifestation form of depth of human body key point.
One specific example of the depth context matrix of the human body key point in the application is as shown in Figure 5.Fig. 5 is shown
One 16 row × 16 column matrixes, that is to say, that the quantity of human body key point is 16.Matrix shown in fig. 5 include 16 ×
16 numerical value (i.e. 256 numerical value).Each of matrix numerical value is the probability value of a 0-1.For example, square shown in fig. 5
The numerical value of the 0th row the 2nd column in battle array is 0.2, and the human body key point for indicating that number is 0 in Fig. 4 numbers the people for being 2 in Fig. 4
Probability before body key point is 0.2.For another example the numerical value that the 1st row the 15th in matrix shown in fig. 5 arranges is 0.3, figure is indicated
It is 0.3 that the human body key point that number is 1 in 4 numbers the probability before the human body key point for being 15 in Fig. 4.It needs to illustrate
It is that the value that line n n-th arranges in matrix shown in fig. 5 is all set to 0.5, indicates the human body key point that number is n in Fig. 4
Probability value before its own is 0.5.Probability value 0.5 in the application may be considered:It is difficult to differentiate between front and back.In addition,
In this section of description, " before " can also be replaced with " later ".
The corresponding depth of human body key point two-by-two is indicated in all human body key points by using the mode of pro-bability value matrices
Spend context, can the clear orderly depth context for embodying human body key point, to be conducive to subsequent basis
The depth context of human body key point carries out the processing of 3 D human body attitude prediction.
The nerve for being used to carry out human body key point the processing of depth context in an optional example, in the application
Network (following to be known as depth context neural network) it is crucial can to form at least two human bodies first according to image to be processed
The characteristic value of point, secondly, the characteristic value that the depth context neural network is formed according to it, calculates the difference between two characteristic values
Different (for example, calculating the difference between all characteristic values two-by-two in all characteristic values), finally, the depth context nerve net
Network forms the depth context of human body key point based on its calculated difference.
In an optional example, the application is on the basis that image to be processed is supplied to depth context neural network
On, the characteristic pattern of the human body key point of image to be processed can also be also provided to depth context neural network, for example, this
Application merges processing (i.e. connection processing) for the characteristic pattern of image to be processed and human body key point, forms image to be processed
Tensor (be referred to as input tensor) be supplied to depth context neural network and by the tensor of image to be processed, with
The depth context prediction processing that human body key point is executed via depth context neural network, so that the application can root
According to the information that depth context neural network exports, the depth context of human body key point is obtained.The application is by by people
The characteristic pattern of body key point is supplied to depth context neural network, is conducive to improve the execution of depth context neural network
The accuracy of the depth context prediction processing of human body key point, so that the depth front and back for being conducive to improve human body key point is closed
The forecasting accuracy of system.
In an optional example, depth context neural network in the application can first according to image to be processed with
And the characteristic pattern of human body key point, the characteristic value of at least two human body key points is formed (for example, forming all human body key points
Characteristic value), then, which calculates the difference between two characteristic values according to the characteristic value that it is formed
(for example, calculating the difference between all characteristic values two-by-two in all characteristic values), finally, the depth context neural network
The depth context of human body key point is formed based on its calculated difference.
In an optional example, the depth context neural network in the application may include:Residual error network unit,
Vector differentials computing unit and context form unit, this three parts.The depth of human body key point is realized by this three parts
Contextual prediction.
In an optional example, the residual error network unit in the application is used for, according to image to be processed, or according to
The hotspot graph of image and human body key point is handled, the characteristic value of at least two human body key points is formed.The application can treat
Processing image and the characteristic pattern (for example, characteristic pattern of all human body key points) of at least one human body key point merge place
Reason forms input tensor, and input tensor is supplied to the residual error network unit in depth context neural network, by residual error
Network unit forms multiple characteristic values (being referred to as feature vector) for input tensor, for example, for shown in Fig. 2 wait locate
For the hotspot graph for managing image and 16 human body key points shown in Fig. 3, residual error network unit forms 16 characteristic values (can also
To be known as the feature vector of 16 dimensions).The characteristic pattern of each characteristic value corresponding a human body key point and human body key point.
In an optional example, the residual error network unit in the application may include, but be not limited to:It is convolutional layer, non-thread
Relu layers of property, pond layer and full articulamentum etc., the number of plies which is included is more, then network is deeper.This Shen
Please in residual error network unit can be specially Resnet-18, Resnet-34 or Resnet-50 etc..The application does not limit residual
The network structure of poor network unit.
In an optional example, the vector differentials computing unit in the application is used for, and is exported for residual error network unit
Characteristic value (for example, all characteristic values, such as feature vectors of above-mentioned 16 dimension) in two characteristic values (for example, in all characteristic values
All characteristic values two-by-two), execute characteristic value difference and calculate, and its calculated difference is exported, for example, in all characteristic values
The difference of all characteristic values two-by-two.
In an optional example, characteristic value difference performed by the vector differentials computing unit in the application calculates can be with
It is expressed as the form of following formula (1):
Fij=Fi-FjFormula (1)
In above-mentioned formula (1), FijIndicate characteristic value FiWith characteristic value FjBetween difference;Characteristic value FiIndicate i-th of people
The characteristic value of body key point;Characteristic value FjIndicate the characteristic value of j-th of human body key point.
In an optional example, the context in the application forms unit and is used for, and vector differentials computing unit is defeated
Difference out is converted to the depth context of human body key point.For example, context, which forms unit, calculates list for vector differentials
All differences of member output are respectively converted into the probability value between 0-1, so that the application can obtain the depth of human body key point
Context matrix, 16 × 16 matrix as shown in Figure 5.
In an optional example, the context in the application forms conversion operation performed by unit and can be expressed as
The form of following formula (2):
In above-mentioned formula (2), PijIndicate probability value;FijIndicate the characteristic value F of i-th of human body key pointiWith j-th of people
The characteristic value F of body key pointjBetween difference.
In an optional example, the application, which utilizes, includes residual error network unit, vector differentials computing unit and front and back
Relationship forms the depth context neural network of unit, obtains one of the depth context prediction result of human body key point
Embodiment is as shown in Figure 6.Specifically, the leftmost side Fig. 6 is image to be processed, by the way that image to be processed is supplied to feature extraction
Neural network (neural network for being used to predict two-dimension human body guise in such as Fig. 6), can obtain the owner of image to be processed
The hotspot graph (multiple hotspot graphs as being located at Fig. 6 middle position) of body key point.The application can by image to be processed and
The hotspot graph of all human body key points is merged and (is connected), is formed input tensor, is supplied to depth context
Residual error network unit (Resnet for being located at Fig. 6 middle position) in neural network, Resnet are calculated according to input tensor
The characteristic value of each human body key point, so that the application can be according to the Resnet each human body key point of information acquisition exported
Characteristic value.The characteristic value of all human body key points of Resnet output is provided to vector differentials computing unit (as being located in Fig. 6
Between Pairwise Layer at position to the right), the feature of all key points of human body two-by-two is calculated by vector differentials computing unit
Difference between value.The calculated all differences of vector differentials computing unit are provided to context and form unit (such as
Rank Transfer at the center-right position Fig. 6), unit is formed by context and converts all difference respectively
Probability value between 0-1, to obtain depth context matrix (the probability value square of such as rightmost side Fig. 6 of human body key point
Battle array).
In an optional example, the depth context neural network in the application is to utilize multiple images sample training
Made of, and each image pattern is usually provided with the depth context markup information of human body key point.In the application
The depth context markup information of human body key point can be used to indicate that the depth between the human body key point in image pattern
Position relative relation.The depth context markup information of human body key point exercises supervision for treating trained neural network
It practises.The depth context markup information of human body key point can be specially the depth context matrix of human body key point.
Fig. 7 is the flow chart of one embodiment of the training method of the neural network of the application.It is shown in Fig. 7 to include:Step
Rapid S700, step S710 and step S720.S700, S710 and S720 in the application are specially:
S700, image pattern is obtained.
S710, image pattern is supplied to neural network to be trained, executes human body via neural network to be trained and closes
The depth context prediction of key point is handled, to obtain the depth context of human body key point.
S720, using image pattern human body key point depth context markup information to the depth of human body key point
Context exercises supervision, the study so that neural network to be trained exercises supervision.
In an optional example, the image pattern in the application can be original image samples, or original graph
In decent includes the topography of human body.In addition, the image pattern in the application can also be for by being directed to original graph
In decent includes topography's image for being handled, and being obtained of human body.In addition, the image pattern in the application can
Think and the image patterns such as static picture sample or photo sample are presented, or the video in dynamic video sample is presented
Frame sample.
In an optional example, the human body that the image pattern in the application is included can be complete human body.Image
The human body that sample is included be also possible to block or angle coverage etc. due to and caused by partial body.The application is unlimited
The specific manifestation form of human body in imaged sample.
In an optional example, the application can from training data concentrate read image pattern, in order to be supplied to
Trained depth context neural network.It includes a plurality of for training the figure of neural network that training data in the application, which is concentrated,
Decent, it is generally the case that each image pattern is provided with the depth context markup information of human body key point.This Shen
It can please once be concentrated from training data according to random reading manner or according to image pattern arrangement order sequence reading manner
Read one or more image pattern.
In an optional example, the depth context markup information of the human body key point of the application can usually be indicated
Depth location relativeness between human body key point out, for example, the human body of the application closes for two human body key points
The depth context markup information of key point can represent the front-rear position relationship between the two human body key points.Above-mentioned two
A human body key point can be any two human body key point in owner's body key point, or preassigned two
Human body key point.
In an optional example, the depth context markup information of the human body key point in the application may include:
For two human body key points, for characterizing one of human body key point before another one human body key point
Or markup information later.For example, the depth context markup information of the human body key point in the application may include:It is all
All key points two-by-two in human body key point are corresponding to be located therein separately for characterizing one of human body key point
Markup information before or after one human body key point.It is used to characterize one of human body key point in it in the application
In markup information before or after another human body key point can be specially:For characterizing one of human body key point
Probability mark value before or after another one human body key point.
In an optional example, the depth context markup information of the human body key point in the application can be specific
For:The depth context of human body key point marks matrix;Wherein, line number and columns included by the mark matrix are figure
The quantity of decent human body key point, for example, the mark matrix can in the case where the quantity of human body key point is integer A
Think the mark matrix of an A × A.Line n in the mark matrix of the application indicates n-th of human body key point (for example, number
For the human body key point of n), the m column in the mark matrix of the application indicate m-th of human body key point (for example, number is m's
Human body key point).The value of line n m column in the mark matrix of the application indicates:N-th of human body key point is in m-th of human body
Probability mark value before or after key point.The value range of the probability mark value typically 0-1.People in the application
The depth context markup information of body key point can also indicate that the application does not limit human body using other forms such as arrays
The specific manifestation form of the depth context markup information of key point.
In an optional example, the depth context markup information of the human body key point of the image pattern in the application
It can be and formed using the human body key point Labeling Coordinate information in three dimensions of image pattern;It is also possible to pass through people
Work, which marks, to be formed.
A mark in an optional example, in the depth context mark matrix of the human body key point in the application
The value of note value can be the first mark value or the second mark value or third mark value.First mark value therein indicates one
The depth coordinate markup information of a human body key point in three dimensions is greater than, another human body key point is in three dimensions
The sum of depth coordinate markup information and predetermined value;Second mark value therein indicates a human body key point in three dimensions
Depth coordinate markup information is less than, the depth coordinate markup information of another human body key point in three dimensions and predetermined value it
Difference;Third mark value therein indicates human body key point depth coordinate markup information in three dimensions and another person
The absolute value of the difference of the depth coordinate markup information of body key point in three dimensions is no more than predetermined value.The application can adopt
The value of a mark value in the depth context mark matrix of human body key point is indicated with following formula (3):
In above-mentioned formula (3), MijIndicate the mark value of the i-th row jth column in depth context mark matrix (as marked
Probability value);1 is above-mentioned first mark value (being referred to as the first Marking Probability value);0 (can also be with for above-mentioned second mark value
Referred to as the second Marking Probability value);0.5 is above-mentioned third mark value (being referred to as third Marking Probability value);zi' indicate i-th
Z coordinate mark value in the depth coordinate markup information of a human body key point in three dimensions;zj' indicate that j-th of human body closes
Z coordinate mark value in the depth coordinate markup information of key point in three dimensions;ε indicates predetermined value, i.e. a constant;Predetermined value
Size can according to practical application determine.
In an optional example, the application can not only obtain image pattern, can also obtain image pattern at least
The characteristic pattern of two human body key points.The quantity at least two of the human body key point of image pattern in the application.Usual
In the case of, the quantity of the human body key point of image pattern be it is multiple, for example, 12 of image pattern or 14 or 16 people
Body key point.Correspondingly, the quantity at least two of the characteristic pattern of the human body key point of image pattern accessed by the application.
In general, the quantity of the characteristic pattern of the human body key point of the application image pattern obtained is multiple, and the application
The characteristic pattern of all human body key points would generally be obtained, for example, the quantity in the human body key point of image pattern is 12 or 14
Or in the case where 16, the application can be directed to image pattern, get 12 or 14 or 16 human body key points
Characteristic pattern.The application does not limit the particular number of the characteristic pattern of the human body key point of the image pattern got.
In an optional example, the characteristic pattern of the human body key point in the application is special commonly used in expression human body key point
Sign.For example, the characteristic pattern of the human body key point in the application can be specially the hotspot graph of human body key point.The application does not limit
The specific manifestation form of the characteristic pattern of human body key point.
In an optional example, the application can be obtained by existing two-dimension human body key point Feature Extraction Technology to be schemed
The hotspot graph of decent human body key point.For example, the application can use the nerve net for extracting human body key point feature
Network (following be known as feature extraction neural network), come obtain image pattern each human body key point hotspot graph.Specifically, this Shen
Image pattern can please be supplied to feature extraction neural network, neural network is extracted via this feature and execute human body key point spy
Extraction process is levied, to extract the information of neural network output according to this feature, the application can be obtained in image pattern extremely
The hotspot graph of two few (as all) human body key point.
In an optional example, depth context neural network to be trained in the application can be first according to figure
Decent, form the characteristic value of at least two human body key points, secondly, should depth context neural network be trained according to
Its characteristic value formed, calculate between two characteristic values difference (for example, calculate all characteristic values two-by-two in all characteristic values it
Between difference), finally, should depth context neural network be trained be based on its calculated difference and form human body key point
Depth context.
In an optional example, image pattern is being supplied to depth context neural network to be trained by the application
On the basis of, the characteristic pattern of the human body key point of image pattern can also be also provided to depth context nerve to be trained
Network is formed for example, the application merges processing (i.e. connection processing) for the characteristic pattern of image pattern and human body key point
The tensor (being referred to as input tensor) of image pattern, and by the tensor of image pattern, it is supplied to before and after depth to be trained
Relationship neural network, it is pre- with the depth context for executing human body key point via depth context neural network to be trained
Survey processing, so that the information that the application can be exported according to depth context neural network to be trained, it is crucial to obtain human body
The depth context of point.The characteristic pattern of human body key point by being supplied to depth context nerve to be trained by the application
Network is conducive to improve at the depth context prediction of depth context neural network execution human body key point to be trained
The accuracy of reason, to be conducive to improve the performance of depth context neural network.
In an optional example, the depth context neural network to be trained in the application can be first according to image
The characteristic pattern of sample and human body key point, the characteristic value for forming at least two human body key points (are closed for example, forming all human bodies
The characteristic value of key point), then, depth context neural network that should be to be trained calculates two spies according to the characteristic value that it is formed
Difference (for example, calculating the difference between all characteristic values two-by-two in all characteristic values) between value indicative, finally, should be wait train
Depth context neural network the depth context of human body key point is formed based on its calculated difference.
In an optional example, include in the depth context neural network to be trained of the application:Residual error network
In the case that unit, vector differentials computing unit and context form unit, firstly, the application can to image pattern with
And the characteristic pattern (for example, characteristic pattern of all human body key points) of at least one human body key point merges processing, is formed defeated
Enter tensor, and input tensor is supplied to residual error network unit to be trained, by residual error network unit to be trained for input
Tensor forms multiple characteristic values (being referred to as feature vector).Secondly, vector differentials computing unit to be trained is directed to wait instruct
Two characteristic values (for example, all characteristic values two-by-two in all characteristic values) in the characteristic value of experienced residual error network unit output,
It executes characteristic value difference to calculate, and exports its calculated difference, for example, vector differentials computing unit output to be trained is all
The difference of all characteristic values two-by-two in characteristic value.Finally, context to be trained forms unit for vector difference to be trained
The difference of value computing unit output is converted to the depth context of human body key point.For example, context to be trained is formed
All differences that vector differentials computing unit to be trained exports are respectively converted into the probability value between 0-1 by unit, thus this
Application can obtain the depth context matrix of human body key point.
In an optional example, the human body that the application can be exported with depth context neural network to be trained is closed
Difference between the depth context of key point and the human body key point depth context markup information of image pattern is guidance
Information, using corresponding loss function, is treated trained depth context neural network and is carried out for the purpose of reducing the difference
Supervised learning.
In an optional example, the loss function of the application can be expressed as the form of following formula (4):
In above-mentioned formula (4), Cij≡C(Fij) indicate the loss function based on i-th of key point and j-th of key point;
MijIndicate the mark value (such as Marking Probability value) of the i-th row jth column in depth context mark matrix;PijIt indicates to be trained
The i-th key point and the corresponding probability value of j-th of key point of depth context neural network output;FijIndicate depth to be trained
Spend the characteristic value F for i-th of human body key point that context neural network is obtained by calculatingiWith j-th human body key point
Characteristic value FjBetween difference.
In the case where the human body keypoint quantity of image pattern is 16, CijQuantity can be 256, the application can benefit
With 256 CijTrained depth context neural network is treated to exercise supervision study.
In an optional example, reach predetermined iteration in the training for depth context neural network to be trained
When condition, this training process terminates.Predetermined iterated conditional in the application may include:Depth context neural network is defeated
Difference between the human body key point depth context markup information of human body key point depth context and image pattern out
It is different to meet predetermined difference requirement.In the case where difference meets predetermined difference requirement respectively, this is treated before and after trained depth
Relationship neural network successfully trains completion.Predetermined iterated conditional in the application also may include:Before the depth to be trained to this
Relationship neural network is trained afterwards, and the quantity of used image pattern reaches predetermined quantity requirement etc..In the image used
The quantity of sample reaches predetermined quantity requirement, however, this is treated in the case that difference does not meet predetermined difference requirement respectively
Trained depth context neural network is not trained successfully.The depth context neural network that success training is completed can be with
Depth context prediction for carrying out human body key point to image to be processed is handled.
Fig. 8 is the structural schematic diagram of depth context prediction meanss one embodiment of the human body key point of the application.
As shown in figure 8, the device of the embodiment mainly includes:First obtains module 800 and the first depth context module 810.It can
Choosing, which can also include:Second obtains module 900, the second depth context module 910, supervision module 920, the
One labeling module 930 and the second labeling module 940.
First acquisition module 800 is for obtaining image to be processed.
First depth context module 810 includes neural network.First depth context module 810 is for will be to
Processing image is supplied to neural network, handles via the depth context prediction that neural network executes human body key point, to obtain
Take the depth context of human body key point.The depth context of human body key point in the application is for indicating human body key
Depth location relativeness between point.
In an optional example, first obtains the available image to be processed of module 800 and image to be processed extremely
The characteristic pattern of few two human body key points.In this case, the first depth context module 810 can by image to be processed with
And the characteristic pattern of the human body key point is both provided to neural network.The characteristic pattern of human body key point in the application can wrap
It includes:The hotspot graph of human body key point.
In an optional example, the neural network of the application may include:First unit, second unit and third list
Member.First unit therein is used for the characteristic pattern according to image to be processed and human body key point, and it is crucial to form at least two human bodies
The characteristic value of point.Second unit therein is used to obtain the difference between characteristic value.Third unit therein is used to be based on difference
Form the depth context of human body key point.
In an optional example, first unit can be specially residual error network unit.Second unit can be vector difference
It is worth computing unit.The vector differentials computing unit is used for the characteristic value two-by-two in the characteristic value for multiple human body key points, holds
Row characteristic value difference calculates, to obtain the difference between characteristic value two-by-two.Third unit can form unit for context.It should
Context forms unit and is used to form the depth context of human body key point according at least one difference.
First obtains concrete operations performed by module 800 and the first depth context module 810, may refer to above-mentioned
For the description of each step in Fig. 1 in method implementation.Second obtains module 900, the second depth context module
910, supervision module 920, the first labeling module 930 and the second labeling module 940 may refer in following apparatus embodiment
For the description of Fig. 9.It is no longer described in detail herein.
Fig. 9 is the structural schematic diagram of training device one embodiment of the neural network of the application.Training cartridge shown in Fig. 9
It sets and mainly includes:Second obtains module 900, the second depth context module 910 and supervision module 920.Optionally, the dress
Setting to include:First labeling module 930 and the second labeling module 940.
Second acquisition module 900 is for obtaining image pattern.
Second depth context module 910 includes neural network to be trained.Second depth context module 910
For image pattern to be supplied to neural network to be trained, the depth of human body key point is executed via neural network to be trained
Context prediction processing, to obtain the depth context of human body key point.
Supervision module 920 is used to close human body using the depth context markup information of the human body key point of image pattern
The depth context of key point exercises supervision, the study so that neural network to be trained exercises supervision.
In an optional example, second obtains at least the two of the available image pattern of module 900 and image pattern
The characteristic pattern of a human body key point.In this case, the second depth context module 910 can be by image pattern and human body
The characteristic pattern of key point is supplied to neural network to be trained.
First labeling module 930 is used to utilize the Labeling Coordinate of the human body key point of image pattern in three dimensions to believe
Breath, forms the depth context markup information of the human body key point of image pattern.
Second labeling module 940 is for providing artificial mark interface, according to based on the artificial mark received information in interface, shape
At the depth context markup information of the human body key point of image pattern.
In an optional example, the depth context markup information of the human body key point of the image pattern in the application
May include:Characterize markup information of the human body key point before or after another human body key point.Optionally, table
Levying markup information of the human body key point before or after another human body key point may include:Characterize a human body
Probability mark value of the key point before or after another human body key point.Optionally, before and after the depth of human body key point
Relationship marking information may include:The depth context of human body key point marks matrix.It is therein mark matrix line number and
Columns is the quantity of human body key point, and the line n for marking matrix indicates n-th of human body key point, marks the m list of matrix
Show that m-th of human body key point, the mark value of mark matrix line n m column indicate that n-th of human body key point is closed in m-th of human body
Probability mark value before or after key point.
In an optional example, the probability mark value in the application can be:First mark value, the second mark value or
Third mark value.First mark value indicates that the depth coordinate markup information of a human body key point in three dimensions is greater than, separately
The sum of one human body key point depth coordinate markup information in three dimensions and predetermined value.Second mark value indicates a people
The depth coordinate markup information of body key point in three dimensions is less than, another depth of human body key point in three dimensions
The difference of Labeling Coordinate information and predetermined value.Third mark value indicates the depth coordinate of a human body key point in three dimensions
The absolute value of the difference of markup information and the depth coordinate markup information of another human body key point in three dimensions is no more than pre-
Definite value.
Second obtain module 900, the second depth context module 910, supervision module 920, the first labeling module 930 with
And second concrete operations performed by labeling module 940, it may refer in above method embodiment for each step in Fig. 7
Description.It is no longer described in detail herein.
Example devices
Figure 10 shows the example devices 1000 for being adapted for carrying out the application, and equipment 1000 can be the control configured in automobile
System/electronic system processed, mobile terminal (for example, intelligent mobile phone etc.), personal computer (PC, for example, desktop computer or
Person's notebook computer etc.), tablet computer and server etc..
In Figure 10, equipment 1000 includes one or more processor, communication unit etc., one or more of processors
Can be:One or more central processing unit (CPU) 1001, and/or, one or more carries out people using neural network
The image processor (GPU) 1013 etc. of the depth context prediction of body key point, processor can be according to being stored in read-only deposit
Executable instruction in reservoir (ROM) 1002 is loaded into random access storage device (RAM) 1003 from storage section 1008
Executable instruction and execute various movements appropriate and processing.Communication unit 1012 can include but is not limited to network interface card, the net
Card can include but is not limited to IB (Infiniband) network interface card.Processor can be deposited with read-only memory 1002 and/or random access
Communication to be in reservoir 1003 to execute executable instruction, be connected by bus 1004 with communication unit 1012 and through communication unit 1012 and
Other target devices communication, to complete the corresponding steps in the application.
Operation performed by above-mentioned each instruction may refer to the associated description in above method embodiment, herein no longer in detail
Explanation.In addition, in RAM 1003, various programs and data needed for device operation can also be stored with.CPU1001,
ROM1002 and RAM1003 is connected with each other by bus 1004.
In the case where there is RAM1003, ROM1002 is optional module.RAM1003 stores executable instruction, or is running
When executable instruction is written into ROM1002, executable instruction makes central processing unit 1001 execute above-mentioned method for segmenting objects
Included step.Input/output (I/O) interface 1005 is also connected to bus 1004.Communication unit 1012 can integrate setting,
It can be set to multiple submodule (for example, multiple IB network interface cards), and connect respectively with bus.
I/O interface 1005 is connected to lower component:Importation 1006 including keyboard, mouse etc.;Including such as cathode
The output par, c 1007 of ray tube (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section including hard disk etc.
1008;And the communications portion 1009 of the network interface card including LAN card, modem etc..Communications portion 1009 passes through
Communication process is executed by the network of such as internet.Driver 1010 is also connected to I/O interface 1005 as needed.It is detachable to be situated between
Matter 1011, such as disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 1010, so as to
It is installed in storage section 1008 as needed in from the computer program read thereon.
It should be strongly noted that framework as shown in Figure 10 is only a kind of optional implementation, in concrete practice process
In, can the component count amount and type according to actual needs to above-mentioned Figure 10 selected, deleted, increased or replaced;In different function
Can in component setting, can also be used it is separately positioned or integrally disposed and other implementations, for example, the separable setting of GPU and CPU, then
Such as reason, GPU can be integrated on CPU, the separable setting of communication unit, can also be integrally disposed on CPU or GPU etc..These can be replaced
The embodiment changed each falls within the protection scope of the application.
Particularly, it according to presently filed embodiment, may be implemented as calculating below with reference to the process of flow chart description
Machine software program, for example, the application embodiment includes a kind of computer program product, it can it includes machine is tangibly embodied in
The computer program on medium is read, computer program includes the program code for step shown in execution flow chart, program generation
Code may include the corresponding corresponding instruction of step executed in method provided by the present application.
In such an embodiment, which can be downloaded and be pacified from network by communications portion 1009
Dress, and/or be mounted from detachable media 1011.When the computer program is executed by central processing unit (CPU) 1001, hold
The row instruction as described in this application for realizing above-mentioned corresponding steps.
In one or more optional embodiments, the embodiment of the present disclosure additionally provides a kind of computer program program production
Product, for storing computer-readable instruction, described instruction is performed so that computer executes described in above-mentioned any embodiment
Human body key point depth context prediction technique or neural network training method.
The computer program product can be realized especially by hardware, software or its mode combined.In an alternative embodiment
In son, the computer program product is embodied as computer storage medium, in another optional example, the computer
Program product is embodied as software product, such as software development kit (Software Development Kit, SDK) etc..
In one or more optional embodiments, the embodiment of the present disclosure additionally provides the depth of another human body key point
The training method and its corresponding device and electronic equipment of context prediction technique and neural network, computer storage medium,
Computer program and computer program product, method therein include:First device sends human body key point to second device
Depth context prediction instruction or training neural network instruction, the instruction is so that second device executes any of the above-described possibility
Embodiment in human body key point depth context prediction technique or training neural network method;First device receives
The depth context prediction result or neural metwork training result for the human body key point that second device is sent.
In some embodiments, the depth context prediction instruction or training neural network instruction of human body key point
It can be specially call instruction, first device can indicate that second device executes the depth of human body key point by way of calling
Context predicted operation or training neural network operation, accordingly, in response to receiving call instruction, second device can be with
Execute any embodiment in the depth context prediction technique of above-mentioned human body key point or the method for training neural network
In step and/or process.
It should be understood that the terms such as " first " in the embodiment of the present disclosure, " second " are used for the purpose of distinguishing, and be not construed as
Restriction to the embodiment of the present disclosure.It should also be understood that in the disclosure, " multiple " can refer to two or more, " at least one
It is a " can refer to one, two or more.It should also be understood that for the either component, data or the structure that are referred in the disclosure,
In no clearly restriction or in the case where context provides opposite enlightenment, one or more may be generally understood to.Also answer
Understand, the disclosure highlights the difference between each embodiment to the description of each embodiment, it is same or similar it
Place can mutually refer to, for sake of simplicity, no longer repeating one by one.
The present processes and device, electronic equipment and computer-readable storage medium may be achieved in many ways
Matter.For example, can be realized by any combination of software, hardware, firmware or software, hardware, firmware the present processes and
Device, electronic equipment and computer readable storage medium.The said sequence of the step of for method merely to be illustrated,
The step of the present processes, is not limited to sequence described in detail above, unless specifically stated otherwise.In addition, some
In embodiment, the application can be also embodied as recording program in the recording medium, these programs include for realizing basis
The machine readable instructions of the present processes.Thus, the application also covers storage for executing the journey according to the present processes
The recording medium of sequence.
The description of the present application is given for the purpose of illustration and description, and is not exhaustively or by this Shen
It please be limited to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selection and
Description embodiment is the principle and practical application in order to more preferably illustrate the application, and makes those skilled in the art
It will be appreciated that the embodiment of the present application can be so that design the various embodiments with various modifications for being suitable for special-purpose.
Claims (10)
1. a kind of depth context prediction technique of human body key point, which is characterized in that including:
Obtain image to be processed;
The image to be processed is supplied to neural network, pass before and after the depth of human body key point is executed via the neural network
It is prediction processing, to obtain the depth context of human body key point;
Wherein, the depth context of the human body key point is used to indicate the opposite pass of depth location between human body key point
System.
2. the method according to claim 1, wherein the acquisition image to be processed includes:
Obtain the characteristic pattern of at least two human body key points of image to be processed and image to be processed;
It is described the image to be processed is supplied to neural network to include:
The characteristic pattern of the image to be processed and the human body key point is supplied to neural network.
3. according to the method described in claim 2, it is characterized in that, the characteristic pattern of the human body key point includes:Human body is crucial
The hotspot graph of point.
4. the method according to any one of claim 2 to 3, which is characterized in that described to be executed via the neural network
The depth context prediction of human body key point is handled:
Via the neural network according to the image to be processed and the characteristic pattern of the human body key point, at least two are formed
The characteristic value of human body key point, and the difference between characteristic value is obtained, before forming the depth of human body key point based on the difference
Relationship afterwards.
5. method according to claim 1 to 4, which is characterized in that before and after the depth of the human body key point
Relationship includes:
Characterize information of the human body key point before or after another human body key point.
6. a kind of training method of neural network, which is characterized in that including:
Obtain image pattern;
Described image sample is supplied to neural network to be trained, it is crucial to execute human body via the neural network to be trained
The depth context prediction processing of point, to obtain the depth context of human body key point;
Using described image sample human body key point depth context markup information to the depth of the human body key point
Context exercises supervision, the study so that neural network to be trained exercises supervision.
7. a kind of depth context prediction meanss of human body key point, which is characterized in that including:
First obtains module, for obtaining image to be processed;
It include the first depth context module of neural network, for the image to be processed to be supplied to neural network,
It is handled via the depth context prediction that the neural network executes human body key point, before the depth to obtain human body key point
Relationship afterwards;
Wherein, the depth context of the human body key point is used to indicate the opposite pass of depth location between human body key point
System.
8. a kind of training device of neural network, which is characterized in that including:
Second obtains module, for obtaining image pattern;
It include the second depth context module of neural network to be trained, for being supplied to described image sample wait instruct
Experienced neural network is handled via the depth context prediction that the neural network to be trained executes human body key point, with
Obtain the depth context of human body key point;
Supervision module, the depth context markup information for the human body key point using described image sample is to the human body
The depth context of key point exercises supervision, the study so that neural network to be trained exercises supervision.
9. a kind of electronic equipment, including:
Memory, for storing computer program;
Processor, for executing the computer program stored in the memory, and the computer program is performed, and is realized
Method described in any one of the claims 1-6.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is executed by processor
When, realize method described in any one of the claims 1-6.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810395949.4A CN108830139A (en) | 2018-04-27 | 2018-04-27 | Depth context prediction technique, device, medium and the equipment of human body key point |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201810395949.4A CN108830139A (en) | 2018-04-27 | 2018-04-27 | Depth context prediction technique, device, medium and the equipment of human body key point |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN108830139A true CN108830139A (en) | 2018-11-16 |
Family
ID=64154905
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201810395949.4A Pending CN108830139A (en) | 2018-04-27 | 2018-04-27 | Depth context prediction technique, device, medium and the equipment of human body key point |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN108830139A (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111079695A (en) * | 2019-12-30 | 2020-04-28 | 北京华宇信息技术有限公司 | Human body key point detection and self-learning method and device |
| CN111368594A (en) * | 2018-12-26 | 2020-07-03 | 中国电信股份有限公司 | Method and device for detecting key points |
| CN112036516A (en) * | 2020-11-04 | 2020-12-04 | 北京沃东天骏信息技术有限公司 | Image processing method and device, electronic equipment and storage medium |
| US20230298204A1 (en) * | 2020-06-26 | 2023-09-21 | Intel Corporation | Apparatus and methods for three-dimensional pose estimation |
| US12307802B2 (en) | 2019-12-20 | 2025-05-20 | Intel Corporation | Light weight multi-branch and multi-scale person re-identification |
| CN120726641A (en) * | 2025-08-25 | 2025-09-30 | 北京金隅天坛家具股份有限公司 | Intelligent recognition method and device for floor plan based on image recognition |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170220904A1 (en) * | 2015-04-02 | 2017-08-03 | Tencent Technology (Shenzhen) Company Limited | Training method and apparatus for convolutional neural network model |
| CN107886069A (en) * | 2017-11-10 | 2018-04-06 | 东北大学 | A kind of multiple target human body 2D gesture real-time detection systems and detection method |
-
2018
- 2018-04-27 CN CN201810395949.4A patent/CN108830139A/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170220904A1 (en) * | 2015-04-02 | 2017-08-03 | Tencent Technology (Shenzhen) Company Limited | Training method and apparatus for convolutional neural network model |
| CN107886069A (en) * | 2017-11-10 | 2018-04-06 | 东北大学 | A kind of multiple target human body 2D gesture real-time detection systems and detection method |
Non-Patent Citations (1)
| Title |
|---|
| FRANZISKA MUELLER等: "GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB", 《ARXIV电子文库》 * |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111368594A (en) * | 2018-12-26 | 2020-07-03 | 中国电信股份有限公司 | Method and device for detecting key points |
| CN111368594B (en) * | 2018-12-26 | 2023-07-18 | 中国电信股份有限公司 | Method and device for detecting key points |
| US12307802B2 (en) | 2019-12-20 | 2025-05-20 | Intel Corporation | Light weight multi-branch and multi-scale person re-identification |
| CN111079695A (en) * | 2019-12-30 | 2020-04-28 | 北京华宇信息技术有限公司 | Human body key point detection and self-learning method and device |
| CN111079695B (en) * | 2019-12-30 | 2021-06-01 | 北京华宇信息技术有限公司 | Human body key point detection and self-learning method and device |
| US20230298204A1 (en) * | 2020-06-26 | 2023-09-21 | Intel Corporation | Apparatus and methods for three-dimensional pose estimation |
| US12299927B2 (en) * | 2020-06-26 | 2025-05-13 | Intel Corporation | Apparatus and methods for three-dimensional pose estimation |
| CN112036516A (en) * | 2020-11-04 | 2020-12-04 | 北京沃东天骏信息技术有限公司 | Image processing method and device, electronic equipment and storage medium |
| CN120726641A (en) * | 2025-08-25 | 2025-09-30 | 北京金隅天坛家具股份有限公司 | Intelligent recognition method and device for floor plan based on image recognition |
| CN120726641B (en) * | 2025-08-25 | 2025-11-18 | 北京金隅天坛家具股份有限公司 | Intelligent house type graph recognition method and device based on image recognition |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108460338B (en) | Human body posture estimation method and apparatus, electronic device, storage medium, and program | |
| CN108830139A (en) | Depth context prediction technique, device, medium and the equipment of human body key point | |
| Yang et al. | Deep plastic surgery: Robust and controllable image editing with human-drawn sketches | |
| CN108960036A (en) | 3 D human body attitude prediction method, apparatus, medium and equipment | |
| CN109522942B (en) | An image classification method, device, terminal device and storage medium | |
| Jabri et al. | Revisiting visual question answering baselines | |
| CN115735227A (en) | Inverting Neural Radiation Fields for Pose Estimation | |
| CN108509915A (en) | The generation method and device of human face recognition model | |
| CN108898185A (en) | Method and apparatus for generating image recognition model | |
| CN113762237B (en) | Text image processing method, device, equipment and storage medium | |
| CN106485773B (en) | A kind of method and apparatus for generating animation data | |
| CN109902548A (en) | A kind of object properties recognition methods, calculates equipment and system at device | |
| CN108229496A (en) | The detection method and device of dress ornament key point, electronic equipment, storage medium and program | |
| CN114005169B (en) | Face key point detection method and device, electronic equipment and storage medium | |
| CN107679466A (en) | Information output method and device | |
| CN115066687A (en) | Radioactivity data generation | |
| CN109359517A (en) | Image recognition method and device, electronic device, storage medium, program product | |
| CN109165562A (en) | Training method, crosswise joint method, apparatus, equipment and the medium of neural network | |
| CN113569852A (en) | Training method and device of semantic segmentation model, electronic equipment and storage medium | |
| CN109241988A (en) | Feature extracting method and device, electronic equipment, storage medium, program product | |
| Dave et al. | Simulation of analytical chemistry experiments on augmented reality platform | |
| CN108229680A (en) | Nerve network system, remote sensing images recognition methods, device, equipment and medium | |
| CN110910478B (en) | GIF map generation method and device, electronic equipment and storage medium | |
| CN108154153A (en) | Scene analysis method and system, electronic equipment | |
| CN115222845A (en) | Method and device for generating style font picture, electronic equipment and medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181116 |
|
| RJ01 | Rejection of invention patent application after publication |