[go: up one dir, main page]

CN111079616A - A Neural Network-Based Single Person Motion Pose Correction Method - Google Patents

A Neural Network-Based Single Person Motion Pose Correction Method Download PDF

Info

Publication number
CN111079616A
CN111079616A CN201911258388.4A CN201911258388A CN111079616A CN 111079616 A CN111079616 A CN 111079616A CN 201911258388 A CN201911258388 A CN 201911258388A CN 111079616 A CN111079616 A CN 111079616A
Authority
CN
China
Prior art keywords
joint point
standard
picture
human body
spatial domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911258388.4A
Other languages
Chinese (zh)
Other versions
CN111079616B (en
Inventor
谢雪梅
高旭
孔龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201911258388.4A priority Critical patent/CN111079616B/en
Publication of CN111079616A publication Critical patent/CN111079616A/en
Application granted granted Critical
Publication of CN111079616B publication Critical patent/CN111079616B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于神经网络的单人运动姿态矫正方法,主要解决当今体育老师对学生进行运动指导准确性和效率低下的问题。其实现方案是:下载包含人体关节点的图像数据集和其对应的标注文件,构建训练数据集;搭建基于空间域转换的人体关节点检测网络,并利用训练数据集对其训练;采集标准运动和普通运动图片,分别输入到训练好的基于空间域转换的人体关节点检测网络,得到各自的关节点坐标,分别构成标准运动和普通运动数据集并进行匹配,得到标准匹配图片;计算普通运动图片与标准匹配图片中各关节点之间的欧氏距离,统计大于打分阈值的关节点,即为需要矫正的动作点。本发明提高了运动姿态矫正准确率和训练效率,可用于单人运动姿态矫正。

Figure 201911258388

The invention discloses a single-person movement posture correction method based on a neural network, which mainly solves the problems of low accuracy and low efficiency of current physical education teachers' exercise guidance for students. The implementation scheme is: download the image data set containing human joint points and their corresponding annotation files, and construct a training data set; build a human joint point detection network based on spatial domain transformation, and use the training data set to train it; collect standard motion and ordinary motion pictures, respectively, are input into the trained human joint point detection network based on spatial domain transformation to obtain their respective joint point coordinates, respectively form standard motion and ordinary motion data sets and match them to obtain standard matching pictures; calculate common motion The Euclidean distance between each joint point in the picture and the standard matching picture, and the joint points whose statistics are greater than the scoring threshold are the action points that need to be corrected. The present invention improves the accuracy rate of movement posture correction and training efficiency, and can be used for single person movement posture correction.

Figure 201911258388

Description

Single-person movement posture correction method based on neural network
Technical Field
The invention belongs to the technical field of image recognition and computer vision, and mainly relates to a single-person movement posture correction method which can be used for guiding the training of ordinary people.
Background
With the rapid development of modern socioeconomic performance, many people neglect the importance of exercise for health. To solve this problem, the state has introduced a series of sports in the middle school entrance to urge students to do physical exercises such as throwing a solid ball, long running, etc. Due to the large population base of the country, the number difference between sports teachers and students is large, and students cannot be guided timely and effectively. The introduction of intelligent motion posture correction methods is urgent. Therefore, there is an urgent need for a method for correcting exercise posture of ordinary people
At present, the correction of the single motion posture is mainly guided by a sports teacher. The guiding mode is to evaluate and correct the actions of the students by guiding the experience of teachers in the sports. This way of guidance is very dependent on the level of exercise of the instructor, and when the instructor experiences a deviation, the training will often have the opposite effect. In addition, because the population base of China is huge and the number of sports teachers is limited, each student cannot be fully guided, which is very unfair for some students who cannot be guided.
Disclosure of Invention
The invention aims to provide a single-person motion posture correction method based on a neural network aiming at the defects of the existing motion posture correction method so as to improve the accuracy and efficiency of motion posture correction.
The idea of the invention is to set up a human body joint point detection network based on spatial domain conversion, construct a standard motion data set, construct a common motion data set, set a scoring threshold value to be 50, and determine action points needing to be corrected. The method comprises the following implementation steps:
(1) collecting a training data set:
(1a) downloading an image data set containing human body joint points and storing the image data set into a training image folder A;
(1b) downloading a label file corresponding to the data set, and storing the label file into a training label folder B;
(1c) putting the image folder and the label folder into the same folder to form a training data set;
(2) constructing a human body joint point detection network based on spatial domain conversion, which is formed by cascading an image spatial domain conversion sub-network and a human body joint point detection sub-network, wherein:
the image space domain conversion sub-network consists of 3 convolutional layers in sequence;
the human body joint point detection sub-network comprises 9 convolution layers and 4 deconvolution layers, namely 4 deconvolution layers are sequentially connected between 8 convolution layers and the last convolution layer which are sequentially cascaded;
(3) training a human body joint point detection network based on spatial domain conversion:
(3a) reading a training data set image from a training image folder A, inputting the image into the human body joint point detection network based on spatial domain conversion constructed in the step (2), generating a spatial conversion image through an image spatial conversion sub-network in the human body joint point detection network, and outputting a predicted coordinate value of a human body joint point through the human body joint point detection sub-network by the spatial conversion image;
(3b) reading the labeled coordinate values corresponding to the images of the training data set from the training labeled folder B, calculating the loss value L of the human body joint point network, and training the network constructed in the step (2) by using the loss value and adopting a random gradient descent algorithm to obtain a trained human body joint point detection network based on spatial domain conversion;
(4) constructing a standard motion data set:
(4a) shooting a standard action video demonstrated by a standard athlete;
(4b) collecting each frame of the shot standard action video into a picture, and storing the picture into a standard picture folder C;
(4c) respectively inputting the collected pictures into a trained human body joint point detection network based on spatial domain conversion to obtain coordinate information of each human body joint point, and storing the obtained coordinate information into a standard labeling folder D;
(5) constructing a common motion data set:
(5a) shooting a non-standard motion video demonstrated by a common athlete;
(5b) collecting each frame of the shot non-standard action video into an image, and storing the image into a test image folder E;
(5c) respectively inputting the collected pictures into a trained human body joint point detection network based on spatial domain conversion to obtain coordinate information of each human body joint point, and storing the obtained coordinate information into a test labeling folder F;
(6) setting a scoring threshold value to be 50, determining an action point needing correction:
(6a) reading coordinate information corresponding to the test picture from the test labeling folder F;
(6b) reading coordinate information corresponding to the standard picture from the standard labeling folder D;
(6c) sequentially calculating the Euclidean distance sum of the coordinates of the joint points of the test picture and the standard picture, and taking the standard picture with the minimum Euclidean distance sum as a standard matching picture of the test picture;
(6d) and calculating the Euclidean distance between the test picture and each joint point in the standard matching picture, and counting the joint points which are larger than the set scoring threshold value, namely the joint points to be corrected.
Compared with the prior art, the invention has the following advantages:
1. the identification accuracy is high
The existing posture correction method is very dependent on the exercise experience and exercise level of a teacher, and when the experience of the teacher deviates or is not very proficient in a certain exercise, misleading effects are often generated on the exercise and training of students. The invention establishes a human body joint point detection network based on spatial domain conversion and collects standard motion videos, and strictly and standard definition is carried out on standard actions, so that the accuracy of guidance is greatly improved.
2. The training efficiency is high
In the existing posture correction method, because the number of teachers is far smaller than that of students, the students often cannot be effectively guided at any time. By establishing a set of universal motion posture detection method, the invention enables students to receive training at any time, thereby greatly improving the training efficiency.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
FIG. 2 is a graph of the standard action collected in the present invention.
FIG. 3 is a graph of the test action collected in the present invention.
Detailed Description
Embodiments of the present invention are further described below with reference to the accompanying drawings.
Referring to fig. 1, the specific implementation steps for this example are as follows.
Step 1, a training data set is collected.
(1.1) downloading an image data set containing human body joint points from an open website and storing the image data set into a training image folder A;
(1.2) downloading a label file corresponding to the data set from the public website, and storing the label file into a training label folder B;
the label file contains coordinate information of 18 joint points in the human body, and the 18 joint points are respectively as follows: nose, neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right eye, left eye, right ear, and left ear;
and (1.3) putting the image folder and the label folder into the same folder to form a training data set.
And 2, building a human body joint point detection network based on spatial domain conversion.
(2.1) constructing an image spatial domain conversion sub-network:
the sub-network is in turn composed of 3 convolutional layers, of which:
the convolution kernel size of the 1 st convolution layer is 1 multiplied by 1, the number of convolution kernels is 3, and the step length is 1;
the convolution kernel size of the 2 nd convolution layer is 1 multiplied by 1, the number of convolution kernels is 64, and the step length is 1;
the convolution kernel size of the 3 rd convolution layer is 1 × 1, the number of convolution kernels is 3, and the step size is 1.
(2.2) constructing a human joint point detection sub-network:
the sub-network comprises 9 convolutional layers and 4 anti-convolutional layers, and the structural relationship is as follows: first convolution layer → second convolution layer → third convolution layer → fourth convolution layer → fifth convolution layer → sixth convolution layer → seventh convolution layer → eighth convolution layer → first reverse convolution layer → second reverse convolution layer → third reverse convolution layer → fourth reverse convolution layer → ninth convolution layer, wherein:
the convolution kernel size of the first convolution layer is 3 multiplied by 3, the number of the convolution kernels is 128, and the step length is 1;
the convolution kernel size of the second convolution layer is 1 multiplied by 1, the number of convolution kernels is 256, and the step length is 2;
the convolution kernel size of the third convolution layer is 3 multiplied by 3, the number of convolution kernels is 256, and the step length is 1;
the convolution kernel size of the fourth convolution layer is 1 × 1, the number of convolution kernels is 256, and the step length is 2;
the convolution kernel size of the fifth convolution layer is 3 × 3, the number of convolution kernels is 256, and the step length is 1;
the convolution kernel size of the sixth convolution layer is 1 × 1, the number of convolution kernels is 256, and the step size is 2;
the convolution kernel size of the seventh convolution layer is 3 × 3, the number of convolution kernels is 256, and the step size is 1;
the convolution kernel size of the eighth convolution layer is 1 × 1, the number of convolution kernels is 256, and the step size is 1;
the convolution kernel size of the first deconvolution layer is 3 × 3, the number of convolution kernels is 256, and the step size is 2;
the convolution kernel size of the second deconvolution layer is 3 × 3, the number of convolution kernels is 128, and the step size is 2;
the convolution kernel size of the third deconvolution layer is 3 × 3, the number of convolution kernels is 128, and the step size is 2;
the convolution kernel size of the fourth deconvolution layer is 3 × 3, the number of convolution kernels is 128, and the step size is 1;
the size of convolution kernels of the ninth convolution layer is 1 multiplied by 1, the number of convolution kernels is 18, and the step size is 1;
and (2.3) cascading the established image spatial domain conversion sub-network with a human body joint point detection sub-network to form a human body joint point detection network based on spatial domain conversion.
And 3, training a human body joint point detection network based on spatial domain conversion.
(3.1) reading a training data set image from the training image folder A, inputting the image into the human body joint point detection network based on spatial domain conversion constructed in the step (2), generating a spatial conversion image through an image spatial conversion sub-network in the human body joint point detection network, and outputting a predicted coordinate value of the human body joint point through the human body joint point detection sub-network by the spatial conversion image;
(3.2) reading the labeled coordinate values corresponding to the images of the training data set from the training labeled folder B, and calculating the loss value L of the human body joint point detection network based on spatial domain conversion:
Figure BDA0002310930450000041
wherein i represents the serial number of the human body joint point, xi' and yi' labeling abscissa and ordinate values, x, respectively representing the joint points of the corresponding serial numbersiAnd yiRespectively representing the abscissa and the ordinate of a predicted coordinate value output by a human body joint point detection network based on spatial domain conversion;
(3.3) detecting the loss value L of the network by using the human body joint points based on spatial domain conversion, and training the network constructed in the step (2) by adopting a random gradient descent algorithm:
(3.3.1) taking the derivative of the loss value of the human body joint point detection network based on the spatial domain conversion:
Figure BDA0002310930450000051
f represents a derivative value of a loss value L of the human body joint point detection network based on the spatial domain conversion to a network parameter theta thereof, and theta represents a parameter of the human body joint point detection network based on the spatial domain conversion;
(3.3.2) calculating an updated value of the human body joint point detection network parameter based on the spatial domain conversion:
θ2=θ-αF
wherein, theta2Representing the updated value of the human body joint point detection network parameters based on the spatial domain conversion, α is the learning rate of the human body joint point detection network based on the spatial domain conversion, and the value is 0.00025;
(3.3.3) detecting updated values of network parameters θ with human body joint points based on spatial domain transformation2Replacing the parameter theta of the original network;
(3c4) and (3) iterating the steps from (3.3.1) to (3.3.3) for 150000 times to obtain the trained human body joint point detection network based on spatial domain conversion.
And 4, constructing a standard motion data set:
(4.1) shooting a standard motion video demonstrated by a standard athlete, wherein the shooting equipment is Canon EOS 5D Mark IV, and the video frame rate is 60 frames/second;
(4.2) collecting each frame of the shot standard motion video into a picture as shown in figure 2, and storing the picture into a standard picture folder C;
and (4.3) respectively inputting the collected pictures into a trained human body joint point detection network based on spatial domain conversion to obtain coordinate information of each human body joint point, and storing the obtained coordinate information into a standard labeling folder D.
And 5, constructing a common motion data set.
(5.1) shooting a nonstandard action video demonstrated by a common athlete, wherein the shooting equipment is Canon EOS 5D MarkIV, and the video frame rate is 60 frames/second;
(5.2) collecting each frame of the shot non-standard motion video into a picture as shown in figure 3, and storing the picture into a test picture folder E;
and (5.3) respectively inputting the acquired pictures into a trained human body joint point detection network based on spatial domain conversion to obtain coordinate information of each human body joint point, and storing the obtained coordinate information into a test labeling folder F.
And 6, determining action points needing to be corrected.
(6.1) reading the coordinate information corresponding to the test picture from the test labeling folder F;
(6.2) reading coordinate information corresponding to the standard picture from the standard labeling folder D;
(6.3) sequentially calculating the sum of Euclidean distances between the coordinates of the test picture and the coordinates of the joint points of the standard picture:
Figure BDA0002310930450000061
wherein, P represents the sum of Euclidean distances between the coordinates of the test picture and the coordinates of the joint points of the standard picture, i represents the serial number of the joint points of the human body, ai' and bi' respectively representing the abscissa and ordinate values of the joint point of the corresponding serial number in the test picture, aiAnd biRespectively representing the abscissa and ordinate values of the joint point with the corresponding serial number in the standard picture.
(6.4) taking the standard picture with the minimum sum of Euclidean distances as a standard matching picture of the test picture from the sum of the Euclidean distances of the coordinates of the joint points of the calculated test picture and the standard picture;
(6.5) calculating the Euclidean distance between the test picture and each joint point in the standard matching picture:
Qj=(c'j-cj)2+(d'j-dj)2,j=1,2,...,18
wherein Q isjRepresenting Euclidean distance of j-th joint point coordinates of the test picture and the standard picture, wherein j represents a serial number of human body joint points, c'jAnd d'jRespectively representing the abscissa and ordinate values of the joint point of the corresponding serial number in the test picture, cjAnd djRespectively representing the abscissa and ordinate values of the joint point with the corresponding serial number in the standard matching picture.
(6.6) setting a scoring threshold value to be 50, and counting the joint points which are greater than the scoring threshold value in the Euclidean distance between the test picture and each joint point in the standard matching picture, namely the joint points to be corrected.
The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (8)

1.一种基于神经网络的单人运动姿态矫正方法,其特征在于,包括如下:1. a single person motion posture correction method based on neural network, is characterized in that, comprises as follows: (1)采集训练数据集:(1) Collect training data sets: (1a)下载包含人体关节点的图像数据集,存入训练图像文件夹A中;(1a) Download the image dataset containing human joint points, and store it in the training image folder A; (1b)下载数据集对应的标注文件,存入训练标注文件夹B中;(1b) Download the annotation file corresponding to the dataset and store it in the training annotation folder B; (1c)将图像文件夹和标注文件夹放入同一文件夹下,构成训练数据集;(1c) Put the image folder and the annotation folder into the same folder to form a training data set; (2)搭建由图像空间域转换子网络与人体关节点检测子网络级联构成的基于空间域转换的人体关节点检测网络,其中:(2) Build a human joint detection network based on spatial domain transformation, which is composed of a cascade of image space domain transformation sub-network and human joint detection sub-network, where: 所述图像空间域转换子网络依次由3个卷积层组成;The image space domain conversion sub-network is sequentially composed of three convolutional layers; 所述人体关节点检测子网络,包括9个卷积层和4个反卷积层,即在依次级联的8个卷积层与最后一个卷积层之间依次连接4个反卷积层;The human body joint point detection sub-network includes 9 convolutional layers and 4 deconvolutional layers, that is, 4 deconvolutional layers are sequentially connected between the 8 convolutional layers cascaded and the last convolutional layer in turn ; (3)训练基于空间域转换的人体关节点检测网络:(3) Training a human joint point detection network based on spatial domain transformation: (3a)从训练图像文件夹A中读取训练数据集图像,将该图像输入到(2)构建的基于空间域转换的人体关节点检测网络中,通过其中的图像空间转换子网络生成空间转换图像,该空间转换图像再通过人体关节点检测子网络,输出人体关节点的预测坐标值;(3a) Read the training dataset image from the training image folder A, input the image into the human joint point detection network based on spatial domain transformation constructed in (2), and generate spatial transformation through the image space transformation sub-network in it. image, the spatially transformed image then passes through the human body joint point detection sub-network, and outputs the predicted coordinate value of the human body joint point; (3b)从训练标注文件夹B中读取训练数据集图像对应的标注坐标值,计算人体关节点网络的损失值L,利用该损失值采用随机梯度下降算法对(2)构建的网络进行训练,得到训练好的基于空间域转换的人体关节点检测网络;(3b) Read the labeled coordinate value corresponding to the training dataset image from the training label folder B, calculate the loss value L of the human joint point network, and use the loss value to train the network constructed in (2) by using the stochastic gradient descent algorithm , to obtain a trained human joint detection network based on spatial domain transformation; (4)构建标准运动数据集:(4) Construct a standard motion dataset: (4a)拍摄由标准运动员演示的标准动作视频;(4a) Shooting video of standard movements demonstrated by standard athletes; (4b)将拍摄标准动作视频的每一帧采集成图片,存入标准图片文件夹C中;(4b) collecting each frame of the standard action video into a picture, and storing it in the standard picture folder C; (4c)将采集到的图片分别输入训练好的基于空间域转换的人体关节点检测网络中,得到每个人体关节点的坐标信息,将得到的坐标信息存入标准标注文件夹D中;(4c) respectively input the collected pictures into the trained human body joint point detection network based on spatial domain transformation, obtain the coordinate information of each human body joint point, and store the obtained coordinate information in the standard labeling folder D; (5)构建普通运动数据集:(5) Construct a common motion dataset: (5a)拍摄由普通运动员演示的非标准动作视频;(5a) Shooting videos of non-standard movements demonstrated by ordinary athletes; (5b)将拍摄的非标准动作视频每一帧采集成图片,存入测试图片文件夹E中;(5b) each frame of the non-standard action video that is shot is collected into a picture, and is stored in the test picture folder E; (5c)将采集到的图片分别输入到训练好的基于空间域转换的人体关节点检测网络中,得到每个人体关节点的坐标信息,将得到的坐标信息存入测试标注文件夹F中;(5c) respectively input the collected pictures into the trained human body joint point detection network based on spatial domain transformation, obtain the coordinate information of each human body joint point, and store the obtained coordinate information in the test labeling folder F; (6)设定打分阈值为50,确定需要矫正的动作点:(6) Set the scoring threshold to 50, and determine the action point that needs to be corrected: (6a)从测试标注文件夹F中读取测试图片对应的坐标信息;(6a) read the coordinate information corresponding to the test picture from the test label folder F; (6b)从标准标注文件夹D中读取标准图片对应的坐标信息;(6b) read the coordinate information corresponding to the standard picture from the standard labeling folder D; (6c)依次计算测试图片坐标与标准图片的关节点坐标的欧氏距离之和,取欧氏距离之和最小的标准图片为该测试图片的标准匹配图片;(6c) calculate the Euclidean distance sum of the joint point coordinates of the test picture coordinates and the standard picture successively, and take the standard picture with the smallest Euclidean distance sum as the standard matching picture of this test picture; (6d)计算测试图片与其标准匹配图片中每个关节点的欧氏距离,统计其中大于所设打分阈值的关节点,即为所要矫正的动作点。(6d) Calculate the Euclidean distance between each joint point in the test picture and its standard matching picture, and count the joint points greater than the set scoring threshold, which are the action points to be corrected. 2.根据权利要求1所述的方法,其中(1b)下载数据集对应的标注文件,包含人体图片以及每张图片中人体的18个关节点位置坐标信息,该18个关节点分别为:鼻子、脖子、右肩、右肘、右手腕、左肩、左肘、左手腕、右臀、右膝盖、右脚踝、左臀、左膝盖、左脚踝、右眼、左眼、右耳朵和左耳朵。2. The method according to claim 1, wherein (1b) download the corresponding annotation file of the data set, including the human body picture and 18 joint point position coordinate information of the human body in each picture, and the 18 joint points are respectively: nose , neck, right shoulder, right elbow, right wrist, left shoulder, left elbow, left wrist, right hip, right knee, right ankle, left hip, left knee, left ankle, right eye, left eye, right ear and left ear. 3.根据权利要求1所述的方法,其中(2)中空间域转换子网络的3个卷积层,其参数如下:3. The method according to claim 1, wherein the 3 convolution layers of the spatial domain conversion sub-network in (2) have the following parameters: 第1卷积层的卷积核大小为1×1,卷积核数量为3,步长为1;The convolution kernel size of the first convolutional layer is 1×1, the number of convolution kernels is 3, and the stride is 1; 第2卷积层的卷积核大小为1×1,卷积核数量为64,步长为1;The convolution kernel size of the second convolutional layer is 1×1, the number of convolution kernels is 64, and the stride is 1; 第3卷积层的卷积核大小为1×1,卷积核数量为3,步长为1。The convolution kernel size of the third convolutional layer is 1×1, the number of convolution kernels is 3, and the stride is 1. 4.根据权利要求1所述的方法,其中(2)中搭建的人体关节点检测子网络,其结构依次为:第一卷积层→第二卷积层→第三卷积层→第四卷积层→第五卷积层→第六卷积层→第七卷积层→第八卷积层→第一反卷积层→第二反卷积层→第三反卷积层→第四反卷积层→第九卷积层,各层参数如下:4. The method according to claim 1, wherein the human body joint point detection sub-network built in (2) has the following structure: the first convolutional layer→the second convolutional layer→the third convolutional layer→the fourth Convolutional layer → fifth convolutional layer → sixth convolutional layer → seventh convolutional layer → eighth convolutional layer → first deconvolutional layer → second deconvolutional layer → third deconvolutional layer → first Four deconvolution layers → ninth convolution layers, the parameters of each layer are as follows: 第一卷积层的卷积核大小为3×3,卷积核数量为128,步长为1;The convolution kernel size of the first convolutional layer is 3×3, the number of convolution kernels is 128, and the stride is 1; 第二卷积层的卷积核大小为1×1,卷积核数量为256,步长为2The kernel size of the second convolutional layer is 1×1, the number of kernels is 256, and the stride is 2. 第三卷积层的卷积核大小为3×3,卷积核数量为256,步长为1;The convolution kernel size of the third convolutional layer is 3×3, the number of convolution kernels is 256, and the stride is 1; 第四卷积层的卷积核大小为1×1,卷积核数量为256,步长为2;The convolution kernel size of the fourth convolutional layer is 1×1, the number of convolution kernels is 256, and the stride is 2; 第五卷积层的卷积核大小为3×3,卷积核数量为256,步长为1;The convolution kernel size of the fifth convolutional layer is 3×3, the number of convolution kernels is 256, and the stride is 1; 第六卷积层的卷积核大小为1×1,卷积核数量为256,步长为2;The convolution kernel size of the sixth convolutional layer is 1×1, the number of convolution kernels is 256, and the stride is 2; 第七卷积层的卷积核大小为3×3,卷积核数量为256,步长为1;The convolution kernel size of the seventh convolutional layer is 3×3, the number of convolution kernels is 256, and the stride is 1; 第八卷积层的卷积核大小为1×1,卷积核数量为256,步长为1;The convolution kernel size of the eighth convolution layer is 1×1, the number of convolution kernels is 256, and the stride is 1; 第一反卷积层的卷积核大小为3×3,卷积核数量为256,步长为2;The convolution kernel size of the first deconvolution layer is 3×3, the number of convolution kernels is 256, and the stride is 2; 第二反卷积层的卷积核大小为3×3,卷积核数量为128,步长为2;The convolution kernel size of the second deconvolution layer is 3×3, the number of convolution kernels is 128, and the stride is 2; 第三反卷积层的卷积核大小为3×3,卷积核数量为128,步长为2;The convolution kernel size of the third deconvolution layer is 3×3, the number of convolution kernels is 128, and the stride is 2; 第四反卷积层的卷积核大小为3×3,卷积核数量为128,步长为1;The convolution kernel size of the fourth deconvolution layer is 3×3, the number of convolution kernels is 128, and the stride is 1; 第九卷积层的卷积核大小为1×1,卷积核数量为18,步长为1。The convolution kernel size of the ninth convolutional layer is 1×1, the number of convolution kernels is 18, and the stride is 1. 5.根据权利要求1所述的方法,其中(3b)中计算的人体关节点检测网络的损失值L,其计算公式为:5. The method according to claim 1, wherein the loss value L of the human body joint point detection network calculated in (3b), its calculation formula is:
Figure FDA0002310930440000031
Figure FDA0002310930440000031
其中,i代表人体关节点的序号,x'i和y'i分别代表所对应序号的关节点的标注横坐标和纵坐标值,xi和yi分别代表人体关节点检测网络输出的预测坐标值的横坐标和纵坐标。Among them, i represents the serial number of the human body joint point, x' i and y' i represent the marked abscissa and ordinate value of the joint point of the corresponding serial number, respectively, x i and y i represent the predicted coordinates output by the human joint point detection network The abscissa and ordinate of the value.
6.根据权利要求1所述的方法,其中(3b)中利用该损失值采用随机梯度下降算法对基于空间域转换的人体关节点检测网络进行,其实现如下:6. The method according to claim 1, wherein the loss value is used in (3b) to perform a stochastic gradient descent algorithm on the human body joint point detection network based on spatial domain transformation, and its implementation is as follows: (3b1)按照下式,对基于空间域转换的人体关节点检测网络的损失值求导数:(3b1) Calculate the derivative of the loss value of the human joint point detection network based on the spatial domain transformation according to the following formula:
Figure FDA0002310930440000032
Figure FDA0002310930440000032
其中,F代表基于空间域转换的人体关节点检测网络的损失值L对其网络参数θ的导数值,θ代表基于空间域转换的人体关节点检测网络的参数;Among them, F represents the derivative value of the loss value L of the human joint point detection network based on the spatial domain transformation to its network parameter θ, and θ represents the parameters of the human joint point detection network based on the spatial domain transformation; (3b2)按照下式计算基于空间域转换的人体关节点检测网络参数的更新值:(3b2) Calculate the updated value of the network parameters of the human body joint point detection based on the spatial domain transformation according to the following formula: θ2=θ-αFθ 2 =θ-αF 其中,θ2代表基于空间域转换的人体关节点检测网络参数的更新值,α为基于空间域转换的人体关节点检测网络的学习速率,取值为0.00025;Among them, θ 2 represents the update value of the parameters of the human joint point detection network based on the spatial domain transformation, α is the learning rate of the human joint point detection network based on the spatial domain transformation, and the value is 0.00025; (3b3)用基于空间域转换的人体关节点检测网络参数的更新值θ2代替原网络的参数θ;(3b3) Replace the parameter θ of the original network with the updated value θ 2 of the human joint point detection network parameter based on the spatial domain transformation; (3b4)将步骤(3b1)到(3b3)迭代150000次,得到训练好的基于空间域转换的人体关节点检测网络。(3b4) Steps (3b1) to (3b3) are iterated 150,000 times to obtain a trained human joint detection network based on spatial domain transformation.
7.根据权利要求1所述的方法,其中(6c)中计算测试图片坐标与标准图片的关节点坐标的欧氏距离之和,公式如下:7. The method according to claim 1, wherein the Euclidean distance sum of the joint point coordinates of the test picture coordinates and the standard picture is calculated in (6c), and the formula is as follows:
Figure FDA0002310930440000041
Figure FDA0002310930440000041
其中,P代表测试图片坐标与标准图片的关节点坐标的欧氏距离之和,i代表人体关节点的序号,a'i和b'i分别代表测试图片中所对应序号的关节点的横坐标和纵坐标值,ai和bi分别代表标准图片中所对应序号的关节点的横坐标和纵坐标值。Among them, P represents the sum of the Euclidean distance between the coordinates of the test image and the joint point coordinates of the standard image, i represents the serial number of the joint point of the human body, and a' i and b' i represent the abscissa of the joint point corresponding to the serial number in the test image respectively. and ordinate values, a i and b i respectively represent the abscissa and ordinate values of the joint points of the corresponding serial numbers in the standard picture.
8.根据权利要求1所述的方法,其中(6d)中计算测试图片与其标准匹配图片中每个关节点的欧氏距离,其公式如下:8. The method according to claim 1, wherein the Euclidean distance of each joint point in the test picture and its standard matching picture is calculated in (6d), and its formula is as follows: Qj=(c'j-cj)2+(d'j-dj)2,j=1,2,...,18Q j =(c' j -c j ) 2 +(d' j -d j ) 2 ,j=1,2,...,18 其中,Qj代表测试图片与标准图片的第j个关节点坐标的欧氏距离,j代表人体关节点的序号,c'j和d'j分别代表测试图片中所对应序号的关节点的横坐标和纵坐标值,cj和dj分别代表标准匹配图片中所对应序号的关节点的横坐标和纵坐标值。Among them, Q j represents the Euclidean distance between the coordinates of the jth joint point of the test image and the standard image, j represents the serial number of the joint point of the human body, and c' j and d' j represent the horizontal direction of the joint point of the corresponding serial number in the test image, respectively. Coordinate and ordinate value, c j and d j respectively represent the abscissa and ordinate values of the joint point of the corresponding serial number in the standard matching picture.
CN201911258388.4A 2019-12-10 2019-12-10 Single-person movement posture correction method based on neural network Active CN111079616B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911258388.4A CN111079616B (en) 2019-12-10 2019-12-10 Single-person movement posture correction method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911258388.4A CN111079616B (en) 2019-12-10 2019-12-10 Single-person movement posture correction method based on neural network

Publications (2)

Publication Number Publication Date
CN111079616A true CN111079616A (en) 2020-04-28
CN111079616B CN111079616B (en) 2022-03-04

Family

ID=70313971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911258388.4A Active CN111079616B (en) 2019-12-10 2019-12-10 Single-person movement posture correction method based on neural network

Country Status (1)

Country Link
CN (1) CN111079616B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118893636A (en) * 2024-10-09 2024-11-05 烟台大学 A robot posture estimation method and system based on convolutional neural network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108762495A (en) * 2018-05-18 2018-11-06 深圳大学 The virtual reality driving method and virtual reality system captured based on arm action
CN109086754A (en) * 2018-10-11 2018-12-25 天津科技大学 A kind of human posture recognition method based on deep learning
WO2019035586A1 (en) * 2017-08-18 2019-02-21 강다겸 Method and apparatus for providing posture guide
CN110175566A (en) * 2019-05-27 2019-08-27 大连理工大学 Hand posture estimation system and method based on RGBD fusion network
CN110245623A (en) * 2019-06-18 2019-09-17 重庆大学 A real-time human motion posture correction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019035586A1 (en) * 2017-08-18 2019-02-21 강다겸 Method and apparatus for providing posture guide
CN108762495A (en) * 2018-05-18 2018-11-06 深圳大学 The virtual reality driving method and virtual reality system captured based on arm action
CN109086754A (en) * 2018-10-11 2018-12-25 天津科技大学 A kind of human posture recognition method based on deep learning
CN110175566A (en) * 2019-05-27 2019-08-27 大连理工大学 Hand posture estimation system and method based on RGBD fusion network
CN110245623A (en) * 2019-06-18 2019-09-17 重庆大学 A real-time human motion posture correction method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈聪: "基于人体姿态序列提取和分析的行为识别", 《中国博士学位论文全文数据库 (基础科学辑)》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118893636A (en) * 2024-10-09 2024-11-05 烟台大学 A robot posture estimation method and system based on convolutional neural network

Also Published As

Publication number Publication date
CN111079616B (en) 2022-03-04

Similar Documents

Publication Publication Date Title
CN108734104B (en) Body-building action error correction method and system based on deep learning image recognition
CN110097094B (en) A Few-shot Classification Method Based on Multiple Semantic Fusion Oriented to Character Interaction
CN110428486B (en) Virtual interactive fitness method, electronic device and storage medium
CN107945118A (en) A kind of facial image restorative procedure based on production confrontation network
Zhang et al. Semi-supervised action quality assessment with self-supervised segment feature recovery
CN112464915B (en) Push-up counting method based on human skeleton point detection
CN113705540A (en) Method and system for recognizing and counting non-instrument training actions
CN111507185B (en) A Fall Detection Method Based on Stacked Hollow Convolutional Networks
Zhao et al. 3d pose based feedback for physical exercises
CN113361928A (en) Crowdsourcing task recommendation method based on special-pattern attention network
CN115731608A (en) Physical exercise training method and system based on human body posture estimation
KR20220013347A (en) System for managing and evaluating physical education based on artificial intelligence based user motion recognition
CN115171208A (en) Sit-up posture evaluation method and device, electronic equipment and storage medium
CN115131879A (en) Action evaluation method and device
CN111079616B (en) Single-person movement posture correction method based on neural network
CN117058758B (en) Intelligent sports examination method based on AI technology and related device
CN117333949A (en) Method for identifying limb actions based on video dynamic analysis
CN116824697A (en) Motion recognition method and device and electronic equipment
CN111783697A (en) Wrong question detection and target recommendation system and method based on convolutional neural network
CN108549857A (en) Event detection model training method, device and event detecting method
CN115546893A (en) A cheerleading video evaluation visualization method and system
Bernardo et al. Determining exercise form correctness in real time using human pose estimation
CN119810925A (en) A computer vision-based method for analyzing pickleball player motion
Rozaliev et al. Methods and applications for controlling the correctness of physical exercises performance
CN111563443A (en) Continuous motion action accuracy evaluation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant