Disclosure of Invention
In order to solve the technical problem that the accuracy of abnormal data detection is reduced due to the fact that the constitution difference of different students is large and the constitution test data difference is also large, so that accurate constitution test data of the students cannot be acquired due to the fact that misdetection or missing detection occurs through the existing method, the invention aims to provide a constitution test data acquisition method and system for students in colleges and universities, and the adopted technical scheme is as follows:
the invention provides a college student physique test data acquisition method, which comprises the following steps:
Acquiring body measurement data of each student in different dimensions, wherein each body measurement data corresponds to a time stamp, and acquiring standard scores of each body measurement data according to deviation of each body measurement data relative to all body measurement data in the dimension of each body measurement data;
Taking any one student as a student to be tested, taking any one body measurement data of the student to be tested as data to be tested, and obtaining the relative deviation of the data to be tested according to the difference of the standard scores between the data to be tested and other body measurement data except the data to be tested in the student to be tested; screening suspicious body measurement data from all body measurement data based on the relative deviation;
According to the difference of the body measurement data of the same student in any two dimensions and the relative deviation of the body measurement data, obtaining the correlation between any two dimensions; taking any one suspicious body measurement data as target data, taking a student corresponding to the target data as a target student, and obtaining real abnormal parameters of the target data according to the difference of the standard scores and the correlation between dimensions and the difference of the time stamps and the difference of the relative deviation between the target data and other suspicious body measurement data of the target student and other body measurement data except the target data;
according to the real abnormal parameters of the suspicious body measurement data, the standard scores of the suspicious body measurement data are adjusted, and abnormal body measurement data are screened from all the suspicious body measurement data;
and optimizing the acquisition of student body measurement data based on the abnormal body measurement data.
Further, the obtaining the relative deviation of the data to be measured according to the standard score difference between the data to be measured and other body measurement data except the data to be measured in the student to be measured includes:
taking the absolute value of the difference value of the standard score between the data to be measured and each other body measurement data except the data to be measured in the student to be measured as a first score difference between the data to be measured and each other body measurement data except the data to be measured in the student to be measured;
and carrying out normalization processing on the average value of the first score difference between the data to be measured and all other body measurement data except the data to be measured in the student to be measured to obtain the relative deviation of the data to be measured.
Further, the obtaining the correlation between any two dimensions according to the difference of the body measurement data of the same student in any two dimensions and the relative deviation of the body measurement data includes:
Respectively taking the arbitrarily selected two dimensions as a first dimension and a second dimension, and respectively normalizing body measurement data in the first dimension and the second dimension to obtain standardized data of each body measurement data;
Taking the absolute value of the difference value of the standardized data of each student in the first dimension and the second dimension as the standardized distance of the body measurement data of each student between the first dimension and the second dimension;
Selecting one dimension from the first dimension and the second dimension as a target dimension, and dividing standardized data of each preset first number of students into a group in the target dimension to serve as a standardized data group;
In each standardized data set, taking a student corresponding to the maximum value of standardized data as a first marked student of each standardized data set, taking a student corresponding to the minimum value of standardized data as a second marked student of each standardized data set, taking a set of the first marked students of all standardized data sets as a first set of target dimensions, and taking a set of the second marked students of all standardized data sets as a second set of target dimensions;
Taking the intersection of the first set between the first dimension and the second dimension as a first intersection; taking the intersection of the second set between the first dimension and the second dimension as a second intersection;
And obtaining the correlation between the first dimension and the second dimension according to the distribution of the standardized distances of all students between the first dimension and the second dimension, the relative deviation of the body measurement data of each student in the first dimension and the second dimension, and the number of the students in the first intersection and the second intersection.
Further, the calculation formula of the correlation between the first dimension and the second dimension is:
wherein, Representing a correlation between the first dimension and the second dimension; Representing the first dimension and the second dimension Standardized distances of the body measurement data of the individual students; representing an average of normalized distances for all students between the first dimension and the second dimension; Represent the first Relative deviation of the physical measurement data of the individual students in the first dimension; Represent the first Relative deviation of the physical measurement data of the individual students in the second dimension; Representing the number of students; representing the number of students in the first intersection; Representing the number of students in the second intersection; representing a number of students in a first set of first dimensions; representing a number of students in the second set of first dimensions; expressed in natural constant An exponential function of the base; representing preset parameters, wherein the value range is 。
Further, the obtaining the real abnormal parameters of the target data includes:
Taking the absolute value of the difference value of the standard score between the target data and each other body measurement data except the target data in the target student as a second score difference between the target data and each other body measurement data except the target data in the target student;
Normalizing the correlation between the dimension of the target data and the dimension of each other body measurement data except the target data in the target students to obtain weight parameters;
Weighting and summing the second fractional differences by using the weight parameters to obtain initial abnormal parameters of the target data;
Selecting a second preset number of other suspicious body measurement data closest to the timestamp of the target data from the dimension of the target data as reference data of the target data;
And adjusting the initial abnormal parameters according to the difference of the time stamp between the target data and each reference data and the difference of the relative deviation between the target data and each reference data to obtain the real abnormal parameters of the target data.
Further, the calculation formula of the real abnormal parameters of the target data is as follows:
wherein, Representing real abnormal parameters of the target data; Initial anomaly parameters representing target data; A timestamp representing the target data; representing the first of the target data A time stamp of the individual reference data; Representing the relative deviation of the target data; representing the first of the target data Relative deviation of the individual reference data; expressed in natural constant An exponential function of the base; representing the preset second quantity, and the value range is that 。
Further, the step of adjusting the standard score of the suspicious body measurement data according to the real abnormal parameters of the suspicious body measurement data, and the step of screening abnormal body measurement data from all suspicious body measurement data includes:
Taking the product value of the real abnormal parameter and the standard score of each suspicious body measurement data as an adjustment standard score of each suspicious body measurement data;
Based on the Grabbs algorithm, abnormal body measurement data are screened from all suspicious body measurement data according to the adjustment standard score of each suspicious body measurement data.
Further, the optimizing the acquisition of student body test data based on the abnormal body test data comprises:
taking students corresponding to the abnormal body measurement data as students to be collected, and taking body measurement items of dimensions corresponding to the abnormal body measurement data as items to be collected;
and removing the abnormal body measurement data from the database, testing the items to be collected of the students to be collected again, obtaining new body measurement data of the students to be collected, and recording the new body measurement data into the database.
Further, the obtaining the standard score of each of the body measurement data according to the deviation of each of the body measurement data relative to all of the body measurement data in the dimension thereof comprises:
Taking any one of the measured data as data to be analyzed;
Taking the average value of all body measurement data in the dimension of the data to be analyzed as the data average value of the dimension of the data to be analyzed, and taking the standard deviation of all body measurement data in the dimension of the data to be analyzed as the data standard deviation of the dimension of the data to be analyzed;
Taking the absolute value of the difference value between the data to be analyzed and the data average value as the data deviation of the data to be analyzed;
And obtaining a standard score of the data to be analyzed, wherein the standard score is positively correlated with the data deviation, and the standard score is negatively correlated with the data standard deviation.
The invention also provides a college student physical testing data acquisition system, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of any one college student physical testing data acquisition method when executing the computer program.
The invention has the following beneficial effects:
According to the method, the situation that mismeasurement or missing measurement occurs to the body measurement data of different students is considered, so that the accuracy of abnormal data detection is reduced, the accurate body measurement data of the students cannot be acquired, firstly, the body measurement data of each student in different dimensions are acquired, the degree of deviation between the body measurement data and the overall data of the dimension where the body measurement data are located is reflected through the standard score, the degree of abnormality of the body measurement data can be accurately analyzed on the basis of the standard score later, factors with larger body measurement differences of different students are eliminated, the fact that the deviation of the body measurement data of the same student in normal conditions relative to the overall data of the dimension where the body measurement data are located is similar is considered, therefore the possibility of abnormality of each body measurement data can be primarily reflected through the relative deviation, the body measurement data which may have abnormality can be primarily screened through the relative deviation, namely the suspicious body measurement data can be accurately calculated and analyzed according to the relevance between the dimensions, the actual abnormal measurement data can be accurately analyzed on the basis of the relevance between the dimensions, the fact that similar data may occur under the condition that the actual abnormal measurement data have similar data are usually in time, the condition that the actual measurement data with stronger relevance is more than the actual measurement parameters, the actual measurement data can be accurately reflected by the actual measurement data, the fact that the body measurement data can be accurately measured through the actual measurement parameters can be accurately measured by the students, the fact that the abnormal measurement data can be accurately is more accurately detected, the abnormal data can be acquired through the accuracy of the body measurement parameters, and the abnormal measurement data can be more accurately is improved, and the abnormal measurement data can be accurately can be better obtained through the accuracy data.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following is a specific implementation, structure, characteristics and effects of the college student physical test data acquisition method and system according to the invention, which are described in detail below with reference to the accompanying drawings and the preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
An embodiment of a method and a system for collecting physique test data of college students:
the invention provides a method and a system for collecting physique test data of college students, which are specifically described below with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for collecting physical testing data of students in colleges and universities according to an embodiment of the present invention is shown, where the method includes:
Step S1: and acquiring body measurement data of each student in different dimensions, wherein each body measurement data corresponds to a time stamp, and acquiring the standard score of each body measurement data according to the deviation of each body measurement data relative to all body measurement data in the dimension of each body measurement data.
The physical test is a comprehensive test for detecting the overall physical condition of students, and through the test result, schools can know the overall physical level and characteristics of the students, discover the students with weak physical constitution or health problems in time, and purposefully develop physical education plans and course arrangements to help the students to improve physical level and motor skills.
Due to equipment failure or human error, a large amount of abnormal data exists in the body measurement data of the students, the abnormal data are usually identified by utilizing the characteristic of larger deviation between the abnormal data and the whole data in the related technology, and the abnormal data are further processed, so that the accurate acquisition of the body measurement data of the students is realized, but under the condition that the number of the students is too large, the physique difference of different students is large, the difference of the body measurement data is also large, the condition of misdetection or missed detection can occur by the existing method, the accuracy of abnormal data detection is reduced, and the accurate body measurement data of the students cannot be accurately acquired. Therefore, the embodiment of the invention provides a college student physical test data acquisition method to solve the problem.
In the embodiment of the invention, during the physical testing of students, standardized testing equipment specified by the country is utilized to collect the physical testing data of each student, and the collected physical testing data of all students are recorded in a database, so that subsequent calculation and analysis are facilitated, wherein the physical testing data of each student comprises a plurality of dimensions, each dimension at least comprises height, weight, sitting body forward bend, attraction upwards, standing long jump and the like, and the specific types and the number of the dimensions can be set by an implementer according to specific implementation scenes without limitation.
In consideration of the fact that partial abnormal data exist in body measurement data of students due to equipment detection faults or human recording errors, in the existing method, for example, the Grabbs algorithm usually utilizes the characteristic that the deviation between the abnormal data and the whole data is large, and the abnormal data are identified, so that the embodiment of the invention firstly analyzes the deviation and the analysis of each body measurement data relative to all body measurement data in the dimension of the body measurement data, reflects the deviation degree between each body measurement data and the whole data in the dimension of the body measurement data through the acquired standard score, and can further adjust the standard score of the body measurement data at the same time, improve the accuracy of detecting the abnormal body measurement data, further optimize the acquisition of the body measurement data and acquire the accurate body measurement data of the students.
Preferably, in one embodiment of the present invention, the method for obtaining the standard score of each measured data specifically includes:
In order to facilitate clearer analysis, any one of the measured data can be used as the data to be analyzed; taking the average value of all body measurement data in the dimension of the data to be analyzed as the data average value of the dimension of the data to be analyzed, and taking the standard deviation of all body measurement data in the dimension of the data to be analyzed as the data standard deviation of the dimension of the data to be analyzed; taking the absolute value of the difference value between the data to be analyzed and the data average value as the data deviation of the data to be analyzed; and obtaining the standard score of the data to be analyzed, wherein the standard score is positively correlated with the data deviation, and the standard score is negatively correlated with the data standard deviation. The expression of the standard score may specifically be, for example:
wherein, A standard score representing data to be analyzed; representing data to be analyzed; the average value of all body measurement data in the dimension of the data to be analyzed is represented, namely the average value of the data in the dimension of the data to be analyzed; the standard deviation of all body measurement data representing the dimension of the data to be analyzed, namely the standard deviation of the data of the dimension of the data to be analyzed, is unlikely to be equal to all the body measurement data of all students in the same dimension, so 。
In the process of obtaining the standard score of the data to be analyzed, the standard scoreThe larger the deviation of the data to be analyzed relative to the overall data of the dimension of the data to be analyzed is, the larger the deviation is, the standard fraction of the body measurement data can be further adjusted in the follow-up process, the interference of factors such as the large physical differences of different students is eliminated, and the abnormal body measurement data is detected based on the adjusted standard fraction, whereinThe larger the description of the data to be analyzedData mean value of dimension of the dataThe larger the difference, and thus the larger the deviation of the data to be analyzed relative to the overall data of the dimension in which the data is located, the standard fractionThe larger the size, the data standard deviation is utilized because the sizes of the data in different dimensions are differentFor a pair ofAnd (3) carrying out standardization, unifying the dimension of the standard score of each measured data, and facilitating the subsequent calculation and analysis of the standard scores of the measured data with different dimensions.
The standard score of each physical measurement data can be obtained through the method, and the physical measurement data of each student and the standard score of each physical measurement data are obtained.
Step S2: taking any one student as a student to be tested, taking any one body measurement data of the student to be tested as data to be tested, and obtaining the relative deviation of the data to be tested according to the difference of standard scores between the data to be tested and other body measurement data except the data to be tested in the student to be tested; suspicious body test data is screened from all body test data based on the relative deviation.
Since the students in the university are numerous, the students in the university come from different regions, and the living habits have larger differences, so that the physique of different students also has larger differences, and further, the normal body measurement data among different students also has larger differences, so that the partial normal body measurement data and the whole data also have a certain degree of deviation, and if the abnormal data is directly identified by the existing method by utilizing the characteristic of larger deviation between the abnormal data and the whole data, the condition of misdetection or missing detection can occur, and the accuracy of detecting the abnormal body measurement data is reduced.
Although the constitution or body measurement data of different students have larger differences, so that abnormal data cannot be accurately detected, in normal conditions, the constitution level of the same student has similarity, namely, the constitution of each body measurement data of the same student is relatively similar to the deviation of the whole data of the dimension where the same student is located, for example, the constitution of one student is relatively weak, then the performance of each item such as long-distance running or standing long jump of the student is relatively similar to the difference of the students with normal constitution, otherwise, if one student only has larger differences on the constitution measurement data of one item and the body measurement data of other students, and the differences of the constitution measurement data of other items are relatively smaller, the possibility that the constitution measurement data of the student on the item is abnormal is relatively higher is illustrated, and the acquired standard score can reflect the deviation degree between the whole data of each body measurement data and the whole data of the dimension where the student is located.
Preferably, in one embodiment of the present invention, the method for acquiring the relative deviation of the data to be measured specifically includes:
Taking the absolute value of the difference value of the standard score between the data to be measured and each other body measurement data except the data to be measured in the students to be measured as a first score difference between the data to be measured and each other body measurement data except the data to be measured in the students to be measured; and carrying out normalization processing on the average value of the first score difference between the data to be measured and all other body measurement data except the data to be measured in the student to be measured to obtain the relative deviation of the data to be measured. The expression of the relative deviation may specifically be, for example:
wherein, Representing the relative deviation of the data to be measured; representing a standard fraction of the data to be measured; Representing the first part of the students to be tested except the data to be tested Standard scores for other body measurement data for each dimension; Representing the number of dimensions, also the number of physical measurement items, then The number of other dimensions except the dimension where the data to be measured are located; Representing the normalization function.
In the process of acquiring the relative deviation of the data to be measured, the relative deviationThe larger the difference between the deviation of the measured data of the student to be measured and the deviation of the body measurement data of other dimensions of the student to be measured, the larger the difference is, and the greater the possibility of abnormality of the measured data isThe larger the difference between the deviation of the measured data of the student to be measured and the deviation of the body measurement data of a certain dimension of the student is, the larger the relative deviation isThe larger and thus the relative deviation by the normalization functionIs limited atIn the range, the subsequent preliminary screening is convenient.
In one embodiment of the present invention, the normalization process may specifically be, for example, maximum and minimum normalization processes, and the normalization in the subsequent steps may be performed by using the maximum and minimum normalization processes, and in other embodiments of the present invention, other normalization methods may be selected according to a specific range of values, which will not be described herein.
The relative deviation of each body measurement data can be obtained by using the same method, and the possibility of abnormality of the body measurement data can be primarily reflected through the relative deviation, so that the embodiment of the invention can primarily screen the body measurement data possibly having abnormality from all the body measurement data based on the relative deviation, namely suspicious body measurement data, and can further screen the abnormal body measurement data from the suspicious body measurement data, thereby improving the accuracy of abnormal data detection.
Preferably, in one embodiment of the present invention, the body measurement data with the relative deviation greater than a preset threshold value is used as suspicious body measurement data, and the preset threshold value is generally within a range of valueIn one embodiment of the present invention, the preset threshold is set to 0.65, and the specific value of the preset threshold may also be set by an implementer according to a specific implementation scenario, which is not limited herein.
The relative deviation of each body measurement data is obtained, suspicious body measurement data is screened out preliminarily, the abnormal degree of the body measurement data can be analyzed based on the relative deviation later, meanwhile, the influence of the abnormal body measurement data in the analysis process can be reduced through the relative deviation, and abnormal body measurement data is screened out from the suspicious body measurement data.
Step S3: according to the difference of the body measurement data of the same student in any two dimensions and the relative deviation of the body measurement data, obtaining the correlation between any two dimensions; taking any suspicious body measurement data as target data, taking a student corresponding to the target data as a target student, and obtaining real abnormal parameters of the target data according to the difference of standard scores and the correlation between dimensions of the target data and other body measurement data except the target data in the target student and the difference of time stamps and the difference of relative deviation between the target data and other suspicious body measurement data.
Although the physique or body measurement data of different students are larger in difference, certain correlation exists among the body measurement projects, for example, the performance of students with better standing long jump performance in sprinting is better, the performance of students with overweight weight in gravitation upward projects is often poor, and the variation performance of the body measurement data of each student in the body measurement projects with stronger correlation is more consistent, so that the embodiment of the invention analyzes the correlation between any two dimensions based on the difference of the body measurement data of the same student in any two dimensions, and simultaneously reduces the influence of abnormal data existing in the body measurement data on the correlation analysis by combining the relative deviation of the body measurement data, improves the accuracy of the correlation analysis, and can analyze the abnormal degree of suspicious body measurement data based on the correlation between the dimensions.
Preferably, in one embodiment of the present invention, the method for acquiring the correlation between any two dimensions specifically includes:
Referring to fig. 2, a flowchart of a method for obtaining correlation between any two dimensions according to an embodiment of the present invention is shown.
Step S301: the two dimensions selected at will are respectively used as the first dimension and the second dimension, and because the dimensions of the body measurement data in different dimensions are different, the body measurement data in the dimensions need to be standardized in advance when the correlation is analyzed, and the comparison and the analysis are realized under the same dimension, so that the body measurement data in the first dimension and the body measurement data in the second dimension are respectively standardized, and standardized data of each body measurement data are obtained.
Step S302: and taking the absolute value of the difference value of the standardized data of each student in the first dimension and the second dimension as the standardized distance of the body measurement data of each student between the first dimension and the second dimension.
The normalized distance can reflect differences between normalized data of the same student in a first dimension and a second dimension, and subsequently, the similarity of the normalized distances of all students can be analyzed for correlation between the first dimension and the second dimension.
Step S303: selecting one dimension from the first dimension and the second dimension as a target dimension, dividing standardized data of each preset first number of students into a group in the target dimension, and taking the standardized data group as a standardized data group, wherein the value range of the preset first number is generally thatIn one embodiment of the present invention, the preset first number is set to 10, and the specific value of the preset first number may also be set by an implementer according to a specific implementation scenario, which is not limited herein.
Step S304: in each standardized data set, the student corresponding to the maximum value of the standardized data is taken as a first marked student of each standardized data set, the student corresponding to the minimum value of the standardized data is taken as a second marked student of each standardized data set, the first marked students of all standardized data sets are taken as a first set of target dimensions, and the second marked students of all standardized data sets are taken as a second set of target dimensions.
Step S305: taking the intersection of the first set between the first dimension and the second dimension as a first intersection; and taking the intersection of the second set between the first dimension and the second dimension as a second intersection.
The first intersection represents the coincident students corresponding to the maximum value of the standardized data in each standardized data set of the first dimension and the second dimension, namely, the students in the first intersection are the same students corresponding to the maximum value of the standardized data in each standardized data set of the first dimension and the second dimension, and the students in the second intersection are the same students corresponding to the minimum value of the standardized data in each standardized data set of the first dimension and the second dimension, so that the more the number of the students in the first intersection and the second intersection is, the more consistent the change trend of the body measurement data in the first dimension and the second dimension is, and the further can be used for the analysis of the subsequent correlation.
Step S306: and obtaining the correlation between the first dimension and the second dimension according to the distribution of the standardized distances of all students between the first dimension and the second dimension, the relative deviation of the body measurement data of each student in the first dimension and the second dimension, and the number of the students in the first intersection and the second intersection.
The calculation formula of the correlation is:
wherein, Representing a correlation between the first dimension and the second dimension; Representing the first dimension and the second dimension Standardized distances of the body measurement data of the individual students; Represent the first Standardized data of body measurement data of the individual students in the first dimension; Represent the first Standardized data of physical measurement data of the individual students in the second dimension; representing an average of normalized distances for all students between the first dimension and the second dimension; Represent the first Relative deviation of the physical measurement data of the individual students in the first dimension; Represent the first Relative deviation of the physical measurement data of the individual students in the second dimension; Representing the number of students; representing the number of students in the first intersection; Representing the number of students in the second intersection; representing a number of students in a first set of first dimensions; representing a number of students in the second set of first dimensions; expressed in natural constant An exponential function for negative correlation, for normalization processing; Representing preset parameters, ensuring that denominator is greater than or equal to 1, and the value range is generally In one embodiment of the inventionIs set to be 1, and is set to be 1,The specific numerical values of (2) may also be set by the practitioner according to the specific implementation scenario, and are not limited herein.
In the process of acquiring the correlation between the first dimension and the second dimension, the correlationThe stronger the trend of the body measurement data of each student in the first dimension and the second dimension is, the more consistent, wherein the distance is standardizedReflecting the difference between the standardized data of the same student in the first dimension and the second dimension, and if the variation trend of the body measurement data of each student in the first dimension and the second dimension is more consistent, representing the standardized distance of all students in the first dimension and the second dimensionCloser, normalized distance of all studentsThe smaller the variance of (c) and thus the normalized distance for all students in the first and second dimensionsSince analysis of variance of (a) is performed while taking into account that abnormal data in the body measurement data may interfere with the calculation of correlation, such interference is reduced by the relative deviation of the body measurement data, the larger the relative deviation is, the more likely the body measurement data is to be abnormal, and thus the normalized distance isSimultaneous introduction of variance analysis of (a)To reduce interference of abnormal body measurement data with correlation analysis.
Representing the ratio of the number of students in the first intersection and the second intersection to the number of students corresponding to the highest value in all standardized data sets of the first dimension, if usedAndRepresenting the number of students in the first set and the number of students in the second set, respectively, of the second dimension, thenReplaced byWherein; Number of students in first intersection and second intersectionThe more the number of students corresponding to the standardized data with the same type of the most value in the standardized data sets of the first dimension and the second dimension is, and the more the change trend of the body measurement data of each student in the first dimension and the second dimension is consistent, the correlation is further illustratedThe larger.
Wherein the method comprises the steps ofAnd (3) withThe correlation between the two can also be calculated by addingThe present invention is not limited thereto.
According to the method, the correlation between any two dimensions can be achieved, under normal conditions, the two dimensions with stronger correlation are consistent in body measurement data performance, so that in order to further accurately detect abnormal body measurement data from suspicious body measurement data, firstly, any one suspicious body measurement data is taken as target data, students corresponding to the target data are taken as target students, in order to eliminate interference of excessive physical differences of different students on abnormal analysis of the suspicious body measurement data, the abnormal degree of the suspicious body measurement data can be analyzed based on standard fraction differences between the target data and other body measurement data except the target data in the target students, and the correlation between dimensions, and meanwhile, the abnormal body measurement data usually have concentration in time, for example, the abnormal body measurement data of the same group of long running students can be caused, the characteristic of the concentration of the abnormal body measurement data also has concentration in time, and therefore, the difference of time stamps and the difference of the relative deviation between the target data and other suspicious body measurement data can be combined to further accurately screen the abnormal body measurement data, and the abnormal degree of the abnormal body measurement data can be conveniently and accurately analyzed.
Preferably, in one embodiment of the present invention, the method for acquiring the real abnormal parameters of the target data specifically includes:
Referring to fig. 3, a flowchart of a method for acquiring real abnormal parameters of target data according to an embodiment of the invention is shown.
Step S311: the absolute value of the difference in the standard score between the target data and each other body measurement data in the target student other than the target data is taken as the second score difference between the target data and each other body measurement data in the target student other than the target data.
The process of obtaining the first score difference in the step of calculating the relative deviation is similar to that described above, and is to eliminate the factor of greater difference between different students when the abnormal analysis is performed on the body measurement data.
Step S312: and carrying out normalization processing on the correlation between the dimension of the target data and the dimension of each other body measurement data except the target data in the target students to obtain weight parameters.
For the two dimensions with stronger correlation, if the suspicious body measurement data of a certain student is inconsistent with the body measurement data of other dimensions of the student, the probability that the suspicious body measurement data is abnormal data is larger, so that the weight parameter and the second score difference acquired in the step S301 can be used for carrying out preliminary analysis on the abnormality degree of the suspicious body measurement data.
Step S313: the second fractional differences are weighted and summed by using the weight parameters to obtain initial abnormal parameters of the target data, and the expression of the initial abnormal parameters can be specifically:
wherein, Initial anomaly parameters representing target data; A standard score representing the target data; Representing the first of the target students except the target data Standard scores for other body measurement data for each dimension; Representing the dimension of the target data and the second dimension of the target students except the dimension of the target data Correlation between individual dimensions; Represent the first Sum of dimensions ofCorrelation between individual dimensions; representing the number of dimensions; Representing a second fractional difference; Representing the weight parameters.
In the process of acquiring initial abnormal parameters of target data, the initial abnormal parametersFor initially reflecting the possibility of abnormality of target data, initial abnormality parameterThe larger the target data is, the larger the possibility that the target data is abnormal data is, the further the preliminary abnormal parameters can be adjusted later, so that the abnormal analysis of suspicious body measurement data is more accurate, the larger the second fraction difference is, the more inconsistent the deviation expression between the target data and other body measurement data of the target student is, at the moment, if the correlation between the dimension of the target data and the dimension of the other body measurement data of the target student is stronger, the greater the possibility that the target data is abnormal is, namely the initial abnormal parametersThe larger the second score difference is, the more the weight parameter is used to weight the second score difference to obtain the initial anomaly parameter。
The method comprises the steps of analyzing the abnormality degree of suspicious body measurement data by only utilizing the difference of standard scores and the correlation between dimensions of the body measurement data, wherein the calculated preliminary abnormality degree is not enough to accurately reflect the abnormality of the suspicious body measurement data, and considering that the abnormal body measurement data usually has concentration in time, for example, equipment faults can cause the abnormality of the body measurement data of the same group of long-running students, and the characteristic that the concentration in time of the abnormal body measurement data can cause the same concentration of the relative deviation of the abnormal body measurement data, so that the difference of time stamps and the difference of the relative deviation between the suspicious body measurement data can be analyzed later, and the initial abnormality parameters can be further adjusted to obtain real abnormality parameters which can more accurately reflect the abnormality of the suspicious body measurement data.
Step S314: because the abnormal body measurement data has concentration in time, a preset second number of other suspicious body measurement data closest to the timestamp of the target data can be selected as the reference data of the target data in the dimension of the target data, wherein the range of the preset second number is generallyIn one embodiment of the present invention, the preset second number is set to 30, and the specific value of the preset second number may also be set by the practitioner according to the specific implementation scenario, which is not limited herein.
Step S315: and then, according to the difference of the time stamp between the target data and each reference data and the difference of the relative deviation between the target data and each reference data, the initial abnormal parameters are adjusted to obtain the real abnormal parameters of the target data. The expression of the true anomaly parameter may specifically be, for example:
wherein, Representing real abnormal parameters of the target data; Initial anomaly parameters representing target data; A timestamp representing the target data; representing the first of the target data A time stamp of the individual reference data; Representing the relative deviation of the target data; representing the first of the target data Relative deviation of the individual reference data; expressed in natural constant An exponential function for negative correlation, for normalization processing; representing a preset second number.
In the process of acquiring the real abnormal parameters of the target data, the real abnormal parametersThe larger the description target data is, the more likely it is abnormal data, whereinThe smaller the difference of the time stamp between the target data and the reference data is, and the more concentrated the time between the target data and the reference data is, the real abnormal parameter isThe larger the size of the container,The smaller the difference between the relative deviation of the target data and the reference data is, and the more the relative deviation between the target data and the reference data is concentrated, the real abnormal parameter isThe larger and thus utilizeAs an initial anomaly parameterFor initial anomaly parametersAdjusting to obtain more accurate real abnormal parameters。
The real abnormal parameters of each suspicious body measurement data can be obtained by the same method, the factors with overlarge physique differences of different students are eliminated in the process of solving the real abnormal parameters, and the real abnormal parameters and the standard scores can be combined subsequently to screen the abnormal body measurement data from the suspicious body measurement data.
Step S4: and according to the real abnormal parameters of the suspicious body measurement data, adjusting the standard scores of the suspicious body measurement data, and screening abnormal body measurement data from all the suspicious body measurement data.
In the existing method, for example, the glabros algorithm usually utilizes the characteristic of larger deviation between abnormal data and integral data to identify the abnormal data, and in the embodiment of the invention, the abnormal data is detected by comparing the standard fraction of the body measurement data with the corresponding critical value in the glabros table, and the abnormal body measurement data is detected by utilizing the obtained real abnormal parameters, so that the abnormal body measurement data can be accurately detected, the abnormal body measurement data can be further processed by students in a follow-up mode, and the acquisition of the high school measurement data is optimized by directly utilizing the characteristic of larger deviation between the abnormal data and the integral data without considering the factors of the body measurement differences of the students.
Preferably, in one embodiment of the present invention, the method for acquiring abnormal body measurement data specifically includes:
taking the product value of the real abnormal parameter and the standard score of each suspicious body measurement data as the adjustment standard score of each suspicious body measurement data; based on the Grabbs algorithm, according to the adjustment standard score of each suspicious body measurement data, abnormal body measurement data are screened from all suspicious body measurement data, wherein the specific screening process comprises the following steps: first, the detection level used by the Grabbs algorithm is determined In one embodiment of the invention causeDetecting the levelThe method can also be set by an operator according to the implementation scene, and is not limited herein, the corresponding critical value is determined in the glabros table by detecting the level and the number of students in colleges and universities, and suspicious body measurement data with the adjustment standard score larger than the critical value is used as abnormal body measurement data, and the glabros algorithm is a technical means well known to those skilled in the art and is not described herein.
The screening process eliminates the factors of overlarge physique or body measurement data difference of different students, so that the accuracy of the detected abnormal body measurement data is higher, and the follow-up optimization processing of body measurement data acquisition is facilitated.
Step S5: and optimizing the acquisition of student body measurement data based on the abnormal body measurement data.
Abnormal body measurement data cannot reflect the actual physical condition of a student, so that the acquisition of the student body measurement data needs to be further optimized based on the abnormal body measurement data.
Preferably, the method for optimizing the acquisition of student body measurement data in one embodiment of the invention specifically comprises the following steps:
Taking students corresponding to the abnormal body measurement data as students to be collected, and taking body measurement items of dimensions corresponding to the abnormal body measurement data as items to be collected; the abnormal body measurement data are removed from the database, the items to be collected of the students to be collected are tested again, new body measurement data of the students to be collected are obtained, and the new body measurement data are recorded into the database, so that the body measurement data of each student can be collected accurately.
Based on the same inventive concept, one embodiment of the invention also provides a college student physical testing data acquisition system, which comprises a memory, a processor and a computer program, wherein the memory is used for storing the corresponding computer program, the processor is used for running the corresponding computer program, and the computer program can realize the method described in the steps S1-S5 when running in the processor.
In summary, in the embodiment of the invention, firstly, body measurement data of each student in different dimensions is obtained, and standard scores of each body measurement data are obtained according to deviation of each body measurement data relative to all body measurement data in the dimension where the body measurement data are located; taking any one student as a student to be tested, taking any one body measurement data of the student to be tested as data to be tested, and obtaining the relative deviation of the data to be tested according to the difference of standard scores between the data to be tested and other body measurement data except the data to be tested in the student to be tested; screening suspicious body measurement data from all body measurement data based on the relative deviation; according to the difference of the body measurement data of the same student in any two dimensions and the relative deviation of the body measurement data, obtaining the correlation between any two dimensions; taking any suspicious body measurement data as target data, taking a student corresponding to the target data as a target student, and obtaining real abnormal parameters of the target data according to the difference of standard scores and the correlation between dimensions of the target data and other body measurement data except the target data in the target student and the difference of time stamps and the difference of relative deviation between the target data and other suspicious body measurement data; according to the real abnormal parameters of the suspicious body measurement data, the standard scores of the suspicious body measurement data are adjusted, and abnormal body measurement data are screened out from all the suspicious body measurement data; and optimizing the acquisition of student body measurement data based on the abnormal body measurement data.
An embodiment of an anomaly detection method for student physical test data acquisition:
Because a large amount of abnormal data exists in body measurement data of students due to equipment faults or human errors, the abnormal data are usually identified by utilizing the characteristic of larger deviation between the abnormal data and the whole data in the related technology, and the abnormal data are further processed to realize accurate acquisition of the body measurement data, but under the condition that the number of students is too large, the physique difference of different students is larger, the difference of the body measurement data is also larger, the condition of false measurement or missing measurement can occur through the existing method, and the accuracy of abnormal data detection is reduced.
In order to solve the problem, the embodiment provides an anomaly detection method for collecting student physical test data, which comprises the following steps:
Step S1: and acquiring body measurement data of each student in different dimensions, wherein each body measurement data corresponds to a time stamp, and acquiring the standard score of each body measurement data according to the deviation of each body measurement data relative to all body measurement data in the dimension of each body measurement data.
Step S2: taking any one student as a student to be tested, taking any one body measurement data of the student to be tested as data to be tested, and obtaining the relative deviation of the data to be tested according to the difference of standard scores between the data to be tested and other body measurement data except the data to be tested in the student to be tested; suspicious body test data is screened from all body test data based on the relative deviation.
Step S3: according to the difference of the body measurement data of the same student in any two dimensions and the relative deviation of the body measurement data, obtaining the correlation between any two dimensions; taking any suspicious body measurement data as target data, taking a student corresponding to the target data as a target student, and obtaining real abnormal parameters of the target data according to the difference of standard scores and the correlation between dimensions of the target data and other body measurement data except the target data in the target student and the difference of time stamps and the difference of relative deviation between the target data and other suspicious body measurement data.
Step S4: and according to the real abnormal parameters of the suspicious body measurement data, adjusting the standard scores of the suspicious body measurement data, and screening abnormal body measurement data from all the suspicious body measurement data.
The steps S1 to S4 are described in detail in the embodiments of the method and the system for collecting physique test data of college students, and are not described herein.
The beneficial effects brought by the embodiment are as follows: according to the method, the situation that mismeasurement or missing measurement occurs to different students is considered, so that the accuracy of abnormal data detection is reduced, firstly, the body measurement data of each student in different dimensions is obtained, the degree of deviation between the body measurement data and the whole data of the dimension where the body measurement data are located is reflected through the standard score, the degree of abnormality of the body measurement data can be accurately analyzed based on the standard score later, factors with larger body measurement differences of different students are eliminated, the fact that the deviation of the body measurement data of the same student relative to the whole data of the dimension where the body measurement data are located is similar under normal conditions is considered, therefore the possibility of abnormality of each body measurement data can be initially reflected through the relative deviation, abnormal body measurement data which possibly exist can be initially screened through the relative deviation, the relevance between the dimensions can be analyzed, accurate calculation analysis can be carried out on real abnormal parameters of target data based on the relevance between the dimensions later, the two dimensions with stronger relevance can be similar data can occur, meanwhile, the abnormal measurement data usually have characteristics in time, the fact that the abnormal body measurement data can be accurately reflected through the real suspected body measurement parameters, the abnormal body measurement parameters can be accurately detected through the actual measurement parameters, the abnormal body measurement parameters can be greatly adjusted, and the abnormal body measurement parameters can be accurately detected, and abnormal body measurement parameters can be greatly detected, and abnormal body measurement parameters can be accurately adjusted.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.