WO2025016335A1 - Method and apparatus for generating action of virtual object, and cluster, medium and program product - Google Patents
Method and apparatus for generating action of virtual object, and cluster, medium and program product Download PDFInfo
- Publication number
- WO2025016335A1 WO2025016335A1 PCT/CN2024/105342 CN2024105342W WO2025016335A1 WO 2025016335 A1 WO2025016335 A1 WO 2025016335A1 CN 2024105342 W CN2024105342 W CN 2024105342W WO 2025016335 A1 WO2025016335 A1 WO 2025016335A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- action
- target
- data
- motion
- template
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
Definitions
- the present application relates to the field of motion capture technology, and in particular to a method, device, cluster, medium and program product for generating motions of virtual objects.
- motion capture technology can be used to obtain the process of real objects performing limb movements.
- the motion capture technology can use different sensors to obtain the movement information of real objects in real space, and convert the captured motion information into animation data of virtual skeletons through algorithms. Based on the obtained animation data, the process of virtual objects performing limb movements can be generated, thereby restoring the actions performed by the real objects.
- the process of objects in reality performing physical movements can be restored more accurately through motion capture technology.
- the need to display virtual objects performing physical movements in virtual space can be to beautify the movements on the basis of restoring the physical movements performed by objects in reality. This leads to the problem of poor flexibility in the process of virtual characters performing physical movements directly displayed in virtual space through motion capture technology.
- the embodiments of the present application provide a method, device, cluster, medium and program product for generating actions of a virtual object, which ensure that the style of the actions performed by the object is preserved, beautify the actions performed by the virtual object, and improve the flexibility of generating actions performed by the virtual object.
- the present application provides a method for generating an action of a virtual object, the method comprising: obtaining first action data; the first action data is used to indicate the joint movement of the target object during the execution of a target action; obtaining second action data; the second action data is used to indicate the joint movement of the virtual object during the execution of the target action according to an action template; the virtual object is used to represent the target object in a virtual space; based on the first action data and the second action data, third action data is generated; the third action data is used to indicate the joint movement after the target action is adjusted in combination with the action template; according to the third action data, an animation video is generated; the animation video includes a clip of the virtual object executing the adjusted target action.
- the first action data is combined with the second action data to modify and obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data.
- the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is preserved, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
- third action data is generated based on the first action data and the second action data, including: extracting first action features based on the first action data; extracting second action features based on the second action data; and mixing the first action features and the second action features according to specified weights to generate third action data.
- third action data after feature mixing can be obtained, thereby achieving the purpose of adjusting the first action data in combination with the second action data.
- third action data is generated based on first action data and second action data, including: based on the first action data, obtaining a first animation curve of a target reference point; the target reference point is a joint used to perform a target action; the first animation curve is used to indicate the movement of the reference point during a time period when the target object performs the target action; based on the second action data, obtaining a second animation curve of the target reference point; the second animation curve is used to indicate the movement of the reference point during a time period when the virtual object performs the target action according to an action template; according to a spherical blending algorithm, the first animation curve and the second animation curve are blended to obtain a third animation curve; the third animation curve is used to indicate the movement of the reference point during a time period when the virtual object performs the target action adjusted in combination with the action template.
- the third animation curve can be obtained, thereby achieving the purpose of adjusting the first motion data in combination with the second motion data.
- generating third action data based on the first action data and the second action data includes: if the target action is an action to be adjusted, generating the third action data based on the first action data and the second action data.
- the action to be adjusted is a target action whose similarity is less than a specified threshold after the frequency domain similarity calculation is performed between the first action data of the target action and the second action data of the target action.
- the frequency domain similarity calculation is performed on the first action data of the target action and the second action data corresponding to the action template of the target action. If the similarity obtained by the frequency domain similarity calculation is less than the specified threshold, it can be determined that the first action data of the target action needs to be adjusted, so the target action can be determined as the action to be adjusted. After determining that the target action is the action to be adjusted, the first action data can be adjusted in combination with the second action data to generate adjusted third action data, thereby achieving the purpose of adjusting the first action data in combination with the second action data.
- generating an animation video according to the third motion data includes: generating a mixed animation video according to the third motion data; and obtaining the animation video by performing low-pass filtering on the mixed animation video.
- the mixed animation data can be smoothed by a low-pass filtering algorithm to obtain a smooth animation video.
- obtaining the first motion data includes: obtaining the first motion data by performing motion capture on a target object.
- the motion capture technology can directly collect the first motion data of the target object when performing the target motion, so as to achieve the purpose of subsequently adjusting the first motion data in combination with the second motion data to obtain beautified third motion data.
- obtaining the first action data includes: obtaining an action video; the action video contains a segment of a target object performing a target action; and obtaining the first action data by performing motion capture on the target object in the action video.
- an action video of the target object performing the target action can be obtained, and the first action data can be obtained by motion capturing the process of the target object performing the target action in the action video, thereby achieving the purpose of subsequently adjusting the first action data in combination with the second action data to obtain beautified third action data.
- obtaining the second action data includes: in response to a received selection operation, determining an action template corresponding to a target action from at least one action template; and obtaining the second action data of the action template corresponding to the target action.
- the action template corresponding to the target action can be determined directly from at least one action template according to the received selection operation, and the second action data of the action template can be obtained, thereby enabling the user to directly select the action template corresponding to the target action from each action template.
- the user can easily select the direction to adjust the target action.
- obtaining the second action data includes: determining the action type of the target action based on the first action data; determining the action template corresponding to the target action according to the action type of the target action; and obtaining the second action data of the action template corresponding to the target action.
- the action type of the target action can be obtained, and the action template corresponding to the action type can be determined, and the second action data of the action template can be obtained.
- This can realize automatic identification of the action type of the target action, and determine the action template corresponding to the action type according to the action type, thereby obtaining the second action data corresponding to the action template.
- determining the action template of the target action including: determining at least two action templates corresponding to the action type of the target action according to the action type of the target action; and determining the action template of the target action from the at least two action templates in response to the received selection operation.
- the action template corresponding to the action type can be automatically obtained. If the action type corresponds to at least two action templates, the user can determine one of the action templates through a selection operation, so that the user can choose the direction to adjust the target action.
- an embodiment of the present application provides a virtual object action generation device, which is used to execute any one of the virtual object action generation methods provided in the first aspect.
- the embodiment of the present application can divide the action generation device of the virtual object into functional modules according to the method provided in the first aspect above.
- each functional module can be divided according to each function, or two or more functions can be integrated into one processing module.
- the embodiment of the present application can divide the action generation device of the virtual object into an acquisition module, a processing module, and a generation module, etc. according to the function.
- the description of the possible technical solutions and beneficial effects executed by the above-mentioned divided functional modules can refer to the technical solutions provided by the above-mentioned first aspect or its corresponding possible implementation method, which will not be repeated here.
- an embodiment of the present application provides a computing device, the computing device comprising a processor and a memory, the processor being coupled to the memory; the memory being used to store computer instructions, the computer instructions being loaded and executed by the processor so that the computing device implements the method for generating actions of a virtual object as described in the above aspects.
- an embodiment of the present application provides a computing device cluster, which includes at least one computing device, each computing device including: a processor and a memory, the processor of at least one computing device is used to execute instructions stored in the memory of at least one computing device, so that the computing device cluster executes the virtual object action generation method provided in the various optional implementation methods of the above-mentioned first aspect.
- an embodiment of the present application provides a computer-readable storage medium, in which at least one computer program instruction is stored, and the computer program instruction is loaded and executed by a processor to implement the method for generating actions of a virtual object as described in the above aspects.
- an embodiment of the present application provides a computer program product, the computer program product including computer instructions, the computer instructions stored in a computer-readable storage medium.
- a processor of a computing device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computing device cluster executes the method for generating an action of a virtual object provided in various optional implementations of the first aspect.
- Fig. 1 is a schematic diagram of a scene showing action generation of a virtual object according to an exemplary embodiment
- FIG2 is a schematic diagram of a motion capture device involved in the embodiment shown in FIG1 collecting motion data of a target object;
- FIG3 is a schematic diagram showing a hardware structure of a computing device according to an exemplary embodiment
- FIG4 is a schematic flow chart of a method for generating an action of a virtual object according to an exemplary embodiment
- FIG5 is a flow chart of anomaly detection of action sequence data involved in the embodiment shown in FIG4 ;
- FIG6 is a schematic diagram of an action beautification involved in the embodiment shown in FIG4 ;
- FIG7 is an architecture diagram of a system for generating motions of a virtual object according to an exemplary embodiment
- FIG8 is a schematic structural diagram of a device for generating a motion of a virtual object according to an exemplary embodiment
- FIG9 is a schematic diagram of a computing device according to an exemplary embodiment
- FIG10 is a schematic diagram of a computing device cluster according to an exemplary embodiment
- Fig. 11 is a schematic diagram showing a connection method between computing device clusters according to an exemplary embodiment.
- a and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone.
- the character "/" generally indicates that the related objects are in an "or” relationship.
- plural means two or more than two.
- At least one of the following or similar expressions refers to any combination of these items, including any combination of single items or plural items.
- at least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, and c can be single or multiple.
- the words “first”, “second” and the like are used to distinguish the same items or similar items with substantially the same functions and effects. Those skilled in the art will understand that the words “first”, “second” and the like do not limit the quantity and execution order, and the words “first”, “second” and the like do not necessarily limit the differences.
- the words “exemplary” or “for example” are used to indicate examples, illustrations or explanations. Any embodiment or design described as “exemplary” or “for example” in the embodiments of the present application should not be interpreted as being more preferred or more advantageous than other embodiments or design solutions. Specifically, the use of words such as “exemplary” or “for example” is intended to present related concepts in a concrete manner for ease of understanding.
- the process of objects performing actions in the real space can be mapped in the virtual space for animation video display.
- virtual objects in the virtual space as digital avatars of objects in the real space
- the actions performed by virtual objects can be close to or even exceed the actions performed by objects in the real space.
- the body movements of objects in the real space can be obtained through motion capture technology.
- motion capture technology can be used to collect the motion data of an object in real space through optical sensors, inertial sensors, cameras and other devices.
- the motion of the object in real space can be restored and displayed in virtual space through motion capture technology, and the collected object motion data can be used to generate skeleton animation data.
- Skeleton can be a data structure composed of virtual joint rotation and position information that is not displayed inside the model of the virtual object.
- the corresponding virtual object is directly displayed in the virtual space to perform the action.
- the action of the object in the real space can be restored more accurately, in actual application, especially in the scene of virtual people, when constructing virtual objects of real people, most users have a higher perception of the aesthetics of their own actions than the actual situation. This leads to that when the virtual human action is restored and displayed according to the real action, the user is likely to be dissatisfied with the aesthetics of the displayed virtual human action when looking back at his own virtual human action.
- the user expects that the displayed virtual human action can have a certain aesthetic feeling under the premise of maintaining his own action style, while the virtual human action restored by the motion capture device is more realistic but lacks aesthetic feeling. Therefore, after the action data of real people is collected through motion capture technology, the virtual human action can be beautified according to artificial experience through action editing.
- motion editing can be done by technical artists using professional software tools to edit the position information of bone rotation to form animation.
- Motion editing can be divided into two types. One is to edit directly from scratch and edit each video frame; the other is to modify based on the motion data collected by motion capture technology.
- Professional software tools can be digital content creation (DCC) tools.
- Editing motion data through professional software tools can achieve the purpose of editing and beautifying the motion data collected by motion capture technology.
- manual editing is inefficient, and the degree of beautification depends on the experience and aesthetic level of technical artists, and the quality of the edited motion cannot be guaranteed.
- motion generation technology can also be used to automatically generate virtual object movements based on instructions, text or video input.
- the action generation technology can be based on an existing action library and machine learning technology to quickly generate the actions of virtual objects according to multimodal instructions.
- the generated virtual object actions rely on the action data pre-stored in the action library and cannot retain the style characteristics of the actions performed by the objects in reality.
- the embodiment of the present application aims to achieve the purpose of editing and beautifying the actions performed by virtual objects and displaying them while retaining the style of the actions performed by real objects, thereby improving the aesthetics of the actions performed by virtual objects.
- the action data of the object can be automatically edited and beautified according to the pre-set standard action data, thereby automatically obtaining the beautified action data, and generating an animation video based on the beautified action data for display, thereby beautifying the actions performed by virtual objects while ensuring that the style of the actions performed by the object is retained, and improving the flexibility of generating the actions performed by virtual objects.
- Figure 1 shows a scene schematic diagram of a virtual object action generation provided by an embodiment of the present application.
- the target object wears a motion capture device 11
- the motion capture device 11 collects the motion data of the target object
- the motion capture device 11 sends the collected motion data to the computing device 10.
- the computing device 10 automatically edits the collected motion data, and can generate an animation video of the virtual object performing a beautified action based on the motion data after automatic editing, and the beautified animation video is displayed through the animation video display interface 12.
- the computing device 10 may be a server, a computer device or a terminal, and the computing device 10 may be a device with video processing and video display functions.
- the computing device 10 is communicatively connected to the motion capture device 11 , and the motion capture device 11 can transmit data to the computing device 10 .
- the target object can be any moving object in the real space, such as a person, an animal, etc.
- the virtual object is used to represent the target object in the virtual space.
- the virtual object is a three-dimensional model created based on the animation skeleton technology. Each virtual object has its own shape, orientation and direction in the three-dimensional virtual space, and occupies a part of the space in the three-dimensional virtual space.
- Fig. 2 is a schematic diagram of a motion capture device involved in an embodiment of the present application collecting motion data of a target object.
- the motion capture device collects the motion data of the target object 21, and the target object 21 can divide the body skeleton according to each joint point according to the animation skeleton technology.
- the motion data may include the rotation angle of the joint, the motion acceleration, the force of the joint, etc.
- the rotation angle ⁇ of the knee joint, the rotation angle ⁇ of the elbow joint, the motion acceleration of the target object 21 is 6 m/s ⁇ 2, the force F of the knee joint, and the force F of the sole joint.
- the computing device 10 can obtain the motion data of the target object captured by the motion capture device 11 through the motion capture device 11; in another case, the computing device 10 can obtain the motion data of the target object by obtaining a video containing the target object and analyzing the motion of the target object in the video.
- FIG 3 is a schematic diagram of the hardware structure of a computing device 10 provided in an embodiment of the present application.
- the computing device 10 may be a server, and the server may be an X86 architecture server, specifically a blade server, a high-density server, a rack server or a high-performance server, etc., wherein the server includes a processor 201, a memory 202 and a bus 203, and the server may also include an external display device 204.
- the processor 201 may include one or more processing units, for example, the processor 201 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU), etc.
- AP application processor
- GPU graphics processor
- ISP image signal processor
- DSP digital signal processor
- NPU neural-network processing unit
- Different processing units may be independent devices or integrated in one or more processors.
- the memory 202 is used to store data, wherein the memory 202 includes but is not limited to random access memory (RAM), read-only memory (ROM), flash memory, or optical storage, database, etc.
- RAM random access memory
- ROM read-only memory
- flash memory or optical storage, database, etc.
- the processor 201 , the memory 202 and the external display device 204 are usually connected to each other via the bus 203 , or are connected to each other in other ways.
- the external display device 204 can be used to display an animation video display interface to show the user the animation video after motion beautification processing.
- the processor 201 can be used to compare the motion data of the target object with the motion data corresponding to the motion template to determine whether each motion performed by the target object needs to be beautified. In other words, the processor 201 can be used to identify the motion that needs to be beautified.
- the processor 201 can also be used to beautify the action performed by the target object, that is, the processor 201 can be used to query the action template closest to the action data from the database according to the collected action data of the target object.
- the processor 201 can also be used to perform data mixing on the collected action data of the target object combined with the action template to generate mixed motion data.
- the processor 201 can also be used to perform smoothing on the mixed motion data.
- the virtual object action generation method is applicable to the computing device 10 shown in Figures 1 and 3.
- FIG4 shows a flow chart of a method for generating an action of a virtual object provided by an exemplary embodiment of the present application.
- the method for generating an action of a virtual object can be executed by a computing device, and the method for generating an action of a virtual object includes the following steps:
- a computing device may acquire first motion data, which may be used to indicate a movement of a reference point when a target object performs a target motion.
- the process of the target object performing the target action can be realized in the real space
- the first action data can include the position information of the reference point in each video frame when the target object performs the target action in the real space, the rotation angle of the reference point, the motion acceleration, and the force condition of the reference point.
- the reference point may be any point on the skeleton when the target action is performed, and the reference point may include a joint point of the skeleton.
- the computing device can obtain the first motion data from the motion capture device, and the first motion data is the motion data of the target object when the motion capture device collects the motion data of the target object.
- the motion capture device can directly send the collected motion data of the target object to the computing device, and the computing device performs subsequent processing.
- the movement process of the target object can be recorded in the form of a video
- the computing device obtains the video recording the movement process of the target object, and obtains the motion data of the target object in the video when performing the target action through motion capture technology.
- the user uses a camera, optical sensor or inertial sensor to obtain the motion data of the target object.
- the user can also record the motion video of the target object and send the motion video of the target object to a computing device.
- the computing device parses the motion video to obtain the motion data corresponding to when the target object performs various types of actions.
- the target action may be an action belonging to a certain action category performed by a certain body part of the target object during the movement.
- the body parts of the target object may include limbs, torso, and shoulders, neck, and head, and each body part may include multiple joint points.
- the motion capture device can capture the motion data of each joint point, and through the changes in the time series of each motion data, the type of action that each joint point participates in performing within a certain time period can be determined.
- each joint point at each time point has corresponding motion data at that time point, and each motion data within the time period for collecting the motion data can be obtained for each joint point.
- the type of action that the joint point participates in performing within each time period can be determined.
- the key time points can be determined based on the speed changes, phase changes or position conditions in the motion data of the joints, and the actions that the joints participate in can be divided according to the key time points.
- a computing device can determine the key frames in the motion video used to segment the action based on the motion conditions of each joint of the target object.
- Each video segment segmented according to the key frames can include the action execution process of at least one action type.
- the key frames for the target object to start and end walking can be determined, and the action type of the target object's lower limbs can be determined based on the key frames.
- the video clip is walking.
- the reference point is the joint point of the bone
- the process of beautifying the movement of the joint point of the bone is only for illustration and does not constitute a limitation of the method of the embodiment of the present application.
- the computing device divides each action performed during the movement of the target object according to the acquired first action data according to time points, and after determining the action type of each action performed by the target object, the computing device can obtain the action template corresponding to the determined action type.
- an action type of the target action is determined; according to the action type of the target action, an action template of the target action is determined; and action data of the action template of the target action is acquired as second action data.
- each action type corresponds to at least two action templates
- at least two action templates corresponding to the action type of the target action are determined according to the action type of the target action, and in response to a received selection operation, the action template of the target action can be determined from the at least two action templates.
- the action template may be pre-stored in the database and used to indicate standard actions of various action types.
- the second motion data may be motion data corresponding to an action template under the action category to which the target action belongs.
- each action type corresponds to at least two action templates
- the user can select one of the at least two action templates as the action template of the target action, and obtain the second action data corresponding to the action template.
- the action template with the action type of waving may include action template 1 and action template 2, wherein the waving amplitude of action template 1 is larger than that of action template 2, and the user may select one action template as the target action from action template 1 and action template 2 according to the needs.
- the user may determine action template 1 as the action template of the target action through a selection operation, and then obtain the second action data corresponding to action template 1; if the user needs to beautify the waving action performed by the target object in the direction of a smaller amplitude, the user may determine action template 2 as the action template of the target action through a selection operation, and then obtain the second action data corresponding to action template 2.
- an action template corresponding to the target action in response to a received selection operation, may be determined from at least one action template, and second action data of the action template corresponding to the target action may be acquired.
- various action templates can be displayed to the user, and the user directly selects the action template of the target action from the various action templates through a selection operation, and obtains the second action data corresponding to the action template of the target action.
- the computing device can display the action templates of different action types stored in the template library to the user, and each action type can correspond to an action template with different action effects.
- action templates corresponding to waving with different effects, action templates corresponding to running with different effects, action templates corresponding to squatting with different effects, etc. can be displayed.
- the computing device determines the action template corresponding to the target action by receiving a selection operation of the action template corresponding to the squatting effect required by the user, thereby obtaining the second action data of the action template corresponding to the target action.
- the computing device may modify the first action data based on the first action data and in combination with the second action data to generate third action data.
- the third motion data may be used to indicate the joint movement after adjusting the target motion in combination with the motion template.
- the first action data and the second action data are mixed and calculated according to a specified weight.
- the mixed motion data is obtained, and then the mixed motion data is smoothed to obtain the third motion data.
- the third action data can be obtained by mixing the features of the target action with the features of the corresponding action template and then performing low-pass filtering on the mixed features to obtain smoothed features.
- a first action feature is extracted based on the first action data
- a second action feature is extracted based on the second action data
- the first action feature and the second action feature are mixed according to a specified weight to generate a third action data.
- the periodic motion features therein are extracted, and by performing mixing processing on the motion features, the third motion features are obtained, thereby obtaining the third motion data.
- FIG5 is a flowchart of anomaly detection of an action sequence data involved in an embodiment of the present application.
- a computing device obtains first action data, which may be posture sequence data (S11), performs periodic curve extraction on the first action data through an FFT autoencoder (S12), and then performs frequency domain similarity calculation on the features extracted from the first action data and the features obtained after periodic curve extraction of the action template (S13), performs anomaly detection according to the calculated similarity result (S14), and determines whether the target action is too far from the action template according to the size of the obtained similarity result. If it is determined that the similarity between the target action and the action template is less than a specified threshold, the target action is determined to be abnormal (S15). If the target action is determined to be abnormal, the target action and the action template can be mixed (S16), and then the mixed action data is filtered to obtain a smooth curve feature, thereby matching the third action data (S17).
- first action data which may be posture sequence data (S11)
- a first animation curve of a target reference point is obtained, the target reference point may be a reference point for executing a target action, the first animation curve may be used to indicate the movement of the reference point during a time period when the target object executes the target action, based on second action data, a second animation curve of the target reference point is obtained, the second animation curve may be used to indicate the movement of the reference point during a time period when the virtual object executes the target action according to an action template, the first animation curve and the second animation curve are mixed according to a spherical blending algorithm to obtain a third animation curve; the third animation curve may be used to indicate the movement of the reference point during a time period when the virtual object executes the target action adjusted in combination with the action template.
- third action data is generated based on the first action data and the second action data.
- the action to be adjusted may be a target action whose similarity is less than a specified threshold after frequency domain similarity calculation is performed between the first action data of the target action and the second action data of the target action.
- FIG6 is a schematic diagram of an action beautification involved in an embodiment of the present application.
- the target action is walking
- the target object obtained by the computing device has an outward eight posture when performing the target action 41
- the target action 41 is beautified by the above-mentioned action mixing and action smoothing methods
- the target action 42 after the knee joint point position is adjusted inward is obtained.
- the beautified action data can adjust the action in dimensions such as posture, walking posture, gesture, and artistic action.
- S104 Generate an animation video according to the third action data.
- a video clip of a virtual object performing a beautified target action can be generated through a third action video. Since the target object can perform multiple actions during the movement process, the target actions that need to be beautified can be multiple actions, so the video clips generated by each target action need to be animated and spliced to generate a complete animation video to show the movement process of the virtualized target object, and the virtualized movement process includes the virtual object performing each beautified target action.
- a mixed animation video is generated according to the third motion data; and the animation video is obtained by performing low-pass filtering on the mixed animation video.
- skeleton skinning rendering may be performed according to the third action data to generate an animation video.
- the embodiment of the present application obtains the first action data of the target object performing the target action in reality, and the second action data of the virtual object performing the target action according to the action template, and modifies the first action data in combination with the second action data to obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data.
- the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is retained, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
- Fig. 7 shows an architecture diagram of a virtual object motion generation system provided by an exemplary embodiment of the present application.
- the virtual object motion generation system includes a motion data processing module 310 , a motion beautification algorithm module 320 , a motion asset library module 330 and a visualization module 340 .
- the motion data processing module 310 may include a motion segmentation submodule 311 and a motion classification submodule 312 .
- the action segmentation submodule 311 is used to obtain the speed of each joint point in the first action data of the target object performing the target action.
- the target object’s motion process is divided into action segments in terms of timing, such as the change in degree, phase change, or the position of the foot joint touching the ground, or the palm joint hanging down.
- the action classification submodule 312 is used to determine the action type in the action segment according to the velocity information or angular velocity information of the key skeletal joints.
- the motion beautification algorithm module 320 may include a motion search algorithm submodule 321 , a motion mixing submodule 322 , a motion smoothing submodule 323 and a problem segment identification submodule 324 .
- the action search algorithm submodule 321 can be used to search for a set of clips with the closest features in the action asset library according to the action clips of the target action of the acquired target object, and sort the closest action clips.
- the action clips in the action asset library can be action templates.
- the action mixing submodule 322 may be configured to generate a mixed action segment using a spherical mixing algorithm according to a specified weight based on the action segment of the target action and the action segment corresponding to the action template closest to the action segment of the target action.
- the motion smoothing submodule 323 may be used to smooth the mixed motion clip based on a low-pass filtering algorithm to generate a smoothed motion clip having the same duration as the mixed motion clip.
- the problem segment identification submodule 324 can be used to determine the difference between the rotation information, angular velocity information and position information of each joint point in the action segment corresponding to the action template based on the acquired action segment of the target action of the target object, and determine whether the difference is too large.
- the action asset library module 330 may include a human skeleton asset library submodule 331 , a standard animation asset library submodule 332 , a stylized animation asset submodule 333 and an open interface.
- the human skeleton asset library submodule 331 includes various human skeleton models such as male, female, old, young, tall, short, fat, and thin.
- the standard animation asset library submodule 332 is a standard animation corresponding to the model in the human skeleton asset library submodule 331 , including animation clips and feature clips.
- the stylized animation asset submodule 333 is the animation curves and frequency domain features of the joints such as the root, foot, wrist, elbow, knee, etc. extracted from the standard animation asset library submodule 332 .
- the visualization module 340 may include a skeleton rendering submodule 341 .
- the skeleton rendering submodule 341 and the skin rendering submodule 342 can be used to perform animation rendering of virtual objects.
- the embodiment of the present application obtains the first action data of the target object performing the target action in reality, and the second action data of the virtual object performing the target action according to the action template, and modifies the first action data in combination with the second action data to obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data.
- the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is retained, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
- the action generation device of the virtual object includes at least one of the hardware structure and software modules corresponding to the execution of each function. It should be easily appreciated by those skilled in the art that, in combination with the units and algorithm steps of each example described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
- the embodiment of the present application can divide the action generation device of the virtual object into functional units according to the above method example.
- each functional unit can be divided corresponding to each function, or two or more functions can be integrated into one processing unit.
- the above integrated unit can be implemented in the form of hardware or in the form of software functional units. It should be noted that the division of units in the embodiment of the present application is schematic and is only a logical functional division. There may be other division methods in actual implementation.
- FIG8 shows a schematic diagram of the structure of a virtual object motion generation device 500 provided by an exemplary embodiment of the present application.
- the virtual object motion generation device 500 is applied to a computing device, or the virtual object motion generation device 500 can be a computing device.
- the virtual object motion generation device 500 includes:
- the acquisition module 510 is used to acquire first action data; the first action data is used to indicate the movement of the reference point during the target object performs the target action; acquire second action data; the second action data is used to indicate the movement of the reference point during the virtual object performs the target action according to the action template; the virtual object is used to represent the target object in the virtual space;
- the processing module 520 is used to generate third motion data based on the first motion data and the second motion data; the third motion data is used to indicate the movement of the reference point after the target motion is adjusted in combination with the motion template;
- the generation module 530 is used to generate an animation video according to the third action data; the animation video includes a segment of the virtual object performing the adjusted target action.
- the acquisition module 510 may be used to execute S101 and S102 as shown in FIG. 4
- the processing module 520 may be used to execute S103 as shown in FIG. 4
- the generation module 530 may be used to execute S104 as shown in FIG. 4 .
- the acquisition module 510 is further used to, in response to a received selection operation, determine an action template corresponding to the target action from at least one action template; and acquire the second action data of the action template corresponding to the target action.
- the processing module 520 is also used to extract the first action feature based on the first action data; extract the second action feature based on the second action data; and mix the first action feature and the second action feature according to specified weights to generate the third action data.
- the processing module 520 is further configured to, if the target action is an action to be adjusted, generate the third action data based on the first action data and the second action data.
- the action to be adjusted is the target action whose similarity is less than a specified threshold after frequency domain similarity calculation is performed on the first action data of the target action and the second action data of the target action.
- the generating module 530 is further configured to generate a mixed animation video according to the third motion data
- the animation video is obtained by performing low-pass filtering on the mixed animation video.
- the acquisition module 510 is further used to acquire an action video; the action video contains a segment of the target object performing the target action; and the first action data is acquired by performing motion capture on the target object in the action video.
- the acquisition module 510 is also used to determine the action type of the target action based on the first action data; determine the action template of the target action according to the action type of the target action; and acquire the action data of the action template of the target action as the second action data.
- the acquisition module 510 is also used to determine at least two action templates corresponding to the action type of the target action according to the action type of the target action; and in response to a received selection operation, determine the action template of the target action from the at least two action templates.
- the acquisition module 510, the processing module 520 and the generation module 530 can all be implemented by software, or can be implemented by hardware. Exemplarily, the following takes the acquisition module 510 as an example to introduce the implementation of the acquisition module 510. Similarly, the implementation of the processing module 520 and the generation module 530 can refer to the implementation of the acquisition module 510.
- the acquisition module 510 may include code running on a computing instance.
- the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the above-mentioned computing instance may be one or more.
- the acquisition module 510 may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region (region) or in different regions.
- the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with similar geographical locations. Among them, usually a region may include multiple AZs.
- VPC virtual private cloud
- multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs.
- VPC virtual private cloud
- a VPC is set up in a region.
- a communication gateway must be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
- the acquisition module 510 may include at least one computing device, such as a server, etc.
- the acquisition module 510 may also be a device implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
- ASIC application-specific integrated circuit
- PLD programmable logic device
- the PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
- CPLD complex programmable logical device
- FPGA field-programmable gate array
- GAL generic array logic
- the multiple computing devices included in the acquisition module 510 can be distributed in the same region or in different regions.
- the multiple computing devices included in the acquisition module 510 can be distributed in the same AZ or in different AZs.
- the multiple computing devices included in the acquisition module 510 can be distributed in the same VPC or in multiple VPCs.
- the multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
- the acquisition module 510 can be used to execute any step in the action generation method of the virtual object
- the processing module 520 can be used to execute any step in the action generation method of the virtual object
- the generation module 530 can be used to execute any step in the action generation method of the virtual object.
- the steps that the acquisition module 510, the processing module 520, and the generation module 530 are responsible for implementing can be specified as needed, and the acquisition module 510, the processing module 520, and the generation module 530 respectively implement different steps in the action generation method of the virtual object to realize the full functions of the action generation device of the virtual object.
- the present application also provides a computing device 100.
- the computing device 100 includes: a bus 102, a processor 104, a memory 106, and a communication interface 108.
- the processor 104, the memory 106, and the communication interface 108 communicate with each other through the bus 102.
- the computing device 100 can be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 100.
- the bus 102 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
- the bus may be divided into an address bus, a data bus, a control bus, etc.
- FIG. 9 is represented by only one line, but does not mean that there is only one bus or one type of bus.
- the bus 104 may include a path for transmitting information between various components of the computing device 100 (e.g., the memory 106, the processor 104, the communication interface 108).
- the processor 104 may include any one or more processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP).
- processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP).
- CPU central processing unit
- GPU graphics processing unit
- MP microprocessor
- DSP digital signal processor
- the memory 106 may include a volatile memory, such as a random access memory (RAM).
- the processor 104 may also include a non-volatile memory, such as a read-only memory (ROM), a flash memory, a hard disk drive (HDD), or a solid state drive (SSD).
- ROM read-only memory
- HDD hard disk drive
- SSD solid state drive
- the memory 106 stores executable program codes, and the processor 104 executes the executable program codes to respectively implement the functions of the aforementioned acquisition module 510, processing module 520, and generation module 530, thereby implementing the action generation method of the virtual object. That is, the memory 106 stores instructions for executing the action generation method of the virtual object.
- the memory 106 stores executable codes
- the processor 104 executes the executable codes to respectively implement the functions of the aforementioned virtual object motion generation device, thereby implementing the virtual object motion generation method. That is, the memory 106 stores instructions for executing the virtual object motion generation method.
- the communication interface 108 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 100 and other devices or a communication network.
- a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 100 and other devices or a communication network.
- the computing device cluster includes at least one computing device 100.
- the memory 106 in one or more computing devices 100 in the computing device cluster may store the same instructions for executing the method for generating actions of a virtual object.
- the memory 106 in different computing devices 100 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the virtual object action generation device. That is, the instructions stored in the memory 106 in different computing devices 100 may implement the functions of one or more modules among the acquisition module 510, the processing module 520 and the generation module 530.
- one or more computing devices in the computing device cluster can be connected via a network.
- the network can be a wide area network or a local area network, etc.
- FIG. 11 shows a possible implementation. As shown in FIG. 11 , two computing devices 100A and 100B are connected via a network. Specifically, the network is connected via a communication interface in each computing device.
- the memory 106 in the computing device 100A stores instructions for executing the functions of the acquisition module 510.
- the memory 106 in the computing device 100B stores instructions for executing the functions of the processing module 520 and the generation module 530.
- connection method between the computing device clusters shown in Figure 11 can be considered to be that the action generation method of the virtual object provided in this application requires a large amount of storage and calculation data, so it is considered to hand over the functions implemented by the processing module 520 and the generation module 530 to the computing device 100B for execution.
- the functions of the computing device 100A shown in FIG11 may also be completed by multiple computing devices 100.
- the functions of the computing device 100B may also be completed by multiple computing devices 100.
- the embodiment of the present application also provides another computing device cluster.
- the connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 10 and 11.
- the difference is that the memory 106 in one or more computing devices 100 in the computing device cluster can store the same instructions for executing the action generation method of the virtual object.
- the memory 106 of one or more computing devices 100 in the computing device cluster may also store partial instructions for executing the method for generating actions of a virtual object.
- the combination of one or more computing devices 100 may jointly execute instructions for executing the method for generating actions of a virtual object.
- the memory 106 in different computing devices 100 in the computing device cluster may store different instructions for executing part of the functions of the virtual object motion generation system. That is, the instructions stored in the memory 106 in different computing devices 100 may implement the functions of one or more devices in the virtual object motion generation device.
- the embodiment of the present application also provides a computer program product including instructions.
- the computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium.
- the at least one computing device executes the method for generating an action of a virtual object.
- the embodiment of the present application also provides a computer-readable storage medium.
- the computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media.
- the available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk).
- the computer-readable storage medium includes instructions that instruct the computing device to execute the action generation method of the virtual object, or instruct the computing device to execute the action generation method of the virtual object.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
本申请要求于2023年7月17日提交国家知识产权局、申请号为202310876415.4、申请名称为“一种动作美化方法、装置及计算设备集群”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the State Intellectual Property Office on July 17, 2023, with application number 202310876415.4 and application name “A motion beautification method, device and computing device cluster”, all contents of which are incorporated by reference in this application.
本申请要求于2023年11月16日提交国家知识产权局、申请号为202311544502.6、申请名称为“虚拟对象的动作生成方法、装置、集群、介质及程序产品”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the State Intellectual Property Office on November 16, 2023, with application number 202311544502.6 and application name “Virtual object action generation method, device, cluster, medium and program product”, all contents of which are incorporated by reference in this application.
本申请涉及动作捕捉技术领域,尤其涉及虚拟对象的动作生成方法、装置、集群、介质及程序产品。The present application relates to the field of motion capture technology, and in particular to a method, device, cluster, medium and program product for generating motions of virtual objects.
随着动画、影视、游戏、元宇宙等虚拟数字空间类产品的高速发展,越来越多的场景中需要在虚拟空间中展示虚拟对象执行肢体动作的过程,因此,为了满足虚拟空间中展示的虚拟对象执行肢体动作的过程,接近甚至超越现实中虚拟对象所对应的真实对象执行肢体动作的过程,需要不断提升捕捉并生成虚拟对象执行肢体动作的技术。With the rapid development of virtual digital space products such as animation, film, television, games, and the metaverse, more and more scenarios require the display of the process of virtual objects performing physical movements in virtual space. Therefore, in order to meet the process of virtual objects performing physical movements displayed in virtual space and approach or even surpass the process of real objects performing physical movements corresponding to virtual objects in reality, it is necessary to continuously improve the technology of capturing and generating virtual objects performing physical movements.
当前,通过动作捕捉技术可以获取现实中的对象执行肢体动作的过程,该动作捕捉技术可以是采用不同的传感器,获取现实中的对象在现实空间中的运动信息,通过算法将捕捉到的运动信息转换成虚拟骨骼的动画数据,根据获取到的动画数据可以生成虚拟对象执行肢体动作的过程,从而还原现实中对象执行的动作。Currently, motion capture technology can be used to obtain the process of real objects performing limb movements. The motion capture technology can use different sensors to obtain the movement information of real objects in real space, and convert the captured motion information into animation data of virtual skeletons through algorithms. Based on the obtained animation data, the process of virtual objects performing limb movements can be generated, thereby restoring the actions performed by the real objects.
在上述相关技术中,通过动作捕捉技术可以较为精准的还原现实中的对象执行肢体动作的过程,但是在虚拟空间中展示虚拟对象执行肢体动作的需求,可以是在还原现实中的对象执行的肢体动作的基础上,对动作进行美化,这就导致了通过动作捕捉技术直接在虚拟空间中显示的虚拟角色执行肢体动作的过程存在灵活性较差的问题。In the above-mentioned related technologies, the process of objects in reality performing physical movements can be restored more accurately through motion capture technology. However, the need to display virtual objects performing physical movements in virtual space can be to beautify the movements on the basis of restoring the physical movements performed by objects in reality. This leads to the problem of poor flexibility in the process of virtual characters performing physical movements directly displayed in virtual space through motion capture technology.
发明内容Summary of the invention
本申请实施例提供了一种虚拟对象的动作生成方法、装置、集群、介质及程序产品,保证了对象执行的动作的风格被保留的前提下,美化了虚拟对象执行的动作,提高了生成虚拟对象执行的动作的灵活性。The embodiments of the present application provide a method, device, cluster, medium and program product for generating actions of a virtual object, which ensure that the style of the actions performed by the object is preserved, beautify the actions performed by the virtual object, and improve the flexibility of generating actions performed by the virtual object.
第一方面,本申请提供了一种虚拟对象的动作生成方法,该方法包括:获取第一动作数据;第一动作数据用于指示目标对象执行目标动作过程中的关节运动情况;获取第二动作数据;第二动作数据用于指示虚拟对象按照动作模板执行目标动作过程中的关节运动情况;虚拟对象用于在虚拟空间中表征目标对象;基于第一动作数据以及第二动作数据,生成第三动作数据;第三动作数据用于指示结合动作模板调整目标动作后的关节运动情况;按照第三动作数据,生成动画视频;动画视频包含虚拟对象执行调整后的目标动作的片段。In a first aspect, the present application provides a method for generating an action of a virtual object, the method comprising: obtaining first action data; the first action data is used to indicate the joint movement of the target object during the execution of a target action; obtaining second action data; the second action data is used to indicate the joint movement of the virtual object during the execution of the target action according to an action template; the virtual object is used to represent the target object in a virtual space; based on the first action data and the second action data, third action data is generated; the third action data is used to indicate the joint movement after the target action is adjusted in combination with the action template; according to the third action data, an animation video is generated; the animation video includes a clip of the virtual object executing the adjusted target action.
可以理解的是,通过获取目标对象在现实中执行目标动作的第一动作数据,以及虚拟对象按照动作模板执行目标动作时的第二动作数据,将第一动作数据结合第二动作数据进行修改,得到向动作模板调整的第三动作数据,从而使得按照第三动作数据渲染可以生成美化后的动画视频。通过按照动作模板自动调整目标动作,在保证了对象执行的动作的风格被保留的前提下,美化了虚拟对象执行的动作,提高了对目标对象执行的目标动作进行虚拟化后生成的动画的灵活性。It can be understood that by obtaining the first action data of the target object performing the target action in reality and the second action data of the virtual object performing the target action according to the action template, the first action data is combined with the second action data to modify and obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data. By automatically adjusting the target action according to the action template, the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is preserved, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
在一种可能的实现方式中,基于第一动作数据以及第二动作数据,生成第三动作数据,包括:基于第一动作数据,提取第一动作特征;基于第二动作数据,提取第二动作特征;按照指定权重,对第一动作特征以及第二动作特征进行混合,生成第三动作数据。In one possible implementation, third action data is generated based on the first action data and the second action data, including: extracting first action features based on the first action data; extracting second action features based on the second action data; and mixing the first action features and the second action features according to specified weights to generate third action data.
可以理解的是,通过对第一动作数据以及第二动作数据分别进行特征提取,然后按照指定权重进行特征混合,可以得到特征混合后的第三动作数据,从而实现将第一动作数据结合第二动作数据进行调整的目的。 It can be understood that by extracting features from the first action data and the second action data respectively, and then mixing the features according to specified weights, third action data after feature mixing can be obtained, thereby achieving the purpose of adjusting the first action data in combination with the second action data.
在一种可能的实现方式中,基于第一动作数据以及第二动作数据,生成第三动作数据,包括:基于第一动作数据,获取目标参考点的第一动画曲线;目标参考点是用于执行目标动作的关节;第一动画曲线用于指示在目标对象执行目标动作的时间段内参考点的运动情况;基于第二动作数据,获取目标参考点的第二动画曲线;第二动画曲线用于指示在虚拟对象按照动作模板执行目标动作的时间段内参考点的运动情况;按照球面混合算法,对第一动画曲线以及第二动画曲线进行混合,得到第三动画曲线;第三动画曲线是用于指示在虚拟对象执行结合动作模板调整目标动作的时间段内参考点的运动情况。In one possible implementation, third action data is generated based on first action data and second action data, including: based on the first action data, obtaining a first animation curve of a target reference point; the target reference point is a joint used to perform a target action; the first animation curve is used to indicate the movement of the reference point during a time period when the target object performs the target action; based on the second action data, obtaining a second animation curve of the target reference point; the second animation curve is used to indicate the movement of the reference point during a time period when the virtual object performs the target action according to an action template; according to a spherical blending algorithm, the first animation curve and the second animation curve are blended to obtain a third animation curve; the third animation curve is used to indicate the movement of the reference point during a time period when the virtual object performs the target action adjusted in combination with the action template.
可以理解的是,通过将第一动作数据以及第二动作数据分别进行动画曲线化,生成第一动画曲线以及第二动画曲线,按照球面混合算法对第一动画曲线以及第二动画曲线进行混合,可以得到第三动画曲线,从而实现将第一动作数据结合第二动作数据进行调整的目的。It can be understood that by converting the first motion data and the second motion data into animation curves respectively, generating the first animation curve and the second animation curve, and mixing the first animation curve and the second animation curve according to the spherical mixing algorithm, the third animation curve can be obtained, thereby achieving the purpose of adjusting the first motion data in combination with the second motion data.
在一种可能的实现方式中,基于第一动作数据以及第二动作数据,生成第三动作数据,包括:若目标动作是待调整动作,基于第一动作数据以及第二动作数据,生成第三动作数据。待调整动作是将目标动作的第一动作数据与目标动作的第二动作数据进行频域相似度计算后相似度小于指定阈值的目标动作。In a possible implementation, generating third action data based on the first action data and the second action data includes: if the target action is an action to be adjusted, generating the third action data based on the first action data and the second action data. The action to be adjusted is a target action whose similarity is less than a specified threshold after the frequency domain similarity calculation is performed between the first action data of the target action and the second action data of the target action.
可以理解的是,将目标动作的第一动作数据与目标动作的动作模板对应的第二动作数据进行频域相似度计算,若频域相似度计算得到的相似度小于指定阈值,可以确定该目标动作的第一动作数据需要调整,所以可以确定该目标动作为待调整动作,在确定目标动作为待调整动作后可以将第一动作数据结合第二动作数据进行调整,生成调整后的第三动作数据,从而实现将第一动作数据结合第二动作数据进行调整的目的。It can be understood that the frequency domain similarity calculation is performed on the first action data of the target action and the second action data corresponding to the action template of the target action. If the similarity obtained by the frequency domain similarity calculation is less than the specified threshold, it can be determined that the first action data of the target action needs to be adjusted, so the target action can be determined as the action to be adjusted. After determining that the target action is the action to be adjusted, the first action data can be adjusted in combination with the second action data to generate adjusted third action data, thereby achieving the purpose of adjusting the first action data in combination with the second action data.
在一种可能的实现方式中,按照第三动作数据,生成动画视频,包括:按照第三动作数据,生成混合动画视频;通过对混合动画视频进行低通滤波处理,得到动画视频。In a possible implementation, generating an animation video according to the third motion data includes: generating a mixed animation video according to the third motion data; and obtaining the animation video by performing low-pass filtering on the mixed animation video.
可以理解的是,为了对第一动作数据结合第二动作数据调整后得到的第三动作数据生成的混合动画视频进行动作平滑处理,可以将混合动画数据通过低通滤波算法对动画中的动作进行平滑,得到平滑的动画视频。It can be understood that in order to perform motion smoothing on the mixed animation video generated by adjusting the first motion data and the second motion data to obtain the third motion data, the mixed animation data can be smoothed by a low-pass filtering algorithm to obtain a smooth animation video.
在一种可能的实现方式中,获取第一动作数据,包括:通过对目标对象进行动作捕捉,获取第一动作数据。In a possible implementation, obtaining the first motion data includes: obtaining the first motion data by performing motion capture on a target object.
可以理解的是,通过动作捕捉技术可以直接采集目标对象执行目标动作时的第一动作数据,从而实现后续将第一动作数据结合第二动作数据进行调整,得到美化后的第三动作数据的目的。It is understandable that the motion capture technology can directly collect the first motion data of the target object when performing the target motion, so as to achieve the purpose of subsequently adjusting the first motion data in combination with the second motion data to obtain beautified third motion data.
在一种可能的实现方式中,获取第一动作数据,包括:获取动作视频;动作视频中包含目标对象执行目标动作的片段;通过对动作视频中的目标对象进行动作捕捉,获取第一动作数据。In a possible implementation, obtaining the first action data includes: obtaining an action video; the action video contains a segment of a target object performing a target action; and obtaining the first action data by performing motion capture on the target object in the action video.
可以理解的是,可以获取目标对象执行目标动作的动作视频,通过对动作视频中目标对象执行目标动作的过程进行动作捕捉可以获取到第一动作数据,从而实现后续将第一动作数据结合第二动作数据进行调整,得到美化后的第三动作数据的目的。It can be understood that an action video of the target object performing the target action can be obtained, and the first action data can be obtained by motion capturing the process of the target object performing the target action in the action video, thereby achieving the purpose of subsequently adjusting the first action data in combination with the second action data to obtain beautified third action data.
在一种可能的实现方式中,获取第二动作数据,包括:响应于接收到的选择操作,从至少一个动作模板中确定目标动作对应的动作模板;获取目标动作对应的动作模板的第二动作数据。In a possible implementation, obtaining the second action data includes: in response to a received selection operation, determining an action template corresponding to a target action from at least one action template; and obtaining the second action data of the action template corresponding to the target action.
可以理解的是,在获取到第一动作数据后,可以直接按照接收到的选择操作,从至少一个动作模板男中确定目标动作对应的动作模板,并且获取该动作模板的第二动作数据,从而实现用户直接从各个动作模板中选择目标动作对应的动作模板,通过用户自由选择第二动作数据对应的动作模板,可以便于用户选择将目标动作调整的方向。It can be understood that after obtaining the first action data, the action template corresponding to the target action can be determined directly from at least one action template according to the received selection operation, and the second action data of the action template can be obtained, thereby enabling the user to directly select the action template corresponding to the target action from each action template. By allowing the user to freely select the action template corresponding to the second action data, the user can easily select the direction to adjust the target action.
在一种可能的实现方式中,获取第二动作数据,包括:基于第一动作数据,确定目标动作的动作种类;按照目标动作的动作种类,确定目标动作对应的动作模板;获取目标动作对应的动作模板的第二动作数据。In a possible implementation, obtaining the second action data includes: determining the action type of the target action based on the first action data; determining the action template corresponding to the target action according to the action type of the target action; and obtaining the second action data of the action template corresponding to the target action.
可以理解的是,在获取到第一动作数据后,可以获取目标动作的动作种类,并且确定该动作种类对应的动作模板,并获取该动作模板的第二动作数据,可以实现自动识别目标动作的动作种类,并且按照动作种类确定该动作种类对应的动作模板,从而获取到动作模板对应的第二动作数据。It can be understood that after obtaining the first action data, the action type of the target action can be obtained, and the action template corresponding to the action type can be determined, and the second action data of the action template can be obtained. This can realize automatic identification of the action type of the target action, and determine the action template corresponding to the action type according to the action type, thereby obtaining the second action data corresponding to the action template.
在一种可能的实现方式中,若每个动作种类对应有至少两个动作模板,按照目标动作的动作 种类,确定目标动作的动作模板,包括:按照目标动作的动作种类,确定目标动作的动作种类对应的至少两个动作模板;响应于接收到的选择操作,从至少两个动作模板中确定目标动作的动作模板。In a possible implementation, if each action type corresponds to at least two action templates, according to the action of the target action Type, determining the action template of the target action, including: determining at least two action templates corresponding to the action type of the target action according to the action type of the target action; and determining the action template of the target action from the at least two action templates in response to the received selection operation.
可以理解的是,在确定目标动作的动作种类后,可以自动获取动作种类对应的动作模板,若该动作种类对应至少两个动作模板,用户可以通过选择操作确定其中一个动作模板,便于用户选择将目标动作调整的方向。It is understandable that after determining the action type of the target action, the action template corresponding to the action type can be automatically obtained. If the action type corresponds to at least two action templates, the user can determine one of the action templates through a selection operation, so that the user can choose the direction to adjust the target action.
第二方面,本申请实施例提供了一种虚拟对象的动作生成装置,该虚拟对象的动作生成装置用于执行上述第一方面提供的任意一种虚拟对象的动作生成方法。In a second aspect, an embodiment of the present application provides a virtual object action generation device, which is used to execute any one of the virtual object action generation methods provided in the first aspect.
在一种可能的实现方式中,本申请实施例可以根据上述第一方面提供的方法,对该虚拟对象的动作生成装置进行功能模块的划分。例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。示例性的,本申请实施例可以按照功能将该虚拟对象的动作生成装置划分为获取模块、处理模块以及生成模块等。上述划分的各个功能模块执行的可能的技术方案和有益效果的描述均可以参考上述第一方面或其相应的可能的实现方式提供的技术方案,此处不再赘述。In a possible implementation, the embodiment of the present application can divide the action generation device of the virtual object into functional modules according to the method provided in the first aspect above. For example, each functional module can be divided according to each function, or two or more functions can be integrated into one processing module. Exemplarily, the embodiment of the present application can divide the action generation device of the virtual object into an acquisition module, a processing module, and a generation module, etc. according to the function. The description of the possible technical solutions and beneficial effects executed by the above-mentioned divided functional modules can refer to the technical solutions provided by the above-mentioned first aspect or its corresponding possible implementation method, which will not be repeated here.
第三方面,本申请实施例提供了一种计算设备,计算设备包含处理器和存储器,处理器与存储器耦合;该存储器用于存储计算机指令,该计算机指令由处理器加载并执行以使计算设备实现如上述方面所述的虚拟对象的动作生成方法。In a third aspect, an embodiment of the present application provides a computing device, the computing device comprising a processor and a memory, the processor being coupled to the memory; the memory being used to store computer instructions, the computer instructions being loaded and executed by the processor so that the computing device implements the method for generating actions of a virtual object as described in the above aspects.
第四方面,本申请实施例提供了一种计算设备集群,该计算设备集群包括至少一个计算设备,每个计算设备包括:处理器和存储器,至少一个计算设备的处理器用于执行至少一个计算设备的存储器中存储的指令,以使得计算设备集群执行上述第一方面的各种可选实现方式中提供的虚拟对象的动作生成方法。In a fourth aspect, an embodiment of the present application provides a computing device cluster, which includes at least one computing device, each computing device including: a processor and a memory, the processor of at least one computing device is used to execute instructions stored in the memory of at least one computing device, so that the computing device cluster executes the virtual object action generation method provided in the various optional implementation methods of the above-mentioned first aspect.
第五方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质中存储有至少一条计算机程序指令,所述计算机程序指令由处理器加载并执行以实现如上述方面所述的虚拟对象的动作生成方法。In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, in which at least one computer program instruction is stored, and the computer program instruction is loaded and executed by a processor to implement the method for generating actions of a virtual object as described in the above aspects.
第六方面,本申请实施例提供了一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。计算设备的处理器从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算设备集群执行上述第一方面的各种可选实现方式中提供的虚拟对象的动作生成方法。In a sixth aspect, an embodiment of the present application provides a computer program product, the computer program product including computer instructions, the computer instructions stored in a computer-readable storage medium. A processor of a computing device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computing device cluster executes the method for generating an action of a virtual object provided in various optional implementations of the first aspect.
本申请中第二方面到第六方面及其各种实现方式的具体描述,可以参考第一方面及其各种实现方式中的详细描述;并且,第二方面到第六方面及其各种实现方式的有益效果,可以参考第一方面及其各种实现方式中的有益效果分析,此处不再赘述。For the specific description of the second to sixth aspects and their various implementations in the present application, reference may be made to the detailed description in the first aspect and its various implementations; and for the beneficial effects of the second to sixth aspects and their various implementations, reference may be made to the beneficial effects analysis in the first aspect and its various implementations, which will not be repeated here.
本申请的这些方面或其他方面在以下的描述中会更加简明易懂。These and other aspects of the present application will become more apparent from the following description.
图1是根据一示例性实施例示出的一种虚拟对象的动作生成的场景示意图;Fig. 1 is a schematic diagram of a scene showing action generation of a virtual object according to an exemplary embodiment;
图2是图1所示实施例中涉及的一种动作捕捉设备采集目标对象的运动数据的示意图;FIG2 is a schematic diagram of a motion capture device involved in the embodiment shown in FIG1 collecting motion data of a target object;
图3是根据一示例性实施例示出的一种计算设备的硬件结构示意图;FIG3 is a schematic diagram showing a hardware structure of a computing device according to an exemplary embodiment;
图4是根据一示例性实施例示出的一种虚拟对象的动作生成方法的流程示意图;FIG4 is a schematic flow chart of a method for generating an action of a virtual object according to an exemplary embodiment;
图5是图4所示实施例中涉及的一种动作序列数据进行异常检测的流程图;FIG5 is a flow chart of anomaly detection of action sequence data involved in the embodiment shown in FIG4 ;
图6是图4所示实施例中涉及的一种动作美化示意图;FIG6 is a schematic diagram of an action beautification involved in the embodiment shown in FIG4 ;
图7是根据一示例性实施例示出的一种虚拟对象的动作生成系统的架构图;FIG7 is an architecture diagram of a system for generating motions of a virtual object according to an exemplary embodiment;
图8是根据一示例性实施例示出的一种虚拟对象的动作生成装置的结构示意图;FIG8 is a schematic structural diagram of a device for generating a motion of a virtual object according to an exemplary embodiment;
图9是根据一示例性实施例示出的一种计算设备的示意图;FIG9 is a schematic diagram of a computing device according to an exemplary embodiment;
图10是根据一示例性实施例示出的一种计算设备集群的示意图;FIG10 is a schematic diagram of a computing device cluster according to an exemplary embodiment;
图11是根据一示例性实施例示出的一种计算设备集群之间的连接方式的示意图。Fig. 11 is a schematic diagram showing a connection method between computing device clusters according to an exemplary embodiment.
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细 描述。In order to make the purpose, technical solution and advantages of the present application more clear, the implementation mode of the present application will be further described in detail below with reference to the accompanying drawings. describe.
在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。The term "multiple" as used herein refers to two or more than two. "And/or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the related objects are in an "or" relationship.
并且,在本申请的描述中,除非另有说明,“多个”是指两个或多于两个。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。Furthermore, in the description of the present application, unless otherwise specified, "plurality" means two or more than two. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single items or plural items. For example, at least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, and c can be single or multiple.
另外,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。同时,在本申请实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念,便于理解。In addition, in order to facilitate the clear description of the technical solutions of the embodiments of the present application, in the embodiments of the present application, the words "first", "second" and the like are used to distinguish the same items or similar items with substantially the same functions and effects. Those skilled in the art will understand that the words "first", "second" and the like do not limit the quantity and execution order, and the words "first", "second" and the like do not necessarily limit the differences. At the same time, in the embodiments of the present application, the words "exemplary" or "for example" are used to indicate examples, illustrations or explanations. Any embodiment or design described as "exemplary" or "for example" in the embodiments of the present application should not be interpreted as being more preferred or more advantageous than other embodiments or design solutions. Specifically, the use of words such as "exemplary" or "for example" is intended to present related concepts in a concrete manner for ease of understanding.
首先,对本申请实施例的应用场景进行示例性介绍。First, the application scenarios of the embodiments of the present application are exemplarily introduced.
通过动画、影视、游戏、元宇宙等虚拟数字空间类产品,可以将现实空间中的对象执行动作的过程映射在虚拟空间中进行动画视频显示,通过在虚拟空间中构建虚拟对象作为现实空间中的对象的数字分身,虚拟对象执行动作可以接近甚至超越现实空间中的对象执行动作。通过动作捕捉技术可以获取现实空间中对象的肢体动作。Through virtual digital space products such as animation, film, games, and the metaverse, the process of objects performing actions in the real space can be mapped in the virtual space for animation video display. By constructing virtual objects in the virtual space as digital avatars of objects in the real space, the actions performed by virtual objects can be close to or even exceed the actions performed by objects in the real space. The body movements of objects in the real space can be obtained through motion capture technology.
其中,动作捕捉技术可以是通过光学传感器、惯性传感器以及摄像头等设备采集现实空间中某对象的动作数据,通过动作捕捉技术可以将现实空间中对象的动作还原在虚拟空间显示,通过采集到的对象的动作数据可以生成骨骼动画数据。骨骼可以是虚拟对象的模型内部不进行显示的,由虚拟的关节旋转和位置信息组成的数据结构。Among them, motion capture technology can be used to collect the motion data of an object in real space through optical sensors, inertial sensors, cameras and other devices. The motion of the object in real space can be restored and displayed in virtual space through motion capture technology, and the collected object motion data can be used to generate skeleton animation data. Skeleton can be a data structure composed of virtual joint rotation and position information that is not displayed inside the model of the virtual object.
当前,通过动作捕捉技术获取显示空间中的对象执行动作的过程后直接在虚拟空间中显示对应的虚拟对象执行该动作,虽然可以较准确的还原现实空间中对象的动作,但是在实际应用时,尤其是针对虚拟人的场景中,在构建真人的虚拟对象时,大部分的用户对自己的动作美感认知高于实际情况,这就导致了在按照真人动作还原展示虚拟人动作时,用户回看自己的虚拟人动作时容易对展示的虚拟人动作体现的美感的满意度不高。也就是说,用户期望展示的虚拟人动作可以在保持自己的动作风格的前提下具有一定的美感,而直接展示通过动作捕捉设备进行还原的虚拟人动作真实性较高但是美感不足。因此,在通过动作捕捉技术采集到真人的动作数据,可以通过动作编辑,按照人工经验美化虚拟人动作。At present, after obtaining the process of an object performing an action in the display space through motion capture technology, the corresponding virtual object is directly displayed in the virtual space to perform the action. Although the action of the object in the real space can be restored more accurately, in actual application, especially in the scene of virtual people, when constructing virtual objects of real people, most users have a higher perception of the aesthetics of their own actions than the actual situation. This leads to that when the virtual human action is restored and displayed according to the real action, the user is likely to be dissatisfied with the aesthetics of the displayed virtual human action when looking back at his own virtual human action. In other words, the user expects that the displayed virtual human action can have a certain aesthetic feeling under the premise of maintaining his own action style, while the virtual human action restored by the motion capture device is more realistic but lacks aesthetic feeling. Therefore, after the action data of real people is collected through motion capture technology, the virtual human action can be beautified according to artificial experience through action editing.
其中,动作编辑可以是技术美术人员通过专业的软件工具,编辑骨骼旋转的位置信息,形成动画。动作编辑可以分为两种,一种是直接从零开始编辑工作,逐个视频帧进行编辑;另一种可以是在动作捕捉技术采集到的动作数据的基础上进行修改。专业的软件工具可以是数字内容生成(digital content creation,DCC)工具。Among them, motion editing can be done by technical artists using professional software tools to edit the position information of bone rotation to form animation. Motion editing can be divided into two types. One is to edit directly from scratch and edit each video frame; the other is to modify based on the motion data collected by motion capture technology. Professional software tools can be digital content creation (DCC) tools.
通过专业的软件工具编辑动作数据可以实现对动作捕捉技术采集到的动作数据进行编辑美化的目的,但是,通过人工编辑的效率较低,并且美化程度取决于技术美术人员的经验与审美水平,无法保证编辑后动作的质量。Editing motion data through professional software tools can achieve the purpose of editing and beautifying the motion data collected by motion capture technology. However, manual editing is inefficient, and the degree of beautification depends on the experience and aesthetic level of technical artists, and the quality of the edited motion cannot be guaranteed.
另外,除了在动作捕捉技术采集真人动作数据的基础上,再通过专业的软件工具人工编辑动作数据来实现对动作的编辑美化,还可以通过动作生成技术,根据指令、文本或者视频输入自动生成虚拟对象的动作。In addition, in addition to collecting real-person motion data through motion capture technology and manually editing the motion data through professional software tools to achieve editing and beautification of the motion, motion generation technology can also be used to automatically generate virtual object movements based on instructions, text or video input.
其中,动作生成技术可以是基于已有的动作库以及机器学习技术,根据多模态指令快速生成虚拟对象的动作。Among them, the action generation technology can be based on an existing action library and machine learning technology to quickly generate the actions of virtual objects according to multimodal instructions.
也就是说,通过预先构建一个包含各类动作的数据库作为动作库,再从动作库中提取动作特征信息或者基于动作库训练动作预测模型。在动作预测模型运行时,可以根据用户的指令输出,以及对象当前的状态在动作库中查找匹配的未来动作,或者调用训练完成的动作预测模型预测虚拟对象未来执行的动作。 That is to say, a database containing various actions is pre-built as an action library, and then action feature information is extracted from the action library or an action prediction model is trained based on the action library. When the action prediction model is running, it can search for matching future actions in the action library based on the user's command output and the current state of the object, or call the trained action prediction model to predict the future actions of the virtual object.
通过这种方式可以快速生成的虚拟对象的动作,提高虚拟对象动作生成的效率,生成的虚拟对象动作依赖于动作库中预先存储的动作数据,无法保留现实中对象执行动作的风格特点。In this way, the actions of virtual objects can be quickly generated, which improves the efficiency of virtual object action generation. The generated virtual object actions rely on the action data pre-stored in the action library and cannot retain the style characteristics of the actions performed by the objects in reality.
有鉴于此,本申请实施例为了实现在保留现实对象执行动作的风格的前提下,编辑美化虚拟对象执行的动作并显示,从而提升虚拟对象执行的动作的美观程度。可以在获取到对象的动作数据后,将获取到对象的动作数据按照预先设置的标准的动作数据自动编辑美化动作数据,从而自动得到美化后的动作数据,并且根据美化后的动作数据生成动画视频进行显示,从而在保证了对象执行的动作的风格被保留的前提下,美化了虚拟对象执行的动作,提高了生成虚拟对象执行的动作的灵活性。In view of this, the embodiment of the present application aims to achieve the purpose of editing and beautifying the actions performed by virtual objects and displaying them while retaining the style of the actions performed by real objects, thereby improving the aesthetics of the actions performed by virtual objects. After obtaining the action data of the object, the action data of the object can be automatically edited and beautified according to the pre-set standard action data, thereby automatically obtaining the beautified action data, and generating an animation video based on the beautified action data for display, thereby beautifying the actions performed by virtual objects while ensuring that the style of the actions performed by the object is retained, and improving the flexibility of generating the actions performed by virtual objects.
其中,图1示出了本申请实施例提供的一种虚拟对象的动作生成的场景示意图。如图1所示,目标对象穿戴动作捕捉设备11,动作捕捉设备11采集目标对象的运动数据,动作捕捉设备11将采集到的运动数据发送给计算设备10,计算设备10通过对采集到的运动数据进行自动编辑处理,根据自动编辑处理后的运动数据可以生成虚拟对象执行美化后的动作的动画视频,并且将动作美化后的动画视频通过动画视频显示界面12进行展示。Among them, Figure 1 shows a scene schematic diagram of a virtual object action generation provided by an embodiment of the present application. As shown in Figure 1, the target object wears a motion capture device 11, the motion capture device 11 collects the motion data of the target object, and the motion capture device 11 sends the collected motion data to the computing device 10. The computing device 10 automatically edits the collected motion data, and can generate an animation video of the virtual object performing a beautified action based on the motion data after automatic editing, and the beautified animation video is displayed through the animation video display interface 12.
其中,计算设备10可以是服务器、计算机设备或者终端,该计算设备10可以是具有视频处理以及视频显示功能的设备。The computing device 10 may be a server, a computer device or a terminal, and the computing device 10 may be a device with video processing and video display functions.
计算设备10与动作捕捉设备11之间通信连接,动作捕捉设备11可以向计算设备10传输数据。The computing device 10 is communicatively connected to the motion capture device 11 , and the motion capture device 11 can transmit data to the computing device 10 .
其中,目标对象可以是现实空间中的任一运动的对象,比如,可以是人物、动物等。虚拟对象用于在虚拟空间中表征目标对象,该虚拟对象是基于动画骨骼技术创建的三维立体模型,每个虚拟对象在三维虚拟空间中具有自身的形状、提及以及朝向,并占据三维虚拟空间中的一部分空间。The target object can be any moving object in the real space, such as a person, an animal, etc. The virtual object is used to represent the target object in the virtual space. The virtual object is a three-dimensional model created based on the animation skeleton technology. Each virtual object has its own shape, orientation and direction in the three-dimensional virtual space, and occupies a part of the space in the three-dimensional virtual space.
示例性的,图2是本申请实施例涉及的一种动作捕捉设备采集目标对象的运动数据的示意图。如图2所示,动作捕捉设备在目标对象21运动时,采集目标对象21的运动数据,目标对象21按照动画骨骼技术可以将身体骨架按照各个关节点进行划分。For example, Fig. 2 is a schematic diagram of a motion capture device involved in an embodiment of the present application collecting motion data of a target object. As shown in Fig. 2, when the target object 21 moves, the motion capture device collects the motion data of the target object 21, and the target object 21 can divide the body skeleton according to each joint point according to the animation skeleton technology.
其中,运动数据可以包括关节点的旋转角度、运动加速度、关节点的受力情况等。比如,如图2所示,膝盖关节点的旋转角度α、肘关节点的旋转角度β、目标对象21的运动加速度为6m/s^2、膝盖关节点的受力F、脚掌关节点的受力F。The motion data may include the rotation angle of the joint, the motion acceleration, the force of the joint, etc. For example, as shown in FIG2 , the rotation angle α of the knee joint, the rotation angle β of the elbow joint, the motion acceleration of the target object 21 is 6 m/s^2, the force F of the knee joint, and the force F of the sole joint.
在一种可能的实现方式中,计算设备10可以获取包含目标对象的视频,通过对视频进行动作捕捉解析,获取目标对象的运动数据。In a possible implementation, the computing device 10 may obtain a video containing a target object, and obtain motion data of the target object by performing motion capture analysis on the video.
也就是说,在一种情况下计算设备10可以通过动作捕捉设备11获取动作捕捉设备11采集的目标对象的运动数据;在另一种情况下计算设备10可以通过获取包含目标对象的视频,通过对视频中的目标对象的运动情况进行解析,得到目标对象的运动数据。That is to say, in one case, the computing device 10 can obtain the motion data of the target object captured by the motion capture device 11 through the motion capture device 11; in another case, the computing device 10 can obtain the motion data of the target object by obtaining a video containing the target object and analyzing the motion of the target object in the video.
图3是本申请实施例提供的一种计算设备10的硬件结构示意图,该计算设备10可以是服务器,该服务器可以是X86架构的服务器,具体可以是刀片服务器、高密服务器、机架服务器或高性能服务器等,其中,该服务器包括处理器201、存储器202以及总线203,该服务器还可以包括外显设备204。Figure 3 is a schematic diagram of the hardware structure of a computing device 10 provided in an embodiment of the present application. The computing device 10 may be a server, and the server may be an X86 architecture server, specifically a blade server, a high-density server, a rack server or a high-performance server, etc., wherein the server includes a processor 201, a memory 202 and a bus 203, and the server may also include an external display device 204.
其中,处理器201可以包括一个或多个处理单元,例如:处理器201可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The processor 201 may include one or more processing units, for example, the processor 201 may include an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU), etc. Different processing units may be independent devices or integrated in one or more processors.
存储器202用于存储数据,其中该存储器202包括但不限于是随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、快闪存储器、或光存储器、数据库等。The memory 202 is used to store data, wherein the memory 202 includes but is not limited to random access memory (RAM), read-only memory (ROM), flash memory, or optical storage, database, etc.
该存储器202用于与上述处理器201交互,该存储器202中可以存储有各个种类动作对应的动作模板,每个种类动作可以对应一个或多个动作模板,用于指示各个种类动作的执行标准,也就是说,按照动作模板执行的动作是无需美化的,与动作模板的差别越大,对应的动作需要美化程度越高。The memory 202 is used to interact with the processor 201. The memory 202 may store action templates corresponding to various types of actions. Each type of action may correspond to one or more action templates, which are used to indicate the execution standards of various types of actions. That is to say, the actions executed according to the action template do not need to be beautified. The greater the difference from the action template, the higher the degree of beautification required for the corresponding action.
总线203,上述处理器201、存储器202以及外显设备204通常通过总线203相互连接,或采用其他方式相互连接。The processor 201 , the memory 202 and the external display device 204 are usually connected to each other via the bus 203 , or are connected to each other in other ways.
外显设备204可以用于显示动画视频显示界面,以向用户展示动作美化处理后的动画视频。The external display device 204 can be used to display an animation video display interface to show the user the animation video after motion beautification processing.
处理器201可以用于将目标对象的运动数据与动作模板对应的动作数据进行比对,确定目标对象执行的每个动作是否需要进行美化处理,也就是说,处理器201可以用于识别需要进行美化处理的动 作。处理器201还可以用于对目标对象执行的动作进行美化处理,也就是说,处理器201可以用于根据采集到的目标对象的动作数据从数据库中查询与该动作数据最接近的动作模板。处理器201还可以用于对采集到的目标对象的动作数据结合动作模板进行数据混合,生成混合后的运动数据。处理器201还可以用于对混合后的运动数据进行平滑处理。The processor 201 can be used to compare the motion data of the target object with the motion data corresponding to the motion template to determine whether each motion performed by the target object needs to be beautified. In other words, the processor 201 can be used to identify the motion that needs to be beautified. The processor 201 can also be used to beautify the action performed by the target object, that is, the processor 201 can be used to query the action template closest to the action data from the database according to the collected action data of the target object. The processor 201 can also be used to perform data mixing on the collected action data of the target object combined with the action template to generate mixed motion data. The processor 201 can also be used to perform smoothing on the mixed motion data.
需要说明的,本申请实施例描述的应用场景以及系统架构是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。It should be noted that the application scenarios and system architectures described in the embodiments of the present application are intended to more clearly illustrate the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided in the embodiments of the present application. Ordinary technicians in this field can know that with the evolution of the system architecture and the emergence of new business scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
为了便于理解,以下结合附图对本申请的提供的虚拟对象的动作生成方法进行示例性介绍,该虚对象的动作生成方法适用于图1、图3所示的计算设备10。For ease of understanding, the following is an exemplary introduction to the virtual object action generation method provided by the present application in conjunction with the accompanying drawings. The virtual object action generation method is applicable to the computing device 10 shown in Figures 1 and 3.
图4示出了本申请一个示例性实施例提供的虚拟对象的动作生成方法的流程示意图。该虚拟对象的动作生成方法可以由计算设备执行,该虚拟对象的动作生成方法包括如下步骤:FIG4 shows a flow chart of a method for generating an action of a virtual object provided by an exemplary embodiment of the present application. The method for generating an action of a virtual object can be executed by a computing device, and the method for generating an action of a virtual object includes the following steps:
S101,获取第一动作数据。S101, obtaining first action data.
在本申请实施例中,计算设备可以获取第一动作数据,该第一动作数据可以用于指示目标对象执行目标动作过程中的参考点的运动情况。In an embodiment of the present application, a computing device may acquire first motion data, which may be used to indicate a movement of a reference point when a target object performs a target motion.
其中,目标对象执行目标动作的过程可以是在现实空间中实现的,第一动作数据可以包括目标对象在现实空间中执行目标动作时在各个视频帧中参考点的位置信息、参考点的旋转角度、运动加速度、参考点的受力情况。Among them, the process of the target object performing the target action can be realized in the real space, and the first action data can include the position information of the reference point in each video frame when the target object performs the target action in the real space, the rotation angle of the reference point, the motion acceleration, and the force condition of the reference point.
示例性的,参考点可以是执行目标动作时骨骼上的任意一点,参考点可以包括骨骼的关节点。Exemplarily, the reference point may be any point on the skeleton when the target action is performed, and the reference point may include a joint point of the skeleton.
在一种可能的实现方式中,通过对目标对象进行动作捕捉,获取第一动作数据;或者获取动作视频;动作视频中包含目标对象执行目标动作的片段,通过对动作视频中的目标对象进行动作捕捉,获取第一动作数据。In one possible implementation, the first action data is obtained by performing motion capture on a target object; or an action video is obtained; the action video contains a segment of the target object performing a target action, and the first action data is obtained by performing motion capture on the target object in the action video.
也就是说,计算设备可以从动作捕捉设备获取第一动作数据,第一动作数据是动作捕捉设备采集目标对象执行目标动作时的运动数据。动作捕捉设备可以直接将采集到的目标对象的运动数据发送给计算设备,由计算设备进行后续处理。That is, the computing device can obtain the first motion data from the motion capture device, and the first motion data is the motion data of the target object when the motion capture device collects the motion data of the target object. The motion capture device can directly send the collected motion data of the target object to the computing device, and the computing device performs subsequent processing.
在一种可能的实现方式中,目标对象的运动过程可以通过视频的方式记录下来,并且计算设备获取记录目标对象运动过程的视频,通过动作捕捉技术获取视频中目标对象执行目标动作时的运动数据。In a possible implementation, the movement process of the target object can be recorded in the form of a video, and the computing device obtains the video recording the movement process of the target object, and obtains the motion data of the target object in the video when performing the target action through motion capture technology.
也就是说,用户用摄像头、光学传感器或者惯性传感器获取目标对象的运动数据,用户还可以录制目标对象的运动视频,将目标对象的运动视频发送给计算设备,由计算设备解析运动视频,获取目标对象执行各个种类的动作时分别对应的运动数据。That is to say, the user uses a camera, optical sensor or inertial sensor to obtain the motion data of the target object. The user can also record the motion video of the target object and send the motion video of the target object to a computing device. The computing device parses the motion video to obtain the motion data corresponding to when the target object performs various types of actions.
其中,目标动作可以是目标对象在运动过程中某一身体部位执行的属于某一动作种类的动作。The target action may be an action belonging to a certain action category performed by a certain body part of the target object during the movement.
比如,若目标对象是人物,目标对象的身体部位可以包括四肢、躯干以及肩颈头部位。每个身体部位中包括多个关节点。For example, if the target object is a person, the body parts of the target object may include limbs, torso, and shoulders, neck, and head, and each body part may include multiple joint points.
其中,不同的身体部位可以执行的动作种类是不同的。Among them, different types of actions that different body parts can perform are different.
比如,上肢部位可以执行的动作种类包括招手、摆动、出拳、肘击等;下肢部位可以执行的动作种类可以包括走、跑、跳、蹲、坐、跳舞等;肩颈头部位可以执行的动作种类可以包括摇头、点头等;躯干部位可以执行的动作种类可以包括鞠躬、扭腰等。For example, the types of actions that can be performed by the upper limbs include waving, swinging, punching, elbowing, etc.; the types of actions that can be performed by the lower limbs include walking, running, jumping, squatting, sitting, dancing, etc.; the types of actions that can be performed by the shoulders, neck and head include shaking head and nodding, etc.; the types of actions that can be performed by the torso include bowing and twisting the waist, etc.
也就是说,动作捕捉设备可以是捕捉各个关节点的运动数据,通过各个运动数据在时序上的变化情况,可以确定各个关节点在某一时间段内参与执行的动作种类。That is to say, the motion capture device can capture the motion data of each joint point, and through the changes in the time series of each motion data, the type of action that each joint point participates in performing within a certain time period can be determined.
在一种可能的实现方式中,若获取到目标对象在运动过程中的各个关节点的运动数据,由于按照时序进行运动的过程,在每个时间点下的各个关节点都对应有各自在该时间点下的运动数据,可以针对每一个关节点获取在采集运动数据的时间段内的各个运动数据,也就是说,可以根据各个时间点下的运动数据,确定该关节点在各个时间段内参与执行的动作种类。In one possible implementation, if the motion data of each joint point of the target object during the movement process is obtained, since the movement process is carried out in a time sequence, each joint point at each time point has corresponding motion data at that time point, and each motion data within the time period for collecting the motion data can be obtained for each joint point. In other words, based on the motion data at each time point, the type of action that the joint point participates in performing within each time period can be determined.
比如,可以根据关节点的运动数据中的速度变化、相位变化或者所处位置情况,确定关键时间点,按照关键时间点划分关节点参与执行的动作。For example, the key time points can be determined based on the speed changes, phase changes or position conditions in the motion data of the joints, and the actions that the joints participate in can be divided according to the key time points.
也就是说,按照指定时间段内脚掌关节点的所处位置,以及脚掌关节点的触地情况,结合膝盖关节点在指定时间段内的旋转情况以及在指定时间段内目标对象的速度变化,可以确定在指定时间段内目标对象的下肢部位执行动作的动作种类是走。 That is to say, according to the position of the sole joint within the specified time period, and the contact situation of the sole joint with the ground, combined with the rotation of the knee joint within the specified time period and the speed change of the target object within the specified time period, it can be determined that the type of action performed by the lower limbs of the target object within the specified time period is walking.
在一种可能的实现方式中,若计算设备获取到包含目标对象运动过程的运动视频,可以根据目标对象的各个关节点的运动情况,确定用于分割动作的运动视频中的关键帧,按照关键帧分割的每个视频片段中可以包括至少一种动作种类的动作执行过程。In one possible implementation, if a computing device obtains a motion video containing the motion process of a target object, it can determine the key frames in the motion video used to segment the action based on the motion conditions of each joint of the target object. Each video segment segmented according to the key frames can include the action execution process of at least one action type.
也就是说,按照运动视频中目标对象的脚掌关节点的所处位置,以及脚掌关节点的触地情况,结合膝盖关节点的旋转情况以及目标对象的速度变化,可以确定目标对象开始走以及结束走的关键帧,根据关键帧确定目标对象的下肢部位执行动作的动作种类是走的视频片段。That is to say, according to the position of the target object's foot joints in the motion video, and the contact situation of the foot joints, combined with the rotation of the knee joints and the speed change of the target object, the key frames for the target object to start and end walking can be determined, and the action type of the target object's lower limbs can be determined based on the key frames. The video clip is walking.
本申请实施例中参考点为骨骼的关节点,对骨骼的关节点的运动情况进行美化的过程仅为举例说明,不构成对本申请实施例方法的限定。In the embodiment of the present application, the reference point is the joint point of the bone, and the process of beautifying the movement of the joint point of the bone is only for illustration and does not constitute a limitation of the method of the embodiment of the present application.
S102,获取第二动作数据。S102, obtaining second action data.
在本申请实施例中,计算设备可以获取第二动作数据,该第二动作数据可以用于指示虚拟对象按照动作模板执行目标动作过程中的参考点的运动情况。In an embodiment of the present application, the computing device may obtain second motion data, which may be used to indicate the movement of a reference point when the virtual object performs a target motion according to the motion template.
在一种可能的实现方式中,计算设备按照获取到的第一动作数据按照时间点将目标对象运动过程中执行的各个动作进行分割,并且确定目标对象执行的各个动作的动作种类之后,可以按照确定的运动种类,获取该运动种类对应的动作模板。In one possible implementation, the computing device divides each action performed during the movement of the target object according to the acquired first action data according to time points, and after determining the action type of each action performed by the target object, the computing device can obtain the action template corresponding to the determined action type.
在一种可能的实现方式中,基于第一动作数据,确定目标动作的动作种类;按照目标动作的动作种类,确定目标动作的动作模板;将目标动作的动作模板的动作数据获取为第二动作数据。In a possible implementation, based on the first action data, an action type of the target action is determined; according to the action type of the target action, an action template of the target action is determined; and action data of the action template of the target action is acquired as second action data.
一种可能的情况下,若每个动作种类对应有至少两个动作模板,按照目标动作的动作种类,确定目标动作的动作种类对应的至少两个动作模板,响应于接收到的选择操作,可以从至少两个动作模板中确定目标动作的动作模板。In one possible case, if each action type corresponds to at least two action templates, at least two action templates corresponding to the action type of the target action are determined according to the action type of the target action, and in response to a received selection operation, the action template of the target action can be determined from the at least two action templates.
其中,动作模板可以是预先存储在数据库中的,用于指示各个动作种类的标准动作。The action template may be pre-stored in the database and used to indicate standard actions of various action types.
在一种可能的实现方式中,第二运动数据可以是目标动作所属动作种类下的动作模板对应的运动数据。In a possible implementation manner, the second motion data may be motion data corresponding to an action template under the action category to which the target action belongs.
也就是说,若各个动作种类对应至少两个动作模板,在确定目标动作的动作种类后,用户可以从至少两个动作模板中选择其中一个作为目标动作的动作模板,并且获取该动作模板对应的第二动作数据。That is, if each action type corresponds to at least two action templates, after determining the action type of the target action, the user can select one of the at least two action templates as the action template of the target action, and obtain the second action data corresponding to the action template.
比如,若目标动作的动作种类为挥手,动作种类为挥手的动作模板可以包括动作模板1和动作模板2,其中,动作模板1的挥手幅度比动作模板2的挥手幅度大,用户可以从动作模板1以及动作模板2中按照需求选择一个作为目标动作的动作模板。也就是说,如果用户需求是将目标对象执行的挥手动作向幅度较大的方向进行美化,用户可以通过选择操作确定动作模板1作为目标动作的动作模板,然后获取动作模板1对应的第二动作数据;如果用户需求是将目标对象执行的挥手动作向幅度较小的方向进行美化,用户可以通过选择操作确定动作模板2作为目标动作的动作模板,然后获取动作模板2对应的第二动作数据。从而可以便于用户通过简单的操作,更加准确的将目标动作向用户需求的效果进行美化。For example, if the action type of the target action is waving, the action template with the action type of waving may include action template 1 and action template 2, wherein the waving amplitude of action template 1 is larger than that of action template 2, and the user may select one action template as the target action from action template 1 and action template 2 according to the needs. In other words, if the user needs to beautify the waving action performed by the target object in the direction of a larger amplitude, the user may determine action template 1 as the action template of the target action through a selection operation, and then obtain the second action data corresponding to action template 1; if the user needs to beautify the waving action performed by the target object in the direction of a smaller amplitude, the user may determine action template 2 as the action template of the target action through a selection operation, and then obtain the second action data corresponding to action template 2. This makes it convenient for the user to beautify the target action to the effect required by the user more accurately through simple operations.
在一种可能的实现方式中,响应于接收到的选择操作,可以从至少一个动作模板中确定目标动作对应的动作模板,获取目标动作对应的动作模板的第二动作数据。In a possible implementation, in response to a received selection operation, an action template corresponding to the target action may be determined from at least one action template, and second action data of the action template corresponding to the target action may be acquired.
也就是说,在计算设备获取到第一动作数据后,可以向用户展示各个动作模板,用户直接通过选择操作从各个动作模板中选择目标动作的动作模板,并且获取该目标动作的动作模板对应的第二动作数据。That is, after the computing device obtains the first action data, various action templates can be displayed to the user, and the user directly selects the action template of the target action from the various action templates through a selection operation, and obtains the second action data corresponding to the action template of the target action.
比如,若目标动作是蹲起动作,计算设备获取到目标对象执行蹲起动作时产生的第一动作数据后,计算设备可以向用户展示模板库中存储的不同动作种类的动作模板,并且每一种动作种类可以对应有不同动作效果的动作模板。也就是说,可以显示有不同效果的挥手对应的动作模板、不同效果的跑步对应的动作模板,不同效果的蹲起对应的动作模板等。计算设备通过接收对用户需要的效果的蹲起对应的动作模板的选择操作,确定该目标动作对应的动作模板,从而获取目标动作对应的动作模板的第二动作数据。For example, if the target action is a squat action, after the computing device obtains the first action data generated when the target object performs the squat action, the computing device can display the action templates of different action types stored in the template library to the user, and each action type can correspond to an action template with different action effects. In other words, action templates corresponding to waving with different effects, action templates corresponding to running with different effects, action templates corresponding to squatting with different effects, etc. can be displayed. The computing device determines the action template corresponding to the target action by receiving a selection operation of the action template corresponding to the squatting effect required by the user, thereby obtaining the second action data of the action template corresponding to the target action.
S103,基于第一动作数据以及第二动作数据,生成第三动作数据。S103, generating third motion data based on the first motion data and the second motion data.
在本申请实施例中,计算设备可以在第一动作数据的基础上结合第二动作数据对第一动作数据进行修改,生成第三动作数据。In an embodiment of the present application, the computing device may modify the first action data based on the first action data and in combination with the second action data to generate third action data.
其中,第三动作数据可以用于指示结合动作模板调整目标动作后的关节运动情况。The third motion data may be used to indicate the joint movement after adjusting the target motion in combination with the motion template.
在一种可能的实现方式中,通过将第一动作数据与第二动作数据进行按照指定权重进行混合计算, 得到混合后的动作数据,然后将混合后的动作数据进行平滑处理,可以得到第三动作数据。In a possible implementation, the first action data and the second action data are mixed and calculated according to a specified weight. The mixed motion data is obtained, and then the mixed motion data is smoothed to obtain the third motion data.
也就是说,通过将目标动作的特征与对应的动作模板的特征进行混合,然后再对混合后的特征进行低通滤波处理,得到平滑后的特征,从而可以得到第三动作数据。That is to say, the third action data can be obtained by mixing the features of the target action with the features of the corresponding action template and then performing low-pass filtering on the mixed features to obtain smoothed features.
在一种可能的实现方式中,基于第一动作数据,提取第一动作特征,基于第二动作数据,提取第二动作特征,按照指定权重,对第一动作特征以及第二动作特征进行混合,生成第三动作数据。In a possible implementation, a first action feature is extracted based on the first action data, a second action feature is extracted based on the second action data, and the first action feature and the second action feature are mixed according to a specified weight to generate a third action data.
也就是说,通过对第一动作数据进行时域-频域转换处理,提取其中的周期性动作特征,通过对动作特征进行混和处理,得到第三动作特征,从而得到第三动作数据。That is to say, by performing time domain-frequency domain conversion processing on the first motion data, the periodic motion features therein are extracted, and by performing mixing processing on the motion features, the third motion features are obtained, thereby obtaining the third motion data.
示例性的,图5是本申请实施例涉及的一种动作序列数据进行异常检测的流程图。如图5所示,计算设备获取第一动作数据,该第一动作数据可以是姿态序列数据(S11),通过FFT自编码器对第一动作数据进行周期性曲线提取(S12),然后将从第一动作数据中提取的特征与动作模板进行周期性曲线提取后得到的特征进行频域相似度计算(S13),根据计算得到的相似度结果进行异常检测(S14),根据得到的相似度结果的大小,确定目标动作是否与动作模板差距过大,若确定目标动作与动作模板的相似度小于指定阈值,则确定目标动作异常(S15),若确定目标动作异常,则可以对目标动作与动作模板进行混合处理(S16),然后对混合处理后的动作数据进行滤波处理,得到平滑的曲线特征,从而的搭配第三动作数据(S17)。Exemplarily, FIG5 is a flowchart of anomaly detection of an action sequence data involved in an embodiment of the present application. As shown in FIG5, a computing device obtains first action data, which may be posture sequence data (S11), performs periodic curve extraction on the first action data through an FFT autoencoder (S12), and then performs frequency domain similarity calculation on the features extracted from the first action data and the features obtained after periodic curve extraction of the action template (S13), performs anomaly detection according to the calculated similarity result (S14), and determines whether the target action is too far from the action template according to the size of the obtained similarity result. If it is determined that the similarity between the target action and the action template is less than a specified threshold, the target action is determined to be abnormal (S15). If the target action is determined to be abnormal, the target action and the action template can be mixed (S16), and then the mixed action data is filtered to obtain a smooth curve feature, thereby matching the third action data (S17).
在一种可能的实现方式中,基于第一动作数据,获取目标参考点的第一动画曲线,目标参考点可以是用于执行目标动作的参考点,第一动画曲线可以用于指示在目标对象执行目标动作的时间段内参考点的运动情况,基于第二动作数据,获取目标参考点的第二动画曲线,第二动画曲线可以用于指示在虚拟对象按照动作模板执行目标动作的时间段内参考点的运动情况,按照球面混合算法,对第一动画曲线以及第二动画曲线进行混合,得到第三动画曲线;第三动画曲线可以是用于指示在虚拟对象执行结合动作模板调整目标动作的时间段内参考点的运动情况。In one possible implementation, based on first action data, a first animation curve of a target reference point is obtained, the target reference point may be a reference point for executing a target action, the first animation curve may be used to indicate the movement of the reference point during a time period when the target object executes the target action, based on second action data, a second animation curve of the target reference point is obtained, the second animation curve may be used to indicate the movement of the reference point during a time period when the virtual object executes the target action according to an action template, the first animation curve and the second animation curve are mixed according to a spherical blending algorithm to obtain a third animation curve; the third animation curve may be used to indicate the movement of the reference point during a time period when the virtual object executes the target action adjusted in combination with the action template.
在一种可能的实现方式中,若目标动作是待调整动作,基于第一动作数据以及第二动作数据,生成第三动作数据。In a possible implementation, if the target action is an action to be adjusted, third action data is generated based on the first action data and the second action data.
其中,待调整动作可以是将目标动作的第一动作数据与目标动作的第二动作数据进行频域相似度计算后相似度小于指定阈值的目标动作。The action to be adjusted may be a target action whose similarity is less than a specified threshold after frequency domain similarity calculation is performed between the first action data of the target action and the second action data of the target action.
示例性的,图6是本申请实施例涉及的一种动作美化示意图。如图6所示,若目标动作是走,计算设备获取到的目标对象执行目标动作41时膝盖关节具有外八姿态,通过上述动作混合以及动作平滑的方法对目标动作41进行美化后,得到将膝盖关节点位置内收调整后的目标动作42。其中,美化动作数据可以在体态、走姿、手势、艺术动作等维度对动作进行调整。For example, FIG6 is a schematic diagram of an action beautification involved in an embodiment of the present application. As shown in FIG6, if the target action is walking, the target object obtained by the computing device has an outward eight posture when performing the target action 41, and after the target action 41 is beautified by the above-mentioned action mixing and action smoothing methods, the target action 42 after the knee joint point position is adjusted inward is obtained. Among them, the beautified action data can adjust the action in dimensions such as posture, walking posture, gesture, and artistic action.
S104,按照第三动作数据,生成动画视频。S104: Generate an animation video according to the third action data.
在本申请实施例中,通过第三动作视频可以生成虚拟对象执行美化后的目标动作的视频片段,由于目标对象在运动过程中可以执行多个动作,需要进行美化处理的目标动作可以是多个动作,所以各个目标动作生成的视频片段需要进行动画拼接,生成完整的动画视频,用来展示虚拟化的目标对象的运动过程,并且该虚拟化的运动过程包括虚拟对象执行各个美化后的目标动作。In an embodiment of the present application, a video clip of a virtual object performing a beautified target action can be generated through a third action video. Since the target object can perform multiple actions during the movement process, the target actions that need to be beautified can be multiple actions, so the video clips generated by each target action need to be animated and spliced to generate a complete animation video to show the movement process of the virtualized target object, and the virtualized movement process includes the virtual object performing each beautified target action.
在一种可能的实现方式中,按照第三动作数据,生成混合动画视频;通过对混合动画视频进行低通滤波处理,得到动画视频。In a possible implementation, a mixed animation video is generated according to the third motion data; and the animation video is obtained by performing low-pass filtering on the mixed animation video.
在一种可能的实现方式中,按照第三动作数据,可以进行骨骼蒙皮渲染,渲染生成动画视频。In a possible implementation, skeleton skinning rendering may be performed according to the third action data to generate an animation video.
综上所述,本申请实施例通过获取目标对象在现实中执行目标动作的第一动作数据,以及虚拟对象按照动作模板执行目标动作时的第二动作数据,将第一动作数据结合第二动作数据进行修改,得到向动作模板调整的第三动作数据,从而使得按照第三动作数据渲染可以生成美化后的动画视频。通过按照动作模板自动调整目标动作,在保证了对象执行的动作的风格被保留的前提下,美化了虚拟对象执行的动作,提高了对目标对象执行的目标动作进行虚拟化后生成的动画的灵活性。In summary, the embodiment of the present application obtains the first action data of the target object performing the target action in reality, and the second action data of the virtual object performing the target action according to the action template, and modifies the first action data in combination with the second action data to obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data. By automatically adjusting the target action according to the action template, the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is retained, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
图7示出了本申请一个示例性实施例提供的虚拟对象的动作生成系统的架构图。如图7所示,该虚拟对象的动作生成系统包括动作数据处理模块310、动作美化算法模块320、动作资产库模块330以及可视化模块340。Fig. 7 shows an architecture diagram of a virtual object motion generation system provided by an exemplary embodiment of the present application. As shown in Fig. 7 , the virtual object motion generation system includes a motion data processing module 310 , a motion beautification algorithm module 320 , a motion asset library module 330 and a visualization module 340 .
其中,动作数据处理模块310可以包括动作分割子模块311以及动作分类子模块312。The motion data processing module 310 may include a motion segmentation submodule 311 and a motion classification submodule 312 .
动作分割子模块311用于根据获取到的目标对象执行目标动作的第一动作数据中的各个关节点速 度变化、相位变化,或者脚掌关节点所处位置触地,或者手掌关节点垂下等关键帧,时序上将目标对象运动过程分割为动作片段。The action segmentation submodule 311 is used to obtain the speed of each joint point in the first action data of the target object performing the target action. The target object’s motion process is divided into action segments in terms of timing, such as the change in degree, phase change, or the position of the foot joint touching the ground, or the palm joint hanging down.
动作分类子模块312用于根据关键骨骼关节点的速度信息或者角速度信息,对动作片段中的动作种类进行确定。The action classification submodule 312 is used to determine the action type in the action segment according to the velocity information or angular velocity information of the key skeletal joints.
其中,动作美化算法模块320可以包括动作搜索算法子模块321、动作混合子模块322、动作平滑子模块323以及问题片段识别子模块324。The motion beautification algorithm module 320 may include a motion search algorithm submodule 321 , a motion mixing submodule 322 , a motion smoothing submodule 323 and a problem segment identification submodule 324 .
动作搜索算法子模块321可以用于根据获取到的目标对象的目标动作的动作片段在动作资产库中搜索特征最接近的片段集合,并将最接近的动作片段进行排序,动作资产库中的动作片段可以是动作模板。The action search algorithm submodule 321 can be used to search for a set of clips with the closest features in the action asset library according to the action clips of the target action of the acquired target object, and sort the closest action clips. The action clips in the action asset library can be action templates.
动作混合子模块322可以用于根据目标动作的动作片段以及与目标动作的动作片段最接近的动作模板对应的动作片段按照指定权重,采用球面混合算法,生成混合后的动作片段。The action mixing submodule 322 may be configured to generate a mixed action segment using a spherical mixing algorithm according to a specified weight based on the action segment of the target action and the action segment corresponding to the action template closest to the action segment of the target action.
动作平滑子模块323可以用于对混合后的动作片段基于低通滤波算法,对混合后的动作片段进行平滑处理,生成与混合后的动作片段时长相同的平滑的动作片段。The motion smoothing submodule 323 may be used to smooth the mixed motion clip based on a low-pass filtering algorithm to generate a smoothed motion clip having the same duration as the mixed motion clip.
问题片段识别子模块324可以用于根据获取到的目标对象的目标动作的动作片段判断与动作模板对应的动作片段其中各个关节点的旋转信息、角速度信息以及位置信息之间的差异,判断是否差异过大。The problem segment identification submodule 324 can be used to determine the difference between the rotation information, angular velocity information and position information of each joint point in the action segment corresponding to the action template based on the acquired action segment of the target action of the target object, and determine whether the difference is too large.
其中,动作资产库模块330可以包括人体骨骼资产库子模块331、标准动画资产库子模块332、风格化动画资产子模块333以及开放接口。The action asset library module 330 may include a human skeleton asset library submodule 331 , a standard animation asset library submodule 332 , a stylized animation asset submodule 333 and an open interface.
人体骨骼资产库子模块331中包括男、女、老、少、高、矮、胖、瘦等各种人体骨骼模型。The human skeleton asset library submodule 331 includes various human skeleton models such as male, female, old, young, tall, short, fat, and thin.
标准动画资产库子模块332是与人体骨骼资产库子模块331中的模型相对应的标准动画,包括动画片段以及特征片段。The standard animation asset library submodule 332 is a standard animation corresponding to the model in the human skeleton asset library submodule 331 , including animation clips and feature clips.
风格化动画资产子模块333是从标准动画资产库子模块332中提取的根、脚、手腕、肘、膝盖等关节的动画曲线以及频域特征。The stylized animation asset submodule 333 is the animation curves and frequency domain features of the joints such as the root, foot, wrist, elbow, knee, etc. extracted from the standard animation asset library submodule 332 .
其中,可视化模块340可以包括骨骼渲染子模块341。The visualization module 340 may include a skeleton rendering submodule 341 .
骨骼渲染子模块341以及蒙皮渲染子模块342,可以用于进行虚拟对象的动画渲染。The skeleton rendering submodule 341 and the skin rendering submodule 342 can be used to perform animation rendering of virtual objects.
综上所述,本申请实施例通过获取目标对象在现实中执行目标动作的第一动作数据,以及虚拟对象按照动作模板执行目标动作时的第二动作数据,将第一动作数据结合第二动作数据进行修改,得到向动作模板调整的第三动作数据,从而使得按照第三动作数据渲染可以生成美化后的动画视频。通过按照动作模板自动调整目标动作,在保证了对象执行的动作的风格被保留的前提下,美化了虚拟对象执行的动作,提高了对目标对象执行的目标动作进行虚拟化后生成的动画的灵活性。In summary, the embodiment of the present application obtains the first action data of the target object performing the target action in reality, and the second action data of the virtual object performing the target action according to the action template, and modifies the first action data in combination with the second action data to obtain the third action data adjusted to the action template, so that the beautified animation video can be generated by rendering according to the third action data. By automatically adjusting the target action according to the action template, the action performed by the virtual object is beautified while ensuring that the style of the action performed by the object is retained, and the flexibility of the animation generated after the target action performed by the target object is virtualized is improved.
上述主要从方法的角度对本申请实施例的方案进行了介绍。可以理解的是,虚拟对象的动作生成装置为了实现上述功能,其包含了执行各个功能相应的硬件结构和软件模块中的至少一个。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The above mainly introduces the scheme of the embodiment of the present application from the perspective of the method. It is understandable that in order to realize the above functions, the action generation device of the virtual object includes at least one of the hardware structure and software modules corresponding to the execution of each function. It should be easily appreciated by those skilled in the art that, in combination with the units and algorithm steps of each example described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
本申请实施例可以根据上述方法示例对虚拟对象的动作生成装置进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiment of the present application can divide the action generation device of the virtual object into functional units according to the above method example. For example, each functional unit can be divided corresponding to each function, or two or more functions can be integrated into one processing unit. The above integrated unit can be implemented in the form of hardware or in the form of software functional units. It should be noted that the division of units in the embodiment of the present application is schematic and is only a logical functional division. There may be other division methods in actual implementation.
示例性的,图8示出了本申请一个示例性实施例提供的虚拟对象的动作生成装置500的结构示意图。该虚拟对象的动作生成装置500应用于计算设备中,或者,该虚拟对象的动作生成装置500可以是计算设备。该虚拟对象的动作生成装置500包括:Exemplarily, FIG8 shows a schematic diagram of the structure of a virtual object motion generation device 500 provided by an exemplary embodiment of the present application. The virtual object motion generation device 500 is applied to a computing device, or the virtual object motion generation device 500 can be a computing device. The virtual object motion generation device 500 includes:
获取模块510,用于获取第一动作数据;所述第一动作数据用于指示目标对象执行目标动作过程中参考点的运动情况;获取第二动作数据;所述第二动作数据用于指示所述虚拟对象按照动作模板执行所述目标动作过程中的所述参考点的运动情况;所述虚拟对象用于在虚拟空间中表征所述目标对象;The acquisition module 510 is used to acquire first action data; the first action data is used to indicate the movement of the reference point during the target object performs the target action; acquire second action data; the second action data is used to indicate the movement of the reference point during the virtual object performs the target action according to the action template; the virtual object is used to represent the target object in the virtual space;
处理模块520,用于基于所述第一动作数据以及所述第二动作数据,生成第三动作数据;所述第三动作数据用于指示结合所述动作模板调整所述目标动作后的所述参考点的运动情况; The processing module 520 is used to generate third motion data based on the first motion data and the second motion data; the third motion data is used to indicate the movement of the reference point after the target motion is adjusted in combination with the motion template;
生成模块530,用于按照所述第三动作数据,生成动画视频;所述动画视频包含所述虚拟对象执行调整后的目标动作的片段。The generation module 530 is used to generate an animation video according to the third action data; the animation video includes a segment of the virtual object performing the adjusted target action.
例如,结合图4,获取模块510可以用于执行如图4所示的S101、S102,处理模块520可以用于执行如图4所示的S103,生成模块530可以用于执行如图4所示的S104。For example, in conjunction with FIG. 4 , the acquisition module 510 may be used to execute S101 and S102 as shown in FIG. 4 , the processing module 520 may be used to execute S103 as shown in FIG. 4 , and the generation module 530 may be used to execute S104 as shown in FIG. 4 .
在一种可能的实现方式中,获取模块510,还用于,响应于接收到的选择操作,从至少一个动作模板中确定所述目标动作对应的动作模板;获取所述目标动作对应的动作模板的所述第二动作数据。In a possible implementation, the acquisition module 510 is further used to, in response to a received selection operation, determine an action template corresponding to the target action from at least one action template; and acquire the second action data of the action template corresponding to the target action.
在一种可能的实现方式中,处理模块520,还用于,基于所述第一动作数据,提取所述第一动作特征;基于所述第二动作数据,提取所述第二动作特征;按照指定权重,对所述第一动作特征以及所述第二动作特征进行混合,生成所述第三动作数据。In one possible implementation, the processing module 520 is also used to extract the first action feature based on the first action data; extract the second action feature based on the second action data; and mix the first action feature and the second action feature according to specified weights to generate the third action data.
在一种可能的实现方式中,处理模块520,还用于,基于所述第一动作数据,获取目标参考点的第一动画曲线;所述目标参考点是用于执行所述目标动作的关节;所述第一动画曲线用于指示在所述目标对象执行所述目标动作的时间段内参考点的运动情况;基于所述第二动作数据,获取所述目标参考点的第二动画曲线;所述第二动画曲线用于指示在所述虚拟对象按照所述动作模板执行所述目标动作的时间段内参考点的运动情况;按照球面混合算法,对所述第一动画曲线以及所述第二动画曲线进行混合,得到第三动画曲线;所述第三动画曲线是用于指示在虚拟对象执行结合所述动作模板调整所述目标动作的时间段内参考点的运动情况。In one possible implementation, the processing module 520 is also used to obtain a first animation curve of a target reference point based on the first action data; the target reference point is a joint used to perform the target action; the first animation curve is used to indicate the movement of the reference point during the time period when the target object performs the target action; based on the second action data, obtain a second animation curve of the target reference point; the second animation curve is used to indicate the movement of the reference point during the time period when the virtual object performs the target action according to the action template; according to a spherical blending algorithm, the first animation curve and the second animation curve are blended to obtain a third animation curve; the third animation curve is used to indicate the movement of the reference point during the time period when the virtual object performs the target action adjusted in combination with the action template.
在一种可能的实现方式中,处理模块520,还用于,若所述目标动作是待调整动作,基于所述第一动作数据以及所述第二动作数据,生成所述第三动作数据。In a possible implementation, the processing module 520 is further configured to, if the target action is an action to be adjusted, generate the third action data based on the first action data and the second action data.
在一种可能的实现方式中,所述待调整动作是将所述目标动作的第一动作数据与所述目标动作的第二动作数据进行频域相似度计算后相似度小于指定阈值的所述目标动作。In a possible implementation, the action to be adjusted is the target action whose similarity is less than a specified threshold after frequency domain similarity calculation is performed on the first action data of the target action and the second action data of the target action.
在一种可能的实现方式中,生成模块530,还用于,按照所述第三动作数据,生成混合动画视频;In a possible implementation, the generating module 530 is further configured to generate a mixed animation video according to the third motion data;
通过对所述混合动画视频进行低通滤波处理,得到所述动画视频。The animation video is obtained by performing low-pass filtering on the mixed animation video.
在一种可能的实现方式中,获取模块510,还用于,通过对所述目标对象进行动作捕捉,获取所述第一动作数据。In a possible implementation, the acquisition module 510 is further configured to acquire the first motion data by performing motion capture on the target object.
在一种可能的实现方式中,获取模块510,还用于,获取动作视频;所述动作视频中包含所述目标对象执行所述目标动作的片段;通过对所述动作视频中的所述目标对象进行动作捕捉,获取所述第一动作数据。In a possible implementation, the acquisition module 510 is further used to acquire an action video; the action video contains a segment of the target object performing the target action; and the first action data is acquired by performing motion capture on the target object in the action video.
在一种可能的实现方式中,获取模块510,还用于,基于所述第一动作数据,确定所述目标动作的动作种类;按照所述目标动作的动作种类,确定所述目标动作的动作模板;将所述目标动作的动作模板的动作数据获取为所述第二动作数据。In a possible implementation, the acquisition module 510 is also used to determine the action type of the target action based on the first action data; determine the action template of the target action according to the action type of the target action; and acquire the action data of the action template of the target action as the second action data.
在一种可能的实现方式中,若每个动作种类对应有至少两个动作模板,获取模块510,还用于按照所述目标动作的动作种类,确定所述目标动作的动作种类对应的至少两个动作模板;响应于接收到的选择操作,从所述至少两个动作模板中确定所述目标动作的动作模板。In one possible implementation, if each action type corresponds to at least two action templates, the acquisition module 510 is also used to determine at least two action templates corresponding to the action type of the target action according to the action type of the target action; and in response to a received selection operation, determine the action template of the target action from the at least two action templates.
关于上述可选方式的具体描述可以参见前述的方法实施例,此处不再赘述。此外,上述提供的任一种虚拟对象的动作生成装置的解释以及有益效果的描述均可参考上述对应的方法实施例,不再赘述。For the detailed description of the above optional methods, please refer to the above method embodiments, which will not be repeated here. In addition, the explanation of any virtual object motion generation device provided above and the description of the beneficial effects can refer to the above corresponding method embodiments, which will not be repeated here.
其中,获取模块510、处理模块520和生成模块530均可以通过软件实现,或者可以通过硬件实现。示例性的,接下来以获取模块510为例,介绍获取模块510的实现方式。类似的,处理模块520和生成模块530的实现方式可以参考获取模块510的实现方式。Among them, the acquisition module 510, the processing module 520 and the generation module 530 can all be implemented by software, or can be implemented by hardware. Exemplarily, the following takes the acquisition module 510 as an example to introduce the implementation of the acquisition module 510. Similarly, the implementation of the processing module 520 and the generation module 530 can refer to the implementation of the acquisition module 510.
模块作为软件功能单元的一种举例,获取模块510可以包括运行在计算实例上的代码。其中,计算实例可以包括物理主机(计算设备)、虚拟机、容器中的至少一种。进一步地,上述计算实例可以是一台或者多台。例如,获取模块510可以包括运行在多个主机/虚拟机/容器上的代码。需要说明的是,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的区域(region)中,也可以分布在不同的region中。进一步地,用于运行该代码的多个主机/虚拟机/容器可以分布在相同的可用区(availability zone,AZ)中,也可以分布在不同的AZ中,每个AZ包括一个数据中心或多个地理位置相近的数据中心。其中,通常一个region可以包括多个AZ。As an example of a software functional unit, the acquisition module 510 may include code running on a computing instance. Among them, the computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the above-mentioned computing instance may be one or more. For example, the acquisition module 510 may include code running on multiple hosts/virtual machines/containers. It should be noted that the multiple hosts/virtual machines/containers used to run the code may be distributed in the same region (region) or in different regions. Furthermore, the multiple hosts/virtual machines/containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one data center or multiple data centers with similar geographical locations. Among them, usually a region may include multiple AZs.
同样,用于运行该代码的多个主机/虚拟机/容器可以分布在同一个虚拟私有云(virtual private cloud,VPC)中,也可以分布在多个VPC中。其中,通常一个VPC设置在一个region内,同一region 内两个VPC之间,以及不同region的VPC之间跨区通信需在每个VPC内设置通信网关,经通信网关实现VPC之间的互连。Similarly, multiple hosts/virtual machines/containers used to run the code can be distributed in the same virtual private cloud (VPC) or in multiple VPCs. Usually, a VPC is set up in a region. For cross-region communication between two VPCs within a region, or between VPCs in different regions, a communication gateway must be set up in each VPC to achieve interconnection between VPCs through the communication gateway.
模块作为硬件功能单元的一种举例,获取模块510可以包括至少一个计算设备,如服务器等。或者,获取模块510也可以是利用专用集成电路(application-specific integrated circuit,ASIC)实现、或可编程逻辑器件(programmable logic device,PLD)实现的设备等。其中,上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、现场可编程门阵列(field-programmable gate array,FPGA)、通用阵列逻辑(generic array logic,GAL)或其任意组合实现。As an example of a hardware functional unit, the acquisition module 510 may include at least one computing device, such as a server, etc. Alternatively, the acquisition module 510 may also be a device implemented by an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD may be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a generic array logic (GAL) or any combination thereof.
获取模块510包括的多个计算设备可以分布在相同的region中,也可以分布在不同的region中。获取模块510包括的多个计算设备可以分布在相同的AZ中,也可以分布在不同的AZ中。同样,获取模块510包括的多个计算设备可以分布在同一个VPC中,也可以分布在多个VPC中。其中,所述多个计算设备可以是服务器、ASIC、PLD、CPLD、FPGA和GAL等计算设备的任意组合。The multiple computing devices included in the acquisition module 510 can be distributed in the same region or in different regions. The multiple computing devices included in the acquisition module 510 can be distributed in the same AZ or in different AZs. Similarly, the multiple computing devices included in the acquisition module 510 can be distributed in the same VPC or in multiple VPCs. The multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.
需要说明的是,在其他实施例中,获取模块510可以用于执行虚拟对象的动作生成方法中的任意步骤,处理模块520可以用于执行虚拟对象的动作生成方法中的任意步骤,生成模块530可以用于执行虚拟对象的动作生成方法中的任意步骤,获取模块510、处理模块520、以及生成模块530负责实现的步骤可根据需要指定,通过获取模块510、处理模块520、以及生成模块530分别实现虚拟对象的动作生成方法中不同的步骤来实现虚拟对象的动作生成装置的全部功能。本申请还提供一种计算设备100。如图9所示,计算设备100包括:总线102、处理器104、存储器106和通信接口108。处理器104、存储器106和通信接口108之间通过总线102通信。计算设备100可以是服务器或终端设备。应理解,本申请不限定计算设备100中的处理器、存储器的个数。It should be noted that, in other embodiments, the acquisition module 510 can be used to execute any step in the action generation method of the virtual object, the processing module 520 can be used to execute any step in the action generation method of the virtual object, and the generation module 530 can be used to execute any step in the action generation method of the virtual object. The steps that the acquisition module 510, the processing module 520, and the generation module 530 are responsible for implementing can be specified as needed, and the acquisition module 510, the processing module 520, and the generation module 530 respectively implement different steps in the action generation method of the virtual object to realize the full functions of the action generation device of the virtual object. The present application also provides a computing device 100. As shown in FIG9, the computing device 100 includes: a bus 102, a processor 104, a memory 106, and a communication interface 108. The processor 104, the memory 106, and the communication interface 108 communicate with each other through the bus 102. The computing device 100 can be a server or a terminal device. It should be understood that the present application does not limit the number of processors and memories in the computing device 100.
总线102可以是外设部件互连标准(peripheral component interconnect,PCI)总线或扩展工业标准结构(extended industry standard architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图9中仅用一条线表示,但并不表示仅有一根总线或一种类型的总线。总线104可包括在计算设备100各个部件(例如,存储器106、处理器104、通信接口108)之间传送信息的通路。The bus 102 may be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of representation, FIG. 9 is represented by only one line, but does not mean that there is only one bus or one type of bus. The bus 104 may include a path for transmitting information between various components of the computing device 100 (e.g., the memory 106, the processor 104, the communication interface 108).
处理器104可以包括中央处理器(central processing unit,CPU)、图形处理器(graphics processing unit,GPU)、微处理器(micro processor,MP)或者数字信号处理器(digital signal processor,DSP)等处理器中的任意一种或多种。The processor 104 may include any one or more processors such as a central processing unit (CPU), a graphics processing unit (GPU), a microprocessor (MP) or a digital signal processor (DSP).
存储器106可以包括易失性存储器(volatile memory),例如随机存取存储器(random access memory,RAM)。处理器104还可以包括非易失性存储器(non-volatile memory),例如只读存储器(read-only memory,ROM),快闪存储器,机械硬盘(hard disk drive,HDD)或固态硬盘(solid state drive,SSD)。The memory 106 may include a volatile memory, such as a random access memory (RAM). The processor 104 may also include a non-volatile memory, such as a read-only memory (ROM), a flash memory, a hard disk drive (HDD), or a solid state drive (SSD).
存储器106中存储有可执行的程序代码,处理器104执行该可执行的程序代码以分别实现前述获取模块510、处理模块520和生成模块530的功能,从而实现虚拟对象的动作生成方法。也即,存储器106上存有用于执行虚拟对象的动作生成方法的指令。The memory 106 stores executable program codes, and the processor 104 executes the executable program codes to respectively implement the functions of the aforementioned acquisition module 510, processing module 520, and generation module 530, thereby implementing the action generation method of the virtual object. That is, the memory 106 stores instructions for executing the action generation method of the virtual object.
或者,存储器106中存储有可执行的代码,处理器104执行该可执行的代码以分别实现前述虚拟对象的动作生成装置的功能,从而实现虚拟对象的动作生成方法。也即,存储器106上存有用于执行虚拟对象的动作生成方法的指令。Alternatively, the memory 106 stores executable codes, and the processor 104 executes the executable codes to respectively implement the functions of the aforementioned virtual object motion generation device, thereby implementing the virtual object motion generation method. That is, the memory 106 stores instructions for executing the virtual object motion generation method.
通信接口108使用例如但不限于网络接口卡、收发器一类的收发模块,来实现计算设备100与其他设备或通信网络之间的通信。The communication interface 108 uses a transceiver module such as, but not limited to, a network interface card or a transceiver to implement communication between the computing device 100 and other devices or a communication network.
本申请实施例还提供了一种计算设备集群。该计算设备集群包括至少一台计算设备。该计算设备可以是服务器,例如是中心服务器、边缘服务器,或者是本地数据中心中的本地服务器。在一些实施例中,计算设备也可以是台式机、笔记本电脑或者智能手机等终端设备。The embodiment of the present application also provides a computing device cluster. The computing device cluster includes at least one computing device. The computing device can be a server, such as a central server, an edge server, or a local server in a local data center. In some embodiments, the computing device can also be a terminal device such as a desktop computer, a laptop computer, or a smart phone.
如图10所示,所述计算设备集群包括至少一个计算设备100。计算设备集群中的一个或多个计算设备100中的存储器106中可以存有相同的用于执行虚拟对象的动作生成方法的指令。As shown in Fig. 10, the computing device cluster includes at least one computing device 100. The memory 106 in one or more computing devices 100 in the computing device cluster may store the same instructions for executing the method for generating actions of a virtual object.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备100的存储器106中也可以分别存有用于执行虚拟对象的动作生成方法的部分指令。换言之,一个或多个计算设备100的组合可以共同执行用于执行虚拟对象的动作生成方法的指令。 In some possible implementations, the memory 106 of one or more computing devices 100 in the computing device cluster may also store partial instructions for executing the method for generating actions of a virtual object. In other words, the combination of one or more computing devices 100 may jointly execute instructions for executing the method for generating actions of a virtual object.
需要说明的是,计算设备集群中的不同的计算设备100中的存储器106可以存储不同的指令,分别用于执行虚拟对象的动作生成装置的部分功能。也即,不同的计算设备100中的存储器106存储的指令可以实现获取模块510、处理模块520和生成模块530中的一个或多个模块的功能。It should be noted that the memory 106 in different computing devices 100 in the computing device cluster may store different instructions, which are respectively used to execute part of the functions of the virtual object action generation device. That is, the instructions stored in the memory 106 in different computing devices 100 may implement the functions of one or more modules among the acquisition module 510, the processing module 520 and the generation module 530.
在一些可能的实现方式中,计算设备集群中的一个或多个计算设备可以通过网络连接。其中,所述网络可以是广域网或局域网等等。图11示出了一种可能的实现方式。如图11所示,两个计算设备100A和100B之间通过网络进行连接。具体地,通过各个计算设备中的通信接口与所述网络进行连接。在这一类可能的实现方式中,计算设备100A中的存储器106中存有执行获取模块510的功能的指令。同时,计算设备100B中的存储器106中存有执行处理模块520和生成模块530的功能的指令。In some possible implementations, one or more computing devices in the computing device cluster can be connected via a network. The network can be a wide area network or a local area network, etc. FIG. 11 shows a possible implementation. As shown in FIG. 11 , two computing devices 100A and 100B are connected via a network. Specifically, the network is connected via a communication interface in each computing device. In this type of possible implementation, the memory 106 in the computing device 100A stores instructions for executing the functions of the acquisition module 510. At the same time, the memory 106 in the computing device 100B stores instructions for executing the functions of the processing module 520 and the generation module 530.
图11所示的计算设备集群之间的连接方式可以是考虑到本申请提供的虚拟对象的动作生成方法需要大量地存储数据和计算数据,因此考虑将处理模块520和生成模块530实现的功能交由计算设备100B执行。The connection method between the computing device clusters shown in Figure 11 can be considered to be that the action generation method of the virtual object provided in this application requires a large amount of storage and calculation data, so it is considered to hand over the functions implemented by the processing module 520 and the generation module 530 to the computing device 100B for execution.
应理解,图11中示出的计算设备100A的功能也可以由多个计算设备100完成。同样,计算设备100B的功能也可以由多个计算设备100完成。It should be understood that the functions of the computing device 100A shown in FIG11 may also be completed by multiple computing devices 100. Similarly, the functions of the computing device 100B may also be completed by multiple computing devices 100.
本申请实施例还提供了另一种计算设备集群。该计算设备集群中各计算设备之间的连接关系可以类似的参考图10和图11所述计算设备集群的连接方式。不同的是,该计算设备集群中的一个或多个计算设备100中的存储器106中可以存有相同的用于执行虚拟对象的动作生成方法的指令。The embodiment of the present application also provides another computing device cluster. The connection relationship between the computing devices in the computing device cluster can be similar to the connection mode of the computing device cluster described in Figures 10 and 11. The difference is that the memory 106 in one or more computing devices 100 in the computing device cluster can store the same instructions for executing the action generation method of the virtual object.
在一些可能的实现方式中,该计算设备集群中的一个或多个计算设备100的存储器106中也可以分别存有用于执行虚拟对象的动作生成方法的部分指令。换言之,一个或多个计算设备100的组合可以共同执行用于执行虚拟对象的动作生成方法的指令。In some possible implementations, the memory 106 of one or more computing devices 100 in the computing device cluster may also store partial instructions for executing the method for generating actions of a virtual object. In other words, the combination of one or more computing devices 100 may jointly execute instructions for executing the method for generating actions of a virtual object.
需要说明的是,计算设备集群中的不同的计算设备100中的存储器106可以存储不同的指令,用于执行虚拟对象的动作生成系统的部分功能。也即,不同的计算设备100中的存储器106存储的指令可以实现虚拟对象的动作生成装置中的一个或多个装置的功能。It should be noted that the memory 106 in different computing devices 100 in the computing device cluster may store different instructions for executing part of the functions of the virtual object motion generation system. That is, the instructions stored in the memory 106 in different computing devices 100 may implement the functions of one or more devices in the virtual object motion generation device.
本申请实施例还提供了一种包含指令的计算机程序产品。所述计算机程序产品可以是包含指令的,能够运行在计算设备上或被储存在任何可用介质中的软件或程序产品。当所述计算机程序产品在至少一个计算设备上运行时,使得至少一个计算设备执行虚拟对象的动作生成方法。The embodiment of the present application also provides a computer program product including instructions. The computer program product may be software or a program product including instructions that can be run on a computing device or stored in any available medium. When the computer program product is run on at least one computing device, the at least one computing device executes the method for generating an action of a virtual object.
本申请实施例还提供了一种计算机可读存储介质。所述计算机可读存储介质可以是计算设备能够存储的任何可用介质或者是包含一个或多个可用介质的数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘)等。该计算机可读存储介质包括指令,所述指令指示计算设备执行虚拟对象的动作生成方法,或指示计算设备执行虚拟对象的动作生成方法。The embodiment of the present application also provides a computer-readable storage medium. The computer-readable storage medium can be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The available medium can be a magnetic medium (e.g., a floppy disk, a hard disk, a tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a solid-state hard disk). The computer-readable storage medium includes instructions that instruct the computing device to execute the action generation method of the virtual object, or instruct the computing device to execute the action generation method of the virtual object.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的保护范围。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit it. Although the present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that they can still modify the technical solutions described in the aforementioned embodiments, or make equivalent replacements for some of the technical features therein. However, these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the protection scope of the technical solutions of the embodiments of the present invention.
Claims (23)
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310876415 | 2023-07-17 | ||
| CN202310876415.4 | 2023-07-17 | ||
| CN202311544502.6A CN119323629A (en) | 2023-07-17 | 2023-11-16 | Method, device, cluster, medium and program product for generating actions of virtual objects |
| CN202311544502.6 | 2023-11-16 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025016335A1 true WO2025016335A1 (en) | 2025-01-23 |
Family
ID=94230891
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/105342 Pending WO2025016335A1 (en) | 2023-07-17 | 2024-07-12 | Method and apparatus for generating action of virtual object, and cluster, medium and program product |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN119323629A (en) |
| WO (1) | WO2025016335A1 (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120182442A (en) * | 2025-03-03 | 2025-06-20 | 郑州光聚网络科技有限公司 | Animation generation method, system and device based on motion capture |
| CN120451346A (en) * | 2025-07-01 | 2025-08-08 | 上海岩潮体育科技有限公司 | Motion management method for virtual motion object and electronic device |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110675474A (en) * | 2019-08-16 | 2020-01-10 | 咪咕动漫有限公司 | Learning method, electronic device and readable storage medium for virtual character model |
| CN112711335A (en) * | 2021-01-19 | 2021-04-27 | 腾讯科技(深圳)有限公司 | Virtual environment picture display method, device, equipment and storage medium |
| CN113658213A (en) * | 2021-08-16 | 2021-11-16 | 百度在线网络技术(北京)有限公司 | Image presentation method, related device and computer program product |
| CN114758039A (en) * | 2020-12-28 | 2022-07-15 | 北京陌陌信息技术有限公司 | Sectional driving method and equipment of human body model and storage medium |
| WO2022241583A1 (en) * | 2021-05-15 | 2022-11-24 | 电子科技大学 | Family scenario motion capture method based on multi-target video |
| CN115496841A (en) * | 2022-09-19 | 2022-12-20 | 清华大学 | Animation generation method, device, electronic device and storage medium of virtual character |
| CN115546360A (en) * | 2021-06-29 | 2022-12-30 | 阿里巴巴新加坡控股有限公司 | Action result recognition method and device |
| CN116206370A (en) * | 2023-05-06 | 2023-06-02 | 北京百度网讯科技有限公司 | Drive information generation, drive method, device, electronic device, and storage medium |
| WO2023109753A1 (en) * | 2021-12-14 | 2023-06-22 | 魔珐(上海)信息科技有限公司 | Animation generation method and apparatus for virtual character, and storage medium and terminal |
-
2023
- 2023-11-16 CN CN202311544502.6A patent/CN119323629A/en active Pending
-
2024
- 2024-07-12 WO PCT/CN2024/105342 patent/WO2025016335A1/en active Pending
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110675474A (en) * | 2019-08-16 | 2020-01-10 | 咪咕动漫有限公司 | Learning method, electronic device and readable storage medium for virtual character model |
| CN114758039A (en) * | 2020-12-28 | 2022-07-15 | 北京陌陌信息技术有限公司 | Sectional driving method and equipment of human body model and storage medium |
| CN112711335A (en) * | 2021-01-19 | 2021-04-27 | 腾讯科技(深圳)有限公司 | Virtual environment picture display method, device, equipment and storage medium |
| WO2022241583A1 (en) * | 2021-05-15 | 2022-11-24 | 电子科技大学 | Family scenario motion capture method based on multi-target video |
| CN115546360A (en) * | 2021-06-29 | 2022-12-30 | 阿里巴巴新加坡控股有限公司 | Action result recognition method and device |
| CN113658213A (en) * | 2021-08-16 | 2021-11-16 | 百度在线网络技术(北京)有限公司 | Image presentation method, related device and computer program product |
| WO2023109753A1 (en) * | 2021-12-14 | 2023-06-22 | 魔珐(上海)信息科技有限公司 | Animation generation method and apparatus for virtual character, and storage medium and terminal |
| CN115496841A (en) * | 2022-09-19 | 2022-12-20 | 清华大学 | Animation generation method, device, electronic device and storage medium of virtual character |
| CN116206370A (en) * | 2023-05-06 | 2023-06-02 | 北京百度网讯科技有限公司 | Drive information generation, drive method, device, electronic device, and storage medium |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120182442A (en) * | 2025-03-03 | 2025-06-20 | 郑州光聚网络科技有限公司 | Animation generation method, system and device based on motion capture |
| CN120451346A (en) * | 2025-07-01 | 2025-08-08 | 上海岩潮体育科技有限公司 | Motion management method for virtual motion object and electronic device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119323629A (en) | 2025-01-17 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11790589B1 (en) | System and method for creating avatars or animated sequences using human body features extracted from a still image | |
| US9240067B2 (en) | Animation of photo-images via fitting of combined models | |
| WO2025016335A1 (en) | Method and apparatus for generating action of virtual object, and cluster, medium and program product | |
| CN108335345B (en) | Control method and device for facial animation model, and computing device | |
| CN108875539B (en) | Expression matching method, device and system and storage medium | |
| US11282257B2 (en) | Pose selection and animation of characters using video data and training techniques | |
| CN110245638A (en) | Video generation method and device | |
| CN108875633A (en) | Expression detection and expression driving method, device and system and storage medium | |
| CN110675475A (en) | Face model generation method, device, equipment and storage medium | |
| TWI532006B (en) | Method and system of simulating hair on 3d model | |
| CN106447785A (en) | Method for driving virtual character and device thereof | |
| US20120139899A1 (en) | Semantic Rigging of Avatars | |
| CN111833236A (en) | Method and device for generating three-dimensional face model simulating user | |
| CN109035415B (en) | Virtual model processing method, device, equipment and computer readable storage medium | |
| US11361467B2 (en) | Pose selection and animation of characters using video data and training techniques | |
| CN116529766A (en) | Automatic mixing of human facial expressions and whole-body gestures for dynamic digital mannequin creation using integrated photo-video volume capture system and mesh tracking | |
| TW202247107A (en) | Facial capture artificial intelligence for training models | |
| CN119850797A (en) | Action correction method and related equipment | |
| CN115690283A (en) | Two-dimensional animation production method and device based on motion sensing technology | |
| CN117504296A (en) | Action generating method, action displaying method, device, equipment, medium and product | |
| Zhang et al. | Optimizing Motion Capture Data in Animation Sequences Using Machine Learning Techniques | |
| CN119131207A (en) | Animation generation method, device and electronic equipment | |
| JP2025140839A (en) | Image processing device, program, and image processing method | |
| CN119205996A (en) | Method, device and storage medium for generating virtual image based on gesture action |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24842328 Country of ref document: EP Kind code of ref document: A1 |