Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present application in various embodiments of the present invention. However, the technical solution claimed in the present application can be implemented without these technical details and various changes and modifications based on the following embodiments.
The following embodiments are divided for convenience of description, and should not constitute any limitation to the specific implementation manner of the present invention, and the embodiments may be mutually incorporated and referred to without contradiction.
At present, the modes of generating a target checkpoint video are divided into two modes, one is to use the existing software and import a multimedia file conforming to a template according to the template in the software, for example, the number of photos or the video length required by the template; and synthesizing the imported multimedia file with the background music in the template to form a target checkpoint video. However, the background music in the template cannot be changed in this way, the background music in the template must be selected, and the imported multimedia file must also meet the requirements of the template, so that the personalized requirements of the user cannot be realized, and the application range of generating the target click video is also reduced.
And the second mode is that the target checkpoint video is generated manually by using professional editing software, the process is to select standard template music, intercept a plurality of sections of video segments according to the accent time of the standard template music, and then combine the standard template music and the intercepted video to form the target checkpoint video. Although the method meets the personalized requirements of the user, the user is required to have professional ability, and the whole process needs manual participation, so that the efficiency of generating the target checkpoint video is reduced.
The first embodiment of the present invention relates to a video generation method, the flow of which is shown in fig. 1:
step 101: and acquiring a target card point of the multimedia file and a standard card point corresponding to the target card point, wherein the standard card point is a card point in the preset standard template music.
Step 102: and editing the multimedia file according to the position of the target card point and the position of the corresponding standard card point so as to match the target card point with the corresponding standard card point.
Step 103: and overlapping the edited multimedia file with the standard template music to form a target checkpoint video.
In the embodiment of the application, the multimedia file is edited through the position of each target card point of the multimedia file and the position of the corresponding standard card point, so that the position of the target card point is matched with the position of the corresponding standard card point, and further, after the edited multimedia file is overlapped with the standard template music, a target card point video played by the multimedia file along with the rhythm of the standard template music is formed; because the multimedia file does not need to be edited manually, the speed and the efficiency of forming the target checkpoint video are improved; meanwhile, the positions of the stuck points in the standard template music are obtained again, so that the standard template music can be selected by a user, the standard template music fixed in the template is not required to be adopted, the target stuck point video can be flexibly generated, and the generated target stuck point video meets the personalized requirements of the user.
A second embodiment of the invention relates to a method of video generation. The second embodiment is a detailed description of steps 101 to 103 in the first embodiment, wherein the method for generating a video can be applied to an electronic device, for example, a mobile phone, a computer, and the like, and a specific flow of the method for generating a video is shown in fig. 2, and includes:
step 201: and acquiring a target card point of the multimedia file and a standard card point corresponding to the target card point, wherein the standard card point is a card point in the preset standard template music.
Specifically, the multimedia file may be a video file, or an audio-video file, or a picture, a photo, or the like. In this example, an audio/video file is described as an example of a multimedia file. The preset standard template music can be customized by a user, and can also be default template music as the standard template music. The standard template music in this example is user-defined music, and the standard template music and the multimedia file can be imported into the electronic equipment by a user.
The target number of the target clicks in the multimedia file may include at least one, and the number of the clicks in the standard template music may also include at least 1, where the number of the target clicks in the multimedia file may be different from or the same as the number of the clicks in the standard template music, which is not limited in this example.
In one example, at least one target card point in a multimedia file is obtained; acquiring at least one standard card point in standard template music; respectively sequencing the target stuck points and the standard stuck points according to the time sequence; and performing the following processing on each target stuck point: and acquiring the sequencing position of the target card point, searching for the standard card points with the same sequencing position, and corresponding the target card point to the searched standard card point.
Specifically, there are various manners of acquiring each of the clicks in the standard template music, for example, a time point corresponding to an accent in the standard template music may be used as the standard click, and a time point at which a pause occurs in the standard template music may also be used as the standard click.
Similarly, there are various ways to obtain each target checkpoint in the multimedia file, and the target checkpoint may be obtained according to the type of the multimedia file, for example, if the multimedia file is a photo, the time point of switching the photo may be used as the target checkpoint; if the multimedia file is an audio/video file, the time point corresponding to the accent in the audio/video file can be used as a target card point; if the multimedia file is a video file, the time point corresponding to the preset action in the video file can be used as a target card point.
Marking each target card point on a time axis of the multimedia file to obtain a distribution map of the card points; and marking each standard card point on the time axis of the standard template music to obtain a distribution diagram of the standard card points. And according to the time sequence, the target card points and the standard card points are in one-to-one correspondence. The binding procedure is described in the following case in section 3:
the first condition is as follows: the number of stuck point locations is greater than the number of standard stuck point locations.
As shown in the distribution diagram of the target stuck points and the distribution diagram of the standard stuck points in fig. 3, in a time period of 0 to n seconds, the number of the target stuck points is 5, and the target stuck points are sorted in chronological order as a1, a2, A3, a4 and a 5; the number of the standard stuck point positions is 3, and the standard stuck point positions are sorted into B1, B2 and B3 according to the time sequence; the processing for each target stuck point is as follows: if the sorting position of A1 is first, the standard checkpoint with the first sorting position is B1, and A1 corresponds to B1; similarly, if the sorting position of a2 is second, the standard checkpoint whose query sorting position is second is B2, then a2 corresponds to B2, and if A3 corresponds to B3, and if there is no corresponding standard checkpoint in the sorting positions corresponding to a4 and A5, it is determined that the standard checkpoint positions corresponding to a4 and A5 are empty.
Case two: the number of stuck point locations is less than the number of standard stuck point locations.
As shown in the distribution diagram of the target stuck points and the distribution diagram of the standard stuck points in fig. 4, in a time period of 0 to n seconds, the number of the target stuck points is 3, and the target stuck points are sorted into a1, a2 and A3 according to the chronological order; the number of the standard card point positions is 5, and the standard card point positions are sorted into B1, B2, B3, B4 and B5 according to the time sequence; the processing for each target stuck point is as follows: if the sorting position of a1 is first, the standard checkpoint with the first query sorting position is B1, a1 corresponds to B1, if the sorting position of a2 is second, the standard checkpoint with the second query sorting position is B2, a2 corresponds to B2, similarly, A3 corresponds to B3, if there is no target checkpoint after A3, and the corresponding operation of the target checkpoint and the standard checkpoint is finished.
Case three: the number of stuck point positions is equal to the number of standard stuck point positions.
As shown in the distribution diagram of the target stuck points and the distribution diagram of the standard stuck points in fig. 5, in a time period of 0 to n seconds, the number of the target stuck points is 3, and the target stuck points are sorted into a1, a2 and A3 according to the chronological order; the number of the standard stuck point positions is 3, and the standard stuck point positions are sorted into B1, B2 and B3 according to the time sequence; the processing for each target stuck point is as follows: if the sorting position of a1 is first, the standard checkpoint with the first query sorting position is B1, a1 corresponds to B1, if the sorting position of a2 is second, the standard checkpoint with the second query sorting position is B2, a2 corresponds to B2, and similarly, A3 corresponds to B3.
Step 202: processing is carried out for each target stuck point: and judging whether the position of the target stuck point is coincident with the position of the corresponding standard stuck point, if the position of the target stuck point is not coincident with the position of the corresponding standard stuck point, executing the step 203, otherwise, continuously processing the next target stuck point until the adjustment is finished.
Specifically, if the position of the target stuck point is equal to the position of the corresponding standard stuck point, it indicates that the position of the target stuck point coincides with the position of the corresponding standard stuck point.
It should be noted that, before or after determining whether the position of the target checkpoint coincides with the position of the corresponding standard checkpoint in time sequence, if it is detected that the standard checkpoint corresponding to the target checkpoint is empty, that is, the detected target checkpoint does not have the corresponding standard checkpoint, the adjustment of the play rate of the multimedia file is directly finished.
Step 203: and adjusting the playing speed of the multimedia file.
In one example, if the position of the target checkpoint is detected to be behind the position of the corresponding standard checkpoint, the playing rate of the multimedia file is increased, so that the playing time of the multimedia file is compressed by a preset time, wherein the preset time is a time difference between the target checkpoint and the corresponding standard checkpoint; and if the position of the target card point is detected to be before the position of the corresponding standard card point, reducing the playing speed of the multimedia file so as to increase the preset time length for the playing time length of the media file.
Specifically, a first time difference value between a previous checkpoint position and a previous reference position thereof may be obtained, and a second time difference value between a corresponding standard checkpoint position and a previous reference position thereof may be obtained; and adjusting the playing rate of the multimedia file according to the ratio of the first time difference to the second time difference, wherein the starting point of the adjustment is the position of the previous checkpoint, and the adjustment duration of the adjustment is the second time difference.
Specifically, the position of the checkpoint is adjusted for the multimedia file by compressing the time length or increasing the time length, and the adjusted time length is a second time difference. For example, the previous reference position of the multi-media file at the checkpoint position t1, the checkpoint position t1 may be the previous checkpoint position t0, and the first time difference Δ t is t1-t 0; a standard pinch point position t1' corresponding to the pinch point position t1, wherein a previous reference position of the standard pinch point position t1' is a standard pinch point position t0 '; the second time difference Δ t ═ t1'-t 0'; adjusting the proportion l to be delta t/delta t'; if the playback rate is s1, the adjusted playback rate is s1 ═ s1(Δ t/Δ t'). The starting point of the adjustment of this time is t0, and the adjustment time length is a second time difference.
It should be noted that the previous reference position of the checkpoint position may also be the position of the frame 0 of the multimedia file, and similarly, the previous reference position of the standard checkpoint position corresponding to the checkpoint position may also be the position of the frame 0 of the standard music template.
There are two situations when the position of the stuck point is not coincident with the corresponding position of the standard stuck point, and the adjustment modes of the two situations will be described separately below.
The first condition is as follows: the position of the standard stuck point is located before the position of the target stuck point.
The position distribution of the standard clicks of the standard template music is shown in fig. 6, and includes a position B1 of the standard click and a position B2 of the standard click; the positions of the target pinch points are labeled A1 and A2, respectively; assuming that the playing time of the standard template music is equal to the playing time of the multimedia file, it is marked as n seconds.
The treatment of A1 and A2 were performed sequentially in chronological order. By increasing the rate, the playing duration is compressed, and the process for processing a1 is as follows: the starting point of the adjustment is marked as x, the x is 0 second, the first time difference is A1-x, and the second time difference is B1-x; starting to adjust the speed from 0 second, wherein the adjusting time length is B1; if the normal playing speed of the multimedia file is S, S (a1) ═ S (a 1-x)/(B1-x); after the adjustment, whether the next checkpoint position a2 coincides with the corresponding next standard checkpoint position B2 is judged, the adjustment starting point is a1, and since the rate of the adjustment of the a1 is performed at this time, the first time difference is a2-a 1-x; the second time difference is B2-B1-x; then S (a2) ═ S (a2-a1-x)/(B2-B1-x), where x is 0 and S (a2) ═ S (a2-a 1)/(B2-B1). After the time length of B2-B1, the playing time length of the multimedia file is compressed to a preset time length, wherein the preset time length is the time length difference between the target card point and the corresponding standard card point. Wherein after the duration of B2-B1, the rate returns to the original rate s.
The playing rate of the multimedia file is represented as: the playback rate is S (A2) in the duration from 0 to B1, and the playback rate is S (A1) and B1 to (B2-B1) and then S; wherein, after the time length of 0-B1, A1 is B1, and after the time length of B1- (B2-B1), A2 is B2; at this time, each target checkpoint is matched with a corresponding standard checkpoint.
Case two: the location of the standard stuck point is located after the location of the target stuck point.
The position distribution of the standard clicks of the standard template music is shown in fig. 7, and includes a standard click position B1 and a standard click position B2; the positions of the target pinch points are labeled A1 and A2, respectively; assuming that the playing time of the standard template music is equal to the playing time of the multimedia file, it is marked as n seconds.
The treatment of A1 and A2 were performed sequentially in chronological order. The process for processing a1 is as follows: the adjustment starting point is marked as x, the first time difference value is A1-x, and the second time difference value is B1-0; x is 0 seconds; the rate is adjusted from 0 second for a time period of B1; if the normal playing speed of the multimedia file is S, S (a1) ═ S (a 1-x)/(B1-x); after the adjustment, whether the next checkpoint position a2 coincides with the corresponding next standard checkpoint position B2 is judged, the adjustment starting point is a1, and since the rate of the adjustment of the a1 is performed at this time, the first time difference is a2-a 1-x; the second time difference is B2-B1-x; then S (a2) ═ S (a2-a1-x)/(B2-B1-x), where x is 0 and S (a2) ═ S (a2-a 1)/(B2-B1). After the time length of B2-B1, the playing time length of the multimedia file is compressed to a preset time length, wherein the preset time length is the time length difference between the target card point and the corresponding standard card point. Wherein after the duration of B2-B1, the rate returns to the original rate s.
The playing rate of the multimedia file is represented as: the playback rate is S (A2) in the duration from 0 to B1, and the playback rate is S (A1) and B1 to (B2-B1) and then S; wherein, after the time length of 0-B1, A1 is B1, and after the time length of B1- (B2-B1), A2 is B2; at this point, each checkpoint position matches the corresponding standard checkpoint position.
It can be understood that, during the playing of the multimedia file, there can exist a case where the position of a part of the target stuck point is located after the position of the standard stuck point and a case where the position of a part of the target stuck point is located before the position of the standard stuck point. The process of processing for the location of each target stuck point is similar to cases one and two.
For example, the standard checkpoint positions of the standard template music are distributed as shown in fig. 8, and the standard checkpoint positions are respectively marked as B2, B1, B3 and B4 in chronological order; the checkpoint positions are labeled a2, a1, A3, and a4, respectively; assuming that the playing time of the standard template music is equal to the playing time of the multimedia file, it is marked as n seconds. The original playing speed of the multimedia file is s; in the time period from 0 to B2, the playing speed S (A2) is A2/B2; if a1 is still earlier than B1 after the adjustment, the adjusted playback speed S (a1) is (a1-a2)/(B1-B2), and the playback time is B1-B2. The subsequent adjustment of the playback rates corresponding to A3 and a4 is similar, and will not be described herein again.
Step 204: and overlapping the edited multimedia file with the standard template music to form a target checkpoint video.
And overlapping the edited multimedia file with the standard template music to form a click video. The multimedia file is played with the rhythm of the standard template music.
The target checkpoint is matched with the corresponding standard checkpoint, so that the generated target checkpoint video is accurate, the obtained target checkpoint video is good in effect, and the generation efficiency is high.
The steps of the above methods are divided for clarity, and the implementation may be combined into one step or split some steps, and the steps are divided into multiple steps, so long as the same logical relationship is included, which are all within the protection scope of the present patent; it is within the scope of the patent to add insignificant modifications to the algorithms or processes or to introduce insignificant design changes to the core design without changing the algorithms or processes.
A third embodiment of the present invention relates to a method of video generation. This embodiment is an introduction to a specific implementation of step 101 in the first embodiment or step 201 in the second embodiment, and a specific implementation process is as shown in fig. 9;
step 301: and acquiring at least one target card point in the multimedia file.
In one example, an energy value of the multimedia file at each sampling time point is obtained; and acquiring a sampling time point corresponding to the energy value meeting the first preset condition as a target stuck point.
Specifically, if the multimedia file is an audio/video file or a video file, the energy value in the multimedia file may be extracted through an energy extraction function in the audio processing tool. The energy extraction function can be shown as equation (1):
wherein p isrefIs the maximum value of the total amplitude, p, of the multimedia filermsIs the amplitude value, L, of the current multimedia filepIs the energy value of the current time position.
Sampling the multimedia file through a preset sampling time point to obtain a corresponding energy value of the sampling time point; the first predetermined condition may be set according to practical applications, for example, the first predetermined condition is an energy value exceeding a first threshold value, and the first threshold value is an energy value 1/4 of an energy peak in the multimedia file.
In another example, a sampling time point corresponding to a key frame satisfying a second preset condition in the multimedia file is obtained as a target stuck point of the multimedia file.
Specifically, the multimedia file is a video file or a collection of photos, and a key frame in the multimedia file may be extracted, and the second preset condition may be a picture including a specified action, for example, a key frame including an action of raising a hand, clapping a hand, stomping a foot, and the like.
Step 302: and acquiring at least one standard checkpoint in the standard template music.
Specifically, an energy value of standard template music at each sampling time point is obtained; and acquiring a sampling time point corresponding to the energy value meeting the first preset condition as a standard stuck point. The function of the extracted energy value is shown in formula (1), and a standard stuck point is obtained according to the amplitude of sound.
Step 303: and respectively sequencing the target stuck points and the standard stuck points according to the time sequence.
Step 304: and performing the following processing on each target stuck point: and acquiring the sequencing position of the target card point, searching for the standard card point with the same sequencing position, and corresponding the target card point to the searched standard card point.
Step 303 and step 304 are substantially the same as step 201 in the second embodiment, and will not be described herein again.
In this embodiment, various modes of acquiring the position of the stuck point are provided, so that the position of the stuck point can be flexibly acquired.
A fourth embodiment of the present invention relates to an electronic device, a block diagram of which is shown in fig. 10, and includes: at least one processor 401; and a memory 402 communicatively coupled to the at least one processor 401; the memory 402 stores instructions executable by the at least one processor 401, and the instructions are executed by the at least one processor 401, so that the at least one processor 401 can execute the video generation method.
The memory and the processor are connected by a bus, which may include any number of interconnected buses and bridges, linking together one or more of the various circuits of the processor and the memory. The bus may also link various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And the memory may be used to store data used by the processor in performing operations.
A fifth embodiment of the present invention relates to a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of video generation described above.
Those skilled in the art can understand that all or part of the steps in the method of the foregoing embodiments may be implemented by a program to instruct related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, etc.) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples for carrying out the invention, and that various changes in form and details may be made therein without departing from the spirit and scope of the invention in practice.