[go: up one dir, main page]

CN115019752A - Intelligent composition method and device, electronic equipment and storage medium - Google Patents

Intelligent composition method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115019752A
CN115019752A CN202210606472.6A CN202210606472A CN115019752A CN 115019752 A CN115019752 A CN 115019752A CN 202210606472 A CN202210606472 A CN 202210606472A CN 115019752 A CN115019752 A CN 115019752A
Authority
CN
China
Prior art keywords
sequence
music
preset
pitch
time value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210606472.6A
Other languages
Chinese (zh)
Inventor
谭媛月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Bank of China
Original Assignee
Agricultural Bank of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Bank of China filed Critical Agricultural Bank of China
Priority to CN202210606472.6A priority Critical patent/CN115019752A/en
Publication of CN115019752A publication Critical patent/CN115019752A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • G10H1/0025Automatic or semi-automatic music composition, e.g. producing random music, applying rules from music theory or modifying a musical piece
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0033Recording/reproducing or transmission of music for electrophonic musical instruments
    • G10H1/0041Recording/reproducing or transmission of music for electrophonic musical instruments in coded form
    • G10H1/0058Transmission between separate instruments or between individual components of a musical system
    • G10H1/0066Transmission between separate instruments or between individual components of a musical system using a MIDI interface

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

本申请提供了一种智能作曲方法、装置、电子设备及存储介质,该方法首先在生成第一声部音高序列和第一声部时值序列之后,对第一声部音高序列和第一声部时值序列进行曲线拟合,得到第二声部音高序列和第二声部时值序列;最后将所有声部的音高和时值序列进行合成,得到二声部作曲旋律,由于第一声部的音高和时值是由马尔可夫模型,双向循环神经网络和曲线拟合三种技术相混合的作曲网络结构生成的,能充分地挖掘音乐序列中过去时刻和当前时刻以及未来时刻的时序信息和语义信息,结合乐曲知识规则,生成具有典型某种特有乐曲风格的二声部乐曲,适用于复杂乐曲环境,无需对适应度函数进行选择。

Figure 202210606472

The present application provides an intelligent composition method, device, electronic device and storage medium. In the method, after generating the first voice part pitch sequence and the first voice part time value sequence, the first voice part pitch sequence and the first voice part pitch sequence and the Curve fitting is performed on the time sequence of one part to obtain the pitch sequence of the second part and the time sequence of the second part; finally, the pitch and time sequences of all parts are synthesized to obtain the composition melody of the two parts, Since the pitch and time value of the first part are generated by the composition network structure which is a mixture of Markov model, bidirectional recurrent neural network and curve fitting, it can fully mine the past and current moments in the music sequence. As well as the timing information and semantic information of the future time, combined with the music knowledge rules, a two-part music with a typical and unique music style is generated, which is suitable for complex music environments and does not need to select the fitness function.

Figure 202210606472

Description

Intelligent composition method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to an intelligent composition method, an intelligent composition device, electronic equipment and a storage medium.
Background
The artificial intelligence composition is to mine and learn music data by using artificial intelligence technologies such as machine learning, deep learning and the like, so as to generate music similar to composers. Artificial intelligence composition greatly reduced the threshold of music creation for even non-professional does not understand the music reason, also can participate in the creation, also can improve professional's creation efficiency simultaneously.
At present, the existing intelligent composition method is mainly characterized in that music rules are used for measuring the fitness of music to be participated in the evolution, and then people select the theme of the composition, so that the music evolves towards the trend with the theme, and the randomness and the inattentiveness of the evolved music are avoided. However, the existing method is difficult to select the fitness function, and the method that the theme music is manually selected and only a single algorithm is used for generating the music is simple and is not suitable for the complex music environment.
Disclosure of Invention
Therefore, the application provides an intelligent composition method, an intelligent composition device, an electronic device and a storage medium, and aims to solve the problems that the existing related scheme is difficult to select a fitness function, the composition theme selection needs to be manually performed, and the composition generated by only using a single algorithm is simple and is not suitable for a complex composition environment.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
the invention discloses an intelligent composing method in a first aspect, which comprises the following steps:
performing up-sampling coding on a preset MIDI data set to obtain an MIDI data up-sampling sequence; the preset MIDI data set comprises a plurality of preset music pieces;
training a preset neural network by utilizing the MIDI data up-sampling sequence to obtain a music prediction model;
generating a first sound part pitch sequence and a first sound part time value sequence based on the music prediction model and a preset motor melody model; the preset motor melody model is a model which is obtained based on the music style type of the MIDI data set and the Markov model and can generate motor melody rules;
performing curve fitting based on the first acoustic portion pitch sequence and the first acoustic portion duration sequence to obtain a second acoustic portion pitch sequence and a second acoustic portion duration sequence;
and synthesizing the first sound part pitch sequence, the first sound part time value sequence, the second sound part pitch sequence and the second sound part time value sequence to obtain the two-sound part composition melody.
Optionally, in the above intelligent composition method, the upsampling coding of the preset MIDI data set to obtain an upsampled sequence of MIDI data includes:
respectively extracting the music characteristics of each music piece in the preset MIDI data set to obtain the music characteristic set of each music piece; the musical feature set comprises: mode, MIDI pitch and duration;
unifying the music tune modes of each piece of music in the preset MIDI data set based on a tune mode conversion table pre-stored in the preset MIDI data set to obtain a unified tune mode pitch sequence and a unified tune mode time value sequence corresponding to the MIDI data set; wherein the uniformly modulated pitch sequence corresponds to the uniformly modulated duration sequence one to one;
and performing up-sampling coding on the uniformly modulated pitch sequence according to the uniformly modulated time value sequence to obtain the MIDI data up-sampling sequence.
Optionally, in the above intelligent composition method, training a preset neural network by using the MIDI data up-sampling sequence to obtain the music prediction model includes:
preprocessing the MIDI data up-sampling sequence to obtain a model training digital sequence; the pretreatment comprises the following steps: normalization processing;
and taking the model training digital sequence as an input training set, and training a preset neural network to obtain the music prediction model.
Optionally, in the above-mentioned intelligent composition method, the generating a first sound pitch sequence and a first sound duration sequence based on the music prediction model and a preset motivational melody model includes:
determining an initial input sequence required to be input into the music piece prediction model;
using the initial input sequence and the motor pitch sequence generated by the preset motor melody model as the input sequence of the music prediction model, and generating the first vocal part pitch sequence;
and performing downsampling on the first sound part pitch sequence to obtain the first sound part time value sequence.
Optionally, in the above intelligent composition method, performing curve fitting based on the first sound pitch sequence and the first sound duration sequence, and obtaining a second sound pitch sequence and a second sound duration sequence includes:
respectively performing polynomial fitting on the first sound part pitch sequence and the first sound part duration sequence to obtain a fitted pitch sequence and a second sound part duration sequence;
and carrying out pitch correction on the fitted pitch sequence to obtain the second acoustic pitch sequence.
Optionally, in the above-mentioned intelligent composition method, synthesizing the first sound part pitch sequence, the first sound part duration sequence, the second sound part pitch sequence, and the second sound part duration sequence to obtain the second sound part composition melody includes:
respectively determining a preset MIDI interface corresponding to the first sound part pitch sequence, the first sound part duration value sequence, the second sound part pitch sequence and the second sound part duration value sequence;
and respectively sending the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence and the second vocal part time value sequence to a preset MIDI synthesizer for synthesis according to the determined preset MIDI interface, so as to obtain the two-vocal part composition melody.
Optionally, in the above-mentioned intelligent composition method, after generating the first sound part pitch sequence and the first sound part duration sequence based on the music prediction model and the preset motivational melody model, the method further includes:
determining a preset MIDI interface corresponding to the first tone pitch sequence and the first tone duration sequence;
and respectively sending the first tone part pitch sequence and the first tone part time value sequence into a preset MIDI synthesizer for synthesis according to the determined preset MIDI interface to obtain a tone part composition melody.
The second aspect of the present invention discloses an intelligent composition device, comprising:
the coding unit is used for performing up-sampling coding on a preset MIDI data set to obtain an MIDI data up-sampling sequence; the preset MIDI data set comprises a plurality of preset music pieces;
the training unit is used for training a preset neural network by utilizing the MIDI data up-sampling sequence to obtain a music prediction model;
a generation unit configured to generate a first sound part pitch sequence and a first sound part duration sequence based on the music prediction model and a preset motivational melody model; the preset motor melody model is a model which is obtained based on the music style type of the MIDI data set and a Markov model and can generate motor melody rules;
a fitting unit configured to perform curve fitting based on the first vocal part pitch sequence and the first vocal part time value sequence to obtain a second vocal part pitch sequence and a second vocal part time value sequence;
and a synthesizing unit configured to synthesize the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence, and the second vocal part time value sequence to obtain a two-vocal part composition melody.
A third aspect of the invention discloses an electronic device comprising a processor and a memory; wherein:
the memory is to store computer instructions;
the processor is configured to execute the computer instructions stored in the memory, and in particular, to execute the intelligent composition method as disclosed in any one of the first aspect.
A fourth aspect of the present invention discloses a storage medium storing a program for implementing the intelligent composition method as disclosed in any one of the first aspect when the program is executed.
The invention provides an intelligent composition method, which comprises the steps of firstly, after a music prediction model is obtained, generating a first tone pitch sequence and a first tone duration value sequence based on the music prediction model and a preset motor melody model; the preset motor melody model is based on the music type of MIDI data, and is beneficial to generating a model capable of outputting motor melody rules by a Markov model; then curve fitting is carried out on the first sound part pitch sequence and the first sound part duration value sequence to obtain a second sound part pitch sequence and a second sound part duration value sequence; finally, the pitch sequence and the time value sequence of the first sound part and the second sound part are synthesized to obtain the composition melody of the two sound parts, the composition network structure mixed by the Markov model, the bidirectional circulation neural network and the curve fitting technology can be utilized to fully mine the time sequence information and the semantic information of the past time, the current time and the future time in the music sequence, and the composition knowledge rule is combined to generate the composition of the two sound parts with a typical specific composition style.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a flowchart of an intelligent composition method according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a MIDI data up-sampling sequence according to an embodiment of the present invention;
fig. 3 is a flowchart illustrating a generation process of a music prediction model according to an embodiment of the present application;
FIG. 4 is a flow chart of determining a first part pitch sequence and a first part duration sequence according to an embodiment of the present application;
fig. 5 is a flowchart of determining a second acoustic pitch sequence and a second acoustic duration sequence according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a composition of a melody composed of two vocal parts according to an embodiment of the present disclosure;
fig. 7 is a flowchart of another intelligent composition method provided in the embodiment of the present application;
fig. 8 is a schematic structural diagram of an intelligent composition device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Firstly, it should be noted that the existing intelligent composition method is mainly implemented by using a genetic algorithm. Wherein the genetic algorithm is a global optimization algorithm that evolves candidates using a fitness function. The genetic algorithm composition is a process for controlling music generation by utilizing a genetic algorithm, wherein in the composition process, a given music is coded in a certain mode, a genetic operator is adopted to evolve the music, and a fitness function is used for measuring an evolution result, so that the evolution result is continuously obtained until a final satisfactory solution is found.
Based on the above, the embodiment of the application provides an intelligent composition method, so as to solve the problems that the existing related scheme is difficult to select a fitness function, and the composition theme needs to be manually selected, and the composition generated by only using a single algorithm is simple and is not suitable for a complex composition environment.
Referring to fig. 1, the intelligent composition method mainly includes the following steps:
s100, up-sampling coding is carried out on a preset MIDI data set to obtain a MIDI data up-sampling sequence.
The preset MIDI data set comprises a plurality of preset style music pieces.
Assuming that the music in the preset style is the national folk music, the preset MIDI data set can comprise a plurality of pieces of music in the national folk music style.
It should be noted that the preset MIDI data set is preset before the execution of the intelligent composition operation; certainly, the preset music pieces in the preset MIDI data set can be added or deleted according to the actual application environment and the user requirement, and the number of the specific music pieces contained in the preset MIDI data set is not limited in the present application, and is within the protection scope of the present application.
In practical applications, the specific process of performing step S100 and performing upsampling coding on the preset MIDI data set to obtain the MIDI data upsampling sequence can be as shown in fig. 2, and mainly includes the following steps:
s200, extracting the music characteristics of each music in the preset MIDI data set to obtain the music characteristic set of each music.
The music feature set includes music features such as mode, duration, and MIDI pitch.
Specifically, for example, in folk music, the range of MIDI pitch representation in the music may be 0 to 127. The MIDI duration is the relative duration of each note in beats. If the quarter note is taken as 1 beat, the duration 0.5 represents half beat. The key comprises all common types of keys in music, denoted by 0-23.
It should be noted that the complete midi scale has 128 tones, which can be represented by 0-127. The adjacent two tones represent the adjacent two keys on the keyboard. 0-23 represents the 24-tone mode in the western music, and each tone in the twelve tone equal temperament can form a major tone and a minor tone as a primary tone, so that the tones can be represented by 0-23.
S202, unifying the music mode of each piece of music in the preset MIDI data set based on the mode conversion table pre-stored in the preset MIDI data set to obtain a unified mode pitch sequence and a unified mode time value sequence corresponding to the MIDI data set.
Wherein, the unified mode pitch sequence is in one-to-one correspondence with the unified mode duration sequence.
In practical application, after extracting the music features of each piece of music in a preset MIDI data set to obtain the music feature set of each piece of music, the music pieces with different tones can be converted into a pitch sequence with a uniform tone and a time value sequence corresponding to the pitch sequence with the uniform tone in combination with a tone conversion table pre-stored in the MIDI data set, so as to obtain the pitch sequence with the uniform tone and the time value sequence with the uniform tone corresponding to the MIDI data set.
And S204, up-sampling and coding the unified tone pitch sequence according to the unified tone timing value sequence to obtain an MIDI data up-sampling sequence.
In practical applications, the unified tone pitch sequence may be up-sampled and encoded according to the corresponding unified tone duration sequence to obtain an up-sampled sequence of MIDI data.
And S102, training a preset neural network by utilizing the MIDI data up-sampling sequence to obtain a music prediction model.
In practical application, the specific process of executing step S102 and training the preset neural network by using the MIDI data up-sampling sequence to obtain the music prediction model is as shown in fig. 3, and mainly includes the following steps:
s300, preprocessing the MIDI data up-sampling sequence to obtain a model training digital sequence.
Wherein the pretreatment comprises: and (6) normalization processing.
In practical application, the MIDI data up-sampling sequence may be subjected to normalization conversion to obtain a digital sequence suitable for the preset neural network training, i.e., a model training digital sequence.
And S302, taking the model training digital sequence as an input training set, and training a preset neural network to obtain a music prediction model.
In practical application, the preset neural network can be a bidirectional cyclic neural network; of course, the present invention is not limited to this, and other types of neural networks may also be used, and the present application does not limit the specific type of the neural network, and is within the protection scope of the present application.
Assuming that the preset neural network model is a bidirectional cyclic neural network, the specific process of obtaining the music prediction model is as follows: and training the bidirectional cyclic neural network by using the model training digital sequence as an input training set until a training error meets a preset requirement, and using the trained bidirectional cyclic neural network as a music prediction model. The preset requirement may be that the error reaches a preset error, or that the number of training times reaches a preset number of times.
And S104, generating a first sound part pitch sequence and a first sound part time value sequence based on the music prediction model and the preset motor melody model.
The preset motor melody model is a model capable of generating motor melody rules based on the music style type to which the MIDI data set belongs and the markov model.
In practical applications, a specific process of executing step S104 and generating the first part pitch sequence and the first part duration sequence based on the music prediction model and the preset melody model may be as shown in fig. 4, and mainly includes the following steps:
and S400, determining a motivational pitch sequence generated by a preset motivational melody model.
In practical applications, when the music prediction model is used to generate the first part pitch sequence, the preset motor melody may be used to generate the motor pitch sequence. The specific process of generating the motor pitch sequence by using the preset motor melody may refer to the prior art, and is not described in detail in this application.
When the music prediction model is used to generate the first vocal pitch sequence, an initial input sequence needs to be given to the preset melody model, and the preset melody model generates a motor pitch sequence according to the markov model as an input sequence of the music prediction model.
S402, the motivational pitch sequence is used as an input sequence of the music prediction model to generate a first sound part pitch sequence.
In practical applications, the motivational pitch sequence may be used as an input sequence of the music prediction model to generate the first vocal part pitch sequence.
S404, down-sampling the first sound part pitch sequence to obtain a first sound part time value sequence.
In practical application, the first sound part pitch sequence may be restored by a down-sampling manner to obtain a corresponding first sound part duration value sequence.
And S106, performing curve fitting on the basis of the first sound part pitch sequence and the first sound part time value sequence to obtain a second sound part pitch sequence and a second sound part time value sequence.
In practical applications, the specific process of executing step S106 and performing curve fitting based on the first sound pitch sequence and the first sound duration sequence to obtain the second sound pitch sequence and the second sound duration sequence may be as shown in fig. 5, and mainly includes the following steps:
and S500, performing polynomial fitting on the first sound part pitch sequence and the first sound part time value sequence respectively to obtain a fitted pitch sequence and a second sound part time value sequence.
In practical applications, the second sound portion may be modeled by polynomial curve fitting according to the first sound portion pitch sequence and the first sound portion duration sequence, so as to directly obtain the second sound portion high sound sequence and the second sound portion duration sequence.
In practical applications, the fit pitch sequence obtained by fitting may be directly used as the second sound pitch sequence, or step S502 may be executed to perform pitch correction on the fit pitch sequence to obtain the second sound pitch sequence.
It should be noted that whether to perform pitch correction may be determined according to specific application environment and user requirement, and the present application is not limited specifically, and is within the scope of the present application.
And S108, synthesizing the first sound part pitch sequence, the first sound part time value sequence, the second sound part pitch sequence and the second sound part time value sequence to obtain the two-sound part composition melody.
The two-tone part melody may be MIDI audio, but is not limited thereto, and may also be other types of two-tone part melodies.
In practical applications, the specific process of performing step S108 and synthesizing the first sound part pitch sequence, the first sound part duration sequence, the second sound part pitch sequence and the second sound part duration sequence to obtain the two-sound part composition melody may be as shown in fig. 6, and mainly includes the following steps:
s600, respectively determining a first sound part pitch sequence, a first sound part time value sequence, a second sound part pitch sequence and a preset MIDI interface corresponding to the second sound part time value sequence.
In practical applications, the interfaces corresponding to the preset MIDI synthesizers for inputting the high pitch sequence and the duration sequence of different sound parts are generally different, and therefore, in the process of synthesizing the first sound part pitch sequence, the first sound part duration sequence, the second sound part pitch sequence and the second sound part duration sequence to obtain the melody of the two-sound part composition, the preset MIDI interface corresponding to each sound part pitch sequence and duration sequence needs to be determined first.
It should be noted that, the preset MIDI interface corresponding to each tone pitch sequence and duration sequence may be determined according to the specific application environment and the user requirement, and the present application is not limited specifically, and all of them are within the scope of the present application.
And S602, according to the determined preset MIDI interface, respectively sending the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence and the second vocal part time value sequence to a preset MIDI synthesizer for synthesis to obtain the two-vocal part composition melody.
In practical applications, after the preset MIDI interfaces corresponding to the pitch sequence and the duration sequence of each vocal part are determined, the pitch sequence and the duration sequence of each vocal part may be respectively input to the preset MIDI synthesizer through the corresponding preset MIDI interfaces for synthesis, so as to obtain the melody of the two vocal parts.
Based on the above principle, the intelligent composition method provided by this embodiment generates a first sound pitch sequence and a first sound duration sequence based on the music prediction model and the preset motor melody model after obtaining the music prediction model; the preset motor melody model is based on the music type of MIDI data, and is beneficial to generating a model capable of outputting motor melody rules by a Markov model; then curve fitting is carried out on the first sound part pitch sequence and the first sound part duration value sequence to obtain a second sound part pitch sequence and a second sound part duration value sequence; finally, the pitch sequence and the time value sequence of the first sound part and the second sound part are synthesized to obtain the composition melody of the two sound parts, the composition network structure mixed by the Markov model, the bidirectional circulation neural network and the curve fitting technology can be utilized to fully mine the time sequence information and the semantic information of the past time, the current time and the future time in the music sequence, and the composition knowledge rule is combined to generate the composition of the two sound parts with a typical specific composition style.
The invention relates to a music composing method, in particular to a music composing method for creating music with typical national style and characteristics, which is characterized in that the inventor researches show that the research of the conventional artificial intelligent music composing technology on western music is active, the research on the automatic creation of national music is in a primary stage, the music generated by a single artificial intelligent music composing algorithm lacks emotional colors, and the style subject is too single.
In addition, the intelligent composition method provided by the application is a folk music composition network structure which is formed by mixing a Markov model, a bidirectional circulation neural network and a curve fitting technology, has the advantages, can make up the blank of research of artificial intelligent composition in the field of folk music, and solves the problems that various artificial intelligent composition algorithms have different advantages and disadvantages in use at present, particularly, the pitch and the duration of a melody are extracted as mutually independent training characteristics during data preprocessing in composition of the traditional algorithm, the rhythm relation between the pitch and the duration cannot be represented, and therefore the network model cannot better learn the music style, and the created music works are dull in music due to single style theme and poor in audibility.
Finally, because the intelligent composing usually comprises two parts of data characteristic representation and music characteristic generation, and the data characteristic representation has important influence on the network training generation, the intelligent composing method provided by the application can utilize an up-down sampling coding data characteristic representation method, and the data is encoded by pitch and duration values and then sent to a network model for training, thereby being beneficial to the network model to learn the rhythm style of the music, and the composing network can generate music with bright style.
It should be noted that the markov model is a relatively common mathematical model statistical probability model for sequence prediction, and is widely used in the fields of speech recognition, text generation, personalized recommendation, user behavior analysis, image processing, and the like. The bidirectional cyclic neural network is a variant of the cyclic neural network, and is proposed in order to make up for the defect that the classical unidirectional cyclic neural network can only capture current and past information in sequence data learning and cannot obtain data information at a later moment. And the up-down sampling is to up-sample each pitch according to the time value of each note and then send the up-down sampled pitch into a network to train to obtain a prediction model, and a pitch sequence generated by the prediction model is decoded according to time value down-sampling to respectively obtain a pitch sequence and a time value sequence.
Alternatively, in another embodiment provided in the present application, referring to fig. 7, after executing step S104, generating the first part pitch sequence and the first part time value sequence based on the music piece prediction model and the preset melody model, the smart composition method further includes:
s700, determining a preset MIDI interface corresponding to the first tone part pitch sequence and the first tone part time value sequence.
In practical applications, the specific process of determining the preset MIDI interface corresponding to the first tone pitch sequence and the first tone duration sequence is the same as that in step S600, and it is sufficient to refer to the specific process, and details are not repeated here.
S702, according to the determined preset MIDI interface, respectively sending the first tone part pitch sequence and the first tone part time value sequence to a preset MIDI synthesizer for synthesis, and obtaining a tone part composition melody.
In practical application, after the preset MIDI interfaces corresponding to the pitch sequence and the duration sequence of each first vocal part are determined, the pitch sequence and the duration sequence of each first vocal part may be respectively input to the preset MIDI synthesizer through the corresponding preset MIDI interfaces for synthesis, so as to obtain a melody for composing a vocal part.
In order to satisfy the requirement of the user for one-part composition, the first part pitch sequence and the first part time value sequence may be generated and then directly synthesized to obtain the one-part composition melody.
Optionally, another embodiment of the present application further provides an intelligent composition device, please refer to fig. 8, where the device mainly includes:
an encoding unit 100, configured to perform upsampling encoding on a preset MIDI data set to obtain a MIDI data upsampling sequence; the preset MIDI data set comprises a plurality of pieces of preset style music.
The training unit 102 is configured to train a preset neural network by using the MIDI data up-sampling sequence, so as to obtain a music prediction model.
A generation unit 104 for generating a first sound part pitch sequence and a first sound part duration sequence based on the music piece prediction model and a preset motivational melody model; the preset motor melody model is a model capable of generating motor melody rules based on the music style type to which the MIDI data set belongs and the markov model.
Fitting section 106 is configured to perform curve fitting based on the first sound portion pitch sequence and the first sound portion duration sequence to obtain a second sound portion pitch sequence and a second sound portion duration sequence.
And a synthesizing unit 108 for synthesizing the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence, and the second vocal part time value sequence to obtain the two-vocal part composition melody.
Optionally, the encoding unit 100 is specifically configured to:
respectively extracting the music characteristics of each music piece in a preset MIDI data set to obtain the music characteristic set of each music piece; the musical feature set includes: mode, MIDI pitch and duration.
Unifying the music tune modes of each piece of music in the preset MIDI data set based on a tune mode conversion table pre-stored in the preset MIDI data set to obtain a unified tune mode pitch sequence and a unified tune mode time value sequence corresponding to the MIDI data set; wherein, the unified mode pitch sequence is in one-to-one correspondence with the unified mode duration sequence.
And performing up-sampling coding on the uniformly modulated pitch sequence according to the uniformly modulated time value sequence to obtain an MIDI data up-sampling sequence.
Optionally, the training unit 102 is specifically configured to:
preprocessing a MIDI data up-sampling sequence to obtain a model training digital sequence; the pretreatment comprises the following steps: and (6) normalization processing.
And taking the model training digital sequence as an input training set, and training a preset neural network to obtain the music prediction model.
Optionally, the generating unit 104 is specifically configured to:
and determining the motivation pitch sequence generated by the preset motivation melody model.
And generating a first part pitch sequence by using the motivation pitch sequence as an input sequence of a music prediction model.
And performing downsampling on the first sound part pitch sequence to obtain a first sound part time value sequence.
Optionally, the fitting unit 106 is specifically configured to:
and performing polynomial fitting on the first sound part pitch sequence and the first sound part duration sequence respectively to obtain a fitted pitch sequence and a second sound part duration sequence.
And carrying out pitch correction on the fitted pitch sequence to obtain a second sound pitch sequence.
Optionally, the synthesis unit 108 is specifically configured to:
and respectively determining preset MIDI interfaces corresponding to the first sound part pitch sequence, the first sound part duration value sequence, the second sound part pitch sequence and the second sound part duration value sequence.
And respectively sending the first sound part pitch sequence, the first sound part time value sequence, the second sound part pitch sequence and the second sound part time value sequence into a preset MIDI synthesizer for synthesis according to the determined preset MIDI interface to obtain the two sound part composition melody.
Optionally, the intelligent composition device further comprises:
and the determining unit is used for determining the preset MIDI interface corresponding to the first tone pitch sequence and the first tone duration value sequence.
And the synthesis subunit is used for respectively sending the first vocal part pitch sequence and the first vocal part time value sequence to the preset MIDI synthesizer for synthesis according to the determined preset MIDI interface to obtain a vocal part composition melody.
The intelligence composing device that this embodiment provided includes: an encoding unit 100, configured to perform upsampling encoding on a preset MIDI data set to obtain a MIDI data upsampling sequence; the preset MIDI data set comprises a plurality of pieces of preset style music; the training unit 102 is configured to train a preset neural network by using a MIDI data up-sampling sequence to obtain a music prediction model; a generation unit 104 for generating a first sound part pitch sequence and a first sound part duration sequence based on the music piece prediction model and a preset motivational melody model; the preset motor melody model is a model which is obtained based on the music style type of the MIDI data set and the Markov model and can generate motor melody rules; a fitting unit 106, configured to perform curve fitting based on the first vocal part pitch sequence and the first vocal part duration sequence to obtain a second vocal part pitch sequence and a second vocal part duration sequence; the synthesis unit 108 is used for synthesizing the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence and the second vocal part time value sequence to obtain a two-vocal part composition melody, can fully mine the time sequence information and semantic information of the past time, the current time and the future time in the music sequence, and combines a music knowledge rule to generate a two-vocal part music with a typical specific music style.
Optionally, another embodiment of the present application further provides an electronic device, including: a processor and a memory, wherein:
the memory is used for storing computer instructions;
the processor is used for executing the computer instructions stored in the memory, and particularly executing the intelligent composition method according to any one of the embodiments.
It should be noted that, for the related description of the intelligent composition method, reference may be made to the corresponding embodiments in fig. 1 to fig. 7, and details are not described herein again.
Optionally, another embodiment of the present application further provides a storage medium for storing a program, and when the program is executed, the storage medium is used for implementing the intelligent composition method according to any one of the above embodiments.
It should be noted that, for the related description of the intelligent composition method, reference may be made to the embodiments corresponding to fig. 1 to fig. 7, and details are not described herein again.
Features described in the embodiments in the present specification may be replaced with or combined with each other, and the same and similar portions among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

Claims (10)

1. An intelligent composition method, comprising:
performing up-sampling coding on a preset MIDI data set to obtain an MIDI data up-sampling sequence; the preset MIDI data set comprises a plurality of preset music pieces;
training a preset neural network by utilizing the MIDI data up-sampling sequence to obtain a music prediction model;
generating a first sound part pitch sequence and a first sound part time value sequence based on the music prediction model and a preset motor melody model; the preset motor melody model is a model which is obtained based on the music style type of the MIDI data set and the Markov model and can generate motor melody rules;
performing curve fitting based on the first vocal part pitch sequence and the first vocal part duration sequence to obtain a second vocal part pitch sequence and a second vocal part duration sequence;
and synthesizing the first sound part pitch sequence, the first sound part time value sequence, the second sound part pitch sequence and the second sound part time value sequence to obtain the two-sound part composition melody.
2. The intelligent music composing method of claim 1, wherein up-sampling coding a predetermined MIDI data set to obtain a MIDI data up-sampled sequence comprises:
respectively extracting the music characteristics of each music piece in the preset MIDI data set to obtain the music characteristic set of each music piece; the musical feature set includes: mode, MIDI pitch and duration;
unifying the music tune modes of each piece of music in the preset MIDI data set based on a tune mode conversion table pre-stored in the preset MIDI data set to obtain a unified tune mode pitch sequence and a unified tune mode time value sequence corresponding to the MIDI data set; wherein the uniformly modulated pitch sequence corresponds to the uniformly modulated duration sequence one to one;
and performing up-sampling coding on the uniformly modulated pitch sequence according to the uniformly modulated time value sequence to obtain the MIDI data up-sampling sequence.
3. The intelligent music composing method according to claim 1, wherein training a predetermined neural network by using the MIDI data up-sampling sequence to obtain a music prediction model comprises:
preprocessing the MIDI data up-sampling sequence to obtain a model training digital sequence; the pretreatment comprises the following steps: normalization processing;
and taking the model training digital sequence as an input training set, and training a preset neural network to obtain the music prediction model.
4. The intelligent composition method according to claim 1, wherein generating a first part pitch sequence and a first part time value sequence based on the music prediction model and a preset motivational melody model comprises:
determining a motor pitch sequence generated by the preset motor melody model;
generating the first part pitch sequence using the motivational pitch sequence as an input sequence of the music prediction model;
and performing downsampling on the first sound part pitch sequence to obtain the first sound part time value sequence.
5. The intelligent composition method of claim 1, wherein curve fitting based on the first vocal part pitch sequence and the first vocal part time value sequence, and obtaining a second vocal part pitch sequence and a second vocal part time value sequence comprises:
respectively performing polynomial fitting on the first sound part pitch sequence and the first sound part duration sequence to obtain a fitted pitch sequence and a second sound part duration sequence;
and carrying out pitch correction on the fitted pitch sequence to obtain the second acoustic pitch sequence.
6. The intelligent composition method of claim 1, wherein synthesizing the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence, and the second vocal part time value sequence to obtain a two-vocal part composition melody comprises:
respectively determining a preset MIDI interface corresponding to the first sound part pitch sequence, the first sound part duration value sequence, the second sound part pitch sequence and the second sound part duration value sequence;
and respectively sending the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence and the second vocal part time value sequence to a preset MIDI synthesizer for synthesis according to the determined preset MIDI interface, so as to obtain the two-vocal part composition melody.
7. The intelligent composition method according to claim 1, further comprising, after generating a first part pitch sequence and a first part duration value sequence based on the music piece prediction model and a preset motivational melody model:
determining a preset MIDI interface corresponding to the first tone pitch sequence and the first tone duration sequence;
and respectively sending the first tone part pitch sequence and the first tone part time value sequence into a preset MIDI synthesizer for synthesis according to the determined preset MIDI interface to obtain a tone part composition melody.
8. An intelligent composition device, comprising:
the coding unit is used for performing up-sampling coding on a preset MIDI data set to obtain an MIDI data up-sampling sequence; the preset MIDI data set comprises a plurality of preset music pieces;
the training unit is used for training a preset neural network by utilizing the MIDI data up-sampling sequence to obtain a music prediction model;
a generation unit configured to generate a first sound part pitch sequence and a first sound part duration sequence based on the music prediction model and a preset motivational melody model; the preset motor melody model is a model which is obtained based on the music style type of the MIDI data set and the Markov model and can generate motor melody rules;
a fitting unit configured to perform curve fitting based on the first vocal part pitch sequence and the first vocal part time value sequence to obtain a second vocal part pitch sequence and a second vocal part time value sequence;
and a synthesizing unit configured to synthesize the first vocal part pitch sequence, the first vocal part time value sequence, the second vocal part pitch sequence, and the second vocal part time value sequence to obtain a two-vocal part composition melody.
9. An electronic device comprising a processor and a memory; wherein:
the memory is to store computer instructions;
the processor is used for executing the computer instructions stored in the memory, and particularly executing the intelligent composition method as claimed in any one of claims 1 to 7.
10. A storage medium storing a program for implementing the intelligent composition method according to any one of claims 1 to 7 when the program is executed.
CN202210606472.6A 2022-05-31 2022-05-31 Intelligent composition method and device, electronic equipment and storage medium Pending CN115019752A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210606472.6A CN115019752A (en) 2022-05-31 2022-05-31 Intelligent composition method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210606472.6A CN115019752A (en) 2022-05-31 2022-05-31 Intelligent composition method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115019752A true CN115019752A (en) 2022-09-06

Family

ID=83071824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210606472.6A Pending CN115019752A (en) 2022-05-31 2022-05-31 Intelligent composition method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115019752A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014170146A (en) * 2013-03-05 2014-09-18 Univ Of Tokyo Method and device for automatically composing chorus from japanese lyrics
KR20160121879A (en) * 2015-04-13 2016-10-21 성균관대학교산학협력단 Automatic melody composition method and automatic melody composition system
CN109671416A (en) * 2018-12-24 2019-04-23 成都嗨翻屋科技有限公司 Music rhythm generation method, device and user terminal based on enhancing study
CN111739492A (en) * 2020-06-18 2020-10-02 南京邮电大学 A music melody generation method based on pitch contour curve
CN111754962A (en) * 2020-05-06 2020-10-09 华南理工大学 System and method for intelligent auxiliary composition of folk songs based on up-down sampling
CN111785236A (en) * 2019-04-02 2020-10-16 陈德龙 Automatic composition method based on motivational extraction model and neural network
CN114141216A (en) * 2021-11-30 2022-03-04 北京小米移动软件有限公司 Music melody generation method, model training method, device and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014170146A (en) * 2013-03-05 2014-09-18 Univ Of Tokyo Method and device for automatically composing chorus from japanese lyrics
KR20160121879A (en) * 2015-04-13 2016-10-21 성균관대학교산학협력단 Automatic melody composition method and automatic melody composition system
CN109671416A (en) * 2018-12-24 2019-04-23 成都嗨翻屋科技有限公司 Music rhythm generation method, device and user terminal based on enhancing study
CN111785236A (en) * 2019-04-02 2020-10-16 陈德龙 Automatic composition method based on motivational extraction model and neural network
CN111754962A (en) * 2020-05-06 2020-10-09 华南理工大学 System and method for intelligent auxiliary composition of folk songs based on up-down sampling
CN111739492A (en) * 2020-06-18 2020-10-02 南京邮电大学 A music melody generation method based on pitch contour curve
CN114141216A (en) * 2021-11-30 2022-03-04 北京小米移动软件有限公司 Music melody generation method, model training method, device and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谭媛月: "基于马尔可夫模型和神经网络的民歌作曲研究", 中国优秀硕士学位论文全文数据库 哲学与人文科学辑, 15 February 2021 (2021-02-15), pages 086 - 175 *

Similar Documents

Publication Publication Date Title
CN112037754B (en) Method for generating speech synthesis training data and related equipment
CN1205827C (en) Portable communication terminal device with music mixing
US20110219940A1 (en) System and method for generating custom songs
CN108053814B (en) Speech synthesis system and method for simulating singing voice of user
CN111754962B (en) Folk song intelligent auxiliary composition system and method based on up-down sampling
CN112669811B (en) Song processing method and device, electronic equipment and readable storage medium
JP2011028130A (en) Speech synthesis device
CN110634465A (en) Music matching method, mobile terminal, data processing method and music matching system
CN112035699A (en) Music synthesis method, device, equipment and computer readable medium
JP2019168608A (en) Learning device, acoustic generation device, method, and program
CN110459201A (en) A kind of phoneme synthesizing method generating new tone color
CN113555027A (en) Voice emotion conversion method and device, computer equipment and storage medium
CN119400134A (en) Music generation method, device, electronic device and storage medium
Huang et al. Musical timbre style transfer with diffusion model
CN115762449B (en) Method and system for automatically generating conditional music theme melody based on Transformer
CN117809666A (en) Audio conversion method, device, equipment and storage medium
CN115810341B (en) Audio synthesis method, device, equipment and medium
CN112669815A (en) Song customization generation method and corresponding device, equipment and medium
Yanchenko et al. Classical music composition using state space models
CN115273806A (en) Song synthesis model training method and device, song synthesis method and device
CN115019752A (en) Intelligent composition method and device, electronic equipment and storage medium
CN1770258B (en) Rendition style determination apparatus and method
CN116704980B (en) Musical composition generation method, music generation model training method and equipment thereof
CN117690397A (en) Melody processing method, melody processing device, melody processing apparatus, melody processing storage medium, and melody processing program product
Vargas et al. Artificial musical pattern generation with genetic algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220906