********* June 19, 2020 *********
The code and data are going through the internal review and will be released later!
********* August 26, 2020 *********
The dataset is still going through the internal review, please wait.
********* September 7, 2020 *********
The code & pose data are released!
This repo is the PyTorch implementation of "Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning". Our proposed approach significantly outperforms the existing SOTAs in extensive experiments, including automatic metrics and human judgements. It can generate creative long dance sequences, e.g., about one-minute length under 15 FPS, from the input music clips, which are smooth, natural-looking, diverse, style-consistent and beat-matching with the music from test set. With the help of 3D human pose estimation and 3D animation driving, this techique can be used to drive various 3D character models such as the 3D model of Hatsune Miku (very popular virtual character in Japan), and has the great potential for the virtual advertisement video generation.
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning.
Ruozi Huang*, Huang Hu*, Wei Wu, Kei Sawada, Mi Zhang and Daxin Jiang.
[arXiv] [YouTube] [Project]
- Python 3.7
- PyTorch 1.3.1
Run sh install.sh
to configure the environment.
-
We released the dance pose data and the corresponding audio data into [Google Drive]. Please put the downloaded
data/
into the project directoryDanceRevolution/
and runprepro.py
that will generate the training data directorydata/train
and test data directorydata/test
. The pose sequences are extracted from the collected dance videos with original 30FPS while the audio data is m4a format. Note that, we develope a simple linear interpolation alogrithminterpolate_missing_keyjoints.py
to find missing keyjoints to reduce the noise in the pose data, which is introduced by the imperfect extraction of OpenPose. -
If you plan to train the model with your own dance data, please install [OpenPose] for the human pose extraction. After that, please follow the hierarchical structure of directory
data/
to place your own extracted data and runprepro.py
to generate the training data and test data.
- Ballet style
- Hiphop style
- Japanese Pop style
- Photo-Realisitc Videos by vid2vid
We map the generated skeleton dances to the photo-realistic videos by vid2vid. Specifically, We record a random dance video of a team memebr to train the vid2vid model. Then we generate photo-realistic videos by feeding the generated skeleton dances to the trained vid2vid model. Note that, our team member has authorized us the usage of her portrait in following demos.
- Driving 3D model by applying 3D human pose estimation and Unity animation to generated skeleton dances.
If you find this work useful for your research, please cite the following paper:
@article{huang2020dance,
title={Dance Revolution: Long Sequence Dance Generation with Music via Curriculum Learning},
author={Huang, Ruozi and Hu, Huang and Wu, Wei and Sawada, Kei and Zhang, Mi},
journal={arXiv preprint arXiv:2006.06119},
year={2020}
}