[go: up one dir, main page]

Follow
Kainan Peng
Kainan Peng
Amazon
Verified email at alumni.cmu.edu
Title
Cited by
Cited by
Year
Deep Voice 3: Scaling text-to-speech with convolutional sequence learning
W Ping, K Peng, A Gibiansky, SO Arik, A Kannan, S Narang, J Raiman, ...
ICLR 2018, 2018
1011*2018
Deep voice 2: Multi-speaker neural text-to-speech
A Gibiansky, S Arik, G Diamos, J Miller, K Peng, W Ping, J Raiman, ...
NIPS 2017, 2962-2970, 2017
711*2017
Neural voice cloning with a few samples
S Arik, J Chen, K Peng, W Ping, Y Zhou
NeurIPS 2018, 10019-10029, 2018
5672018
ClariNet: Parallel wave generation in end-to-end text-to-speech
W Ping, K Peng, J Chen
ICLR 2019, 2018
4752018
Non-Autoregressive Neural Text-to-Speech
K Peng, W Ping, Z Song, K Zhao
ICML 2020, 2019
203*2019
WaveFlow: A Compact Flow-based Model for Raw Audio
W Ping, K Peng, K Zhao, Z Song
ICML 2020, 2019
1822019
Systems and methods for multi-speaker neural text-to-speech
G DIAMOS, A GIBIANSKY, J Miller, K PENG, W PING, J RAIMAN, Y ZHOU
US Patent 10,896,669, 2021
1192021
Systems and methods for neural voice cloning with a few samples
C Jitong, P Kainan, P Wei, Z Yanqi
US Patent 11,238,843, 2022
732022
Systems and methods for neural text-to-speech using convolutional sequence learning
P Wei, P Kainan
US Patent 10,796,686, 2020
562020
Incremental text-to-speech synthesis with prefix-to-prefix framework
M Ma, B Zheng, K Liu, R Zheng, H Liu, K Peng, K Church, L Huang
Findings of the Association for Computational Linguistics: EMNLP 2020, 3886-3896, 2020
392020
Vevo: Controllable zero-shot voice imitation with self-supervised disentanglement
X Zhang, X Zhang, K Peng, Z Tang, V Manohar, Y Liu, J Hwang, D Li, ...
arXiv preprint arXiv:2502.07243, 2025
352025
Parallel neural text-to-speech
P Kainan, P Wei, S Zhao, Z Kexin
US Patent 11,017,761, 2021
322021
Systems and methods for parallel wave generation in end-to-end text-to-speech
P Wei, P Kainan, C Jitong
US Patent 10,872,596, 2020
272020
Multi-speaker end-to-end speech synthesis
J Park, K Zhao, K Peng, W Ping
arXiv preprint arXiv:1907.04462, 2019
222019
Voiceshop: A unified speech-to-speech framework for identity-preserving zero-shot voice editing
P Anastassiou, Z Tang, K Peng, D Jia, J Li, M Tu, Y Wang, Y Wang, M Ma
arXiv preprint arXiv:2404.06674, 2024
112024
Zero-shot accent conversion using pseudo siamese disentanglement network
D Jia, Q Tian, K Peng, J Li, Y Chen, M Ma, Y Wang, Y Wang
arXiv preprint arXiv:2212.05751, 2022
72022
Deep Voice 3: scaling text-to-speech with convolutional sequence learning
P Wei, P Kainan, G Andrew, SO Arik, A Kannan, S Narang, J Raiman, ...
arXiv preprint, 2017
52017
Waveform generation using end-to-end text-to-waveform system
P Wei, P Kainan, C Jitong
US Patent 11,482,207, 2022
32022
Multi-speaker neural text-to-speech
G DIAMOS, A GIBIANSKY, J Miller, K PENG, W PING, J RAIMAN, Y ZHOU
US Patent 11,651,763, 2023
22023
SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment
S Mehta, Y Liu, Z Tang, K Peng, V Manohar, S Zhang, M Seltzer, Q He, ...
arXiv preprint arXiv:2507.09070, 2025
12025
The system can't perform the operation now. Try again later.
Articles 1–20