Stars
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Official PyTorch implementation of CoverHunter
Train transformer language models with reinforcement learning.
中文大模型能力评测榜单:目前已囊括128个大模型,覆盖chatgpt、gpt-4o、谷歌gemini、百度文心一言、阿里通义千问、百川、讯飞星火、商汤senseChat、minimax等商用模型, 以及qwen2.5、llama3.1、glm4、书生internLM2.5、openbuddy、AquilaChat等开源大模型。不仅提供能力评分排行榜,也提供所有模型的原始输出结果!
Pitch and Duration Tuner for fragment of audio
A demo for fingerprint of python implement
Fast constant-Q transform feature, c++ implement
C++ Parallel Computing and Asynchronous Networking Framework
Natural Language Processing for the next decade. Tokenization, Part-of-Speech Tagging, Named Entity Recognition, Syntactic & Semantic Dependency Parsing, Document Classification
Please visit https://thuhcsi.github.io/icassp2022-hybrid-bottleneck-vc/
A 10000+ hours dataset for Chinese speech recognition
😝 TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
Open Language Pre-trained Model Zoo
Real-time audio analysis library, support acoustic feature extraction and real-time beats detection
TTS-frontend with Bert and CRF/lstm (For Tacotron)
A TensorFlow implementation of Google's Tacotron speech synthesis with pre-trained model (unofficial)
Speech recognition software where the neural net is trained with TensorFlow and GMM training and decoding is done in Kaldi
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
Roadmap of DL and ML, some courses, study notes and paper summary
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow