[go: up one dir, main page]

Skip to content

possibly useful materials for learning RWKV language model.

Notifications You must be signed in to change notification settings

Hannibal046/RWKV-howto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

RWKV-howto

possibly useful materials and tutorial for learning RWKV.

RWKV: Parallelizable RNN with Transformer-level LLM Performance.

Relevant Papers

  • 🌟(2023-05) RWKV: Reinventing RNNs for the Transformer Era arxiv

  • (2023-03) Resurrecting Recurrent Neural Networks for Long Sequences arxiv

  • (2023-02) SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks arxiv

  • (2022-08) Simplified State Space Layers for Sequence Modeling ICLR2023

  • 🌟(2021-05) An Attention Free Transformer arxiv

  • (2021-10) Efficiently Modeling Long Sequences with Structured State Spaces ICLR2022

  • (2020-08) Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention ICML2020

  • (2018) Parallelizing Linear Recurrent Neural Nets Over Sequence Length ICLR2018

  • (2017-09) Simple Recurrent Units for Highly Parallelizable Recurrence EMNLP2017

  • (2017-10) MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks Neurips2017

  • (2017-06) Attention Is All You Need Neurips2017

  • (2016-11) Quasi-Recurrent Neural Networks ICLR2017

Resources

  • Introducing RWKV - An RNN with the advantages of a transformer Hugging Face

  • 有了Transformer框架后是不是RNN完全可以废弃了?知乎

  • RNN最简单有效的形式是什么?知乎

  • 🌟RWKV的RNN CNN二象性 知乎

  • RNN的隐藏层需要非线性吗?知乎

  • Google新作试图“复活”RNN:RNN能否再次辉煌? 苏剑林

  • 🌟How the RWKV language model works Johan Sokrates Wind

  • 🌟The RWKV language model: An RNN with the advantages of a transformer Johan Sokrates Wind

  • The Unreasonable Effectiveness of Recurrent Neural Networks Andrej Karpathy blog

Code

About

possibly useful materials for learning RWKV language model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published