-
https://www.zhihu.com/question/456443707
-
https://zhuanlan.zhihu.com/p/367666974
-
GPT-3: Language Models are Few-Shot Learners
-
PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
-
Attention Is All You Need
-