Bibliography (4):

  1. RoBERTa: A Robustly Optimized BERT Pretraining Approach

  2. PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

  3. CPM-2: Large-scale Cost-effective Pre-trained Language Models

  4. Wikipedia Bibliography:

    1. Ceiling effect (statistics)