Bibliography (5):

  1. DeBERTa: Decoding-enhanced BERT with Disentangled Attention

  2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

  3. RoBERTa: A Robustly Optimized BERT Pretraining Approach

  4. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

  5. Wikipedia Bibliography:

    1. Ensemble learning