β AraBERT: Transformer-based Model for Arabic Language Understanding
β CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
β BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
β Unigram LM: Byte Pair Encoding is Suboptimal for Language Model Pretraining
β Unsupervised Cross-lingual Representation Learning at Scale