“AceGPT, Localizing Large Language Models in Arabic”, 2023-09-21 (; backlinks):
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models. Significant concerns emerge when addressing cultural sensitivity and local values.
To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) using native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities.
Comprehensive evaluations reveal that the resulting model, dubbed AceGPT, sets the state-of-the-art standard for open Arabic LLMs across various benchmarks, including the instruction-following benchmark (ie. Arabic Vicuna-80 and Arabic AlpacaEval), knowledge benchmark (ie. Arabic MMLU and EXAMs), and the newly introduced Arabic Cultural and Value Alignment benchmark. Notably, AceGPT outperforms Turbo in the popular Vicuna-80 benchmark when evaluated with GPT-4, despite the benchmark’s limited scale.
Codes, data, and models are in Github.