“AdaVAE: Exploring Adaptive GPT-2s in Variational Autoencoders for Language Modeling”, 2022-05-12 (; similar):
Variational autoencoder (VAE) has become the de-facto learning paradigm in achieving representation learning and generation for natural language at the same time.
Nevertheless, existing VAE-based language models either employ elementary RNNs, which are not powerful enough to handle complex works in the multi-task situation, or fine-tune two pre-trained language models (PLMs) for any downstream task, which is a huge drain on resources.
In this paper, we propose the first VAE framework empowered with adaptive GPT-2s (AdaVAE). Different from existing systems, we unify both the encoder and decoder of the VAE model using GPT-2s with adaptive parameter-efficient components, and further introduce Latent Attention operation to better construct latent space from transformer models.
Experiments from multiple dimensions validate that AdaVAE is competent to effectively organize language in 3 related tasks (language modeling, representation modeling, and guided text generation) even with less than 15% activated parameters in training.
Our code is available at https://github.com/ImKeTT/AdaVAE.