“Co-Training Improves Prompt-Based Learning for Large Language Models”, 2022-02-02 (; similar):
We demonstrate that co-training (1998) can improve the performance of prompt-based learning by using unlabeled data. While prompting has emerged as a promising paradigm for few-shot and zero-shot learning, it is often brittle and requires much larger models compared to the standard supervised setup.
We find that co-training makes it possible to improve the original prompt model and at the same time learn a smaller, downstream task-specific model. In the case where we only have partial access to a prompt model (eg. output probabilities from GPT-3 ( et al 2020)) we learn a calibration model over the prompt outputs. When we have full access to the prompt model’s gradients but full finetuning remains prohibitively expensive (eg. T0 ( et al 2021)), we learn a set of soft prompt continuous vectors to iteratively update the prompt model.
We find that models trained in this manner can improve performance on challenging datasets where there is currently a large gap between prompt-based learning and fully-supervised models.