Bibliography (17):

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
DALL·E 1: Creating Images from Text: We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language
GPT-3: Language Models are Few-Shot Learners
https://www.alignmentforum.org/posts/Haawpd5rZrzkzvYRC/an-162-foundation-models-a-paradigm-shift-within-ai
https://arxiv.org/pdf/2108.07258.pdf#page=26
https://arxiv.org/pdf/2108.07258.pdf#page=34
https://arxiv.org/pdf/2108.07258.pdf#page=42
https://arxiv.org/pdf/2108.07258.pdf#page=54
https://arxiv.org/pdf/2108.07258.pdf#page=85
The Power of Scale for Parameter-Efficient Prompt Tuning
https://mailchi.mp/459b1e4f860d/an-152how-weve-overestimated-few-shot-learning-capabilities
https://thegradient.pub/prompting/
https://mailchi.mp/aa6782968981/an-155a-minecraft-benchmark-for-algorithms-that-learn-without-reward-functions
https://arxiv.org/pdf/2108.07258.pdf#page=92
https://arxiv.org/pdf/2108.07258.pdf#page=114
https://arxiv.org/pdf/2108.07258.pdf#page=118
Wikipedia Bibliography:
1. Weak supervision § Semi-supervised learning