Bibliography (5):

GPT-3: Language Models are Few-Shot Learners
Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks
https://x.com/denny_zhou/status/1532104072353808384
PaLM: Scaling Language Modeling with Pathways
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs