-
WebGPT: Browser-assisted question-answering with human feedback
-
https://x.com/dust4ai/status/1587104029712203778
-
GPT-3: Language Models are Few-Shot Learners
-
Learning from Human Preferences
-
Fine-Tuning GPT-2 from Human Preferences
-
https://openai.com/research/learning-to-summarize-with-human-feedback
-
https://openai.com/research/summarizing-books
-
TruthfulQA: Measuring How Models Mimic Human Falsehoods
-
https://openai.com/research/debate
-