Reducing conversational agents’ overconfidence through linguistic calibration
Self-Consistency Improves Chain-of-Thought Reasoning in Language Models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback