Bibliography (7):

https://x.com/AnthropicAI/status/1547250801130713090
Teaching Models to Express Their Uncertainty in Words
Reducing conversational agents’ overconfidence through linguistic calibration
Self-Consistency Improves Chain-of-Thought Reasoning in Language Models
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
https://github.com/google/BIG-bench
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback