Bibliography (3):
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Attention Is All You Need
Mamba: Linear-Time Sequence Modeling with Selective State Spaces