https://github.com/ggerganov/llama.cpp/pull/1773 (NN sampling, Codex, inner monologue (AI))
https://github.com/ggerganov/llama.cpp/pull/1773