Bibliography (7):

  1. https://ml-gsai.github.io/LLaDA-demo/

  2. https://github.com/ML-GSAI/LLaDA

  3. Structured Denoising Diffusion Models in Discrete State-Spaces

  4. RADD: Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

  5. Attention Is All You Need

  6. The Reversal Curse: LLMs trained on A-is-B fail to learn B-is-A

  7. https://openai.com/index/hello-gpt-4o/