-
https://ml-gsai.github.io/LLaDA-demo/
-
https://github.com/ML-GSAI/LLaDA
-
Structured Denoising Diffusion Models in Discrete State-Spaces
-
RADD: Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
-
Attention Is All You Need
-
The Reversal Curse: LLMs trained on A-is-B fail to learn B-is-A
-
https://openai.com/index/hello-gpt-4o/