Bibliography (10):

  1. https://github.com/KindXiaoming/Omnigrok

  2. https://theinsideview.ai/eric#grokking

  3. The Goldilocks zone: Towards better understanding of neural network loss landscapes

  4. Decoupled Weight Decay Regularization

  5. 2022-liu-figure1-goldilockszoneofinitializationandrelationshiptogrokking.png

  6. https://arxiv.org/pdf/2210.01117#page=4

  7. 2022-liu-figure7-transformergrokkingvsweightnormformodularaddition.png

  8. Progress measures for grokking via mechanistic interpretability