“Explaining Grokking through Circuit Efficiency”, 2023-09-05 (; backlinks):
One of the most surprising puzzles in neural network generalization is grokking: a network with perfect training accuracy but poor generalization will, upon further training, transition to perfect generalization. We propose that grokking occurs when the task admits a generalizing solution and a memorizing solution, where the generalizing solution is slower to learn but more efficient, producing larger logits with the same parameter norm.
We hypothesize that memorizing circuits become more inefficient with larger training datasets while generalizing circuits do not, suggesting there is a critical dataset size at which memorization and generalization are equally efficient.
We make and confirm 4 novel predictions about grokking, providing evidence in favor of our explanation. Most strikingly, we demonstrate two novel and surprising behaviors: ungrokking, in which a network regresses from perfect to low test accuracy, and semi-grokking, in which a network shows delayed generalization to partial rather than perfect test accuracy.