Bibliography (9):

Deep Double Descent: We show that the double descent phenomenon occurs in CNNs, ResNets, and transformers: performance first improves, then gets worse, and then improves again with increasing model size, data size, or training time
https://arxiv.org/abs/1706.04454
Visualizing the Loss Landscape of Neural Nets
Essentially No Barriers in Neural Network Energy Landscape
A jamming transition from under-parameterization to over-parameterization affects loss landscape and generalization
https://arxiv.org/abs/1803.06969
Wikipedia Bibliography:
1. Loss function
2. Spin glass
3. https://en.wikipedia.org/wiki/Gibbs_measure :
  
  https://en.wikipedia.org/wiki/Gibbs_measure