Bibliography (7):

  1. Layer Normalization

  2. Pointer Sentinel Mixture Models

  3. https://github.com/sIncerass/powernorm