“Minimum Description Length Recurrent Neural Networks”, Nur Lan, Michal Geyer, Emmanuel Chemla, Roni Katzir2021-10-31 (, ; similar)⁠:

We train neural networks to optimize a Minimum Description Length score, ie. to balance between the complexity of the network and its accuracy at a task.

We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as anbn, anbncn, anb2n, anbmcn+m, and they perform addition. Moreover, they often do so with 100% accuracy.

The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence.

To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.