“Linear Algebra With Transformers”, François Charton2021-12-03 (, )⁠:

Transformers can learn to perform numerical computations from examples only.

I study 9 problems of linear algebra, from basic matrix operations to eigenvalue decomposition & matrix inversion, and introduce and discuss 4 encoding schemes to represent real numbers.

On all problems, transformers trained on sets of random matrices achieve high accuracies (over 90%). The models are robust to noise, and can generalize out of their training distribution. In particular, models trained to predict Laplace-distributed eigenvalues generalize to different classes of matrices: Wigner matrices or matrices with positive eigenvalues. The reverse is not true.