âNaturalProofs: Mathematical Theorem Proving in Natural Languageâ, 2021-03-24 (; backlinks; similar)â :
Understanding and creating mathematics using natural mathematical languageâthe mixture of symbolic and natural language used by humansâis a challenging and important problem for driving progress in machine learning.
As a step in this direction, we develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs, written in natural mathematical language. NaturalProofs unifies broad coverage, deep coverage, and low-resource mathematical sources, allowing for evaluating both in-distribution and zero-shot generalization.
Using NaturalProofs, we benchmark strong neural methods on mathematical reference retrieval and generation tasks which test a systemâs ability to determine key results that appear in a proof. Large-scale sequence models show promise compared to classical information retrieval methods, yet their performance and out-of-domain generalization leave substantial room for improvement.
NaturalProofs opens many avenues for research on challenging mathematical tasks.