“Language-Conditioned Absolute Unit NNs”, Gwern2022-10-22 (, ; backlinks)⁠:

Proposal for applying the AUNN neural net architecture to reconstruction of historical documents using pretrained large language models.

As an application of my proposed AUNN MLP neural net architecture which handles arbitrary-modality data, I sketch out a system for plugging large language models (LLMs) into AUNNs. The advantage is that this efficiently provides a highly-informative Greco-Roman language prior for reconstruction of the text of damaged Herculaneum papyri using advanced imaging modalities like X-rays.

Because there is so little raw data, and obtaining more is infeasible indefinitely in the absence of convincing reconstructions which could justify the risk of excavating more fragile papyri, posing a chicken-and-egg bootstrap problem, it is critical to use all available sources of information jointly & end-to-end.

Since AUNNs concentrate all raw data about all papyri into a single model, it can generate embeddings of the implicit reconstructed text at given locations in a papyrus. These embeddings are differentiable and can be passed into a frozen Greco-Roman large language model to be scored by their plausibility as real natural language, and then run backwards to update the AUNN weights to emit more plausible embeddings.

This constrains the naive raw/physics-only reconstructions (which are highly under-determined by the raw data), to the vanishingly small subset of reconstructions consistent with our extensive data on Greek/Latin natural language, and can potentially produce meaningful reconstructions out of the reach of conventional approaches using naive priors & separate analyses.