“Self-Play Reinforcement Learning Guides Protein Engineering”, Yi Wang, Hui Tang, Lichao Huang, Lulu Pan, Lixiang Yang, Huanming Yang, Feng Mu, Meng Yang2023-07-20 (, )⁠:

[code, Code Ocean] Designing protein sequences towards desired properties is a fundamental goal of protein engineering, with applications in drug discovery and enzymatic engineering. Machine learning-guided directed evolution has shown success in expediting the optimization cycle and reducing experimental burden. However, efficient sampling in the vast design space remains a challenge.

To address this, we propose EvoPlay, a self-play reinforcement learning framework based on the single-player version of AlphaZero. In this work, we mutate a single-site residue as an action to optimize protein sequences, analogous to playing pieces on a chessboard. A policy-value neural network reciprocally interacts with look-ahead Monte Carlo tree search to guide the optimization agent with breadth and depth.

We extensively evaluate EvoPlay on a suite of in silico directed evolution tasks over full-length sequences or combinatorial sites using functional surrogates. EvoPlay also supports AlphaFold2 as a structural surrogate to design peptide binders with high affinities, validated by binding assays.

Moreover, we harness EvoPlay to prospectively engineer luciferase, resulting in the discovery of variants with 7.8× bioluminescence improvement beyond wild type.

In sum, EvoPlay holds great promise for facilitating protein design to tackle unmet academic, industrial and clinical needs.