âA Library for Representing Python Programs As Graphs for Machine Learningâ, 2022-08-15 ()â :
Graph representations of programs are commonly a central element of machine learning for code research.
We introduce an open source Python library python_graphs that applies static analysis to construct graph representations of Python programs suitable for training machine learning models. Our library admits the construction of control-flow graphs, data-flow graphs, and composite âprogram graphsâ that combine control-flow, data-flow, syntactic, and lexical information about a program.
We present the capabilities and limitations of the library, perform a case study applying the library to millions of competitive programming submissions, and showcase the libraryâs utility for machine learning research.