GitHub and other sources have reported more than 50% of developers adopting AI Assisted-development during 2023. What these sources haven’t reported is how the composition of code changes when AI is used.
We examine 4 years worth of data, encompassing more than 150m changed lines of code, to determine how AI Assistants influence the quality of code being written. We find a substantial uptick in churn code, and a concerning decrease in code reuse.
2023 marked the coming out party for GitHub Copilot. In less than two years’ time, the AI programming assistant shot from “prototype” to “cornerstone”, used by millions of developers across hundreds of thousands of businesses. Its unprecedented growth defines a new era in “how code gets written.”
GitHub has published several insightful pieces of research on the growth and impact of AI on software development. Among their findings is that developers write code “55% faster” when using Copilot. This profusion of LLM-generated code begs the question: how does the code quality & maintainability compare to what would have been written by a human? Is it more similar to the careful, refined contributions of a Senior Developer, or more akin to the disjointed work of a short-term contractor?
To investigate, GitClear analyzed ~153 million changed lines of code, authored between January 2020 and December 2023A1. This is the largest known database of highly structured code change data that has been used to evaluate code quality differencesA2.
We find disconcerting trends for maintainability. Code churn—the percentage of lines that are reverted or updated less than two weeks after being authored—is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of “added code” and “copy/pasted code” is increasing in proportion to “updated”, “deleted”, and “moved” code. In this regard, code generated during 2023 more resembles an itinerant contributor, prone to violate the DRY-ness of the repos visited.
We conclude with suggestions for managers seeking to maintain high code quality in spite of the forces currently opposing it.
Code Churn by Year
…GitClear’s data is split about 2⁄3rds private corporations that have opted in to anonymized data sharing, and 1⁄3rd open source projects (mostly those run by Google, Facebook, and Microsoft). In addition to the code operation data, GitClear’s data set also segments and excludes lines if they exist within auto-generated files, subrepo commits, and other exclusionary criteria enumerated in this documentation. As of January 2024, that documentation suggests that a little less than half of the “lines changed” by a conventional git stats aggregator (eg. GitHub) would qualify for analysis among the 150m lines in this study. The study does include commented lines—future research could compare comment vs. non-comment lines. It could also compare “test code” vs “other types of code”, which probably influences the levels of copy/paste.