“CodeCompose: A Large-Scale Industrial Deployment of AI-Assisted Code Authoring”, 2023-05-20 ():
The rise of large language models (LLMs) has unlocked various applications of this technology in software development. In particular, generative LLMs have been shown to effectively power AI-based code authoring tools that can suggest entire statements or blocks of code during code authoring. In this paper we present CodeCompose, an AI-assisted code authoring tool developed and deployed at Meta [Facebook] internally. CodeCompose is based on the InCoder LLM [MoE] that merges generative capabilities with bi-directionality […it can understand inline comments in natural language and generate code that adheres to the comment, as shown in Figure 1(b). It can also fluently generate comments, messages, and documentation.].
We have scaled up CodeCompose to serve tens of thousands of developers at Meta, across 10+ programming languages [Python, Javascript, C++, Hack etc], and several coding surfaces…Training for 4 epochs with sharded data parallelism took 4 days on a cluster of 128 A100 GPUs. We then deployed the model on a cluster of 150 A100 GPUs.
We discuss unique challenges in terms of user experience and metrics that arise when deploying such tools in large-scale industrial settings. We present our experience in making design decisions about the model and system architecture for CodeCompose that addresses these challenges.
Finally, we present metrics from our large-scale deployment of CodeCompose that shows its impact on Meta’s internal code authoring experience over a 15-day time window, where 4.5 million suggestions were made by CodeCompose.
Quantitative metrics reveal that (1) CodeCompose has an acceptance rate of 22% across several languages, and (2) 8% of the code typed by users of CodeCompose is through accepting code suggestions from CodeCompose. Qualitative feedback indicates an overwhelming 91.5% positive reception for CodeCompose.
In addition to assisting with code authoring, CodeCompose is also introducing other positive side effects such as encouraging developers to generate more in-code documentation, helping them with the discovery of new APIs, etc.