âCommunication and Code Dependency Effects on Software Code Quality: An Empirical Analysis of Herbsleb Hypothesisâ, 2019-04-22 ()â :
Prior literature has suggested that in many projects 80% or more of the contributions are made by a small called group of around 20% of the development team. Most prior studies deprecate a reliance on such a small inner group of âheroesâ, arguing that it causes bottlenecks in development and communication. Despite this, such projects are very common in open source projects. So what exactly is the impact of âheroesâ in code quality?
Herbsleb argues that if code is strongly connected yet their developers are not, then that code will be buggy. To test the Herbsleb hypothesis, we develop and apply two metrics of (a) âsocial-nessââ and (b) âhero-nessâ that measure (a) how much one developer comments on the issues of another; and (b) how much one developer changes another developerâs code (and âheroesâ are those that change the most code, all around the system). In a result endorsing the Herbsleb hypothesis, in over 1,000 open source projects, we find that âsocial-nessâ is a statistically stronger indicate for code quality (number of bugs) than âhero-nessâ.
Hence we say that debates over the merits of âhero-nessâ is subtly misguided. Our results suggest that the real benefits of these so-called âheroesâ is not so much the code they generate but the pattern of communication required when the interaction between a large community of programmers passes through a small group of centralized developers. To say that another way, to build better code, build better communication flows between core developers and the rest.
In order to allow other researchers to confirm/improve/refute our results, all our scripts and data are available, on-line at Github.