The ability to build progressively on the achievements of earlier generations is central to human uniqueness, but experimental investigations of this cumulative cultural evolution lack real-world complexity.
Here, we studied the dynamics of cumulative culture using a large-scale data set from online collaborative programming competitions run over 14 years. We show that, within each contest population, performance increases over time through frequent ‘tweaks’ of the current best entry and rare innovative ‘leaps’ (successful tweak:leap ratio = 16:1), the latter associated with substantially greater variance in performance.
Cumulative cultural evolution reduces technological diversity over time, as populations focus on refining high-performance solutions. While individual entries borrow from few sources, iterative copying allows populations to integrate ideas from many sources, demonstrating a new form of collective intelligence.
Our results imply that maximizing technological progress requires accepting high levels of failure.
Figure 1: Scores over time. Normalized log-transformed scores over time (measured in days from the beginning of each contest) in all contests (n = 47,921 entries). For visualization purposes, because some values were zero, we added a small number on the appropriate scale to each score before log-transforming (the number chosen was 10). Note that in all contests low score values are better. Each point on the graph is an entry. The red line follows the progress of the leading entries in the contest, i.e. the entries that achieved the best score at the time of their submission.
…We analysed a database of 21,745,538 lines of computer code in total and 483,173 unique lines, originating from 47,967 entries to 19 online collaborative programming competitions organized over the course of 14 years by the MathWorks software company. In every contest, the organizers set a computational challenge and, over the course of one week, participants developed and provided solutions in the form of MATLAB® code. Once an entry had been successfully evaluated, its score, code and the username of the participant who submitted it became public and available to all the other participants to build upon. The challenges were all NP-complete computer science problems
…Leaps usually fail but can bring large advances: The success of an entry—whether it took the lead, and if so by how much—was strongly related to the extent to which it was based largely on copying or exhibited substantial innovation. Among entries that took the lead, we observed a statistically-significant negative correlation between the entry’s similarity to the previous leader and its associated improvement in score (Spearman’s ρ = −0.15, p < 0.001), with the biggest improvements associated with those entries most different from the previous leader. However, among entries that did not take the lead, the reverse relationship was observed (Spearman ρ = −0.53, p < 0.001), with the most innovative entries exhibiting the poorest performance, measured as the absolute difference in score from the current leader. Hence tweaks were associated with smaller changes in score, either positive or negative, while leaps garnered both large improvements in score and spectacular failures (Figure 2d; Supplementary Figure 4). The distribution of entry performance relative to the current leader shows that while leaps were more likely to lead to poorer scores than tweaks of copied material overall, on rare occasions they generated statistically-significantly larger benefits.