[OSF] Shepard’suniversal law of generalization is a remarkable hypothesis about how intelligent organisms should perceive similarity. In its broadest form, the universal law states that the level of perceived similarity between a pair of stimuli should decay as a concave function of their distance when embedded in an appropriate psychological space.
While extensively studied, evidence in support of the universal law has relied on low-dimensional stimuli and small stimulus sets that are very different from their real-world counterparts. This is largely because pairwise comparisons—as required for similarity judgments—scale quadratically in the number of stimuli.
…by analyzing an existing data set of 214,200 human similarity judgments and a newly collected data set of 390,819 human generalization judgments (n = 2,406 US participants) across 3 sets of natural images:
We provide strong evidence for the universal law in a naturalistic high-dimensional regime.
Humans constantly form generalizations, whether when trying to identify the color of an object or reasoning about which action to take based on past experiences. Understanding how generalizations relate to underlying psychological representations is a core problem in cognitive science. The universal law of generalization is a fundamental hypothesis concerning the nature of this relationship which states that the strength of generalization between two stimuli should decay as a universal exponential function of their psychological distance. While extensively studied, evidence for the universal law comes from small data sets and artificial stimuli that are very different from the real world. Our work is the first to provide strong evidence for the universal law in a high-dimensional naturalistic domain by collecting and analyzing 605,019 human similarity and generalization judgments for natural images.
…To address this gap, we leveraged recent advances in online recruitment as well as the availability of naturalistic image data sets to test the universal law of generalization in a high-dimensional setting. Specifically, we considered a data set of similarity judgments over 3 sets of images recently collected by Petersonet al2018 where each data set comprised 120 images from a given natural category, namely, animals, fruits, and vegetables. This data set consisted of 214,200 human judgments. To account for the different ways in which similarity scores can be constructed, we augmented this data set with a newly collected set of generalization judgments where participants rated how likely it is a certain blank property (Kemp & Tenenbaum2009; Osherson et al 199034ya; eg. having an enzyme) generalizes from one stimulus to another. The latter data set comprises 390,819 generalization judgments from 2,406 online participants. We used these data to directly test the universal law of generalization in this high-dimensional large-scale regime.
Figure 4: Generalization Gradients Across Domains of Natural Images and Tasks With the Optimal Model Fits Overlaid. Note: Error bars indicate 95% confidence intervals. “GAM” = generalized additive model; “MDS” = multidimensional scaling.