Language models accurately infer correlations between psychological items and scales from text alone


Author Assertions
Conflict of Interest
Public Data
Preregistration
    ">
    Hommel & Arslan (2024) - Preprint Stage I.pdf
    Version: 2
    Created: April 05, 2024
    |
    Last edited: June 20, 2024
    Views: 2012 | Downloads: 603

    Abstract

    Many behavioural scientists do not agree on core constructs and how they should be measured. Different literatures measure related constructs, but the connections are not always obvious to readers and meta-analysts. Many measures in behavioural science are based on agreement with survey items. Because these items are sentences, computerised language models can make connections between disparate measures and constructs and help researchers regain an overview over the rapidly growing, fragmented literature. Our fine-tuned language model, the SurveyBot3000, accurately predicts the correlations between survey items, the reliability of aggregated measurement scales, and intercorrelations between scales from item positions in semantic vector space. In our pilot study, the out-of-sample accuracy for item correlations was .71, .86 for reliabilities, and .89 for scale correlations. In a preregistered study, we will investigate whether the performance of our model generalises to measures across behavioural science.

    Supplemental Materials

    https://osf.io/z47qs/

    preprint DOI

    https://doi.org/10.31234/osf.io/kjuce

    License

    CC-By Attribution 4.0 International

    Disciplines

    Quantitative Psychology Psychometrics Quantitative Methods Social and Behavioral Sciences

    Tags

    construct proliferation jingle-jangle language model natural language processing

    Citations

    APA

    MLA

    Chicago

    Get more citations