“SAYCam: A Large, Longitudinal Audiovisual Dataset Recorded from the Infant’s Perspective”, Jess Sullivan, Michelle Mei, Amy Perfors, Erica Wojcik, Michael Frank2020-01-14 (, , ; backlinks; similar)⁠:

We introduce a new resource: the SAYCam corpus.

Infants aged 6–32 months wore a head-mounted camera for ~2 hours per week, over the course of ~2.5 years.

The result is a large, naturalistic, longitudinal dataset of infant-perspective and child-perspective videos. Transcription efforts are underway, with over 200,000 words of naturalistic dialogue already transcribed. Similarly, the dataset is searchable using a number of criteria (eg. age of participant, location, setting, objects present).

The resulting dataset will be of broad use to psychologists, linguists, and computer scientists.