“Goodbooks-10k: a New Dataset for Book Recommendations”, Zygmunt Zajc2017-11-29 (, ; backlinks)⁠:

There have been a few recommendations datasets for movies (Netflix, Movielens) and music (Million Songs), but not for books. That is, until now. The dataset contains six million ratings for ten thousand most popular books (with most ratings). There are also:

As to the source, let’s say that these ratings come from a site similar to goodreads.com, but with more permissive terms of use. There are a few types of data here:

All files are available on GitHub. Some of them are quite large, so GitHub won’t show their contents online. See samples for smaller CSV snippets. You can download individual zipped files from releases.