“The Most ‘Abandoned’ Books on GoodReads”, Gwern2019-12-09 (, , , ; similar)⁠:

Which books on GoodReads are most difficult to finish? Estimating proportions in December 2019 gives an entirely different result than absolute counts.

What books are hardest for a reader who starts them to finish, and most likely to be abandoned? I scrape a crowdsourced tag, abandoned, from the GoodReads book social network on 2019-12-09 to estimate conditional probability of being abandoned.

The default GoodReads tag interface presents only raw counts of tags, not counts divided by total ratings (=reads). This conflates popularity with probability of being abandoned: a popular but rarely-abandoned book may have more abandoned tags than a less popular but often-abandoned book. There is also residual error from the winner’s curse where books with fewer ratings are more mis-estimated than popular books. I fix that to see what more correct rankings look like.

Correcting for both changes the top-5 ranking completely, from (raw counts):

  1. The Casual Vacancy, J. K. Rowling

  2. Catch-22, Joseph Heller

  3. American Gods, Neil Gaiman

  4. A Game of Thrones, George R. R. Martin

  5. The Book Thief, Markus Zusak

to (shrunken posterior proportions):

  1. Black Leopard, Red Wolf, Marlon James

  2. Space Opera, Catherynne M. Valente

  3. Little, Big, John Crowley

  4. The Witches: Salem, 1692, Stacy Schiff

  5. Tender Morsels, Margo Lanagan

I also consider a model adjusting for covariates (author/average-rating/year), to see what books are most surprisingly often-abandoned given their pedigrees & rating etc. Abandon rates increase the newer a book is, and the lower the average rating.

Adjusting for those, the top-5 are:

  1. The Casual Vacancy, J. K. Rowling

  2. The Chemist, Stephenie Meyer

  3. Infinite Jest, David Foster Wallace

  4. The Glass Bead Game, Hermann Hesse

  5. Theft by Finding: Diaries (197725200222ya), David Sedaris

Books at the top of the adjusted list appear to reflect a mix of highly-popular authors changing genres, and ‘prestige’ books which are highly-rated but a slog to read.

These results are interesting for how they highlight how people read books for many reasons (such as marketing campaigns, literary prestige, or following a popular author), and this is reflected in their decision whether to continue reading or to abandon a book.