Algorithmic recommendations shape music consumption at scale, and understanding the role different behavioral aspects play in how content is consumed, is a central question for music streaming platforms. Focusing on the notions of familiarity, similarity and discovery, we identify the need for explicit consideration and optimization of such objectives, and establish the need to efficiently balance them when generating algorithmic recommendations for users.
We posit that while familiarity helps drive short term engagement, jointly optimizing for discovery enables the platform to influence and shape consumption across suppliers. We propose a multi-level ordered-weighted averaging based objective balancer to help maintain a healthy balance with respect to familiarity and discovery objectives, and conduct a series of offline evaluations and online AB tests, to demonstrate that despite the presence of strict trade-offs, we can achieve wins on both satisfaction and discover centric objectives.
Our proposed methods and insights have implications for the design and deployment of practical approaches for music recommendations, and our findings demonstrate that they can lead to substantial improvements on recommendation quality on one of the world’s largest music streaming platforms.
…We investigate how the above mentioned aspects of recommendations affect user behavior, and conduct large-scale analyses and multiple live experiments on the music streaming platform Spotify for investigating such questions. We view user consumption on Spotify from the lens of the identified recommendation aspects, and present insights about user’s preferences for familiar music, and the interplay between similarity, familiarity and discovery. We conduct a series of live A/B tests on a large user population on two distinct user-centric recommendation products, to test how the proposed objective balancing methods fare on key user engagement metrics. The proposed methods are able to obtain metric improvements on both user satisfaction and discovery centric objectives, despite the presence of strict trade-offs. Finally, we view discovery as an enabler for shifting consumption to non-popular or tail-artists, and present detailed results on how additionally optimizing for discovery helps in surfacing less popular artists.
…6.5 Impact on Suppliers: We hypothesized that discovery can act as an enabler of shifting consumption towards less popular artists (§3.3). We investigate to what extent this is true. In Figure 8, we consider a random sample of streamed content, and plot the stream share that went to artists of different popularity buckets. We observe that models which over-emphasize on discovery, are able to substantially shift the consumption towards right, i.e. transfer streams to less popular artists, as is evident by the right-shifted distribution of methods like OWA-SAT-Discovery (AND). Even rankers which provide a healthy balance between satisfaction gains and discovery gains are able to shift stream share towards less popular artists, and decrease stream share for more popular artists.
Figure 3: Impact on supplier distribution: simulating impact of varying proportions of discovery on supplier distribution.
Figure 8: Impact on Suppliers: stream share across different popularity percentiles, across different rankers.
Indeed, optimizing for discovery enables platforms to control consumption patterns, and divert consumption towards less popular or niche artists, who might otherwise not get exposed enough. Such departure from relevant, popular and familiar content allow platforms to broaden the scope of music listening and shift consumption towards the tail and less familiar content.
[If you use recommender services heavily, like upvoting/downvoting everything you see, you often notice that after a while, the results get worse, in a certain sense: the predictions are fine, but they lose diversity and you stop being able to find novel stuff. Fighting this sort of ‘recommender collapse’ requires you to find new things out-of-band, like using a separate site like Reddit which will keep exposing you to new things. This paper demonstrates that optimizing for exploration/discovery works, implying the collapse is a RL explore-exploit problem, where the default is just exploitation.
Why don’t services already do that? Well, maybe it doesn’t actually benefit them. Most users churn rapidly, so there is no long-term worth exploring for (for either users or services); and exploring will have a large upfront cost and add to churn as users dislike the riskier recommendations. There is also a problem of revealed preferences & social desirability bias: if users are given an option of “show me only things I already like as I am boring and uncool and narrow-minded” vs “explore exciting new trends as a sophisticated connoisseur”, who would opt for the former? And yet, that is how most people are—like small children, a great many people want repetition & lack of variety to an extent that would shock cultural elitists. (Such people may not even realize this.) So a service would also need to infer users’ true preferences and how much novelty they want, regardless of what they claim. All of this discourages anything but the straightforward use of greedy recommenders.]