“GrokNet: Unified Computer Vision Model Trunk and Embeddings For Commerce”, Sean Bell, Yiqun Liu, Sami Alsheikh, Yina Tang, Ed Pizzi, M. Henning, Karun Singh, Omkar Parkhi, Fedor Borisyuk2020-08-22 (; similar)⁠:

In this paper, we present GrokNet, a deployed image recognition system for commerce applications.

GrokNet leverages a multi-task learning approach to train a single computer vision trunk. We achieve a 2.1× improvement in exact product match accuracy when compared to the previous state-of-the-art Facebook product recognition system.

We achieve this by training on 7 datasets across several commerce verticals, using 80 categorical loss functions and 3 embedding losses. We share our experience of combining diverse sources with wide-ranging label semantics and image statistics, including learning from human annotations, user-generated tags, and noisy search engine interaction data.

GrokNet has demonstrated gains in production applications and operates at Facebook scale.