âItâs About Time: Analog Clock Reading in the Wildâ, 2021-11-17 ()â :
[homepage; code] In this paper, we present a framework for reading analog clocks in natural images or videos.
Specifically, we make the following contributions: First, we create a scalable pipeline for generating synthetic clocks, reducing the requirements for the labour-intensive annotations. Second, we introduce a clock recognition architecture based on spatial transformer networks (STN), which is trained end-to-end for clock alignment and recognition.
We show that the model trained on the proposed synthetic dataset generalizes towards real clocks with good accuracy, advocating a Sim2Real training regime.
Third, to further reduce the gap between simulation and real data, we leverage the special property of âtimeâ, ie. uniformity, to generate reliable pseudo-labels on real unlabeled clock videos, and show that training on these videos offers further improvements while still requiring zero manual annotations.
Lastly, we introduce 3 benchmark datasets based on COCO, Open Images, and The Clock movie, with full annotations for time, accurate to the minute.