“SickZil-Machine (SZMC): A Deep Learning Based Script Text Isolation System for Comics Translation”, U-Ram Ko, Hwan-Gue Cho2020-08-14 (; backlinks; similar)⁠:

The translation of comics (and Manga) involves removing text from a foreign comic images and typesetting translated letters into it. The text in comics contain a variety of deformed letters drawn in arbitrary positions, in complex images or patterns. These letters have to be removed by experts, as computationally erasing these letters is very challenging. Although several classical image processing algorithms and tools have been developed, a completely automated method that could erase the text is still lacking.

Therefore, we propose an image processing framework called ‘SickZil-Machine’ (SZMC) that automates the removal of text from comics. SZMC works through a two-step process. In the first step, the text areas are segmented at the pixel level. In the second step, the letters in the segmented areas are erased and inpainted naturally to match their surroundings.

SZMC exhibited a notable performance, employing deep learning based image segmentation and image inpainting models.

To train these models, we constructed 285 pairs of original comic pages, a text area-mask dataset, and a dataset of 31,497 comic pages. We identified the characteristics of the dataset that could improve SZMC performance.

SZMC code is available.

[Keywords: comics translation, deep learning, image manipulation system]