ISauvola: Improved Sauvola’s Algorithm for Document Image Binarization
Binarization of historical documents is difficult and is still an open area of research. In this paper, a new binarization technique for document images is presented. The proposed technique is based on the most commonly used binarization method: Sauvola's, which performs relatively well on classical documents, however, three main defects remain: the window parameter of Sauvola's formula does not fit automatically to the image content, is not robust to low contrasts, and not invariant with respect to contrast inversion. Thus on documents such as magazines, the content may not be retrieved correctly. In this paper we use the image contrast that is defined by the local image minimum and maximum in combination with the computed Sauvola’s binarization step to guarantee good quality binarization for both low and correctly contrasted objects inside a single document, without adjusting manually the user-defined parameters to the document content. The efficiency of the proposed method is shown on both recent and historical document images of the datasets that are used in DIBCO datasets including different types of degradations.
Document image, binarization, Sauvola’s method, image contrast.