Binarization of Document Images with Various Object Sizes

Hadjadj, Zineb; Meziane, Abdelkrim

Binarization of Document Images with Various Object Sizes

Date

2017-02-01

Authors

Hadjadj, Zineb

Meziane, Abdelkrim

Publisher

CERIST

Abstract

There are a lot of document image binarization techniques that try to differentiate between foreground and background but many of them fail to correctly detect all the text pixels because of degradations. In this paper, a new binarization method for document images is presented. The proposed method is based on the most commonly used binarization method: Sauvola’s, which performs relatively well on classical documents, however, three main defects remain: the window parameter of Sauvola’s formula does not fit automatically to the image content, is not robust to low contrasts, and not invariant with respect to contrast inversion. Thus for some documents, the content may not be retrieved correctly. In this paper we try to overcome one of the limitations of Sauvola’s binarization which is the Handling badly various object sizes. The well-known Chan-Vese active contour model is use in combination with the computed Sauvola’s binarization step to guarantee good quality binarization for both small and large objects inside a single document, without adjusting manually the window size to the document content. The efficiency of the proposed method is shown on several document images with various object sizes.

Keywords

Document image; binarization; Sauvola’s method; Chan-Vese active contour model.

URI

http://dl.cerist.dz/handle/CERIST/879

Collections

Research Reports

Full item page

Binarization of Document Images with Various Object Sizes

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By