1. Introduction
Physical Page layout analysis, one of the first steps of OCR, divides an image into areas of text and non-text, as well as splitting multi-column text into columns. This paper does not address logical layout analysis, which detects headers, footers, body text, numbered lists, and segmentation into articles.