IBM Journal of Research and Development
IBM Skip to main content
  Home     Products & services     Support & downloads     My account  

  Select a country  
Journals Home  
  Systems Journal  
Journal of Research
and Development
    Current Issue  
    Recent Issues  
    Papers in Progress  
    Recent publications  
    Author's Guide  
  Contact Us  
  Related links:  
     IBM Research  

IBM Journal of Research and Development  
Volume 26, Number 6, Page 647 (1982)
Image Processing and Pattern Recognition
  Full article: arrowPDF   arrowCopyright info


Document Analysis System

by K. Y. Wong, R. G. Casey, F. M. Wahl
This paper outlines the requirements and components for a proposed Document Analysis System, which assists a user in encoding printed documents for computer processing. Several critical functions have been investigated and the technical approaches are discussed. The first is the segmentation and classification of digitized printed documents into regions of text and images. A nonlinear, run-length smoothing algorithm has been used for this purpose. By using the regular features of text lines, a linear adaptive classification scheme discriminates text regions from others. The second technique studied is an adaptive approach to the recognition of the hundreds of font styles and sizes that can occur on printed documents. A preclassifier is constructed during the input process and used to speed up a well-known pattern-matching method for clustering characters from an arbitrary print source into a small sample of prototypes. Experimental results are included.
Related Subjects: Algorithms; Image processing; Pattern Recognition