Hardcopy Document Processing Workshop
Conference on Information and Knowledge Management
CIKM 2004

November 12, 2004 - Hyatt Arlington Hotel - Washington D.C.

Workshop Topic | Relevance of Topics | Target Audience | Organizing Committee
Agenda | Submission Requirements and Evaluation Criteria | Final Submissions | Schedule



Workshop Topic


A stated purpose of the CIKM 2004 Conference "is to identify challenging problems facing the development of future knowledge and information systems." A major (and growing) challenge for industry and government is need to access and process the content of hardcopy documents. The focus of the workshop is to present current research and development in addressing this challenge. Topics would include but not be limited to:

Information retrieval in the image domain
Information extraction in the image domain
Optical correlation for processing of text
Information retrieval of noisy OCR documents
Innovative OCR techniques
Robust OCR for degraded images
Innovative OCR techniques
OCR post-processing techniques
Automatic categorization of noisy OCR documents
Clustering of noisy OCR documents
Entity extraction in noisy OCR documents
Processing of text (overlaid and in-scene text) in video images
Visualization of noisy OCR document collections
Processing of fax documents
Machine translation of noisy OCR
Automatic summarization of noisy OCR
Forms recognition of noisy text
Duplicate Detection