Workshop Topic
A stated purpose of the CIKM 2004 Conference "is to identify challenging
problems facing the development of future knowledge and information systems."
A major (and growing) challenge for industry and government is need to access
and process the content of hardcopy documents. The focus of the workshop is
to present current research and development in addressing this challenge. Topics
would include but not be limited to:
Information retrieval in the image
domain
Information extraction in the image domain
Optical correlation for processing of text
Information retrieval of noisy OCR documents
Innovative OCR techniques
Robust OCR for degraded images
Innovative OCR techniques
OCR post-processing techniques
Automatic categorization of noisy OCR documents
Clustering of noisy OCR documents
Entity extraction in noisy OCR documents
Processing of text (overlaid and in-scene text) in video images
Visualization of noisy OCR document collections
Processing of fax documents
Machine translation of noisy OCR
Automatic summarization of noisy OCR
Forms recognition of noisy text
Duplicate Detection