Colaboração: Jose Roberto Rodrigues Leal
Data de Publicação: 27 de Setembro de 2001
O Clara OCR - http://www.claraocr.org - é um software para reconhecimento
de caracteres desenvolvido por Ricardo Ueda Karpischeck,
que tem como caracteristica rodar nos varios sabores do
Unix, desde que usem X windows e suportem C.
A seguir incluo um pequeno trecho do FAQ encontrado no website do
projeto. Não deixe também de visitar o website.
Clara differs from other OCR softwares in various aspects:
1. Most known OCRs are non-free and Clara is free. Clara focus the X windows
system. Clara offers batch processing, a web interface and supports
cooperative revision effort.
2. Most OCR softwares focus omnifont technology disregarding training. Clara
does not implement omnifont techniques and concentrate on building
specialized fonts (some day in the future, however, maybe we'll try
classification techniques that do not require training).
3. Most OCR softwares make the revision of the recognized text a process
totally separated from the recognition. Clara pragmatically joins the two
processes, and makes training and revision parts of one same thing. In fact,
the OCR model implemented by Clara is an interactive effort where the usage
of the heuristics alternates with revision and fine-tuning of the OCR, guided
by the user experience and feeling.
4. Clara allows to enter the transliteration of each pattern using an
interface that displays a graphic cursor directly over the image of the
scanned page, and builds and maintains a mapping between graphic symbols and
their transliterations on the OCR output. This is a potentially useful
mechanism for documentation systems, and a valuable tool for typists and
reviewers. In fact, Clara OCR may be seen as a productivity tool for typists.
5. Most OCR softwares are integrated to scanning tools offerring to the user
an unified interface to execute all steps from scanning to recognition. Clara
does not offer one such integrated interface, so you need a separate software
(e.g. SANE) to perform scanning.
6. Most OCR softwares expect the input to be a graphic file encoded in tiff
or other formats. Clara supports only raw PBM.