A Catalogue of Free/Open Source Software for Translators
|Category:||Editing and Publishing Tools|
|Typology:||Optical character recognition|
|Operating systems:||Windows, GNU/Linux|
|License:||Apache License 2.0|
From the project's web-site:
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google and is probably one of the most accurate open source OCR engines available. Combined with the Leptonica Image Processing Library it can read a wide variety of image formats and convert them to text in over 40 languages.