Nuance scansoft omnipage pro 14 User Manual
Chapter 4
Languages
69
Languages
The program can read over 110 languages with three alphabets: Latin,
Greek and Cyrillic. See the list in the OCR panel of the Options dialog
box. It shows which languages have dictionary support. A listing is also
provided on the ScanSoft web site.
Greek and Cyrillic. See the list in the OCR panel of the Options dialog
box. It shows which languages have dictionary support. A listing is also
provided on the ScanSoft web site.
In addition to user dictionaries, specialized dictionaries are available for
certain professions (currently medical, legal and financial) for some
languages. See the list and make selections in the OCR panel of the
Options dialog box.
certain professions (currently medical, legal and financial) for some
languages. See the list and make selections in the OCR panel of the
Options dialog box.
Training
Training is the process of changing the OCR solutions assigned to
character shapes in the image. It is useful for uniformly degraded
documents or when an unusual typeface is used throughout a document.
Training will be less useful for texts with random distortions. Here is an
example, based on the letter “g”, which can be printed in different ways:
character shapes in the image. It is useful for uniformly degraded
documents or when an unusual typeface is used throughout a document.
Training will be less useful for texts with random distortions. Here is an
example, based on the letter “g”, which can be printed in different ways:
The first two examples do not need training, because both shapes are
normal for the letter “g” and the program can handle them. The third
example could benefit from training because the shape of “g” is unusual,
and all instances of “g” in the text are likely to look like this. The fourth
example is not good for training, because the first “g” is poorly printed,
and this shape is unlikely to appear again in the document.
normal for the letter “g” and the program can handle them. The third
example could benefit from training because the shape of “g” is unusual,
and all instances of “g” in the text are likely to look like this. The fourth
example is not good for training, because the first “g” is poorly printed,
and this shape is unlikely to appear again in the document.
The program identifies the language of recognized texts and displays it in the status
bar. This language marking is exported with the document. Use Set Language... in
the Tools menu to change the language marking for selected text. This does not
change the recognition language(s).
bar. This language marking is exported with the document. Use Set Language... in
the Tools menu to change the language marking for selected text. This does not
change the recognition language(s).