abbyy-software formreader ユーザーガイド
What is a form?
Questionnaires, social security forms, polling slips, warranty cards – all different types of form used to collect
different types of information. How do forms differ from other types of documents?
different types of information. How do forms differ from other types of documents?
1. A form has a set number of fields.
2. Field content is always determined by for example field name. E.g. a “Last Name” field contains only last
2. Field content is always determined by for example field name. E.g. a “Last Name” field contains only last
names (if completed correctly), a “Date” field only dates, etc.
3. During form processing, only the field contents are of interest; all remaining form elements are
disregarded.
Gathering information can be a long and weary process, involving the input of hundreds if not thousands of forms.
ABBYY FormReader, however, makes life much easier, allowing the whole process to be automated. The inputting
process then consists of the following stages:
ABBYY FormReader, however, makes life much easier, allowing the whole process to be automated. The inputting
process then consists of the following stages:
1. Application setup – the form to be processed is specified.
A form template is created within the program, containing the geometrical locations of the fields and
specifying the type of information to be contained within them and containing other field parameters.
specifying the type of information to be contained within them and containing other field parameters.
2. Form
processing.
Completed forms are scanned and recognized (i.e. field images are converted into text) by the application.
An existing template is used to identify form field positions and the type of information contained within
them. Recognition results are subsequently verified and exported to a file or database.
An existing template is used to identify form field positions and the type of information contained within
them. Recognition results are subsequently verified and exported to a file or database.
Easy? In theory, yes, in practice, no, as not all forms used to gather information are suitable for automated input.
The aim of this guide is to explain exactly which requirements a form must meet if it is to be suitable for automated
processing, and to show you how to create your own forms using Microsoft Visio 2000, Microsoft Word 2000, and
Corel Draw.
The aim of this guide is to explain exactly which requirements a form must meet if it is to be suitable for automated
processing, and to show you how to create your own forms using Microsoft Visio 2000, Microsoft Word 2000, and
Corel Draw.
What is a machine-readable form?
Two principal tasks are carried out during form recognition:
1. Locating fields.
1. Locating fields.
This is by no means an easy task as the scanned form image may be distorted in various ways e.g. stretched,
skewed, or rotated. In order for these distortions to be corrected, the form must contain what are termed
reference points. For more information on reference points and other form elements, see: “Elements of
machine-readable forms“, page 6.
skewed, or rotated. In order for these distortions to be corrected, the form must contain what are termed
reference points. For more information on reference points and other form elements, see: “Elements of
machine-readable forms“, page 6.
2. Separating field contents from field borders
The information entered in the fields must be clearly separated from other form elements: field borders,
background, service, and explanatory text. In order for the application to do this correctly, the form must meet
certain requirements; these requirements specify several form types. For more information on form types, see:
“Types of machine-readable forms“ (page 6).
background, service, and explanatory text. In order for the application to do this correctly, the form must meet
certain requirements; these requirements specify several form types. For more information on form types, see:
“Types of machine-readable forms“ (page 6).
In order for the above two tasks to be carried out successfully, the forms must correspond to the form pattern
exactly, i.e. forms of the same type must be printed using the same source document (pattern) so that the location of
all form elements is identical on each one. If this is not the case, i.e. the location of fields on different copies of the
form varies, the application will be unable to “find” the fields and, consequently, unable to recognize them. Copies
of the form will only match the source document (pattern) by having the forms printed professionally. For more
information regarding print quality, see: “Print quality requirements“ (page 15).
If the application is able to identify the field locations and separate the field contents from the field borders, the form
in question is deemed to be machine-readable. From now on such forms are simply referred to as forms.
exactly, i.e. forms of the same type must be printed using the same source document (pattern) so that the location of
all form elements is identical on each one. If this is not the case, i.e. the location of fields on different copies of the
form varies, the application will be unable to “find” the fields and, consequently, unable to recognize them. Copies
of the form will only match the source document (pattern) by having the forms printed professionally. For more
information regarding print quality, see: “Print quality requirements“ (page 15).
If the application is able to identify the field locations and separate the field contents from the field borders, the form
in question is deemed to be machine-readable. From now on such forms are simply referred to as forms.
Form completion methods
A form may be completed in one of the following ways:
1) by hand (“handprint” completion). Letters, digits and all other characters are written separately, with each
character having its own individual character space.
1) by hand (“handprint” completion). Letters, digits and all other characters are written separately, with each
character having its own individual character space.
2) Using a matrix printer.
3) Using a typewriter.
4) Typographically. This refers to the use of inkjet and laser (not matrix) printers with a resolution of no less than
300 dpi.
3) Using a typewriter.
4) Typographically. This refers to the use of inkjet and laser (not matrix) printers with a resolution of no less than
300 dpi.
5) Using a combination of the above.