Macromedia colfusion mx 7 Manual
116
Chapter 9: Indexing Collections with Verity Spider
-maxindmem
Syntax:
-maxindmem kilobytes
Specifies the maximum amount of memory, in kilobytes, used by each indexing thread. Specify
the number of threads with the
the number of threads with the
-indexers
option.
By default, each indexing thread uses as much memory as is available from the system.
-maxnumdoc
Syntax:
-maxnumdoc num_docs
Specifies the maximum number of documents to download or submit for indexing. The value for
num_docs does not necessarily correspond to the number of documents indexed. The following
factors affect the actual number:
num_docs does not necessarily correspond to the number of documents indexed. The following
factors affect the actual number:
•
Whether the value of
num_docs
falls within a block of documents dictated by the
-submitsize
option. If it does, the entire block of documents must be processed.
•
Whether documents retrieved are actually indexed, because they are invalid or corrupt.
-mimemap
Syntax:
-mimemap path_and_filename
Specifies a control file (simple ASCII text) that maps file extensions to MIME-types. This lets you
make custom associations and override defaults.
make custom associations and override defaults.
The following is the format for the control file:
#file_ext_no_dot
mime-type
abc
application/word
-nocache
Type: Web crawling only
Used with the
-noindex
or
-nosubmit
options, this option disables the caching of files during
website indexing. This has the effect of decreasing the demands on your disk space.
Normally, Verity Spider downloads URLs, then writes them to a bulk insert file and downloads
the documents themselves. When indexing occurs, once the
the documents themselves. When indexing occurs, once the
-submitsize
option has been
reached, the cached files are indexed and then deleted. If you use the
-noindex
option, the bulk
insert file is submitted but not processed by Verity Spider, and so the documents are not deleted
until indexing occurs. This will usually be
until indexing occurs. This will usually be
mkvdk
or
collsvc
, or you can use Verity Spider again
with the
-processbif
option.
By using the
-nocache
option in conjunction with the
-noindex
or
-nosubmit
option, you
avoid storing files locally. Files are downloaded only when indexing actually occurs.
See also
.