Xerox DocuShare Support & Software Merkblatt

Seite von 10
Best Practices for Content Indexing
1
Best Practices for Content Indexing
Recommended for Sites Containing Over 500,000 Documents
The intent of this document is to provide guidance for improving DocuShare 6.6.1 search indexing 
performance. These instructions should be included in the planning of any new or upgraded 6.6.1 
server.
Note: 
Refer to the DocuShare 6.6.1 Command Line Utilities Guide for more information on running the 
commands called-out in this document.
Upgrading from releases prior to 6.5
Note: 
Perform these procedures after upgrading to 6.6.1 and before running dsindex index_all.
Consider the needs and usage of the site, then complete any of following commands after upgrading 
to 6.6.1. 
DSHOME\bin\countFileNumber 
Run this command to gather metrics on the number of documents and file sizes currently on 
the site. This command identifies large files that will take longer to content index. 
DSHOME\bin\fixupMimeTypes 
Run this command to find files that may have been incorrectly identified with a text file 
MIME type. 
This command reports document and rendition objects where the MIME types assigned by 
the file guesser do not match the MIME type file extension. Earlier versions of DocuShare had 
the MIME type configuration set by default to the File Content algorithm (file type guesser) 
instead of the File Extension algorithm. The fixupMimeTypes command identifies documents 
such as *.dat files (containing many rows of numeric strings, commas, and other special 
characters) that may have been identified as .txt files in an earlier version of DocuShare. 
DSHOME\bin\fixupMimeTypes -f 
Run this command to fix the incorrectly assigned document MIME type to match the file 
extension. Documents will be indexed according to the MIME type file extensions in the 
MIME Types table.