Cisco Cisco Email Security Appliance C160 Mode D'Emploi

Page de 1197
 
22-3
User Guide for AsyncOS 9.7 for Cisco Email Security Appliances
 
Chapter 22      Text Resources
  Content Dictionaries
Dictionary Content
Words in dictionaries are created with one text string per line, and entries can be in plain text or in the 
form of regular expressions. Dictionaries can also contain non-ASCII characters. Defining dictionaries 
of regular expressions can provide more flexibility in matching terms, but doing so requires you to 
understand how to delimit words properly. For a more detailed discussion of Python style regular 
expressions, consult the Python Regular Expression HOWTO, accessible from 
http://www.python.org/doc/howto/
 
Note
To use the special character # at the beginning of a dictionary entry, you can use a character class [#] to 
prevent it being treated as a comment. 
For each term, you specify a “weight,” so that certain terms can trigger filter conditions more easily. 
When AsyncOS scans messages for the content dictionary terms, it “scores” the message by multiplying 
the number of term instances by the weight of term. Two instances of a term with a weight of three would 
result in a score of six. AsyncOS then compares this score with a threshold value associated with the 
content or message filter to determine if the message should trigger the filter action. 
You can also add smart identifiers to a content dictionary. Smart identifiers are algorithms that search 
for patterns in data that correspond to common numeric patterns, such as social security numbers and 
ABA routing numbers. These identifiers can useful for policy enforcement. For more information about 
regular expressions, see “Regular Expressions in Rules” in the “Using Message Filters to Enforce Email 
Policies” chapter. For more information about smart identifiers, see “Smart Identifiers” in the “Using 
Message Filters to Enforce Email Policies” chapter.
Note
Dictionaries containing non-ASCII characters may or may not display properly in the CLI on your 
terminal. The best way to view and change dictionaries that contain non-ASCII characters is to export 
the dictionary to a text file, edit that text file, and then import the new file back into the appliance. For 
more information, see 
Related Topics
Word Boundaries and Double-byte Character Sets
In some languages (double-byte character sets), the concepts of a word or word boundary, or case do not 
exist. Complex regular expressions that depend on concepts like what is or is not a character that would 
compose a word (represented as “\w” in regex syntax) cause problems when the locale is unknown or if 
the encoding is not known for certain. For that reason, you may want to disable word-boundary 
enforcement.
Importing and Exporting Dictionaries as Text Files
The content dictionary feature also includes, by default, the following text files located in the 
configuration directory of the appliance: 
config.dtd
 
profanity.txt