Kaspersky Lab kaspersky anti-spam 2.0 Manual

Page of 133
Kaspersky Anti-Spam Operation and Filtering Philosophy 
27
 
black list (e.g. to check IP=202.103.129.8 via zone="blackholes.mail-
abuse.org"
 a request to DNS with the 8.129.103.202.blackholes.mail-
abuse.org 
domain name will be formed). 
E-mail recipient’s check is performed: 
•  in common profiles – according to the full list of recipients. 
•  In personal profiles – according to the list of those message recipients to 
whom this profile is applied. 
A filtering rule can simultaneously contain several conditions of different types. 
For example, it can block messages where a recipient belongs to list A and the 
sender belongs to list B (B – stands for black list for the users included in list A). 
4.3.2. Message content analysis – content 
filtering 
An e-mail message may not have any formal spam attributes – it can be 
forwarded to a recipient from an address that is not included in any black list – 
but may still contain some "suspicious" information. In order to detect and 
process such messages (in the Russian or the English language) content filtering 
algorithms are used.  
The message content is analyzed using artificial intelligence methods (including 
the Subject header). Attached files in the following formats are also processed: 
•  Text: plain text (ASCII, not multibyte); 
•  HTML (2.0, 3.0, 3.2, 4.0, XHTML 1.0); 
•  Microsoft Word (versions 6.0, 95/97/2000/XP); 
•  RTF. 
 
The task of Kaspersky Anti-Spam is to decrease the flow of unwanted 
mail that blocks up user’s mailboxes. 100% detection of all unwanted 
mails cannot be guaranteed because excessively strict criteria would 
inevitably cause "filtering out" of some non-spam messages. 
Two basic methods are used to detect messages with "suspicious" content: 
•  checking against sample messages (by comparison of their lexical 
content); 
•  detection of regular expressions – words and word combinations. 
All data used by Kaspersky Anti-Spam – index  (hierarchical category list), 
sample messages, regular expressions, etc. are stored in the content filtering 
database
,.