Cisco Cisco Web Security Appliance S190 Guía Del Usuario
9-21
AsyncOS 9.0 for Cisco Web Security Appliances User Guide
Chapter 9 Classify URLs for Policy Application
Regular Expressions
In the following example, the regular expression matches files ending in
.exe
,
.zip
, and .
bin
in the
downloads
directory.
/downloads/.*\.(exe|zip|bin)
Note
You must enclose regular expressions that contain blank spaces or non-alphanumeric characters in
ASCII quotation marks.
ASCII quotation marks.
Guidelines for Avoiding Validation Failures
Follow these guidelines to minimize validation failures:
•
Use literal expressions rather than wildcards and bracketed expressions whenever possible. A literal
expression is essentially just straight text such as “
expression is essentially just straight text such as “
It’s as easy as ABC123
”. This is less likely
to fail than using “
It’s as easy as [A-C]{3}[1-3]{3}
”. The latter expression results in the
creation of non-deterministic finite automatons (NFA) entries, which can dramtically increase
processing time.
processing time.
•
Avoid the use of an unescaped dot whenever possible. The dot is a special regular-expression
character that means match any character except for a newline. If you want to match an actual dot,
for example, as in “
character that means match any character except for a newline. If you want to match an actual dot,
for example, as in “
url.com
”, then escape the dot using the \ character, as in “
url\.com
”. Escaped
dots are treated as literal entries and therefore do not cause issues.
Similarly, use more specific matches rather than unescaped dots wherever possible. For example, if
you want to match a URL that is followed by a single digit, use “
you want to match a URL that is followed by a single digit, use “
url[0-9]
” rather than “
url.
”.
•
Unescaped dots in a larger regular expression can be especially problematic and should be avoided.
For example, “
For example, “
Four score and seven years ago our fathers brought forth on this
continent, a new nation, conceived in Liberty, and dedicated to the proposition that
all men are created .qual
” may cause a failure. Replacing the dot in “
.qual
” with the literal
“
equal
” should resolve the problem.
Also, an unescaped dot in a pattern that will return more than 63 characters after the dot will be
disabled by the pattern-matching engine. Correct or replace the pattern.
disabled by the pattern-matching engine. Correct or replace the pattern.
•
You cannot use “
.*
” to begin or end a regular expression. You also cannot use “./” in a regular
expression intended to match a URL, nor can you end such an expression with a dot.
•
Combinations of wild cards and bracket expressions can cause problems. Eliminate as many
combinations as possible. For example,
“
combinations as possible. For example,
“
id:[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12}\) Gecko/20100101
Firefox/9\.0\.1\$
” may cause a failure, while “
Gecko/20100101 Firefox/9\.0\.1\$
” will not.
The latter expression does not include any wild cards or bracketed expressions, and both expressions
use only escaped dots.
use only escaped dots.
When wilds cards and bracketed expressions cannot be eliminated, try to reduce the expression’s
size and complexity. For example, “
size and complexity. For example, “
[0-9a-z]{64}
” may cause a failure. Changing it to something
smaller or less complex, such as “
[0-9]{64}
” or “
[0-9a-z]{40}
” may resolve the problem.
If a failure occurs, try to resolve it by applying these rules to the wildcard (such as *, + and .) and
bracketed expressions.
bracketed expressions.