Mikroelektronika MIKROE-442 Datenbogen
108
mikoBasic PRO for dsPIC30/33 and PIC24
MikroElektronika
Regular Expressions
Introduction
Regular Expressions are a widely-used method of specifying patterns of text to search for. Special metacharacters
allow you to specify, for instance, that a particular string you are looking for, occurs at the beginning, or end of a line, or
contains
n
recurrences of a certain character.
Simple matches
Any single character matches itself, unless it is a metacharacter with a special meaning described below. A series
of characters matches that series of characters in the target string, so the pattern
“short”
would match
“short”
in the target string. You can cause characters that normally function as metacharacters or escape sequences to be
interpreted by preceding them with a backslash
“\”
.
For instance, metacharacter
“^”
matches beginning of string, but
“\^”
matches character
“^”
, and
“\\”
matches
“\”
, etc.
Examples:
unsigned
matches string
'unsigned'
\^unsigned
matches string
'^unsigned'
Escape sequences
Characters may be specified using a escape sequences:
“\n”
matches a newline,
“\t”
a tab, etc. More generally,
\xnn
, where
nn
is a string of hexadecimal digits, matches the character whose ASCII value is
nn
.
If you need wide (Unicode) character code, you can use
‘\x{nnnn}’
, where
‘nnnn’
- one or more hexadecimal
digits.
\xnn
- char with hex code
nn
\x{nnnn)
- char with hex code
nnnn
(one byte for plain text and two bytes for Unicode)
\t
- tab (HT/TAB), same as
\x09
\n
- newline (NL), same as
\x0a
\r
- car.return (CR), same as
\x0d
\f
- form feed (FF), same as
\x0c
\a
- alarm (bell) (BEL), same as
\x07
\e
- escape (ESC) , same as
\x1b
Examples:
unsigned\x20int
matches
'unsigned int'
(note space in the middle)
\tunsigned
matches
'unsigned'
(predecessed by tab)
Character classes
You can specify a character class, by enclosing a list of characters in
[]
, which will match any of the characters from
the list. If the first character after the
“[“
is
“^”
, the class matches any character not in the list.