Motorola G24 用户手册

下载
页码 642
Character Sets
1-20
 AT Commands Reference Manual
December 31, 2007
Unlike some legacy encoding, UTF-8 is easy to parse. So-called lead and trail bytes are easily 
distinguished. Moving forwards or backwards in a text string is easier in UTF-8 than in many 
other multi-byte encoding.
The codes in the first half of the first row in Character Set Table CS2 (UTF-8 <-> ASCII) are 
replaced in this transformation format by their ASCII codes, which are octets in the range 
between 00h and 7F. The other UCS2 codes are transformed to between two and six octets in the 
range between 80h and FF. Text containing only characters in Character Set Table CS3
(UTF-8 <-> UCS-2) is transformed to the same octet sequence, irrespective of whether it was 
coded with UCS-2.
8859-1 Character Set Management
ISO-8859-1 is an 8 bit character set - a major improvement over the plain 7 bit US-ASCII.
Characters 0 to 127 are always identical with US-ASCII and the positions 128 to 159 hold some 
less used control characters. Positions 160 to 255 hold language-specific characters.
ISO-8859-1 covers most West European languages, such as French (fr), Spanish (es), Catalan 
(ca), Basque (eu), Portuguese (pt), Italian (it), Albanian (sq), Rhaeto-Romanic (rm), Dutch (nl), 
German (de), Danish (da), Swedish (sv), Norwegian (no), Finnish (fi), Faroese (fo), Icelandic (is), 
Irish (ga), Scottish (gd) and English (en). Afrikaans (af) and Swahili (sw) are also included, 
extending coverage to much of Africa.