Oracle Audio Technologies Application 9i User Manual

Page of 140
Overview
7-2
Oracle9Application Server Wireless Edition Configuration Guide
7.1 Overview
This release of Wireless Edition supports single-byte, multi-byte, and fixed-width 
encoding schemes which are based on national, international, and vendor-specific 
standards.
If the character set is single byte, and that character set includes only composite 
characters, the number of characters and the number of bytes are the same. If the 
character set is multi-byte, there is generally no such correspondence between the 
number of characters and the number of bytes. A character can consist of one or 
more bytes, depending on the specific multi-byte encoding scheme. 
A typical situation is when character elements are combined to form a single 
character. For example, in the Thai language, up to three separate character 
elements can be combined to form one character, and one Thai character would 
require up to 3 bytes when TH8TISASCII or another single-byte Thai character set is 
used. One Thai character would require up to 9 bytes when the UTF8 character set 
is used.
7.2 Multi-byte Encoding Schemes
Multi-byte encoding schemes are needed to support ideographic scripts used in 
Asian languages like Chinese or Japanese since these languages use thousands of 
characters. These schemes use either a fixed number of bytes to represent a 
character or a variable number of bytes per character.
7.2.1 Fixed-width Encoding Schemes
In a fixed-width Multi-byte encoding scheme, each character is represented by a 
fixed number of n bytes, where n is greater than or equal to two.
7.2.2 Variable-width Encoding Schemes
A variable-width encoding scheme uses one or more bytes to represent a single 
character. Some Multi-byte encoding schemes use certain bits to indicate the 
number of bytes that represent a character. For example, if two bytes is the 
maximum number of bytes used to represent a character, the most significant bit 
can be toggled to indicate whether that byte is part of a single-byte character or the 
first byte of a double-byte character. In other schemes, control codes differentiate 
single-byte from double-byte characters. Another possibility is that a shift-out code 
is used to indicate that the subsequent bytes are double-byte characters until a 
shift-in code is encountered.