ISO/IEC
Click on the red underlined text to get to the source
... encoding. Up to the
present time, changes in Unicode and amendments to ISO/IEC 10646 have
tracked each other, so that the character repertoires and code point
...
...
UTF-1 has only historical interest, having been removed from ISO/IEC
10646. UTF-7 has the quality of encoding ...
... octets, where the number of octets, and the value of each, depend on
the integer value assigned to the character in ISO/IEC 10646. This
transformation format has the following characteristics (all values
...
...
ISO/IEC 10646 is updated from time to time by published amendments;
similarly, different versions of the Unicode ...
...
In general, the changes amount to adding new characters, which does
not pose particular problems with old data. Amendment 5 to ISO/IEC
10646, however, has moved and expanded the Korean Hangul block,
thereby making any previous data containing Hangul characters invalid
...
... media types
containing text consisting of characters from the repertoire of
ISO/IEC 10646 including all amendments at least up to amendment 5
(Korean block), encoded to a sequence of octets using the encoding
...
... UTF-8" does not contain a version
identification, referring generically to ISO/IEC 10646. This is
intentional, the rationale being as follows:
...
...
Now the "Korean mess" (ISO/IEC 10646 amendment 5) is an incompatible
change, in principle contradicting the appropriateness of a version
...
... Hangul characters encoded according to Unicode 1.1 (or equivalently
ISO/IEC 10646 before amendment 5), and there is arguably no such data
to worry about, this being the very reason the incompatible change
was deemed acceptable.
...
... and provided no incompatible change actually occurs. Should
incompatible changes occur in a later version of ISO/IEC 10646, the
MIME charset ...
... containing Hangul syllables encoded to UTF-8 without taking into
account Amendment 5 of ISO/IEC 10646 (i.e. using the pre-amendment 5
code point assignments). Any other UTF-8 ...
... particular data not containing any Hangul syllables, and it
is felt important to strongly recommend against creating any new
Hangul-containing data without taking Amendment 5 of ISO/IEC 10646
into account.
...
... ISO/IEC 10646-1:1993. International Standard -- Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture ...
