RFC 4042:UTF-9 and UTF-18 Efficient Transformation...
RFC-Ref

2. Overview


   UTF-9 encodes [UNICODE] codepoints in the low order 8 bits of a
   nonet, using the high order bit to indicate continuation.  Surrogates
   are not used.

   [UNICODE] codepoints in the range U+0000 - U+00FF ([US-ASCII] and
   Latin 1) are represented by a single nonet; codepoints in the range
   U+0100 - U+FFFF (the remainder of the BMP) are represented by two
   nonets; and codepoints in the range U+1000 - U+10FFFF (remainder of
   [UNICODE]) are represented by three nonets.

   Non-[UNICODE] codepoints in [ISO-10646] (that is, codepoints in the
   range 0x110000 - 0x7fffffff) can also be represented in UTF-9 by
   obvious extension, but this is not discussed further as these
   codepoints have been removed from [ISO-10646] by ISO.

   UTF-18 encodes [UNICODE] codepoints in the Basic Multilingual Plane
   (BMP, plane 0), Supplementary Multilingual Plane (SMP, plane 1),
   Supplementary Ideographic Plane (SIP, plane 2), and Supplementary
   Special-purpose Plane (SSP, plane 14) in a single 18-bit value.  It
   does not encode planes 3 though 13, which are currently unused; nor
   planes 15 or 16, which are private spaces.

   Normally, UTF-9 and UTF-18 should only be used in the context of 9
   bit storage and transport.  Although some protocols, e.g., [FTP],
   support transport of nonets, the current IETF protocol suite is quite
   deficient in this area.  The IETF is urged to take action to improve
   IETF protocol support for nonets.



Google
Web
RFC-Ref