character set
Click on the red underlined text to get to the source
... text is intended to be displayed "as-is". No special
software is required to get the full meaning of the
text, aside from support for the indicated character
set. Other subtypes are to be used for enriched text in
forms where application software may enhance the
appearance of the text, but such software must not be
...
... principally textual in form. A "charset" parameter may be used to
indicate the character set of the body text for "text" subtypes,
notably including the subtype "text/plain", which is a generic
...
...
This rule applies regardless of format or character set or sets
involved.
...
... Content-Type field
for "text/plain" data is the character set. This is specified with a
"charset" parameter, as in:
...
... Unlike some other parameter values, the values of the charset
parameter are NOT case sensitive. The default character set, which
must be assumed in the absence of a charset parameter, is US-ASCII ...
... charset.
In particular, definers of future "text" subtypes should pay close
attention to the implications of multioctet character sets for their
subtype definitions.
...
... The charset parameter for subtypes of "text" gives a name of a
character set, as "character set" is defined in RFC 2045draft. The rules
...
... charset parameter for subtypes of "text" gives a name of a
character set, as "character set" is defined in RFC 2045draft. The rules
regarding line breaks ...
... regarding line breaks detailed in the previous section must also be
observed -- a character set whose definition does not conform to
these rules cannot be used in a MIME "text" subtype.
...
...
An initial list of predefined character set names can be found at the
end of this section. Additional character sets may be registered
...
... An initial list of predefined character set names can be found at the
end of this section. Additional character sets may be registered
with IANA.
...
... line break
restriction removed. Therefore, all character sets that conform to
the general definition of "character set" in RFC 2045draft ...
... removed. Therefore, all character sets that conform to
the general definition of "character set" in RFC 2045draft can be
registered for MIME ...
...
Note that if the specified character set includes 8-bit characters
and such characters are used in the body, a Content-Transfer-Encoding ...
... future, it is strongly recommended that new user agents explicitly
specify a character set as a media type parameter in the Content-Type
...
... US-ASCII" does not indicate an arbitrary 7-bit
character set, but specifies that all octets in the body must be
interpreted as characters according to the US-ASCII character set ...
... character set, but specifies that all octets in the body must be
interpreted as characters according to the US-ASCII character set.
National and application-oriented versions of ISO ...
... Internet mail is explicitly discouraged. The omission of the ISO 646
character set from this document is deliberate in this regard. The
character set name of "US-ASCII ...
... character set from this document is deliberate in this regard. The
character set name of "US-ASCII" explicitly refers to the character
set defined in ANSI ...
... character set name of "US-ASCII" explicitly refers to the character
set defined in ANSI X3.4-1986 [US-ASCII]. The new international
...
... ISO 646 is identical
to US-ASCII. The character set name "ASCII" is reserved and must not
be used for any purpose.
...
... version of the American Standard. Insofar as one of the purposes of
specifying a media type and character set is to permit the receiver
to unambiguously determine how the sender ...
... 8bit or multiple
octet character encodings MUST use an appropriate character set
specification to be consistent with MIME.
...
...
The complete US-ASCII character set is listed in ANSI X3.4- 1986.
Note that the control characters ...
...
NOTE: An enormous proliferation of character sets exist beyond US-
ASCII. A large number of partially or totally overlapping character
sets ...
... character sets exist beyond US-
ASCII. A large number of partially or totally overlapping character
sets is NOT a good thing. A SINGLE character set that can be used
universally for representing all of the world's languages ...
... ASCII. A large number of partially or totally overlapping character
sets is NOT a good thing. A SINGLE character set that can be used
universally for representing all of the world's languages in Internet
mail ...
... Internet
mail would be preferrable. Unfortunately, existing practice in
several communities seems to point to the continued use of multiple
character sets in the near future. A small number of standard
character sets are, therefore, defined for Internet ...
... character sets in the near future. A small number of standard
character sets are, therefore, defined for Internet use in this
document.
...
... ISO-8859]. Note
that the ISO 646 character sets have deliberately been
omitted in favor of their 8859 replacements, which are
the designated character sets ...
... character sets have deliberately been
omitted in favor of their 8859 replacements, which are
the designated character sets for Internet mail. As of
the publication of this document, the legitimate values
...
...
All of these character sets are used as pure 7bit or 8bit sets
without any shift or escape functions. The meaning of shift and
...
... 8bit sets
without any shift or escape functions. The meaning of shift and
escape sequences in these character sets is not defined.
...
...
The character sets specified above are the ones that were relatively
uncontroversial during the drafting of MIME. This document does not
...
... uncontroversial during the drafting of MIME. This document does not
endorse the use of any particular character set other than US-ASCII,
and recognizes that the future evolution of world character sets ...
... character set other than US-ASCII,
and recognizes that the future evolution of world character sets
remains unclear.
...
...
Note that the character set used, if anything other than US- ASCII,
must always be explicitly specified in the Content-Type ...
...
No character set name other than those defined above may be used in
Internet mail without the publication of a formal specification and
...
...
Implementors are discouraged from defining new character sets unless
absolutely necessary.
...
...
In general, composition software should always use the "lowest common
denominator" character set possible. For example, if a body contains
only US-ASCII characters, it SHOULD be marked as being in the US-
...
... ISO-8859-1, which, like all the ISO-8859
family of character sets, is a superset of US-ASCII. More generally,
if a widely-used character set ...
... character sets, is a superset of US-ASCII. More generally,
if a widely-used character set is a subset of another character set,
and a body contains only characters in the widely-used subset, it
...
... US-ASCII. More generally,
if a widely-used character set is a subset of another character set,
and a body contains only characters in the widely-used subset, it
should be labelled as being in that subset. This will increase the
...
... data, such as file names and mail server commands, are required to be
in the US-ASCII character set.
...
... further specification of data format details, particularly the
specification of alternate character sets. Additional optional
header fields provide mechanisms for certain extensions deemed
...
