1 - 2 - 7 - 8 - A - B - C - D - E - F - G - H - I - J - L - M - N - O - P - Q - R - S - T - U - V - W - X
character set
Click on the red underlined text to get to the source
... 822std11(-> 2822prop) is inadequate for the needs of mail users whose
languages require the use of character sets richer than US ASCII
[US-ASCII ...
... 2.a. A "text" Content-Type value, which can be used to represent
textual information in a number of character sets and
formatted text description languages in a standardized
...
... order to allow it to pass through mail transport mechanisms which
may have data or character set limitations.
4. Two additional header ...
... associated parameters will grow significantly with time. Several
other MIME fields, notably including character set names, are likely
to have new values defined over time. In order to ensure that the
set of such values is developed in an orderly, well-specified, and
...
... 822std11(-> 2822prop) mail.
The term "character set" is used in this document to refer to a
method used with one or more tables to convert encoded text to a
...
... ISO 2022's
techniques. However, a MIME character set name must fully specify
the mapping to be performed.
...
... parameters is not significant. Among the defined parameters is a
"charset" parameter by which the character set used in the body may
be declared. Comments are allowed in accordance with RFC 822std11(-> 2822prop) rules
...
... special software is required to get the full
meaning of the text, aside from support for the
indicated character set. Subtypes are to be used
for enriched text in forms where application
software may enhance the appearance of the text,
...
... 822std11(-> 2822prop) messages are typed by this protocol as plain text in
the US-ASCII character set, which can be explicitly specified as
"Content-type: text/plain ...
... header field, it is impossible to be certain that a
message is actually text in the US-ASCII character set, since it
might well be a message that, using the conventions that predate
this document, includes text in another character set ...
... character set, since it
might well be a message that, using the conventions that predate
this document, includes text in another character set or non-
textual data in a manner that cannot be automatically recognized
(e.g., a uuencoded compressed UNIX ...
... encoding of data that was originally in ISO-8859-1, and will be in
that character set again after decoding.
The following sections will define the two standard encoding ...
... transport, no encoding would be required for text in certain
character sets, while such encodings are clearly required for 7-
bit ...
... system to system, and the relationship between content-transfer-
encodings and character sets. For this reason, a canonical model
for encoding ...
...
The description is presumed to be given in the US-ASCII character
set, although the mechanism specified in [RFC-1522] may be used for
non-US-ASCII ...
... Content-Type. A
"charset" parameter may be used to indicate the character set of the
body text for some text subtypes, notably including the primary
subtype, "text/plain ...
... Content-Type field
for text/plain data is the character set. This is specified with a
"charset" parameter, as in:
...
... Unlike some other parameter values, the values of the charset
parameter are NOT case sensitive. The default character set, which
must be assumed in the absence of a charset parameter, is US-ASCII ...
... charset. In particular, definers
of future text subtypes should pay close attention the the
implications of multibyte character sets for their subtype
definitions.
...
... mapping which does not require external profiling information.
An initial list of predefined character set names can be found at the
end of this section. Additional character sets may be registered
...
... An initial list of predefined character set names can be found at the
end of this section. Additional character sets may be registered
with IANA, although the standardization of their use requires the
...
... IESG [RFC-1340] review and approval. Note that if the
specified character set includes 8-bit data, a Content-Transfer-
Encoding header ...
... future, it is strongly recommended that new user agents explicitly
specify a character set via the Content-Type header field. "US-
...
... Internet mail is explicitly discouraged. The omission of the ISO 646
character set is deliberate in this regard. The character set name
of "US-ASCII ...
... ISO 646
character set is deliberate in this regard. The character set name
of "US-ASCII" explicitly refers to ANSI ...
... ANSI X3.4-1986 [US-ASCII] only.
The character set name "ASCII" is reserved and must not be used for
any purpose.
...
... version of the American Standard. Insofar as one of the
purposes of specifying a Content-Type and character set is to
permit the receiver to unambiguously determine how the sender ...
... 8-bit or multiple octet character encodings MUST use an
appropriate character set specification to be consistent with this
specification.
...
...
The complete US-ASCII character set is listed in [US-ASCII]. Note
that the control characters ...
...
NOTE: Beyond US-ASCII, an enormous proliferation of character sets
is possible. It is the opinion of the IETF working group that a
...
... is possible. It is the opinion of the IETF working group that a
large number of character sets is NOT a good thing. We would
prefer to specify a single character set that can be used
...
... large number of character sets is NOT a good thing. We would
prefer to specify a single character set that can be used
universally for representing all of the world's languages in
...
... electronic mail. Unfortunately, existing practice in several
communities seems to point to the continued use of multiple
character sets in the near future. For this reason, we define
names for a small number of character sets ...
... character sets in the near future. For this reason, we define
names for a small number of character sets for which a strong
constituent base exists.
...
... ISO-8859]. Note that the ISO 646
character sets have deliberately been omitted in favor of
their 8859 replacements, which are the designated character
sets for Internet mail ...
... character sets have deliberately been omitted in favor of
their 8859 replacements, which are the designated character
sets for Internet mail. As of the publication of this
document, the legitimate values for "X" are the digits 1
...
... through 9.
The character sets specified above are the ones that were relatively
uncontroversial during the drafting of MIME. This document does not
...
... uncontroversial during the drafting of MIME. This document does not
endorse the use of any particular character set other than US-ASCII,
and recognizes that the future evolution of world character sets ...
... character set other than US-ASCII,
and recognizes that the future evolution of world character sets
remains unclear. It is expected that in the future, additional
character sets ...
... character sets
remains unclear. It is expected that in the future, additional
character sets will be registered for use in MIME.
...
... MIME.
Note that the character set used, if anything other than US-ASCII,
must always be explicitly specified in the Content-Type ...
... Content-Type field.
No other character set name may be used in Internet mail without the
publication of a formal specification and its registration ...
...
Implementors are discouraged from defining new character sets for
mail use unless absolutely necessary.
...
...
In general, mail-sending software must always use the "lowest common
denominator" character set possible. For example, if a body contains
only US-ASCII characters, it must be marked as being in the US-ASCII ...
... US-ASCII characters, it must be marked as being in the US-ASCII
character set, not ISO-8859-1, which, like all the ISO-8859 family of
...
... ISO-8859-1, which, like all the ISO-8859 family of
character sets, is a superset of US-ASCII. More generally, if a
widely-used character set ...
... character sets, is a superset of US-ASCII. More generally, if a
widely-used character set is a subset of another character set, and a
body contains only characters in the widely-used subset, it must be
...
... US-ASCII. More generally, if a
widely-used character set is a subset of another character set, and a
body contains only characters in the widely-used subset, it must be
labeled as being in that subset. This will increase the chances that
...
... data, such as file names and mail server commands, are required to be
in the US-ASCII character set. If this proves problematic in
practice, a new mechanism may be required as a future extension to
MIME ...
... image, or several other kinds of data. A
distinguished parameter syntax allows further specification of data
format details, particularly the specification of alternate character
sets. Additional optional header fields provide mechanisms for
certain extensions deemed desirable by many implementors ...
... text. Be able to send at least text/plain messages, with the
character set specified as a parameter if it is not US-ASCII.
...
...
-- Recognize and display "text" mail
with the character set "US-ASCII."
...
... US-ASCII."
-- Recognize other character sets at
least to the extent of being able
to inform the user about what
...
... least to the extent of being able
to inform the user about what
character set the message uses.
-- Recognize the "ISO ...
...
-- Recognize the "ISO-8859-*" character
sets to the extent of being able to
display those characters that are
common to ISO ...
... gateways to systems that use the EBCDIC character set.
(1) Under some circumstances the encoding ...
... encapsulated text message in a non-ASCII character set.
The embedded multipart message has two parts to be displayed in
parallel, a picture and an audio ...
... associated parameters will grow significantly with time. Several
other MIME fields, notably character set names, access-type
parameters for the message/external-body type, and possibly even
...
... quoted-printable generally preferred if an encoding
is needed and the character set is mostly an ASCII superset.
...
... The body to be transmitted is created in the system's native format.
The native character set is used, and where appropriate local end of
line conventions are used as well. The body may be a UNIX-style text
...
... canonical
form that is used. Conversion to the proper canonical form may
involve character set conversion, transformation of audio data,
compression ...
... compression, or various other operations specific to the various
content types. If character set conversion is involved, however,
care must be taken to understand the semantics of the content-type ...
... semantics of the content-type,
which may have strong implications for any character set conversion,
e.g. with regard to syntactically meaningful characters in a text
subtype other than "plain".
...
... For example, in the case of text/plain data, the text must be
converted to a supported character set and lines must be delimited
with CRLF delimiters in accordance with RFC822std11(-> 2822prop) ...
...
must be first represented in the text/foo form, then (if necessary)
represented in the "bar" character set, and finally transformed via
the base64 algorithm ...
... Coded Character Set--7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. ...
... Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, ISO ...
... International Standard--Information Processing--ISO 7-bit coded character set for information interchange, ISO 646:1983. ...
... Simonsen, K., "Character Mnemonics & Character Sets", RFC 1345, Rationel Almen Planlaegning, June 1992. ...
