RFC 1521:MIME (Multipurpose Internet Mail Extensio...
RFC-Ref

character set


Click on the red underlined text to get to the source

... 822std11(-> 2822prop) is inadequate for the needs of mail users whose languages require the use of character sets richer than US ASCII [US-ASCII ...
... 2.a. A "text" Content-Type value, which can be used to represent textual information in a number of character sets and formatted text description languages in a standardized ...
... order to allow it to pass through mail transport mechanisms which may have data or character set limitations. 4. Two additional header ...
... associated parameters will grow significantly with time. Several other MIME fields, notably including character set names, are likely to have new values defined over time. In order to ensure that the set of such values is developed in an orderly, well-specified, and ...


... 822std11(-> 2822prop) mail. The term "character set" is used in this document to refer to a method used with one or more tables to convert encoded text to a ...
... ISO 2022's techniques. However, a MIME character set name must fully specify the mapping to be performed. ...


... parameters is not significant. Among the defined parameters is a "charset" parameter by which the character set used in the body may be declared. Comments are allowed in accordance with RFC 822std11(-> 2822prop) rules ...
... special software is required to get the full meaning of the text, aside from support for the indicated character set. Subtypes are to be used for enriched text in forms where application software may enhance the appearance of the text, ...
... 822std11(-> 2822prop) messages are typed by this protocol as plain text in the US-ASCII character set, which can be explicitly specified as "Content-type: text/plain ...
... header field, it is impossible to be certain that a message is actually text in the US-ASCII character set, since it might well be a message that, using the conventions that predate this document, includes text in another character set ...
... character set, since it might well be a message that, using the conventions that predate this document, includes text in another character set or non- textual data in a manner that cannot be automatically recognized (e.g., a uuencoded compressed UNIX ...


... encoding of data that was originally in ISO-8859-1, and will be in that character set again after decoding. The following sections will define the two standard encoding ...
... transport, no encoding would be required for text in certain character sets, while such encodings are clearly required for 7- bit ...
... system to system, and the relationship between content-transfer- encodings and character sets. For this reason, a canonical model for encoding ...


... The description is presumed to be given in the US-ASCII character set, although the mechanism specified in [RFC-1522] may be used for non-US-ASCII ...


... Content-Type. A "charset" parameter may be used to indicate the character set of the body text for some text subtypes, notably including the primary subtype, "text/plain ...
... Content-Type field for text/plain data is the character set. This is specified with a "charset" parameter, as in: ...
... Unlike some other parameter values, the values of the charset parameter are NOT case sensitive. The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII ...
... charset. In particular, definers of future text subtypes should pay close attention the the implications of multibyte character sets for their subtype definitions. ...
... mapping which does not require external profiling information. An initial list of predefined character set names can be found at the end of this section. Additional character sets may be registered ...
... An initial list of predefined character set names can be found at the end of this section. Additional character sets may be registered with IANA, although the standardization of their use requires the ...
... IESG [RFC-1340] review and approval. Note that if the specified character set includes 8-bit data, a Content-Transfer- Encoding header ...
... SMTP. The default character set, US-ASCII, has been the subject of some ...
... future, it is strongly recommended that new user agents explicitly specify a character set via the Content-Type header field. "US- ...
... Internet mail is explicitly discouraged. The omission of the ISO 646 character set is deliberate in this regard. The character set name of "US-ASCII ...
... ISO 646 character set is deliberate in this regard. The character set name of "US-ASCII" explicitly refers to ANSI ...
... ANSI X3.4-1986 [US-ASCII] only. The character set name "ASCII" is reserved and must not be used for any purpose. ...
... version of the American Standard. Insofar as one of the purposes of specifying a Content-Type and character set is to permit the receiver to unambiguously determine how the sender ...
... 8-bit or multiple octet character encodings MUST use an appropriate character set specification to be consistent with this specification. ...
... The complete US-ASCII character set is listed in [US-ASCII]. Note that the control characters ...
... NOTE: Beyond US-ASCII, an enormous proliferation of character sets is possible. It is the opinion of the IETF working group that a ...
... is possible. It is the opinion of the IETF working group that a large number of character sets is NOT a good thing. We would prefer to specify a single character set that can be used ...
... large number of character sets is NOT a good thing. We would prefer to specify a single character set that can be used universally for representing all of the world's languages in ...
... electronic mail. Unfortunately, existing practice in several communities seems to point to the continued use of multiple character sets in the near future. For this reason, we define names for a small number of character sets ...
... character sets in the near future. For this reason, we define names for a small number of character sets for which a strong constituent base exists. ...
... ISO-8859]. Note that the ISO 646 character sets have deliberately been omitted in favor of their 8859 replacements, which are the designated character sets for Internet mail ...
... character sets have deliberately been omitted in favor of their 8859 replacements, which are the designated character sets for Internet mail. As of the publication of this document, the legitimate values for "X" are the digits 1 ...
... through 9. The character sets specified above are the ones that were relatively uncontroversial during the drafting of MIME. This document does not ...
... uncontroversial during the drafting of MIME. This document does not endorse the use of any particular character set other than US-ASCII, and recognizes that the future evolution of world character sets ...
... character set other than US-ASCII, and recognizes that the future evolution of world character sets remains unclear. It is expected that in the future, additional character sets ...
... character sets remains unclear. It is expected that in the future, additional character sets will be registered for use in MIME. ...
... MIME. Note that the character set used, if anything other than US-ASCII, must always be explicitly specified in the Content-Type ...
... Content-Type field. No other character set name may be used in Internet mail without the publication of a formal specification and its registration ...
... IANA, or by private agreement, in which case the character set name must begin with "X-". ...
... Implementors are discouraged from defining new character sets for mail use unless absolutely necessary. ...
... In general, mail-sending software must always use the "lowest common denominator" character set possible. For example, if a body contains only US-ASCII characters, it must be marked as being in the US-ASCII ...
... US-ASCII characters, it must be marked as being in the US-ASCII character set, not ISO-8859-1, which, like all the ISO-8859 family of ...
... ISO-8859-1, which, like all the ISO-8859 family of character sets, is a superset of US-ASCII. More generally, if a widely-used character set ...
... character sets, is a superset of US-ASCII. More generally, if a widely-used character set is a subset of another character set, and a body contains only characters in the widely-used subset, it must be ...
... US-ASCII. More generally, if a widely-used character set is a subset of another character set, and a body contains only characters in the widely-used subset, it must be labeled as being in that subset. This will increase the chances that ...
... data, such as file names and mail server commands, are required to be in the US-ASCII character set. If this proves problematic in practice, a new mechanism may be required as a future extension to MIME ...


... image, or several other kinds of data. A distinguished parameter syntax allows further specification of data format details, particularly the specification of alternate character sets. Additional optional header fields provide mechanisms for certain extensions deemed desirable by many implementors ...


... text. Be able to send at least text/plain messages, with the character set specified as a parameter if it is not US-ASCII. ...
... -- Recognize and display "text" mail with the character set "US-ASCII." ...
... US-ASCII." -- Recognize other character sets at least to the extent of being able to inform the user about what ...
... least to the extent of being able to inform the user about what character set the message uses. -- Recognize the "ISO ...
... -- Recognize the "ISO-8859-*" character sets to the extent of being able to display those characters that are common to ISO ...


... gateways to systems that use the EBCDIC character set. (1) Under some circumstances the encoding ...


... encapsulated text message in a non-ASCII character set. The embedded multipart message has two parts to be displayed in parallel, a picture and an audio ...


... associated parameters will grow significantly with time. Several other MIME fields, notably character set names, access-type parameters for the message/external-body type, and possibly even ...


... quoted-printable generally preferred if an encoding is needed and the character set is mostly an ASCII superset. ...


... The body to be transmitted is created in the system's native format. The native character set is used, and where appropriate local end of line conventions are used as well. The body may be a UNIX-style text ...
... canonical form that is used. Conversion to the proper canonical form may involve character set conversion, transformation of audio data, compression ...
... compression, or various other operations specific to the various content types. If character set conversion is involved, however, care must be taken to understand the semantics of the content-type ...
... semantics of the content-type, which may have strong implications for any character set conversion, e.g. with regard to syntactically meaningful characters in a text subtype other than "plain". ...
... For example, in the case of text/plain data, the text must be converted to a supported character set and lines must be delimited with CRLF delimiters in accordance with RFC822std11(-> 2822prop) ...
... must be first represented in the text/foo form, then (if necessary) represented in the "bar" character set, and finally transformed via the base64 algorithm ...


... Coded Character Set--7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. ...
... ISO 7-bit and 8-bit coded character sets--Code extension techniques, ISO 2022:1986. ...
... Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, ISO ...
... International Standard--Information Processing--ISO 7-bit coded character set for information interchange, ISO 646:1983. ...
... Simonsen, K., "Character Mnemonics & Character Sets", RFC 1345, Rationel Almen Planlaegning, June 1992. ...



Google
Web
RFC-Ref