rohrpost

A commandline mail client to change the world as we see it.
git clone git://r-36.net/rohrpost
Log | Files | Refs | README | LICENSE

rfc1341.txt (211117B)


      1 
      2 
      3 
      4 
      5 
      6 
      7             Network Working Group               N. Borenstein, Bellcore
      8             Request for Comments: 1341               N. Freed, Innosoft
      9                                                               June 1992
     10 
     11 
     12 
     13                    MIME  (Multipurpose Internet Mail Extensions):
     14 
     15 
     16                       Mechanisms for Specifying and Describing
     17                        the Format of Internet Message Bodies
     18 
     19 
     20           Status of this Memo
     21 
     22             This RFC specifies an IAB standards track protocol  for  the
     23             Internet  community, and requests discussion and suggestions
     24             for improvements.  Please refer to the  current  edition  of
     25             the    "IAB    Official    Protocol   Standards"   for   the
     26             standardization  state  and   status   of   this   protocol.
     27             Distribution of this memo is unlimited.
     28 
     29           Abstract
     30 
     31             RFC 822 defines  a  message  representation  protocol  which
     32             specifies  considerable  detail  about  message headers, but
     33             which leaves the message content, or message body,  as  flat
     34             ASCII  text.   This document redefines the format of message
     35             bodies to allow multi-part textual and  non-textual  message
     36             bodies  to  be  represented  and  exchanged  without loss of
     37             information.   This is based on earlier work  documented  in
     38             RFC  934  and  RFC  1049, but extends and revises that work.
     39             Because RFC 822 said so little about  message  bodies,  this
     40             document  is  largely  orthogonal to (rather than a revision
     41             of) RFC 822.
     42 
     43             In  particular,  this  document  is  designed   to   provide
     44             facilities  to include multiple objects in a single message,
     45             to represent body text in  character  sets  other  than  US-
     46             ASCII,  to  represent formatted multi-font text messages, to
     47             represent non-textual material  such  as  images  and  audio
     48             fragments,  and  generally  to  facilitate  later extensions
     49             defining new types of Internet mail for use  by  cooperating
     50             mail agents.
     51 
     52             This document does NOT extend Internet mail header fields to
     53             permit  anything  other  than  US-ASCII  text  data.   It is
     54             recognized that such extensions are necessary, and they  are
     55             the subject of a companion document [RFC -1342].
     56 
     57             A table of contents appears at the end of this document.
     58 
     59 
     60 
     61 
     62 
     63 
     64             Borenstein & Freed                                  [Page i]
     65 
     66 
     67 
     68 
     69 
     70 
     71 
     72             1    Introduction
     73 
     74             Since its publication in 1982, RFC 822 [RFC-822] has defined
     75             the   standard  format  of  textual  mail  messages  on  the
     76             Internet.  Its success has been such that the RFC 822 format
     77             has  been  adopted,  wholly  or  partially,  well beyond the
     78             confines of the Internet and  the  Internet  SMTP  transport
     79             defined  by RFC 821 [RFC-821].  As the format has seen wider
     80             use,  a  number  of  limitations  have  proven  increasingly
     81             restrictive for the user community.
     82 
     83             RFC 822 was intended to specify a format for text  messages.
     84             As such, non-text messages, such as multimedia messages that
     85             might include audio or images,  are  simply  not  mentioned.
     86             Even in the case of text, however, RFC 822 is inadequate for
     87             the needs of mail users whose languages require the  use  of
     88             character  sets  richer  than US ASCII [US-ASCII]. Since RFC
     89             822 does not specify mechanisms for mail  containing  audio,
     90             video,  Asian  language  text, or even text in most European
     91             languages, additional specifications are needed
     92 
     93             One of the notable limitations of  RFC  821/822  based  mail
     94             systems  is  the  fact  that  they  limit  the  contents  of
     95             electronic  mail  messages  to  relatively  short  lines  of
     96             seven-bit  ASCII.   This  forces  users  to convert any non-
     97             textual data that they may wish to send into seven-bit bytes
     98             representable  as printable ASCII characters before invoking
     99             a local mail UA (User Agent,  a  program  with  which  human
    100             users  send  and  receive  mail). Examples of such encodings
    101             currently used in the  Internet  include  pure  hexadecimal,
    102             uuencode,  the  3-in-4 base 64 scheme specified in RFC 1113,
    103             the Andrew Toolkit Representation [ATK], and many others.
    104 
    105             The limitations of RFC 822 mail become even more apparent as
    106             gateways  are  designed  to  allow  for the exchange of mail
    107             messages between RFC 822 hosts and X.400 hosts. X.400 [X400]
    108             specifies  mechanisms  for the inclusion of non-textual body
    109             parts  within  electronic  mail   messages.    The   current
    110             standards  for  the  mapping  of  X.400  messages to RFC 822
    111             messages specify that either X.400  non-textual  body  parts
    112             should  be converted to (not encoded in) an ASCII format, or
    113             that they should be discarded, notifying the  RFC  822  user
    114             that  discarding has occurred.  This is clearly undesirable,
    115             as information that a user may  wish  to  receive  is  lost.
    116             Even  though  a  user's  UA  may  not have the capability of
    117             dealing with the non-textual body part, the user might  have
    118             some  mechanism  external  to the UA that can extract useful
    119             information from the body part.  Moreover, it does not allow
    120             for  the  fact  that the message may eventually be gatewayed
    121             back into an X.400 message handling system (i.e., the  X.400
    122             message  is  "tunneled"  through  Internet  mail), where the
    123             non-textual  information  would  definitely  become   useful
    124             again.
    125 
    126 
    127 
    128 
    129             Borenstein & Freed                                  [Page 1]
    130 
    131 
    132 
    133 
    134             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    135 
    136 
    137             This document describes several mechanisms that  combine  to
    138             solve most of these problems without introducing any serious
    139             incompatibilities with the existing world of RFC  822  mail.
    140             In particular, it describes:
    141 
    142             1.  A MIME-Version header field, which uses a version number
    143                  to  declare  a  message  to  be  conformant  with  this
    144                  specification and  allows  mail  processing  agents  to
    145                  distinguish  between  such messages and those generated
    146                  by older or non-conformant software, which is  presumed
    147                  to lack such a field.
    148 
    149             2.  A Content-Type header field, generalized from  RFC  1049
    150                  [RFC-1049],  which  can be used to specify the type and
    151                  subtype of data in the body of a message and  to  fully
    152                  specify  the  native  representation (encoding) of such
    153                  data.
    154 
    155                  2.a.  A "text" Content-Type value, which can be used to
    156                       represent  textual  information  in  a  number  of
    157                       character  sets  and  formatted  text  description
    158                       languages in a standardized manner.
    159 
    160                  2.b.  A "multipart" Content-Type value,  which  can  be
    161                       used  to  combine  several body parts, possibly of
    162                       differing types of data, into a single message.
    163 
    164                  2.c.  An "application" Content-Type value, which can be
    165                       used  to transmit application data or binary data,
    166                       and hence,  among  other  uses,  to  implement  an
    167                       electronic mail file transfer service.
    168 
    169                  2.d.  A "message" Content-Type value, for encapsulating
    170                       a mail message.
    171 
    172                  2.e  An "image"  Content-Type value,  for  transmitting
    173                       still image (picture) data.
    174 
    175                  2.f.  An "audio"  Content-Type value, for  transmitting
    176                       audio or voice data.
    177 
    178                  2.g.  A "video"  Content-Type value,  for  transmitting
    179                       video or moving image data, possibly with audio as
    180                       part of the composite video data format.
    181 
    182             3.  A Content-Transfer-Encoding header field, which  can  be
    183                  used  to specify an auxiliary encoding that was applied
    184                  to the data in order to allow it to pass  through  mail
    185                  transport  mechanisms  which may have data or character
    186                  set limitations.
    187 
    188             4.  Two optional header fields that can be used  to  further
    189                  describe the data in a message body, the Content-ID and
    190                  Content-Description header fields.
    191 
    192 
    193 
    194             Borenstein & Freed                                  [Page 2]
    195 
    196 
    197 
    198 
    199             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    200 
    201 
    202             MIME has been carefully designed as an extensible mechanism,
    203             and  it  is  expected  that  the set of content-type/subtype
    204             pairs   and   their   associated   parameters   will    grow
    205             significantly with time.  Several other MIME fields, notably
    206             including character set names, are likely to have new values
    207             defined  over time.  In order to ensure that the set of such
    208             values is  developed  in  an  orderly,  well-specified,  and
    209             public  manner,  MIME  defines  a registration process which
    210             uses the Internet Assigned Numbers  Authority  (IANA)  as  a
    211             central  registry  for  such  values.   Appendix  F provides
    212             details about how IANA registration is accomplished.
    213 
    214             Finally, to specify and promote interoperability, Appendix A
    215             of  this  document  provides a basic applicability statement
    216             for a subset of the above mechanisms that defines a  minimal
    217             level of "conformance" with this document.
    218 
    219             HISTORICAL NOTE:  Several of  the  mechanisms  described  in
    220             this  document  may seem somewhat strange or even baroque at
    221             first reading.  It is important to note  that  compatibility
    222             with  existing  standards  AND  robustness  across  existing
    223             practice were two of the highest priorities of  the  working
    224             group   that   developed   this  document.   In  particular,
    225             compatibility was always favored over elegance.
    226 
    227             2    Notations, Conventions, and Generic BNF Grammar
    228 
    229             This document is being published in  two  versions,  one  as
    230             plain  ASCII  text  and  one  as  PostScript.  The latter is
    231             recommended, though the textual contents are  identical.  An
    232             Andrew-format  copy  of this document is also available from
    233             the first author (Borenstein).
    234 
    235             Although the mechanisms specified in this document  are  all
    236             described  in prose, most are also described formally in the
    237             modified BNF notation of RFC 822.  Implementors will need to
    238             be  familiar  with this notation in order to understand this
    239             specification, and are referred to RFC 822  for  a  complete
    240             explanation of the modified BNF notation.
    241 
    242             Some of the modified BNF in this document makes reference to
    243             syntactic  entities  that  are defined in RFC 822 and not in
    244             this document.  A complete formal grammar, then, is obtained
    245             by combining the collected grammar appendix of this document
    246             with that of RFC 822.
    247 
    248             The term CRLF, in this document, refers to the  sequence  of
    249             the  two  ASCII  characters CR (13) and LF (10) which, taken
    250             together, in this order, denote a  line  break  in  RFC  822
    251             mail.
    252 
    253             The term "character  set",  wherever  it  is  used  in  this
    254             document,  refers  to a coded character set, in the sense of
    255             ISO character set standardization  work,  and  must  not  be
    256 
    257 
    258 
    259             Borenstein & Freed                                  [Page 3]
    260 
    261 
    262 
    263 
    264             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    265 
    266 
    267             misinterpreted as meaning "a set of characters."
    268 
    269             The term "message", when not further qualified, means either
    270             the (complete or "top-level") message being transferred on a
    271             network, or  a  message  encapsulated  in  a  body  of  type
    272             "message".
    273 
    274             The term "body part", in this document,  means  one  of  the
    275             parts  of  the body of a multipart entity. A body part has a
    276             header and a body, so it makes sense to speak about the body
    277             of a body part.
    278 
    279             The term "entity", in this document, means either a  message
    280             or  a  body  part.  All kinds of entities share the property
    281             that they have a header and a body.
    282 
    283             The term "body", when not further qualified, means the  body
    284             of  an  entity, that is the body of either a message or of a
    285             body part.
    286 
    287             Note : the previous four definitions are  clearly  circular.
    288             This  is  unavoidable,  since the overal structure of a MIME
    289             message is indeed recursive.
    290 
    291             In this document, all numeric and octet values are given  in
    292             decimal notation.
    293 
    294             It must be noted that  Content-Type  values,  subtypes,  and
    295             parameter  names  as  defined  in  this  document  are case-
    296             insensitive.  However, parameter values  are  case-sensitive
    297             unless otherwise specified for the specific parameter.
    298 
    299             FORMATTING NOTE:  This document has been carefully formatted
    300             for   ease  of  reading.  The  PostScript  version  of  this
    301             document, in particular, places notes like this  one,  which
    302             may  be  skipped  by  the  reader, in a smaller, italicized,
    303             font, and indents it as well.  In the text version, only the
    304             indentation  is  preserved,  so  if you are reading the text
    305             version of this you  might  consider  using  the  PostScript
    306             version  instead.  However,  all such notes will be indented
    307             and preceded by "NOTE:" or some similar  introduction,  even
    308             in the text version.
    309 
    310             The primary purpose  of  these  non-essential  notes  is  to
    311             convey  information about the rationale of this document, or
    312             to  place  this  document  in  the  proper   historical   or
    313             evolutionary  context.   Such  information may be skipped by
    314             those who are  focused  entirely  on  building  a  compliant
    315             implementation,  but  may  be  of  use  to those who wish to
    316             understand why this document is written as it is.
    317 
    318             For ease of  recognition,  all  BNF  definitions  have  been
    319             placed  in  a  fixed-width font in the PostScript version of
    320             this document.
    321 
    322 
    323 
    324             Borenstein & Freed                                  [Page 4]
    325 
    326 
    327 
    328 
    329             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    330 
    331 
    332             3    The MIME-Version Header Field
    333 
    334             Since RFC 822 was published in 1982, there has  really  been
    335             only  one  format  standard for Internet messages, and there
    336             has  been  little  perceived  need  to  declare  the  format
    337             standard  in  use.  This document is an independent document
    338             that complements RFC 822. Although the  extensions  in  this
    339             document have been defined in such a way as to be compatible
    340             with RFC 822, there are  still  circumstances  in  which  it
    341             might  be  desirable  for  a  mail-processing  agent to know
    342             whether a message was composed  with  the  new  standard  in
    343             mind.
    344 
    345             Therefore, this document defines a new header field,  "MIME-
    346             Version",  which is to be used to declare the version of the
    347             Internet message body format standard in use.
    348 
    349             Messages composed in  accordance  with  this  document  MUST
    350             include  such  a  header  field, with the following verbatim
    351             text:
    352 
    353             MIME-Version: 1.0
    354 
    355             The presence of this header field is an assertion  that  the
    356             message has been composed in compliance with this document.
    357 
    358             Since it is possible that a future document might extend the
    359             message format standard again, a formal BNF is given for the
    360             content of the MIME-Version field:
    361 
    362             MIME-Version := text
    363 
    364             Thus, future  format  specifiers,  which  might  replace  or
    365             extend  "1.0", are (minimally) constrained by the definition
    366             of "text", which appears in RFC 822.
    367 
    368             Note that the MIME-Version header field is required  at  the
    369             top  level  of  a  message. It is not required for each body
    370             part of a multipart entity.  It is required for the embedded
    371             headers  of  a  body  of  type  "message" if and only if the
    372             embedded message is itself claimed to be MIME-compliant.
    373 
    374 
    375 
    376 
    377 
    378 
    379 
    380 
    381 
    382 
    383 
    384 
    385 
    386 
    387 
    388 
    389             Borenstein & Freed                                  [Page 5]
    390 
    391 
    392 
    393 
    394             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    395 
    396 
    397             4    The Content-Type Header Field
    398 
    399             The purpose of the Content-Type field  is  to  describe  the
    400             data  contained  in the body fully enough that the receiving
    401             user agent can pick an appropriate  agent  or  mechanism  to
    402             present  the  data  to the user, or  otherwise deal with the
    403             data in an appropriate manner.
    404 
    405             HISTORICAL NOTE:  The Content-Type header  field  was  first
    406             defined  in RFC 1049.  RFC 1049 Content-types used a simpler
    407             and less powerful syntax, but one that is largely compatible
    408             with the mechanism given here.
    409 
    410             The Content-Type  header field is used to specify the nature
    411             of  the  data  in  the body of an entity, by giving type and
    412             subtype identifiers, and by providing auxiliary  information
    413             that may be required for certain types.   After the type and
    414             subtype names, the remainder of the header field is simply a
    415             set of parameters, specified in an attribute/value notation.
    416             The set of meaningful parameters differs for  the  different
    417             types.   The  ordering  of  parameters  is  not significant.
    418             Among the defined parameters is  a  "charset"  parameter  by
    419             which  the  character  set used in the body may be declared.
    420             Comments are allowed in accordance with RFC  822  rules  for
    421             structured header fields.
    422 
    423             In general, the top-level Content-Type is  used  to  declare
    424             the  general  type  of  data,  while the subtype specifies a
    425             specific format for that type of data.  Thus, a Content-Type
    426             of  "image/xyz" is enough to tell a user agent that the data
    427             is an image, even if the user agent has no knowledge of  the
    428             specific  image format "xyz".  Such information can be used,
    429             for example, to decide whether or not to show a user the raw
    430             data from an unrecognized subtype -- such an action might be
    431             reasonable for unrecognized subtypes of text,  but  not  for
    432             unrecognized  subtypes  of image or audio.  For this reason,
    433             registered subtypes of audio, image, text, and video, should
    434             not  contain  embedded  information  that  is  really  of  a
    435             different type.  Such compound types should  be  represented
    436             using the "multipart" or "application" types.
    437 
    438             Parameters are modifiers of the content-subtype, and do  not
    439             fundamentally  affect  the  requirements of the host system.
    440             Although  most  parameters  make  sense  only  with  certain
    441             content-types,  others  are  "global" in the sense that they
    442             might apply to any  subtype.  For  example,  the  "boundary"
    443             parameter makes sense only for the "multipart" content-type,
    444             but the "charset" parameter might make  sense  with  several
    445             content-types.
    446 
    447             An initial set of seven Content-Types  is  defined  by  this
    448             document.   This  set  of  top-level names is intended to be
    449             substantially complete.  It is expected  that  additions  to
    450             the   larger   set  of  supported  types  can  generally  be
    451 
    452 
    453 
    454             Borenstein & Freed                                  [Page 6]
    455 
    456 
    457 
    458 
    459             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    460 
    461 
    462             accomplished by  the  creation  of  new  subtypes  of  these
    463             initial  types.   In the future, more top-level types may be
    464             defined only by an extension to this standard.   If  another
    465             primary  type is to be used for any reason, it must be given
    466             a name starting  with  "X-"  to  indicate  its  non-standard
    467             status  and  to  avoid  a  potential  conflict with a future
    468             official name.
    469 
    470             In the Extended BNF notation  of  RFC  822,  a  Content-Type
    471             header field value is defined as follows:
    472 
    473             Content-Type := type "/" subtype *[";" parameter]
    474 
    475             type :=          "application"     / "audio"
    476                       / "image"           / "message"
    477                       / "multipart"  / "text"
    478                       / "video"           / x-token
    479 
    480             x-token := <The two characters "X-" followed, with no
    481                        intervening white space, by any token>
    482 
    483             subtype := token
    484 
    485             parameter := attribute "=" value
    486 
    487             attribute := token
    488 
    489             value := token / quoted-string
    490 
    491             token := 1*<any CHAR except SPACE, CTLs, or tspecials>
    492 
    493             tspecials :=  "(" / ")" / "<" / ">" / "@"  ; Must be in
    494                        /  "," / ";" / ":" / "\" / <">  ; quoted-string,
    495                        /  "/" / "[" / "]" / "?" / "."  ; to use within
    496                        /  "="                        ; parameter values
    497 
    498             Note that the definition of "tspecials" is the same  as  the
    499             RFC  822  definition  of "specials" with the addition of the
    500             three characters "/", "?", and "=".
    501 
    502             Note also that a subtype specification is MANDATORY.   There
    503             are no default subtypes.
    504 
    505             The  type,  subtype,  and  parameter  names  are  not   case
    506             sensitive.   For  example,  TEXT,  Text,  and  TeXt  are all
    507             equivalent.  Parameter values are normally  case  sensitive,
    508             but   certain   parameters   are  interpreted  to  be  case-
    509             insensitive, depending on the intended use.   (For  example,
    510             multipart  boundaries  are  case-sensitive, but the "access-
    511             type" for message/External-body is not case-sensitive.)
    512 
    513             Beyond this syntax, the only constraint on the definition of
    514             subtype  names  is  the  desire  that  their  uses  must not
    515             conflict.  That is, it would  be  undesirable  to  have  two
    516 
    517 
    518 
    519             Borenstein & Freed                                  [Page 7]
    520 
    521 
    522 
    523 
    524             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    525 
    526 
    527             different       communities       using       "Content-Type:
    528             application/foobar"  to  mean  two  different  things.   The
    529             process  of  defining  new  content-subtypes,  then,  is not
    530             intended to be a mechanism for  imposing  restrictions,  but
    531             simply  a  mechanism  for publicizing the usages. There are,
    532             therefore,  two  acceptable  mechanisms  for  defining   new
    533             Content-Type subtypes:
    534 
    535                  1.  Private values (starting  with  "X-")  may  be
    536                       defined  bilaterally  between two cooperating
    537                       agents  without   outside   registration   or
    538                       standardization.
    539 
    540                  2.   New  standard  values  must  be   documented,
    541                       registered  with,  and  approved  by IANA, as
    542                       described in Appendix F.  Where intended  for
    543                       public  use,  the  formats they refer to must
    544                       also be defined by a published specification,
    545                       and possibly offered for standardization.
    546 
    547             The seven  standard  initial  predefined  Content-Types  are
    548             detailed in the bulk of this document.  They are:
    549 
    550                  text --  textual  information.   The  primary  subtype,
    551                       "plain",  indicates plain (unformatted) text.   No
    552                       special software  is  required  to  get  the  full
    553                       meaning  of  the  text, aside from support for the
    554                       indicated character set.  Subtypes are to be  used
    555                       for  enriched  text  in  forms  where  application
    556                       software may enhance the appearance of  the  text,
    557                       but such software must not be required in order to
    558                       get the general  idea  of  the  content.  Possible
    559                       subtypes  thus include any readable word processor
    560                       format.   A  very  simple  and  portable  subtype,
    561                       richtext, is defined in this document.
    562                  multipart --  data  consisting  of  multiple  parts  of
    563                       independent  data  types.   Four  initial subtypes
    564                       are  defined,  including   the   primary   "mixed"
    565                       subtype,  "alternative"  for representing the same
    566                       data in multiple  formats,  "parallel"  for  parts
    567                       intended to be viewed simultaneously, and "digest"
    568                       for multipart entities in which each  part  is  of
    569                       type "message".
    570                  message  --  an  encapsulated  message.   A   body   of
    571                       Content-Type "message" is itself a fully formatted
    572                       RFC 822 conformant message which may  contain  its
    573                       own  different  Content-Type  header  field.   The
    574                       primary  subtype  is  "rfc822".    The   "partial"
    575                       subtype is defined for partial messages, to permit
    576                       the fragmented transmission  of  bodies  that  are
    577                       thought  to be too large to be passed through mail
    578                       transport    facilities.      Another     subtype,
    579                       "External-body",  is  defined for specifying large
    580                       bodies by reference to an external data source.
    581 
    582 
    583 
    584             Borenstein & Freed                                  [Page 8]
    585 
    586 
    587 
    588 
    589             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    590 
    591 
    592                  image --  image data.  Image requires a display  device
    593                       (such  as a graphical display, a printer, or a FAX
    594                       machine)  to  view   the   information.    Initial
    595                       subtypes  are  defined  for  two widely-used image
    596                       formats, jpeg and gif.
    597                  audio --  audio data,  with  initial  subtype  "basic".
    598                       Audio  requires  an audio output device (such as a
    599                       speaker or a telephone) to "display" the contents.
    600                  video --  video data.  Video requires the capability to
    601                       display   moving   images,   typically   including
    602                       specialized hardware and  software.   The  initial
    603                       subtype is "mpeg".
    604                  application --  some  other  kind  of  data,  typically
    605                       either uninterpreted binary data or information to
    606                       be processed by  a  mail-based  application.   The
    607                       primary  subtype, "octet-stream", is to be used in
    608                       the case of uninterpreted binary  data,  in  which
    609                       case  the  simplest recommended action is to offer
    610                       to write the information into a file for the user.
    611                       Two  additional  subtypes, "ODA" and "PostScript",
    612                       are defined for transporting  ODA  and  PostScript
    613                       documents  in  bodies.   Other  expected  uses for
    614                       "application"  include  spreadsheets,   data   for
    615                       mail-based  scheduling  systems, and languages for
    616                       "active" (computational) email.  (Note that active
    617                       email   entails   several  securityconsiderations,
    618                       which  are   discussed   later   in   this   memo,
    619                       particularly      in      the      context      of
    620                       application/PostScript.)
    621 
    622             Default RFC 822 messages are typed by this protocol as plain
    623             text  in the US-ASCII character set, which can be explicitly
    624             specified as "Content-type:  text/plain;  charset=us-ascii".
    625             If  no  Content-Type  is specified, either by error or by an
    626             older user agent, this default is assumed.   In the presence
    627             of  a  MIME-Version header field, a receiving User Agent can
    628             also assume  that  plain  US-ASCII  text  was  the  sender's
    629             intent.   In  the  absence  of a MIME-Version specification,
    630             plain US-ASCII text must still be assumed, but the  sender's
    631             intent might have been otherwise.
    632 
    633             RATIONALE:  In the absence of any Content-Type header  field
    634             or MIME-Version header field, it is impossible to be certain
    635             that a message is actually text in  the  US-ASCII  character
    636             set,  since  it  might  well  be  a  message that, using the
    637             conventions that predate this  document,  includes  text  in
    638             another  character  set or non-textual data in a manner that
    639             cannot  be  automatically  recognized  (e.g.,  a   uuencoded
    640             compressed  UNIX  tar  file).  Although  there  is  no fully
    641             acceptable alternative to treating such untyped messages  as
    642             "text/plain;  charset=us-ascii",  implementors should remain
    643             aware that if a message lacks both the MIME-Version and  the
    644             Content-Type  header  fields,  it  may  in  practice contain
    645             almost anything.
    646 
    647 
    648 
    649             Borenstein & Freed                                  [Page 9]
    650 
    651 
    652 
    653 
    654             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    655 
    656 
    657             It should be noted that  the  list  of  Content-Type  values
    658             given  here  may  be  augmented  in time, via the mechanisms
    659             described above, and that the set of subtypes is expected to
    660             grow substantially.
    661 
    662             When a mail reader encounters mail with an unknown  Content-
    663             type  value,  it  should generally treat it as equivalent to
    664             "application/octet-stream",  as  described  later  in   this
    665             document.
    666 
    667             5    The Content-Transfer-Encoding Header Field
    668 
    669             Many Content-Types which could usefully be  transported  via
    670             email  are  represented, in their "natural" format, as 8-bit
    671             character or binary data.  Such data cannot  be  transmitted
    672             over   some  transport  protocols.   For  example,  RFC  821
    673             restricts mail messages to 7-bit  US-ASCII  data  with  1000
    674             character lines.
    675 
    676             It is necessary, therefore, to define a  standard  mechanism
    677             for  re-encoding  such  data into a 7-bit short-line format.
    678             This  document  specifies  that  such  encodings   will   be
    679             indicated by a new "Content-Transfer-Encoding" header field.
    680             The Content-Transfer-Encoding field is used to indicate  the
    681             type  of  transformation  that  has  been  used  in order to
    682             represent the body in an acceptable manner for transport.
    683 
    684             Unlike Content-Types, a proliferation  of  Content-Transfer-
    685             Encoding  values  is  undesirable and unnecessary.  However,
    686             establishing   only   a   single   Content-Transfer-Encoding
    687             mechanism  does  not  seem  possible.    There is a tradeoff
    688             between the desire for a compact and efficient  encoding  of
    689             largely-binary  data  and the desire for a readable encoding
    690             of data that is mostly, but not entirely, 7-bit  data.   For
    691             this reason, at least two encoding mechanisms are necessary:
    692             a "readable" encoding and a "dense" encoding.
    693 
    694             The Content-Transfer-Encoding field is designed  to  specify
    695             an invertible mapping between the "native" representation of
    696             a type of data and a  representation  that  can  be  readily
    697             exchanged  using  7  bit  mail  transport protocols, such as
    698             those defined by RFC 821 (SMTP). This  field  has  not  been
    699             defined  by  any  previous  standard. The field's value is a
    700             single token specifying the type of encoding, as  enumerated
    701             below.  Formally:
    702 
    703             Content-Transfer-Encoding := "BASE64" / "QUOTED-PRINTABLE" /
    704                                          "8BIT"   / "7BIT" /
    705                                          "BINARY" / x-token
    706 
    707             These values are not case sensitive.  That  is,  Base64  and
    708             BASE64  and  bAsE64 are all equivalent.  An encoding type of
    709             7BIT requires that the body is already in a seven-bit  mail-
    710             ready representation.  This is the default value -- that is,
    711 
    712 
    713 
    714             Borenstein & Freed                                 [Page 10]
    715 
    716 
    717 
    718 
    719             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    720 
    721 
    722             "Content-Transfer-Encoding:  7BIT"   is   assumed   if   the
    723             Content-Transfer-Encoding header field is not present.
    724 
    725             The values "8bit", "7bit", and "binary" all  imply  that  NO
    726             encoding  has  been performed. However, they are potentially
    727             useful as indications of the kind of data contained  in  the
    728             object,  and  therefore  of  the kind of encoding that might
    729             need to be performed for transmission in a  given  transport
    730             system.   "7bit"  means  that the data is all represented as
    731             short lines of US-ASCII data.  "8bit" means that  the  lines
    732             are  short,  but  there  may be non-ASCII characters (octets
    733             with the high-order bit set).  "Binary" means that not  only
    734             may non-ASCII characters be present, but also that the lines
    735             are not necessarily short enough for SMTP transport.
    736 
    737             The difference between  "8bit"  (or  any  other  conceivable
    738             bit-width  token)  and  the  "binary" token is that "binary"
    739             does not require adherence to any limits on line  length  or
    740             to  the  SMTP  CRLF semantics, while the bit-width tokens do
    741             require such adherence.  If the body contains  data  in  any
    742             bit-width   other  than  7-bit,  the  appropriate  bit-width
    743             Content-Transfer-Encoding token must be used  (e.g.,  "8bit"
    744             for unencoded 8 bit wide data).  If the body contains binary
    745             data, the "binary" Content-Transfer-Encoding token  must  be
    746             used.
    747 
    748             NOTE:  The distinction between the Content-Transfer-Encoding
    749             values  of  "binary,"  "8bit," etc. may seem unimportant, in
    750             that all of them really mean "none" -- that  is,  there  has
    751             been  no encoding of the data for transport.  However, clear
    752             labeling will be  of  enormous  value  to  gateways  between
    753             future mail transport systems with differing capabilities in
    754             transporting data that do not meet the restrictions  of  RFC
    755             821 transport.
    756 
    757             As of  the  publication  of  this  document,  there  are  no
    758             standardized  Internet transports for which it is legitimate
    759             to include unencoded 8-bit or binary data  in  mail  bodies.
    760             Thus  there  are  no  circumstances  in  which the "8bit" or
    761             "binary" Content-Transfer-Encoding is actually legal on  the
    762             Internet.   However,  in the event that 8-bit or binary mail
    763             transport becomes a reality in Internet mail, or  when  this
    764             document  is  used  in  conjunction  with any other 8-bit or
    765             binary-capable transport mechanism, 8-bit or  binary  bodies
    766             should be labeled as such using this mechanism.
    767 
    768             NOTE:  The five values  defined  for  the  Content-Transfer-
    769             Encoding  field  imply  nothing about the Content-Type other
    770             than the algorithm by which it was encoded or the  transport
    771             system requirements if unencoded.
    772 
    773             Implementors  may,  if  necessary,   define   new   Content-
    774             Transfer-Encoding  values, but must use an x-token, which is
    775             a name prefixed by "X-" to indicate its non-standard status,
    776 
    777 
    778 
    779             Borenstein & Freed                                 [Page 11]
    780 
    781 
    782 
    783 
    784             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    785 
    786 
    787             e.g.,    "Content-Transfer-Encoding:     x-my-new-encoding".
    788             However, unlike Content-Types and subtypes, the creation  of
    789             new   Content-Transfer-Encoding  values  is  explicitly  and
    790             strongly  discouraged,  as  it  seems   likely   to   hinder
    791             interoperability  with  little potential benefit.  Their use
    792             is allowed only  as  the  result  of  an  agreement  between
    793             cooperating user agents.
    794 
    795             If a Content-Transfer-Encoding header field appears as  part
    796             of  a  message header, it applies to the entire body of that
    797             message.   If  a  Content-Transfer-Encoding   header   field
    798             appears as part of a body part's headers, it applies only to
    799             the body of that  body  part.   If  an  entity  is  of  type
    800             "multipart"  or  "message", the Content-Transfer-Encoding is
    801             not permitted to have any  value  other  than  a  bit  width
    802             (e.g., "7bit", "8bit", etc.) or "binary".
    803 
    804             It should be noted that email is character-oriented, so that
    805             the  mechanisms  described  here are mechanisms for encoding
    806             arbitrary byte streams, not bit streams.  If a bit stream is
    807             to  be encoded via one of these mechanisms, it must first be
    808             converted to an 8-bit byte stream using the network standard
    809             bit  order  ("big-endian"),  in  which the earlier bits in a
    810             stream become the higher-order bits in a byte.  A bit stream
    811             not  ending at an 8-bit boundary must be padded with zeroes.
    812             This document provides a mechanism for noting  the  addition
    813             of such padding in the case of the application Content-Type,
    814             which has a "padding" parameter.
    815 
    816             The encoding mechanisms defined here explicitly  encode  all
    817             data  in  ASCII.   Thus,  for example, suppose an entity has
    818             header fields such as:
    819 
    820                  Content-Type: text/plain; charset=ISO-8859-1
    821                  Content-transfer-encoding: base64
    822 
    823             This should be interpreted to mean that the body is a base64
    824             ASCII  encoding  of  data that was originally in ISO-8859-1,
    825             and will be in that character set again after decoding.
    826 
    827             The following sections will define the two standard encoding
    828             mechanisms.    The   definition   of  new  content-transfer-
    829             encodings is explicitly discouraged and  should  only  occur
    830             when  absolutely  necessary.   All content-transfer-encoding
    831             namespace except that  beginning  with  "X-"  is  explicitly
    832             reserved  to  the  IANA  for future use.  Private agreements
    833             about   content-transfer-encodings   are   also   explicitly
    834             discouraged.
    835 
    836             Certain Content-Transfer-Encoding values may only be used on
    837             certain  Content-Types.   In  particular,  it  is  expressly
    838             forbidden to use any encodings other than "7bit", "8bit", or
    839             "binary"  with  any  Content-Type  that recursively includes
    840             other Content-Type  fields,   notably  the  "multipart"  and
    841 
    842 
    843 
    844             Borenstein & Freed                                 [Page 12]
    845 
    846 
    847 
    848 
    849             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    850 
    851 
    852             "message" Content-Types.  All encodings that are desired for
    853             bodies of type multipart or message  must  be  done  at  the
    854             innermost  level,  by encoding the actual body that needs to
    855             be encoded.
    856 
    857             NOTE  ON  ENCODING  RESTRICTIONS:   Though  the  prohibition
    858             against  using  content-transfer-encodings  on  data of type
    859             multipart or message may  seem  overly  restrictive,  it  is
    860             necessary  to  prevent  nested  encodings, in which data are
    861             passed through an encoding  algorithm  multiple  times,  and
    862             must  be  decoded  multiple  times  in  order to be properly
    863             viewed.  Nested encodings  add  considerable  complexity  to
    864             user  agents:   aside  from  the obvious efficiency problems
    865             with such multiple encodings, they  can  obscure  the  basic
    866             structure  of a message.  In particular, they can imply that
    867             several decoding operations are necessary simply to find out
    868             what  types  of  objects a message contains.  Banning nested
    869             encodings may complicate the job of certain  mail  gateways,
    870             but  this  seems less of a problem than the effect of nested
    871             encodings on user agents.
    872 
    873             NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE  AND  CONTENT-
    874             TRANSFER-ENCODING:   It  may seem that the Content-Transfer-
    875             Encoding could be inferred from the characteristics  of  the
    876             Content-Type  that  is to be encoded, or, at the very least,
    877             that certain Content-Transfer-Encodings  could  be  mandated
    878             for  use  with  specific  Content-Types.  There  are several
    879             reasons why this is not the case. First, given  the  varying
    880             types  of  transports  used  for mail, some encodings may be
    881             appropriate for some Content-Type/transport combinations and
    882             not  for  others.  (For  example, in an  8-bit transport, no
    883             encoding would be required for  text  in  certain  character
    884             sets,  while  such  encodings are clearly required for 7-bit
    885             SMTP.)  Second, certain Content-Types may require  different
    886             types  of  transfer  encoding under different circumstances.
    887             For example, many PostScript bodies might  consist  entirely
    888             of  short lines of 7-bit data and hence require little or no
    889             encoding. Other PostScript bodies  (especially  those  using
    890             Level  2 PostScript's binary encoding mechanism) may only be
    891             reasonably represented using a  binary  transport  encoding.
    892             Finally,  since Content-Type is intended to be an open-ended
    893             specification  mechanism,   strict   specification   of   an
    894             association  between Content-Types and encodings effectively
    895             couples the specification of an application protocol with  a
    896             specific  lower-level transport. This is not desirable since
    897             the developers of a Content-Type should not have to be aware
    898             of all the transports in use and what their limitations are.
    899 
    900             NOTE ON TRANSLATING  ENCODINGS:   The  quoted-printable  and
    901             base64  encodings  are  designed  so that conversion between
    902             them is possible. The only  issue  that  arises  in  such  a
    903             conversion  is  the handling of line breaks. When converting
    904             from  quoted-printable  to  base64  a  line  break  must  be
    905             converted  into  a CRLF sequence. Similarly, a CRLF sequence
    906 
    907 
    908 
    909             Borenstein & Freed                                 [Page 13]
    910 
    911 
    912 
    913 
    914             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    915 
    916 
    917             in base64 data should be  converted  to  a  quoted-printable
    918             line break, but ONLY when converting text data.
    919 
    920             NOTE  ON  CANONICAL  ENCODING  MODEL:     There   was   some
    921             confusion,  in  earlier  drafts  of this memo, regarding the
    922             model for when email data was to be converted  to  canonical
    923             form  and  encoded, and in particular how this process would
    924             affect the treatment of CRLFs, given that the representation
    925             of  newlines  varies greatly from system to system. For this
    926             reason, a canonical  model  for  encoding  is  presented  as
    927             Appendix H.
    928 
    929             5.1  Quoted-Printable Content-Transfer-Encoding
    930 
    931             The Quoted-Printable encoding is intended to represent  data
    932             that largely consists of octets that correspond to printable
    933             characters in the ASCII character set.  It encodes the  data
    934             in  such  a way that the resulting octets are unlikely to be
    935             modified by mail transport.  If the data being  encoded  are
    936             mostly  ASCII  text,  the  encoded  form of the data remains
    937             largely recognizable by humans.  A body  which  is  entirely
    938             ASCII  may also be encoded in Quoted-Printable to ensure the
    939             integrity of the data should  the  message  pass  through  a
    940             character-translating, and/or line-wrapping gateway.
    941 
    942             In this encoding, octets are to be represented as determined
    943             by the following rules:
    944 
    945                  Rule #1:  (General  8-bit  representation)  Any  octet,
    946                  except  those  indicating a line break according to the
    947                  newline convention of the canonical form  of  the  data
    948                  being encoded, may be represented by an "=" followed by
    949                  a two digit hexadecimal representation of  the  octet's
    950                  value. The digits of the hexadecimal alphabet, for this
    951                  purpose, are "0123456789ABCDEF". Uppercase letters must
    952                  be
    953                  used when sending hexadecimal  data,  though  a  robust
    954                  implementation   may   choose  to  recognize  lowercase
    955                  letters on receipt. Thus, for  example,  the  value  12
    956                  (ASCII  form feed) can be represented by "=0C", and the
    957                  value 61 (ASCII  EQUAL  SIGN)  can  be  represented  by
    958                  "=3D".   Except  when  the  following  rules  allow  an
    959                  alternative encoding, this rule is mandatory.
    960 
    961                  Rule #2: (Literal representation) Octets  with  decimal
    962                  values  of 33 through 60 inclusive, and 62 through 126,
    963                  inclusive, MAY be represented as the  ASCII  characters
    964                  which  correspond  to  those  octets (EXCLAMATION POINT
    965                  through LESS THAN,  and  GREATER  THAN  through  TILDE,
    966                  respectively).
    967 
    968                  Rule #3: (White Space): Octets with values of 9 and  32
    969                  MAY   be  represented  as  ASCII  TAB  (HT)  and  SPACE
    970                  characters,  respectively,   but   MUST   NOT   be   so
    971 
    972 
    973 
    974             Borenstein & Freed                                 [Page 14]
    975 
    976 
    977 
    978 
    979             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
    980 
    981 
    982                  represented at the end of an encoded line. Any TAB (HT)
    983                  or SPACE characters on an encoded  line  MUST  thus  be
    984                  followed  on  that  line  by a printable character.  In
    985                  particular, an "=" at  the  end  of  an  encoded  line,
    986                  indicating  a  soft line break (see rule #5) may follow
    987                  one or more TAB (HT) or SPACE characters.   It  follows
    988                  that  an  octet with value 9 or 32 appearing at the end
    989                  of an encoded line must  be  represented  according  to
    990                  Rule  #1.  This  rule  is  necessary  because some MTAs
    991                  (Message Transport  Agents,  programs  which  transport
    992                  messages from one user to another, or perform a part of
    993                  such transfers) are known to pad  lines  of  text  with
    994                  SPACEs,  and  others  are known to remove "white space"
    995                  characters from the end  of  a  line.  Therefore,  when
    996                  decoding  a  Quoted-Printable  body, any trailing white
    997                  space on a line must be deleted, as it will necessarily
    998                  have been added by intermediate transport agents.
    999 
   1000                  Rule #4 (Line Breaks): A line  break  in  a  text  body
   1001                  part,   independent   of  what  its  representation  is
   1002                  following the  canonical  representation  of  the  data
   1003                  being  encoded, must be represented by a (RFC 822) line
   1004                  break,  which  is  a  CRLF  sequence,  in  the  Quoted-
   1005                  Printable  encoding.  If isolated CRs and LFs, or LF CR
   1006                  and CR LF sequences are allowed  to  appear  in  binary
   1007                  data  according  to  the  canonical  form, they must be
   1008                  represented   using  the  "=0D",  "=0A",  "=0A=0D"  and
   1009                  "=0D=0A" notations respectively.
   1010 
   1011                  Note that many implementation may elect to  encode  the
   1012                  local representation of various content types directly.
   1013                  In particular, this may apply to plain text material on
   1014                  systems  that  use  newline conventions other than CRLF
   1015                  delimiters. Such an implementation is permissible,  but
   1016                  the  generation  of  line breaks must be generalized to
   1017                  account for the case where alternate representations of
   1018                  newline sequences are used.
   1019 
   1020                  Rule  #5  (Soft  Line  Breaks):  The   Quoted-Printable
   1021                  encoding REQUIRES that encoded lines be no more than 76
   1022                  characters long. If longer lines are to be encoded with
   1023                  the  Quoted-Printable encoding, 'soft' line breaks must
   1024                  be used. An equal sign  as  the  last  character  on  a
   1025                  encoded  line indicates such a non-significant ('soft')
   1026                  line break in the encoded text. Thus if the "raw"  form
   1027                  of the line is a single unencoded line that says:
   1028 
   1029                       Now's the time for all folk to come to the aid of
   1030                       their country.
   1031 
   1032                  This  can  be  represented,  in  the   Quoted-Printable
   1033                  encoding, as
   1034 
   1035 
   1036 
   1037 
   1038 
   1039             Borenstein & Freed                                 [Page 15]
   1040 
   1041 
   1042 
   1043 
   1044             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1045 
   1046 
   1047                       Now's the time =
   1048                       for all folk to come=
   1049                        to the aid of their country.
   1050 
   1051                  This provides a mechanism with  which  long  lines  are
   1052                  encoded  in  such  a  way as to be restored by the user
   1053                  agent.  The 76  character  limit  does  not  count  the
   1054                  trailing   CRLF,   but  counts  all  other  characters,
   1055                  including any equal signs.
   1056 
   1057             Since the hyphen character ("-") is represented as itself in
   1058             the  Quoted-Printable  encoding,  care  must  be taken, when
   1059             encapsulating a quoted-printable encoded body in a multipart
   1060             entity,  to  ensure that the encapsulation boundary does not
   1061             appear anywhere in the encoded body.  (A good strategy is to
   1062             choose a boundary that includes a character sequence such as
   1063             "=_" which can never appear in a quoted-printable body.  See
   1064             the   definition   of   multipart  messages  later  in  this
   1065             document.)
   1066 
   1067             NOTE:  The quoted-printable encoding represents something of
   1068             a   compromise   between   readability  and  reliability  in
   1069             transport.   Bodies  encoded   with   the   quoted-printable
   1070             encoding will work reliably over most mail gateways, but may
   1071             not work  perfectly  over  a  few  gateways,  notably  those
   1072             involving  translation  into  EBCDIC.  (In theory, an EBCDIC
   1073             gateway could decode a quoted-printable body  and  re-encode
   1074             it  using  base64,  but  such gateways do not yet exist.)  A
   1075             higher  level  of  confidence  is  offered  by  the   base64
   1076             Content-Transfer-Encoding.  A way to get reasonably reliable
   1077             transport through EBCDIC gateways is to also quote the ASCII
   1078             characters
   1079 
   1080                  !"#$@[\]^`{|}~
   1081 
   1082             according to rule #1.  See Appendix B for more information.
   1083 
   1084             Because quoted-printable data is  generally  assumed  to  be
   1085             line-oriented,  it is to be expected that the breaks between
   1086             the lines  of  quoted  printable  data  may  be  altered  in
   1087             transport,  in  the  same  manner  that  plain text mail has
   1088             always been altered in Internet mail  when  passing  between
   1089             systems   with   differing  newline  conventions.   If  such
   1090             alterations are likely to constitute  a  corruption  of  the
   1091             data,  it  is  probably  more  sensible  to  use  the base64
   1092             encoding rather than the quoted-printable encoding.
   1093 
   1094 
   1095 
   1096 
   1097 
   1098 
   1099 
   1100 
   1101 
   1102 
   1103 
   1104             Borenstein & Freed                                 [Page 16]
   1105 
   1106 
   1107 
   1108 
   1109             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1110 
   1111 
   1112             5.2  Base64 Content-Transfer-Encoding
   1113 
   1114             The  Base64   Content-Transfer-Encoding   is   designed   to
   1115             represent  arbitrary  sequences  of octets in a form that is
   1116             not humanly readable.  The encoding and decoding  algorithms
   1117             are simple, but the encoded data are consistently only about
   1118             33 percent larger than the unencoded data.  This encoding is
   1119             based on the one used in Privacy Enhanced Mail applications,
   1120             as defined in RFC 1113.   The  base64  encoding  is  adapted
   1121             from  RFC  1113, with one change:  base64 eliminates the "*"
   1122             mechanism for embedded clear text.
   1123 
   1124             A 65-character subset of US-ASCII is used, enabling  6  bits
   1125             to  be  represented per printable character. (The extra 65th
   1126             character, "=", is used  to  signify  a  special  processing
   1127             function.)
   1128 
   1129             NOTE:  This subset has the important  property  that  it  is
   1130             represented   identically   in  all  versions  of  ISO  646,
   1131             including US ASCII, and all characters  in  the  subset  are
   1132             also  represented  identically  in  all  versions of EBCDIC.
   1133             Other popular encodings, such as the encoding  used  by  the
   1134             UUENCODE  utility  and the base85 encoding specified as part
   1135             of Level 2 PostScript, do not share  these  properties,  and
   1136             thus  do  not  fulfill the portability requirements a binary
   1137             transport encoding for mail must meet.
   1138 
   1139             The encoding process represents 24-bit groups of input  bits
   1140             as  output  strings of 4 encoded characters. Proceeding from
   1141             left  to  right,  a  24-bit  input  group   is   formed   by
   1142             concatenating  3  8-bit input groups. These 24 bits are then
   1143             treated as 4 concatenated 6-bit groups,  each  of  which  is
   1144             translated  into a single digit in the base64 alphabet. When
   1145             encoding a bit stream  via  the  base64  encoding,  the  bit
   1146             stream  must  be  presumed  to  be  ordered  with  the most-
   1147             significant-bit first.  That is, the first bit in the stream
   1148             will be the high-order bit in the first byte, and the eighth
   1149             bit will be the low-order bit in the first byte, and so on.
   1150 
   1151             Each 6-bit group is used as an index into  an  array  of  64
   1152             printable  characters. The character referenced by the index
   1153             is placed in the output string. These characters, identified
   1154             in  Table  1,  below,  are  selected so as to be universally
   1155             representable,  and  the  set   excludes   characters   with
   1156             particular  significance to SMTP (e.g., ".", "CR", "LF") and
   1157             to the encapsulation boundaries  defined  in  this  document
   1158             (e.g., "-").
   1159 
   1160 
   1161 
   1162 
   1163 
   1164 
   1165 
   1166 
   1167 
   1168 
   1169             Borenstein & Freed                                 [Page 17]
   1170 
   1171 
   1172 
   1173 
   1174             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1175 
   1176 
   1177                             Table 1: The Base64 Alphabet
   1178 
   1179                Value Encoding  Value  Encoding   Value  Encoding   Value
   1180             Encoding
   1181                    0 A            17 R            34 i            51 z
   1182                    1 B            18 S            35 j            52 0
   1183                    2 C            19 T            36 k            53 1
   1184                    3 D            20 U            37 l            54 2
   1185                    4 E            21 V            38 m            55 3
   1186                    5 F            22 W            39 n            56 4
   1187                    6 G            23 X            40 o            57 5
   1188                    7 H            24 Y            41 p            58 6
   1189                    8 I            25 Z            42 q            59 7
   1190                    9 J            26 a            43 r            60 8
   1191                   10 K            27 b            44 s            61 9
   1192                   11 L            28 c            45 t            62 +
   1193                   12 M            29 d            46 u            63 /
   1194                   13 N            30 e            47 v
   1195                   14 O            31 f            48 w         (pad) =
   1196                   15 P            32 g            49 x
   1197                   16 Q            33 h            50 y
   1198 
   1199             The output stream (encoded bytes)  must  be  represented  in
   1200             lines  of  no more than 76 characters each.  All line breaks
   1201             or other characters not found in Table 1 must be ignored  by
   1202             decoding  software.   In  base64 data, characters other than
   1203             those in  Table  1,  line  breaks,  and  other  white  space
   1204             probably  indicate  a  transmission  error,  about  which  a
   1205             warning  message  or  even  a  message  rejection  might  be
   1206             appropriate under some circumstances.
   1207 
   1208             Special processing is performed if fewer than  24  bits  are
   1209             available  at  the  end  of  the data being encoded.  A full
   1210             encoding quantum is always completed at the end of  a  body.
   1211             When  fewer  than  24  input  bits are available in an input
   1212             group, zero bits  are  added  (on  the  right)  to  form  an
   1213             integral number of 6-bit groups.  Output character positions
   1214             which are not required to represent actual  input  data  are
   1215             set  to  the  character  "=".   Since all base64 input is an
   1216             integral number of octets,  only  the  following  cases  can
   1217             arise:  (1)  the  final  quantum  of  encoding  input  is an
   1218             integral multiple of  24  bits;  here,  the  final  unit  of
   1219             encoded  output will be an integral multiple of 4 characters
   1220             with no "=" padding, (2) the final quantum of encoding input
   1221             is  exactly  8  bits; here, the final unit of encoded output
   1222             will  be  two  characters  followed  by  two   "="   padding
   1223             characters,  or  (3)  the final quantum of encoding input is
   1224             exactly 16 bits; here, the final unit of encoded output will
   1225             be three characters followed by one "=" padding character.
   1226 
   1227             Care must be taken to use the proper octets for line  breaks
   1228             if base64 encoding is applied directly to text material that
   1229             has not been converted to  canonical  form.  In  particular,
   1230             text  line  breaks  should  be converted into CRLF sequences
   1231 
   1232 
   1233 
   1234             Borenstein & Freed                                 [Page 18]
   1235 
   1236 
   1237 
   1238 
   1239             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1240 
   1241 
   1242             prior to base64 encoding. The important  thing  to  note  is
   1243             that this may be done directly by the encoder rather than in
   1244             a prior canonicalization step in some implementations.
   1245 
   1246             NOTE: There is no  need  to  worry  about  quoting  apparent
   1247             encapsulation  boundaries  within  base64-encoded  parts  of
   1248             multipart entities because no hyphen characters are used  in
   1249             the base64 encoding.
   1250 
   1251             6    Additional Optional Content- Header Fields
   1252 
   1253             6.1  Optional Content-ID Header Field
   1254 
   1255             In constructing a high-level user agent, it may be desirable
   1256             to   allow   one   body   to   make  reference  to  another.
   1257             Accordingly, bodies may be labeled  using  the  "Content-ID"
   1258             header  field,  which  is  syntactically  identical  to  the
   1259             "Message-ID" header field:
   1260 
   1261             Content-ID := msg-id
   1262 
   1263             Like  the  Message-ID  values,  Content-ID  values  must  be
   1264             generated to be as unique as possible.
   1265 
   1266             6.2  Optional Content-Description Header Field
   1267 
   1268             The ability to associate some descriptive information with a
   1269             given body is often desirable. For example, it may be useful
   1270             to mark an "image" body as "a picture of the  Space  Shuttle
   1271             Endeavor."    Such  text  may  be  placed  in  the  Content-
   1272             Description header field.
   1273 
   1274             Content-Description := *text
   1275 
   1276             The description is presumed to  be  given  in  the  US-ASCII
   1277             character  set,  although  the  mechanism specified in [RFC-
   1278             1342]  may  be  used  for  non-US-ASCII  Content-Description
   1279             values.
   1280 
   1281 
   1282 
   1283 
   1284 
   1285 
   1286 
   1287 
   1288 
   1289 
   1290 
   1291 
   1292 
   1293 
   1294 
   1295 
   1296 
   1297 
   1298 
   1299             Borenstein & Freed                                 [Page 19]
   1300 
   1301 
   1302 
   1303 
   1304             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1305 
   1306 
   1307             7    The Predefined Content-Type Values
   1308 
   1309             This document defines seven initial Content-Type values  and
   1310             an  extension  mechanism  for private or experimental types.
   1311             Further standard types must  be  defined  by  new  published
   1312             specifications.   It is expected that most innovation in new
   1313             types of mail will take place as subtypes of the seven types
   1314             defined  here.   The  most  essential characteristics of the
   1315             seven content-types are summarized in Appendix G.
   1316 
   1317             7.1  The Text Content-Type
   1318 
   1319             The text Content-Type is intended for sending material which
   1320             is  principally textual in form.  It is the default Content-
   1321             Type.  A "charset" parameter may be  used  to  indicate  the
   1322             character set of the body text.  The primary subtype of text
   1323             is "plain".  This indicates plain (unformatted)  text.   The
   1324             default  Content-Type  for  Internet  mail  is  "text/plain;
   1325             charset=us-ascii".
   1326 
   1327             Beyond plain text, there are many formats  for  representing
   1328             what might be known as "extended text" -- text with embedded
   1329             formatting and  presentation  information.   An  interesting
   1330             characteristic of many such representations is that they are
   1331             to some extent  readable  even  without  the  software  that
   1332             interprets  them.   It is useful, then, to distinguish them,
   1333             at the highest level, from such unreadable data  as  images,
   1334             audio,  or  text  represented in an unreadable form.  In the
   1335             absence  of  appropriate  interpretation  software,  it   is
   1336             reasonable to show subtypes of text to the user, while it is
   1337             not reasonable to do so with most nontextual data.
   1338 
   1339             Such formatted textual  data  should  be  represented  using
   1340             subtypes  of text.  Plausible subtypes of text are typically
   1341             given by the common name of the representation format, e.g.,
   1342             "text/richtext".
   1343 
   1344             7.1.1     The charset parameter
   1345 
   1346             A critical parameter that may be specified in  the  Content-
   1347             Type  field  for  text  data  is the character set.  This is
   1348             specified with a "charset" parameter, as in:
   1349 
   1350                  Content-type: text/plain; charset=us-ascii
   1351 
   1352             Unlike some  other  parameter  values,  the  values  of  the
   1353             charset  parameter  are  NOT  case  sensitive.   The default
   1354             character set, which must be assumed in  the  absence  of  a
   1355             charset parameter, is US-ASCII.
   1356 
   1357             An initial list of predefined character  set  names  can  be
   1358             found at the end of this section.  Additional character sets
   1359             may be registered with IANA  as  described  in  Appendix  F,
   1360             although the standardization of their use requires the usual
   1361 
   1362 
   1363 
   1364             Borenstein & Freed                                 [Page 20]
   1365 
   1366 
   1367 
   1368 
   1369             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1370 
   1371 
   1372             IAB  review  and  approval.  Note  that  if  the   specified
   1373             character  set  includes  8-bit  data,  a  Content-Transfer-
   1374             Encoding header field and a corresponding  encoding  on  the
   1375             data  are  required  in  order to transmit the body via some
   1376             mail transfer protocols, such as SMTP.
   1377 
   1378             The default character set, US-ASCII, has been the subject of
   1379             some  confusion  and  ambiguity  in the past.  Not only were
   1380             there some ambiguities in the definition,  there  have  been
   1381             wide  variations  in  practice.   In order to eliminate such
   1382             ambiguity and variations  in  the  future,  it  is  strongly
   1383             recommended  that  new  user  agents  explicitly  specify  a
   1384             character set via the Content-Type header field.  "US-ASCII"
   1385             does not indicate an arbitrary seven-bit character code, but
   1386             specifies that the body uses character coding that uses  the
   1387             exact  correspondence  of  codes  to characters specified in
   1388             ASCII.  National use variations of ISO 646 [ISO-646] are NOT
   1389             ASCII   and   their  use  in  Internet  mail  is  explicitly
   1390             discouraged. The omission of the ISO 646  character  set  is
   1391             deliberate  in  this regard.  The character set name of "US-
   1392             ASCII" explicitly refers  to ANSI X3.4-1986 [US-ASCII] only.
   1393             The  character  set name "ASCII" is reserved and must not be
   1394             used for any purpose.
   1395 
   1396             NOTE: RFC 821 explicitly specifies "ASCII",  and  references
   1397             an earlier version of the American Standard.  Insofar as one
   1398             of the purposes of specifying a Content-Type  and  character
   1399             set is to permit the receiver to unambiguously determine how
   1400             the sender intended the coded  message  to  be  interpreted,
   1401             assuming  anything  other than "strict ASCII" as the default
   1402             would risk unintentional and  incompatible  changes  to  the
   1403             semantics  of  messages  now being transmitted.    This also
   1404             implies that messages containing characters coded  according
   1405             to  national  variations on ISO 646, or using code-switching
   1406             procedures (e.g., those of ISO 2022), as well  as  8-bit  or
   1407             multiple   octet character encodings MUST use an appropriate
   1408             character set  specification  to  be  consistent  with  this
   1409             specification.
   1410 
   1411             The complete US-ASCII character set is listed in [US-ASCII].
   1412             Note  that  the control characters including DEL (0-31, 127)
   1413             have no defined meaning  apart  from  the  combination  CRLF
   1414             (ASCII  values 13 and 10) indicating a new line.  Two of the
   1415             characters have de facto meanings in wide use: FF (12) often
   1416             means  "start  subsequent  text  on  the  beginning of a new
   1417             page"; and TAB or HT (9) often  (though  not  always)  means
   1418             "move  the  cursor  to  the  next available column after the
   1419             current position where the column number is a multiple of  8
   1420             (counting  the  first column as column 0)." Apart from this,
   1421             any use of the control characters or DEL in a body  must  be
   1422             part   of   a  private  agreement  between  the  sender  and
   1423             recipient.  Such  private  agreements  are  discouraged  and
   1424             should  be  replaced  by  the  other  capabilities  of  this
   1425             document.
   1426 
   1427 
   1428 
   1429             Borenstein & Freed                                 [Page 21]
   1430 
   1431 
   1432 
   1433 
   1434             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1435 
   1436 
   1437             NOTE:   Beyond  US-ASCII,  an  enormous   proliferation   of
   1438             character  sets  is  possible. It is the opinion of the IETF
   1439             working group that a large number of character sets is NOT a
   1440             good  thing.   We would prefer to specify a single character
   1441             set that can be used universally for representing all of the
   1442             world's   languages   in  electronic  mail.   Unfortunately,
   1443             existing practice in several communities seems to  point  to
   1444             the  continued  use  of  multiple character sets in the near
   1445             future.  For this reason, we define names for a small number
   1446             of  character  sets  for  which  a  strong  constituent base
   1447             exists.    It is our hope  that  ISO  10646  or  some  other
   1448             effort  will  eventually define a single world character set
   1449             which can then be specified for use in Internet mail, but in
   1450             the  advance of that definition we cannot specify the use of
   1451             ISO  10646,  Unicode,  or  any  other  character  set  whose
   1452             definition is, as of this writing, incomplete.
   1453 
   1454             The defined charset values are:
   1455 
   1456                  US-ASCII -- as defined in [US-ASCII].
   1457 
   1458                  ISO-8859-X -- where "X"  is  to  be  replaced,  as
   1459                       necessary,  for  the  parts of ISO-8859 [ISO-
   1460                       8859].  Note that the ISO 646 character  sets
   1461                       have  deliberately  been  omitted in favor of
   1462                       their  8859  replacements,  which   are   the
   1463                       designated  character sets for Internet mail.
   1464                       As of the publication of this  document,  the
   1465                       legitimate  values  for  "X" are the digits 1
   1466                       through 9.
   1467 
   1468             Note that the character set used,  if  anything  other  than
   1469             US-ASCII,   must  always  be  explicitly  specified  in  the
   1470             Content-Type field.
   1471 
   1472             No other character set name may be  used  in  Internet  mail
   1473             without  the  publication  of a formal specification and its
   1474             registration with IANA as described in  Appendix  F,  or  by
   1475             private agreement, in which case the character set name must
   1476             begin with "X-".
   1477 
   1478             Implementors are discouraged  from  defining  new  character
   1479             sets for mail use unless absolutely necessary.
   1480 
   1481             The "charset" parameter has been defined primarily  for  the
   1482             purpose  of  textual  data, and is described in this section
   1483             for that reason.   However,  it  is  conceivable  that  non-
   1484             textual  data might also wish to specify a charset value for
   1485             some purpose, in which  case  the  same  syntax  and  values
   1486             should be used.
   1487 
   1488             In general, mail-sending  software  should  always  use  the
   1489             "lowest  common  denominator"  character  set possible.  For
   1490             example, if a body contains  only  US-ASCII  characters,  it
   1491 
   1492 
   1493 
   1494             Borenstein & Freed                                 [Page 22]
   1495 
   1496 
   1497 
   1498 
   1499             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1500 
   1501 
   1502             should be marked as being in the US-ASCII character set, not
   1503             ISO-8859-1, which, like all the ISO-8859 family of character
   1504             sets,  is  a  superset  of  US-ASCII.   More generally, if a
   1505             widely-used character set is a subset of  another  character
   1506             set,  and a body contains only characters in the widely-used
   1507             subset, it should be labeled as being in that  subset.  This
   1508             will increase the chances that the recipient will be able to
   1509             view the mail correctly.
   1510 
   1511             7.1.2     The Text/plain subtype
   1512 
   1513             The primary subtype of text   is  "plain".   This  indicates
   1514             plain  (unformatted)  text.  The  default  Content-Type  for
   1515             Internet  mail,  "text/plain;  charset=us-ascii",  describes
   1516             existing  Internet practice, that is, it is the type of body
   1517             defined by RFC 822.
   1518 
   1519             7.1.3     The Text/richtext subtype
   1520 
   1521             In order to promote the  wider  interoperability  of  simple
   1522             formatted  text,  this  document defines an extremely simple
   1523             subtype of "text", the "richtext" subtype.  This subtype was
   1524             designed to meet the following criteria:
   1525 
   1526                  1.  The syntax must be extremely simple to  parse,
   1527                  so  that  even  teletype-oriented mail systems can
   1528                  easily strip away the formatting  information  and
   1529                  leave only the readable text.
   1530 
   1531                  2.  The syntax must be extensible to allow for new
   1532                  formatting commands that are deemed essential.
   1533 
   1534                  3.  The capabilities must be extremely limited, to
   1535                  ensure  that  it  can  represent  no  more than is
   1536                  likely to be representable by the  user's  primary
   1537                  word  processor.   While  this  limits what can be
   1538                  sent, it increases the  likelihood  that  what  is
   1539                  sent can be properly displayed.
   1540 
   1541                  4.  The syntax must be compatible  with  SGML,  so
   1542                  that,  with  an  appropriate  DTD  (Document  Type
   1543                  Definition, the standard mechanism for defining  a
   1544                  document  type  using SGML), a general SGML parser
   1545                  could be made to parse richtext.  However, despite
   1546                  this  compatibility,  the  syntax  should  be  far
   1547                  simpler than full SGML, so that no SGML  knowledge
   1548                  is required in order to implement it.
   1549 
   1550             The syntax of "richtext" is very simple.  It is assumed,  at
   1551             the  top-level,  to be in the US-ASCII character set, unless
   1552             of course a different charset parameter was specified in the
   1553             Content-type  field.   All  characters represent themselves,
   1554             with the exception of the "<" character (ASCII 60), which is
   1555             used   to  mark  the  beginning  of  a  formatting  command.
   1556 
   1557 
   1558 
   1559             Borenstein & Freed                                 [Page 23]
   1560 
   1561 
   1562 
   1563 
   1564             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1565 
   1566 
   1567             Formatting  instructions  consist  of  formatting   commands
   1568             surrounded  by angle brackets ("<>", ASCII 60 and 62).  Each
   1569             formatting command may be no  more  than  40  characters  in
   1570             length,  all in US-ASCII, restricted to the alphanumeric and
   1571             hyphen ("-") characters. Formatting commands may be preceded
   1572             by  a  forward slash or solidus ("/", ASCII 47), making them
   1573             negations, and such negations must always exist  to  balance
   1574             the  initial opening commands, except as noted below.  Thus,
   1575             if the formatting command "<bold>" appears  at  some  point,
   1576             there  must  later  be a "</bold>" to balance it.  There are
   1577             only three exceptions to this "balancing" rule:  First,  the
   1578             command "<lt>" is used to represent a literal "<" character.
   1579             Second, the command "<nl>" is used to represent  a  required
   1580             line  break.   (Otherwise,  CRLFs in the data are treated as
   1581             equivalent to  a  single  SPACE  character.)   Finally,  the
   1582             command  "<np>"  is  used to represent a page break.  (NOTE:
   1583             The 40 character  limit  on  formatting  commands  does  not
   1584             include  the  "<",  ">",  or  "/"  characters  that might be
   1585             attached to such commands.)
   1586 
   1587             Initially defined formatting commands, not all of which will
   1588             be implemented by all richtext implementations, include:
   1589 
   1590                  Bold -- causes the subsequent text  to  be  in  a  bold
   1591                       font.
   1592                  Italic -- causes the subsequent text to be in an italic
   1593                       font.
   1594                  Fixed -- causes the subsequent text to be  in  a  fixed
   1595                       width font.
   1596                  Smaller -- causes  the  subsequent  text  to  be  in  a
   1597                       smaller font.
   1598                  Bigger -- causes the subsequent text to be in a  bigger
   1599                       font.
   1600                  Underline  --  causes  the  subsequent   text   to   be
   1601                       underlined.
   1602                  Center -- causes the subsequent text to be centered.
   1603                  FlushLeft -- causes the  subsequent  text  to  be  left
   1604                       justified.
   1605                  FlushRight -- causes the subsequent text  to  be  right
   1606                       justified.
   1607                  Indent -- causes the subsequent text to be indented  at
   1608                       the left margin.
   1609                  IndentRight  --  causes  the  subsequent  text  to   be
   1610                       indented at the right margin.
   1611                  Outdent -- causes the subsequent text to  be  outdented
   1612                       at the left margin.
   1613                  OutdentRight  --  causes  the  subsequent  text  to  be
   1614                       outdented at the right margin.
   1615                  SamePage -- causes the subsequent text to  be  grouped,
   1616                       if possible, on one page.
   1617                  Subscript  --  causes  the  subsequent   text   to   be
   1618                       interpreted as a subscript.
   1619 
   1620 
   1621 
   1622 
   1623 
   1624             Borenstein & Freed                                 [Page 24]
   1625 
   1626 
   1627 
   1628 
   1629             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1630 
   1631 
   1632                  Superscript  --  causes  the  subsequent  text  to   be
   1633                       interpreted as a superscript.
   1634                  Heading -- causes the subsequent text to be interpreted
   1635                       as a page heading.
   1636                  Footing -- causes the subsequent text to be interpreted
   1637                       as a page footing.
   1638                  ISO-8859-X  (for any value of X  that  is  legal  as  a
   1639                       "charset" parameter) -- causes the subsequent text
   1640                       to be  interpreted  as  text  in  the  appropriate
   1641                       character set.
   1642                  US-ASCII  --  causes  the   subsequent   text   to   be
   1643                       interpreted as text in the US-ASCII character set.
   1644                  Excerpt -- causes the subsequent text to be interpreted
   1645                       as   a   textual   excerpt  from  another  source.
   1646                       Typically this will be displayed using indentation
   1647                       and  an  alternate font, but such decisions are up
   1648                       to the viewer.
   1649                  Paragraph  --  causes  the  subsequent   text   to   be
   1650                       interpreted    as   a   single   paragraph,   with
   1651                       appropriate  paragraph  breaks  (typically   blank
   1652                       space) before and after.
   1653                  Signature  --  causes  the  subsequent   text   to   be
   1654                       interpreted  as  a  "signature".  Some systems may
   1655                       wish to display signatures in a  smaller  font  or
   1656                       otherwise set them apart from the main text of the
   1657                       message.
   1658                  Comment -- causes the subsequent text to be interpreted
   1659                       as a comment, and hence not shown to the reader.
   1660                  No-op -- has no effect on the subsequent text.
   1661                  lt -- <lt> is replaced by a literal "<" character.   No
   1662                       balancing </lt> is allowed.
   1663                  nl -- <nl> causes a line break.  No balancing </nl>  is
   1664                       allowed.
   1665                  np -- <np> causes a page break.  No balancing </np>  is
   1666                       allowed.
   1667 
   1668             Each positive formatting command affects all subsequent text
   1669             until  the matching negative formatting command.  Such pairs
   1670             of formatting commands must be properly balanced and nested.
   1671             Thus, a proper way to describe text in bold italics is:
   1672 
   1673                       <bold><italic>the-text</italic></bold>
   1674 
   1675                  or, alternately,
   1676 
   1677                       <italic><bold>the-text</bold></italic>
   1678 
   1679                  but,  in  particular,  the  following  is  illegal
   1680                  richtext:
   1681 
   1682                       <bold><italic>the-text</bold></italic>
   1683 
   1684             NOTE:   The  nesting  requirement  for  formatting  commands
   1685             imposes  a  slightly  higher  burden  upon  the composers of
   1686 
   1687 
   1688 
   1689             Borenstein & Freed                                 [Page 25]
   1690 
   1691 
   1692 
   1693 
   1694             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1695 
   1696 
   1697             richtext  bodies,  but   potentially   simplifies   richtext
   1698             displayers  by  allowing  them  to be stack-based.  The main
   1699             goal of richtext is to be simple enough to  make  multifont,
   1700             formatted  email  widely  readable,  so  that those with the
   1701             capability of  sending  it  will  be  able  to  do  so  with
   1702             confidence.   Thus  slightly  increased  complexity  in  the
   1703             composing software was  deemed  a  reasonable  tradeoff  for
   1704             simplified  reading  software.  Nonetheless, implementors of
   1705             richtext  readers  are  encouraged  to  follow  the  general
   1706             Internet  guidelines  of being conservative in what you send
   1707             and liberal in what you accept.  Those implementations  that
   1708             can  do so are encouraged to deal reasonably with improperly
   1709             nested richtext.
   1710 
   1711             Implementations  must  regard  any  unrecognized  formatting
   1712             command  as  equivalent to "No-op", thus facilitating future
   1713             extensions to "richtext".  Private extensions may be defined
   1714             using  formatting  commands that begin with "X-", by analogy
   1715             to Internet mail header field names.
   1716 
   1717             It is worth noting that no special behavior is required  for
   1718             the TAB (HT) character. It is recommended, however, that, at
   1719             least  when  fixed-width  fonts  are  in  use,  the   common
   1720             semantics  of  the  TAB  (HT)  character should be observed,
   1721             namely that it moves to the next column position that  is  a
   1722             multiple  of  8.   (In  other words, if a TAB (HT) occurs in
   1723             column n, where the leftmost column is column 0,  then  that
   1724             TAB   (HT)   should   be  replaced  by  8-(n  mod  8)  SPACE
   1725             characters.)
   1726 
   1727             Richtext also differentiates between "hard" and "soft"  line
   1728             breaks.   A line break (CRLF) in the richtext data stream is
   1729             interpreted as a "soft" line break,  one  that  is  included
   1730             only for purposes of mail transport, and is to be treated as
   1731             white space by richtext interpreters.  To include  a  "hard"
   1732             line  break (one that must be displayed as such), the "<nl>"
   1733             or "<paragraph> formatting constructs  should  be  used.  In
   1734             general, a soft line break should be treated as white space,
   1735             but when soft line breaks immediately follow  a  <nl>  or  a
   1736             </paragraph>  tag they should be ignored rather than treated
   1737             as white space.
   1738 
   1739             Putting all this  together,  the  following  "text/richtext"
   1740             body fragment:
   1741 
   1742                       <bold>Now</bold> is the time for
   1743                       <italic>all</italic> good men
   1744                        <smaller>(and <lt>women>)</smaller> to
   1745                       <ignoreme></ignoreme> come
   1746 
   1747                       to the aid of their
   1748                       <nl>
   1749 
   1750 
   1751 
   1752 
   1753 
   1754             Borenstein & Freed                                 [Page 26]
   1755 
   1756 
   1757 
   1758 
   1759             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1760 
   1761 
   1762                       beloved <nl><nl>country. <comment> Stupid
   1763                       quote! </comment> -- the end
   1764 
   1765             represents the following  formatted  text  (which  will,  no
   1766             doubt,  look  cryptic  in  the  text-only  version  of  this
   1767             document):
   1768 
   1769                  Now is the time for all good men (and <women>)  to
   1770                  come to the aid of their
   1771                  beloved
   1772 
   1773                  country. -- the end
   1774 
   1775             Richtext conformance:  A minimal richtext implementation  is
   1776             one  that  simply  converts "<lt>" to "<", converts CRLFs to
   1777             SPACE, converts <nl> to a newline according to local newline
   1778             convention,  removes  everything between a <comment> command
   1779             and the next balancing </comment> command, and  removes  all
   1780             other  formatting  commands  (all  text  enclosed  in  angle
   1781             brackets).
   1782 
   1783             NOTE ON THE RELATIONSHIP OF RICHTEXT TO SGML:   Richtext  is
   1784             decidedly  not  SGML,  and  must  not  be  used to transport
   1785             arbitrary SGML  documents.   Those  who  wish  to  use  SGML
   1786             document  types as a mail transport format must define a new
   1787             text or application subtype, e.g.,  "text/sgml-dtd-whatever"
   1788             or   "application/sgml-dtd-whatever",   depending   on   the
   1789             perceived readability  of  the  DTD  in  use.   Richtext  is
   1790             designed  to  be  compatible  with SGML, and specifically so
   1791             that it will be possible to define a richtext DTD if one  is
   1792             needed.   However,  this  does not imply that arbitrary SGML
   1793             can be called richtext, nor that richtext implementors  have
   1794             any  need  to  understand  SGML;  the  description  in  this
   1795             document is a complete definition of richtext, which is  far
   1796             simpler than complete SGML.
   1797 
   1798             NOTE ON THE INTENDED USE OF RICHTEXT:  It is recognized that
   1799             implementors  of  future  mail  systems  will want rich text
   1800             functionality  far  beyond  that   currently   defined   for
   1801             richtext.   The  intent  of  richtext is to provide a common
   1802             format for expressing that functionality in a form in  which
   1803             much  of  it, at least, will be understood by interoperating
   1804             software.  Thus,  in  particular,  software  with  a  richer
   1805             notion  of  formatted  text  than  richtext  can  still  use
   1806             richtext as its basic representation, but can extend it with
   1807             new  formatting  commands and by hiding information specific
   1808             to that software  system  in  richtext  comments.   As  such
   1809             systems  evolve,  it  is  expected  that  the  definition of
   1810             richtext  will  be  further  refined  by  future   published
   1811             specifications,  but  richtext  as  defined  here provides a
   1812             platform on which evolutionary refinements can be based.
   1813 
   1814             IMPLEMENTATION NOTE:  In  some  environments,  it  might  be
   1815             impossible  to combine certain richtext formatting commands,
   1816 
   1817 
   1818 
   1819             Borenstein & Freed                                 [Page 27]
   1820 
   1821 
   1822 
   1823 
   1824             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1825 
   1826 
   1827             whereas in  others  they  might  be  combined  easily.   For
   1828             example,  the  combination  of  <bold>  and  <italic>  might
   1829             produce bold italics on systems that support such fonts, but
   1830             there  exist  systems that can make text bold or italicized,
   1831             but not both.  In  such  cases,  the  most  recently  issued
   1832             recognized formatting command should be preferred.
   1833 
   1834             One of the major goals in the design of richtext was to make
   1835             it  so  simple  that  even  text-only mailers will implement
   1836             richtext-to-plain-text  translators,  thus  increasing   the
   1837             likelihood  that  multifont  text  will become "safe" to use
   1838             very widely.  To demonstrate this simplicity,  an  extremely
   1839             simple  35-line  C program that converts richtext input into
   1840             plain text output is included in Appendix D.
   1841 
   1842 
   1843 
   1844 
   1845 
   1846 
   1847 
   1848 
   1849 
   1850 
   1851 
   1852 
   1853 
   1854 
   1855 
   1856 
   1857 
   1858 
   1859 
   1860 
   1861 
   1862 
   1863 
   1864 
   1865 
   1866 
   1867 
   1868 
   1869 
   1870 
   1871 
   1872 
   1873 
   1874 
   1875 
   1876 
   1877 
   1878 
   1879 
   1880 
   1881 
   1882 
   1883 
   1884             Borenstein & Freed                                 [Page 28]
   1885 
   1886 
   1887 
   1888 
   1889             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1890 
   1891 
   1892             7.2  The Multipart Content-Type
   1893 
   1894             In the case of multiple part messages, in which one or  more
   1895             different  sets  of  data  are  combined in a single body, a
   1896             "multipart" Content-Type field must appear in  the  entity's
   1897             header. The body must then contain one or more "body parts,"
   1898             each preceded by an encapsulation boundary, and the last one
   1899             followed  by  a  closing boundary.  Each part starts with an
   1900             encapsulation  boundary,  and  then  contains  a  body  part
   1901             consisting  of   header area, a blank line, and a body area.
   1902             Thus a body part is similar to an RFC 822 message in syntax,
   1903             but different in meaning.
   1904 
   1905             A body part is NOT to be interpreted as  actually  being  an
   1906             RFC  822  message.   To  begin  with,  NO  header fields are
   1907             actually required in body parts.  A body  part  that  starts
   1908             with  a blank line, therefore, is allowed and is a body part
   1909             for which all default values are to be assumed.  In  such  a
   1910             case,  the  absence  of  a Content-Type header field implies
   1911             that the encapsulation is plain  US-ASCII  text.   The  only
   1912             header  fields  that have defined meaning for body parts are
   1913             those the names of which begin with "Content-".   All  other
   1914             header  fields  are  generally  to be ignored in body parts.
   1915             Although  they  should  generally  be   retained   in   mail
   1916             processing,  they may be discarded by gateways if necessary.
   1917             Such other fields are permitted to appear in body parts  but
   1918             should  not  be  depended on. "X-" fields may be created for
   1919             experimental or private purposes, with the recognition  that
   1920             the information they contain may be lost at some gateways.
   1921 
   1922             The distinction between an RFC 822 message and a  body  part
   1923             is  subtle,  but  important.  A gateway between Internet and
   1924             X.400 mail, for example, must be able to tell the difference
   1925             between  a  body part that contains an image and a body part
   1926             that contains an encapsulated message, the body of which  is
   1927             an  image.   In order to represent the latter, the body part
   1928             must have "Content-Type: message", and its body  (after  the
   1929             blank  line)  must be the encapsulated message, with its own
   1930             "Content-Type: image" header  field.   The  use  of  similar
   1931             syntax facilitates the conversion of messages to body parts,
   1932             and vice versa, but the distinction between the two must  be
   1933             understood  by implementors.  (For the special case in which
   1934             all parts actually are messages, a "digest" subtype is  also
   1935             defined.)
   1936 
   1937             As stated previously, each  body  part  is  preceded  by  an
   1938             encapsulation boundary.  The encapsulation boundary MUST NOT
   1939             appear inside any of the encapsulated parts.   Thus,  it  is
   1940             crucial  that  the  composing  agent  be  able to choose and
   1941             specify the unique boundary that will separate the parts.
   1942 
   1943             All present and future subtypes of the "multipart" type must
   1944             use  an  identical  syntax.  Subtypes  may  differ  in their
   1945             semantics, and may impose additional restrictions on syntax,
   1946 
   1947 
   1948 
   1949             Borenstein & Freed                                 [Page 29]
   1950 
   1951 
   1952 
   1953 
   1954             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   1955 
   1956 
   1957             but  must  conform  to the required syntax for the multipart
   1958             type.  This requirement ensures  that  all  conformant  user
   1959             agents  will  at least be able to recognize and separate the
   1960             parts of any  multipart  entity,  even  of  an  unrecognized
   1961             subtype.
   1962 
   1963             As stated in the definition of the Content-Transfer-Encoding
   1964             field, no encoding other than "7bit", "8bit", or "binary" is
   1965             permitted for entities of type "multipart".   The  multipart
   1966             delimiters  and  header fields are always 7-bit ASCII in any
   1967             case, and data within the body parts can  be  encoded  on  a
   1968             part-by-part  basis,  with  Content-Transfer-Encoding fields
   1969             for each appropriate body part.
   1970 
   1971             Mail gateways, relays, and other mail  handling  agents  are
   1972             commonly  known  to alter the top-level header of an RFC 822
   1973             message.   In particular, they frequently  add,  remove,  or
   1974             reorder  header  fields.   Such  alterations  are explicitly
   1975             forbidden for the body part headers embedded in  the  bodies
   1976             of messages of type "multipart."
   1977 
   1978             7.2.1     Multipart:  The common syntax
   1979 
   1980             All subtypes of "multipart" share a common  syntax,  defined
   1981             in  this  section.   A simple example of a multipart message
   1982             also appears in this section.  An example of a more  complex
   1983             multipart message is given in Appendix C.
   1984 
   1985             The Content-Type field for multipart  entities requires  one
   1986             parameter,   "boundary",   which  is  used  to  specify  the
   1987             encapsulation  boundary.   The  encapsulation  boundary   is
   1988             defined   as  a  line  consisting  entirely  of  two  hyphen
   1989             characters ("-", decimal code 45) followed by  the  boundary
   1990             parameter value from the Content-Type header field.
   1991 
   1992             NOTE:  The hyphens are  for  rough  compatibility  with  the
   1993             earlier  RFC  934  method  of message encapsulation, and for
   1994             ease   of   searching   for   the   boundaries    in    some
   1995             implementations.  However, it should be noted that multipart
   1996             messages  are  NOT  completely  compatible  with   RFC   934
   1997             encapsulations;  in  particular,  they  do  not obey RFC 934
   1998             quoting conventions  for  embedded  lines  that  begin  with
   1999             hyphens.   This  mechanism  was  chosen  over  the  RFC  934
   2000             mechanism because the latter causes lines to grow with  each
   2001             level  of  quoting.  The combination of this growth with the
   2002             fact that SMTP implementations  sometimes  wrap  long  lines
   2003             made  the  RFC 934 mechanism unsuitable for use in the event
   2004             that deeply-nested multipart structuring is ever desired.
   2005 
   2006             Thus, a typical multipart Content-Type  header  field  might
   2007             look like this:
   2008 
   2009                  Content-Type: multipart/mixed;
   2010 
   2011 
   2012 
   2013 
   2014             Borenstein & Freed                                 [Page 30]
   2015 
   2016 
   2017 
   2018 
   2019             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2020 
   2021 
   2022                       boundary=gc0p4Jq0M2Yt08jU534c0p
   2023 
   2024             This indicates that the entity consists  of  several  parts,
   2025             each itself with a structure that is syntactically identical
   2026             to an RFC 822 message, except that the header area might  be
   2027             completely  empty,  and  that the parts are each preceded by
   2028             the line
   2029 
   2030                  --gc0p4Jq0M2Yt08jU534c0p
   2031 
   2032             Note that the  encapsulation  boundary  must  occur  at  the
   2033             beginning  of  a line, i.e., following a CRLF, and that that
   2034             initial CRLF is considered to be part of  the  encapsulation
   2035             boundary  rather  than  part  of  the preceding part.    The
   2036             boundary must be followed immediately either by another CRLF
   2037             and the header fields for the next part, or by two CRLFs, in
   2038             which case there are no header fields for the next part (and
   2039             it is therefore assumed to be of Content-Type text/plain).
   2040 
   2041             NOTE:   The  CRLF  preceding  the  encapsulation   line   is
   2042             considered  part  of  the boundary so that it is possible to
   2043             have a part that does not end with  a  CRLF  (line   break).
   2044             Body  parts that must be considered to end with line breaks,
   2045             therefore, should have two CRLFs preceding the encapsulation
   2046             line, the first of which is part of the preceding body part,
   2047             and the  second  of  which  is  part  of  the  encapsulation
   2048             boundary.
   2049 
   2050             The requirement that the encapsulation boundary begins  with
   2051             a  CRLF  implies  that  the  body of a multipart entity must
   2052             itself begin with a CRLF before the first encapsulation line
   2053             --  that  is, if the "preamble" area is not used, the entity
   2054             headers must be followed by TWO CRLFs.  This is  indeed  how
   2055             such  entities  should be composed.  A tolerant mail reading
   2056             program, however, may interpret a  body  of  type  multipart
   2057             that  begins  with  an encapsulation line NOT initiated by a
   2058             CRLF  as  also  being  an  encapsulation  boundary,  but   a
   2059             compliant  mail  sending  program  must  not  generate  such
   2060             entities.
   2061 
   2062             Encapsulation  boundaries  must  not   appear   within   the
   2063             encapsulations,  and  must  be no longer than 70 characters,
   2064             not counting the two leading hyphens.
   2065 
   2066             The encapsulation boundary following the last body part is a
   2067             distinguished  delimiter that indicates that no further body
   2068             parts will follow.  Such a delimiter  is  identical  to  the
   2069             previous  delimiters,  with the addition of two more hyphens
   2070             at the end of the line:
   2071 
   2072                  --gc0p4Jq0M2Yt08jU534c0p--
   2073 
   2074             There appears to be room for additional information prior to
   2075             the  first  encapsulation  boundary  and following the final
   2076 
   2077 
   2078 
   2079             Borenstein & Freed                                 [Page 31]
   2080 
   2081 
   2082 
   2083 
   2084             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2085 
   2086 
   2087             boundary.  These areas should generally be left  blank,  and
   2088             implementations  should  ignore anything that appears before
   2089             the first boundary or after the last one.
   2090 
   2091             NOTE:  These "preamble" and "epilogue" areas  are  not  used
   2092             because  of the lack of proper typing of these parts and the
   2093             lack  of  clear  semantics  for  handling  these  areas   at
   2094             gateways, particularly X.400 gateways.
   2095 
   2096             NOTE:  Because encapsulation boundaries must not  appear  in
   2097             the  body  parts  being  encapsulated,  a  user  agent  must
   2098             exercise care to choose a unique boundary.  The boundary  in
   2099             the example above could have been the result of an algorithm
   2100             designed to produce boundaries with a very  low  probability
   2101             of  already  existing in the data to be encapsulated without
   2102             having to prescan  the  data.   Alternate  algorithms  might
   2103             result in more 'readable' boundaries for a recipient with an
   2104             old user agent, but would  require  more  attention  to  the
   2105             possibility   that   the   boundary   might  appear  in  the
   2106             encapsulated  part.   The  simplest  boundary  possible   is
   2107             something like "---", with a closing boundary of "-----".
   2108 
   2109             As a very simple example, the  following  multipart  message
   2110             has  two  parts,  both  of  them  plain  text,  one  of them
   2111             explicitly typed and one of them implicitly typed:
   2112 
   2113                  From: Nathaniel Borenstein <nsb@bellcore.com>
   2114                  To:  Ned Freed <ned@innosoft.com>
   2115                  Subject: Sample message
   2116                  MIME-Version: 1.0
   2117                  Content-type: multipart/mixed; boundary="simple
   2118                  boundary"
   2119 
   2120                  This is the preamble.  It is to be ignored, though it
   2121                  is a handy place for mail composers to include an
   2122                  explanatory note to non-MIME compliant readers.
   2123                  --simple boundary
   2124 
   2125                  This is implicitly typed plain ASCII text.
   2126                  It does NOT end with a linebreak.
   2127                  --simple boundary
   2128                  Content-type: text/plain; charset=us-ascii
   2129 
   2130                  This is explicitly typed plain ASCII text.
   2131                  It DOES end with a linebreak.
   2132 
   2133                  --simple boundary--
   2134                  This is the epilogue.  It is also to be ignored.
   2135 
   2136             The use of a Content-Type of multipart in a body part within
   2137             another  multipart  entity  is explicitly allowed.   In such
   2138             cases, for obvious reasons, care must  be  taken  to  ensure
   2139             that  each  nested  multipart  entity  must  use a different
   2140             boundary delimiter. See Appendix C for an example of  nested
   2141 
   2142 
   2143 
   2144             Borenstein & Freed                                 [Page 32]
   2145 
   2146 
   2147 
   2148 
   2149             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2150 
   2151 
   2152             multipart entities.
   2153 
   2154             The use of the multipart Content-Type  with  only  a  single
   2155             body  part  may  be  useful  in  certain  contexts,  and  is
   2156             explicitly permitted.
   2157 
   2158             The only mandatory parameter for the multipart  Content-Type
   2159             is  the  boundary  parameter,  which  consists  of  1  to 70
   2160             characters from a set of characters known to be very  robust
   2161             through  email  gateways,  and  NOT ending with white space.
   2162             (If a boundary appears to end with white  space,  the  white
   2163             space  must be presumed to have been added by a gateway, and
   2164             should  be  deleted.)   It  is  formally  specified  by  the
   2165             following BNF:
   2166 
   2167             boundary := 0*69<bchars> bcharsnospace
   2168 
   2169             bchars := bcharsnospace / " "
   2170 
   2171             bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+"  /
   2172             "_"
   2173                            / "," / "-" / "." / "/" / ":" / "=" / "?"
   2174 
   2175             Overall, the body of a multipart entity may be specified  as
   2176             follows:
   2177 
   2178             multipart-body := preamble 1*encapsulation
   2179                            close-delimiter epilogue
   2180 
   2181             encapsulation := delimiter CRLF body-part
   2182 
   2183             delimiter := CRLF "--" boundary   ; taken from  Content-Type
   2184             field.
   2185                                            ;   when   content-type    is
   2186             multipart
   2187                                          ; There must be no space
   2188                                          ; between "--" and boundary.
   2189 
   2190             close-delimiter := delimiter "--" ; Again, no  space  before
   2191             "--"
   2192 
   2193             preamble :=  *text                  ;  to  be  ignored  upon
   2194             receipt.
   2195 
   2196             epilogue :=  *text                  ;  to  be  ignored  upon
   2197             receipt.
   2198 
   2199             body-part = <"message" as defined in RFC 822,
   2200                      with all header fields optional, and with the
   2201                      specified delimiter not occurring anywhere in
   2202                      the message body, either on a line by itself
   2203                      or as a substring anywhere.  Note that the
   2204 
   2205 
   2206 
   2207 
   2208 
   2209             Borenstein & Freed                                 [Page 33]
   2210 
   2211 
   2212 
   2213 
   2214             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2215 
   2216 
   2217                      semantics of a part differ from the semantics
   2218                      of a message, as described in the text.>
   2219 
   2220             NOTE:  Conspicuously missing from the multipart  type  is  a
   2221             notion  of  structured,  related body parts.  In general, it
   2222             seems premature to try to  standardize  interpart  structure
   2223             yet.  It is recommended that those wishing to provide a more
   2224             structured or integrated multipart messaging facility should
   2225             define   a   subtype  of  multipart  that  is  syntactically
   2226             identical, but  that  always  expects  the  inclusion  of  a
   2227             distinguished part that can be used to specify the structure
   2228             and integration of the other parts,  probably  referring  to
   2229             them  by  their Content-ID field.  If this approach is used,
   2230             other implementations will not recognize  the  new  subtype,
   2231             but  will  treat it as the primary subtype (multipart/mixed)
   2232             and will thus be able to show the user the  parts  that  are
   2233             recognized.
   2234 
   2235             7.2.2     The Multipart/mixed (primary) subtype
   2236 
   2237             The primary subtype for multipart, "mixed", is intended  for
   2238             use  when  the body parts are independent and intended to be
   2239             displayed  serially.   Any  multipart   subtypes   that   an
   2240             implementation does not recognize should be treated as being
   2241             of subtype "mixed".
   2242 
   2243             7.2.3     The Multipart/alternative subtype
   2244 
   2245             The multipart/alternative type is syntactically identical to
   2246             multipart/mixed,   but  the  semantics  are  different.   In
   2247             particular, each of the parts is an "alternative" version of
   2248             the same information.  User agents should recognize that the
   2249             content of the various parts are interchangeable.  The  user
   2250             agent  should  either  choose  the  "best" type based on the
   2251             user's environment and preferences, or offer  the  user  the
   2252             available  alternatives.  In general, choosing the best type
   2253             means displaying only the LAST part that can  be  displayed.
   2254             This  may be used, for example, to send mail in a fancy text
   2255             format in such  a  way  that  it  can  easily  be  displayed
   2256             anywhere:
   2257 
   2258             From:  Nathaniel Borenstein <nsb@bellcore.com>
   2259             To: Ned Freed <ned@innosoft.com>
   2260             Subject: Formatted text mail
   2261             MIME-Version: 1.0
   2262             Content-Type: multipart/alternative; boundary=boundary42
   2263 
   2264 
   2265             --boundary42
   2266             Content-Type: text/plain; charset=us-ascii
   2267 
   2268             ...plain text version of message goes here....
   2269 
   2270 
   2271 
   2272 
   2273 
   2274             Borenstein & Freed                                 [Page 34]
   2275 
   2276 
   2277 
   2278 
   2279             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2280 
   2281 
   2282             --boundary42
   2283             Content-Type: text/richtext
   2284 
   2285             .... richtext version of same message goes here ...
   2286             --boundary42
   2287             Content-Type: text/x-whatever
   2288 
   2289             .... fanciest formatted version of same  message  goes  here
   2290             ...
   2291             --boundary42--
   2292 
   2293             In this example, users  whose  mail  system  understood  the
   2294             "text/x-whatever"  format  would see only the fancy version,
   2295             while other users would see only the richtext or plain  text
   2296             version, depending on the capabilities of their system.
   2297 
   2298             In general, user agents that  compose  multipart/alternative
   2299             entities  should place the body parts in increasing order of
   2300             preference, that is, with the  preferred  format  last.  For
   2301             fancy  text,  the sending user agent should put the plainest
   2302             format first and the richest format  last.   Receiving  user
   2303             agents  should  pick  and  display  the last format they are
   2304             capable of  displaying.   In  the  case  where  one  of  the
   2305             alternatives  is  itself  of  type  "multipart" and contains
   2306             unrecognized sub-parts, the user agent may choose either  to
   2307             show that alternative, an earlier alternative, or both.
   2308 
   2309             NOTE:  From an implementor's perspective, it might seem more
   2310             sensible  to  reverse  this  ordering, and have the plainest
   2311             alternative last.  However, placing the plainest alternative
   2312             first    is    the    friendliest   possible   option   when
   2313             mutlipart/alternative entities are viewed using a  non-MIME-
   2314             compliant mail reader.  While this approach does impose some
   2315             burden on  compliant  mail  readers,  interoperability  with
   2316             older  mail  readers was deemed to be more important in this
   2317             case.
   2318 
   2319             It may be the case  that  some  user  agents,  if  they  can
   2320             recognize more than one of the formats, will prefer to offer
   2321             the user the choice of which format  to  view.   This  makes
   2322             sense, for example, if mail includes both a nicely-formatted
   2323             image version and an easily-edited text  version.   What  is
   2324             most  critical,  however, is that the user not automatically
   2325             be shown multiple versions of the  same  data.   Either  the
   2326             user  should  be shown the last recognized version or should
   2327             explicitly be given the choice.
   2328 
   2329 
   2330 
   2331 
   2332 
   2333 
   2334 
   2335 
   2336 
   2337 
   2338 
   2339             Borenstein & Freed                                 [Page 35]
   2340 
   2341 
   2342 
   2343 
   2344             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2345 
   2346 
   2347             7.2.4     The Multipart/digest subtype
   2348 
   2349             This document defines a "digest" subtype  of  the  multipart
   2350             Content-Type.   This  type  is  syntactically  identical  to
   2351             multipart/mixed,  but  the  semantics  are  different.    In
   2352             particular,  in a digest, the default Content-Type value for
   2353             a   body   part   is   changed    from    "text/plain"    to
   2354             "message/rfc822".   This  is  done  to allow a more readable
   2355             digest format that is largely  compatible  (except  for  the
   2356             quoting convention) with RFC 934.
   2357 
   2358             A digest in this format might,  then,  look  something  like
   2359             this:
   2360 
   2361             From: Moderator-Address
   2362             MIME-Version: 1.0
   2363             Subject:  Internet Digest, volume 42
   2364             Content-Type: multipart/digest;
   2365                  boundary="---- next message ----"
   2366 
   2367 
   2368             ------ next message ----
   2369 
   2370             From: someone-else
   2371             Subject: my opinion
   2372 
   2373             ...body goes here ...
   2374 
   2375             ------ next message ----
   2376 
   2377             From: someone-else-again
   2378             Subject: my different opinion
   2379 
   2380             ... another body goes here...
   2381 
   2382             ------ next message ------
   2383 
   2384             7.2.5     The Multipart/parallel subtype
   2385 
   2386             This document defines a "parallel" subtype of the  multipart
   2387             Content-Type.   This  type  is  syntactically  identical  to
   2388             multipart/mixed,  but  the  semantics  are  different.    In
   2389             particular,  in  a  parallel  entity,  all  of the parts are
   2390             intended to be presented in parallel, i.e.,  simultaneously,
   2391             on  hardware  and  software  that  are  capable of doing so.
   2392             Composing agents should be aware that many mail readers will
   2393             lack this capability and will show the parts serially in any
   2394             event.
   2395 
   2396 
   2397 
   2398 
   2399 
   2400 
   2401 
   2402 
   2403 
   2404             Borenstein & Freed                                 [Page 36]
   2405 
   2406 
   2407 
   2408 
   2409             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2410 
   2411 
   2412             7.3  The Message Content-Type
   2413 
   2414             It is frequently desirable, in sending mail, to  encapsulate
   2415             another  mail  message. For this common operation, a special
   2416             Content-Type, "message", is defined.  The  primary  subtype,
   2417             message/rfc822,  has  no required parameters in the Content-
   2418             Type field.  Additional subtypes, "partial"  and  "External-
   2419             body",  do  have  required  parameters.   These subtypes are
   2420             explained below.
   2421 
   2422             NOTE:  It has been suggested that subtypes of message  might
   2423             be  defined  for  forwarded  or rejected messages.  However,
   2424             forwarded and rejected messages can be handled as  multipart
   2425             messages  in  which  the  first part contains any control or
   2426             descriptive  information,  and  a  second  part,   of   type
   2427             message/rfc822,   is  the  forwarded  or  rejected  message.
   2428             Composing rejection and forwarding messages in  this  manner
   2429             will  preserve  the type information on the original message
   2430             and allow it to be correctly presented to the recipient, and
   2431             hence is strongly encouraged.
   2432 
   2433             As stated in the definition of the Content-Transfer-Encoding
   2434             field, no encoding other than "7bit", "8bit", or "binary" is
   2435             permitted for messages  or  parts  of  type  "message".  The
   2436             message  header  fields are always US-ASCII in any case, and
   2437             data within the body can still be encoded, in which case the
   2438             Content-Transfer-Encoding  header  field in the encapsulated
   2439             message will reflect this.  Non-ASCII text in the headers of
   2440             an   encapsulated   message   can  be  specified  using  the
   2441             mechanisms described in [RFC-1342].
   2442 
   2443             Mail gateways, relays, and other mail  handling  agents  are
   2444             commonly  known  to alter the top-level header of an RFC 822
   2445             message.   In particular, they frequently  add,  remove,  or
   2446             reorder  header  fields.   Such  alterations  are explicitly
   2447             forbidden for  the  encapsulated  headers  embedded  in  the
   2448             bodies of messages of type "message."
   2449 
   2450             7.3.1     The Message/rfc822 (primary) subtype
   2451 
   2452             A Content-Type of "message/rfc822" indicates that  the  body
   2453             contains  an encapsulated message, with the syntax of an RFC
   2454             822 message.
   2455 
   2456             7.3.2     The Message/Partial subtype
   2457 
   2458             A subtype of message, "partial",  is  defined  in  order  to
   2459             allow  large  objects  to  be  delivered as several separate
   2460             pieces  of  mail  and  automatically  reassembled   by   the
   2461             receiving  user  agent.   (The  concept  is  similar  to  IP
   2462             fragmentation/reassembly in the basic  Internet  Protocols.)
   2463             This  mechanism  can  be  used  when  intermediate transport
   2464             agents limit the size of individual  messages  that  can  be
   2465             sent.   Content-Type  "message/partial"  thus indicates that
   2466 
   2467 
   2468 
   2469             Borenstein & Freed                                 [Page 37]
   2470 
   2471 
   2472 
   2473 
   2474             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2475 
   2476 
   2477             the body contains a fragment of a larger message.
   2478 
   2479             Three parameters must be specified in the Content-Type field
   2480             of  type  message/partial:  The  first,  "id",  is  a unique
   2481             identifier,  as  close  to  a  world-unique  identifier   as
   2482             possible,  to  be  used  to  match  the parts together.  (In
   2483             general, the identifier  is  essentially  a  message-id;  if
   2484             placed  in  double  quotes,  it  can  be  any message-id, in
   2485             accordance with the BNF for  "parameter"  given  earlier  in
   2486             this  specification.)   The second, "number", an integer, is
   2487             the part number, which indicates where this part  fits  into
   2488             the  sequence  of  fragments.   The  third, "total", another
   2489             integer, is the total number of parts. This  third  subfield
   2490             is  required  on  the  final  part,  and  is optional on the
   2491             earlier parts. Note also that these parameters may be  given
   2492             in any order.
   2493 
   2494             Thus, part 2 of a 3-part message  may  have  either  of  the
   2495             following header fields:
   2496 
   2497                  Content-Type: Message/Partial;
   2498                       number=2; total=3;
   2499                       id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
   2500 
   2501                  Content-Type: Message/Partial;
   2502                       id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
   2503                       number=2
   2504 
   2505             But part 3 MUST specify the total number of parts:
   2506 
   2507                  Content-Type: Message/Partial;
   2508                       number=3; total=3;
   2509                       id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
   2510 
   2511             Note that part numbering begins with 1, not 0.
   2512 
   2513             When the parts of a message broken up in this manner are put
   2514             together,  the  result is a complete RFC 822 format message,
   2515             which may have its own Content-Type header field,  and  thus
   2516             may contain any other data type.
   2517 
   2518             Message fragmentation and reassembly:  The  semantics  of  a
   2519             reassembled  partial  message  must  be those of the "inner"
   2520             message, rather than  of  a  message  containing  the  inner
   2521             message.   This  makes  it  possible, for example, to send a
   2522             large audio message as several partial messages,  and  still
   2523             have  it  appear  to the recipient as a simple audio message
   2524             rather than as an encapsulated message containing  an  audio
   2525             message.   That  is,  the  encapsulation  of  the message is
   2526             considered to be "transparent".
   2527 
   2528             When  generating   and   reassembling   the   parts   of   a
   2529             message/partial  message,  the  headers  of the encapsulated
   2530             message must be merged with the  headers  of  the  enclosing
   2531 
   2532 
   2533 
   2534             Borenstein & Freed                                 [Page 38]
   2535 
   2536 
   2537 
   2538 
   2539             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2540 
   2541 
   2542             entities.  In  this  process  the  following  rules  must be
   2543             observed:
   2544 
   2545                  (1) All of the headers from the initial  enclosing
   2546                  entity  (part  one),  except those that start with
   2547                  "Content-" and "Message-ID", must  be  copied,  in
   2548                  order, to the new message.
   2549 
   2550                  (2) Only those headers  in  the  enclosed  message
   2551                  which  start with "Content-" and "Message-ID" must
   2552                  be appended, in order, to the headers of  the  new
   2553                  message.   Any  headers  in  the  enclosed message
   2554                  which do not start  with  "Content-"  (except  for
   2555                  "Message-ID") will be ignored.
   2556 
   2557                  (3) All of the headers from  the  second  and  any
   2558                  subsequent messages will be ignored.
   2559 
   2560             For example, if an audio message is broken into  two  parts,
   2561             the first part might look something like this:
   2562 
   2563                  X-Weird-Header-1: Foo
   2564                  From: Bill@host.com
   2565                  To: joe@otherhost.com
   2566                  Subject: Audio mail
   2567                  Message-ID: id1@host.com
   2568                  MIME-Version: 1.0
   2569                  Content-type: message/partial;
   2570                       id="ABC@host.com";
   2571                       number=1; total=2
   2572 
   2573                  X-Weird-Header-1: Bar
   2574                  X-Weird-Header-2: Hello
   2575                  Message-ID: anotherid@foo.com
   2576                  Content-type: audio/basic
   2577                  Content-transfer-encoding: base64
   2578 
   2579                  ... first half of encoded audio data goes here...
   2580 
   2581             and the second half might look something like this:
   2582 
   2583                  From: Bill@host.com
   2584                  To: joe@otherhost.com
   2585                  Subject: Audio mail
   2586                  MIME-Version: 1.0
   2587                  Message-ID: id2@host.com
   2588                  Content-type: message/partial;
   2589                       id="ABC@host.com"; number=2; total=2
   2590 
   2591                  ... second half of encoded audio data goes here...
   2592 
   2593             Then,  when  the  fragmented  message  is  reassembled,  the
   2594             resulting  message  to  be displayed to the user should look
   2595             something like this:
   2596 
   2597 
   2598 
   2599             Borenstein & Freed                                 [Page 39]
   2600 
   2601 
   2602 
   2603 
   2604             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2605 
   2606 
   2607                  X-Weird-Header-1: Foo
   2608                  From: Bill@host.com
   2609                  To: joe@otherhost.com
   2610                  Subject: Audio mail
   2611                  Message-ID: anotherid@foo.com
   2612                  MIME-Version: 1.0
   2613                  Content-type: audio/basic
   2614                  Content-transfer-encoding: base64
   2615 
   2616                  ... first half of encoded audio data goes here...
   2617                  ... second half of encoded audio data goes here...
   2618 
   2619             It should be  noted  that,  because  some  message  transfer
   2620             agents  may choose to automatically fragment large messages,
   2621             and because such  agents  may  use  different  fragmentation
   2622             thresholds,  it  is  possible  that  the pieces of a partial
   2623             message, upon reassembly, may prove themselves to comprise a
   2624             partial message.  This is explicitly permitted.
   2625 
   2626             It should also be noted that the inclusion of a "References"
   2627             field  in the headers of the second and subsequent pieces of
   2628             a fragmented message that references the Message-Id  on  the
   2629             previous  piece  may  be  of  benefit  to  mail readers that
   2630             understand and track references. However, the generation  of
   2631             such "References" fields is entirely optional.
   2632 
   2633             7.3.3     The Message/External-Body subtype
   2634 
   2635             The external-body subtype indicates  that  the  actual  body
   2636             data are not included, but merely referenced.  In this case,
   2637             the  parameters  describe  a  mechanism  for  accessing  the
   2638             external data.
   2639 
   2640             When  a   message   body   or   body   part   is   of   type
   2641             "message/external-body",   it  consists  of  a  header,  two
   2642             consecutive  CRLFs,  and  the   message   header   for   the
   2643             encapsulated  message.  If another pair of consecutive CRLFs
   2644             appears, this of course ends  the  message  header  for  the
   2645             encapsulated   message.   However,  since  the  encapsulated
   2646             message's body is itself external, it does NOT appear in the
   2647             area  that  follows.   For  example,  consider the following
   2648             message:
   2649 
   2650                  Content-type: message/external-body; access-
   2651                  type=local-file;
   2652                       name=/u/nsb/Me.gif
   2653 
   2654                  Content-type:  image/gif
   2655 
   2656                  THIS IS NOT REALLY THE BODY!
   2657 
   2658             The area at the end, which  might  be  called  the  "phantom
   2659             body", is ignored for most external-body messages.  However,
   2660             it may be used to contain auxilliary  information  for  some
   2661 
   2662 
   2663 
   2664             Borenstein & Freed                                 [Page 40]
   2665 
   2666 
   2667 
   2668 
   2669             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2670 
   2671 
   2672             such  messages,  as  indeed  it  is  when the access-type is
   2673             "mail-server".   Of  the  access-types   defined   by   this
   2674             document, the phantom body is used only when the access-type
   2675             is "mail-server".  In all other cases, the phantom  body  is
   2676             ignored.
   2677 
   2678             The only always-mandatory  parameter  for  message/external-
   2679             body  is  "access-type";  all of the other parameters may be
   2680             mandatory or optional depending on the value of access-type.
   2681 
   2682                  ACCESS-TYPE -- One or more case-insensitive words,
   2683                  comma-separated,   indicating   supported   access
   2684                  mechanisms by  which  the  file  or  data  may  be
   2685                  obtained.  Values include, but are not limited to,
   2686                  "FTP", "ANON-FTP",  "TFTP",  "AFS",  "LOCAL-FILE",
   2687                  and   "MAIL-SERVER".  Future  values,  except  for
   2688                  experimental values beginning with "X-",  must  be
   2689                  registered with IANA, as described in Appendix F .
   2690 
   2691             In addition, the following two parameters are  optional  for
   2692             ALL access-types:
   2693 
   2694                  EXPIRATION -- The date (in the RFC 822 "date-time"
   2695                  syntax, as extended by RFC 1123 to permit 4 digits
   2696                  in the date field) after which  the  existence  of
   2697                  the external data is not guaranteed.
   2698 
   2699                  SIZE -- The size (in octets)  of  the  data.   The
   2700                  intent  of this parameter is to help the recipient
   2701                  decide whether or  not  to  expend  the  necessary
   2702                  resources to retrieve the external data.
   2703 
   2704                  PERMISSION -- A field that  indicates  whether  or
   2705                  not it is expected that clients might also attempt
   2706                  to  overwrite  the  data.   By  default,   or   if
   2707                  permission  is "read", the assumption is that they
   2708                  are not, and that if the data is  retrieved  once,
   2709                  it  is never needed again. If PERMISSION is "read-
   2710                  write", this assumption is invalid, and any  local
   2711                  copy  must  be  considered  no  more than a cache.
   2712                  "Read"  and  "Read-write"  are  the  only  defined
   2713                  values of permission.
   2714 
   2715             The precise semantics of the access-types defined  here  are
   2716             described in the sections that follow.
   2717 
   2718             7.3.3.1  The "ftp" and "tftp" access-types
   2719 
   2720             An access-type of FTP or TFTP  indicates  that  the  message
   2721             body is accessible as a file using the FTP [RFC-959] or TFTP
   2722             [RFC-783] protocols, respectively.  For these  access-types,
   2723             the following additional parameters are mandatory:
   2724 
   2725 
   2726 
   2727 
   2728 
   2729             Borenstein & Freed                                 [Page 41]
   2730 
   2731 
   2732 
   2733 
   2734             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2735 
   2736 
   2737                  NAME -- The name of the  file  that  contains  the
   2738                  actual body data.
   2739 
   2740                  SITE -- A machine  from  which  the  file  may  be
   2741                  obtained, using the given protocol
   2742 
   2743             Before the data is retrieved,  using  these  protocols,  the
   2744             user  will  generally need to be asked to provide a login id
   2745             and a password for the machine named by the site parameter.
   2746 
   2747             In addition, the  following  optional  parameters  may  also
   2748             appear when the access-type is FTP or ANON-FTP:
   2749 
   2750                  DIRECTORY -- A directory from which the data named
   2751                  by NAME should be retrieved.
   2752 
   2753                  MODE  --  A  transfer  mode  for  retrieving   the
   2754                  information, e.g. "image".
   2755 
   2756             7.3.3.2  The "anon-ftp" access-type
   2757 
   2758             The "anon-ftp" access-type is identical to the "ftp"  access
   2759             type,  except  that  the user need not be asked to provide a
   2760             name and password for the specified site.  Instead, the  ftp
   2761             protocol  will be used with login "anonymous" and a password
   2762             that corresponds to the user's email address.
   2763 
   2764             7.3.3.3  The "local-file" and "afs" access-types
   2765 
   2766             An access-type of "local-file"  indicates  that  the  actual
   2767             body  is  accessible  as  a  file  on the local machine.  An
   2768             access-type of "afs" indicates that the file  is  accessible
   2769             via  the  global  AFS  file  system.   In both cases, only a
   2770             single parameter is required:
   2771 
   2772                  NAME -- The name of the  file  that  contains  the
   2773                  actual body data.
   2774 
   2775             The following optional parameter may be used to describe the
   2776             locality  of  reference  for  the data, that is, the site or
   2777             sites at which the file is expected to be visible:
   2778 
   2779                  SITE -- A domain specifier for a machine or set of
   2780                  machines that are known to have access to the data
   2781                  file.  Asterisks may be used for wildcard matching
   2782                  to   a   part   of   a   domain   name,   such  as
   2783                  "*.bellcore.com", to indicate a set of machines on
   2784                  which the data should be directly visible, while a
   2785                  single asterisk may be used  to  indicate  a  file
   2786                  that  is  expected  to  be  universally available,
   2787                  e.g., via a global file system.
   2788 
   2789             7.3.3.4  The "mail-server" access-type
   2790 
   2791 
   2792 
   2793 
   2794             Borenstein & Freed                                 [Page 42]
   2795 
   2796 
   2797 
   2798 
   2799             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2800 
   2801 
   2802             The "mail-server" access-type indicates that the actual body
   2803             is  available  from  a mail server.  The mandatory parameter
   2804             for this access-type is:
   2805 
   2806                  SERVER -- The email address  of  the  mail  server
   2807                  from which the actual body data can be obtained.
   2808 
   2809             Because mail servers accept a variety  of  syntax,  some  of
   2810             which  is  multiline,  the full command to be sent to a mail
   2811             server is not included as a parameter  on  the  content-type
   2812             line.   Instead,  it  may  be provided as the "phantom body"
   2813             when  the  content-type  is  message/external-body  and  the
   2814             access-type is mail-server.
   2815 
   2816             Note that  MIME  does  not  define  a  mail  server  syntax.
   2817             Rather,  it  allows  the  inclusion of arbitrary mail server
   2818             commands  in  the  phantom  body.   Implementations   should
   2819             include the phantom body in the body of the message it sends
   2820             to the mail server address to retrieve the relevant data.
   2821 
   2822 
   2823 
   2824 
   2825 
   2826 
   2827 
   2828 
   2829 
   2830 
   2831 
   2832 
   2833 
   2834 
   2835 
   2836 
   2837 
   2838 
   2839 
   2840 
   2841 
   2842 
   2843 
   2844 
   2845 
   2846 
   2847 
   2848 
   2849 
   2850 
   2851 
   2852 
   2853 
   2854 
   2855 
   2856 
   2857 
   2858 
   2859             Borenstein & Freed                                 [Page 43]
   2860 
   2861 
   2862 
   2863 
   2864             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2865 
   2866 
   2867             7.3.3.5  Examples and Further Explanations
   2868 
   2869             With  the  emerging  possibility  of  very  wide-area   file
   2870             systems,  it becomes very hard to know in advance the set of
   2871             machines where a  file  will  and  will  not  be  accessible
   2872             directly  from the file system.  Therefore it may make sense
   2873             to provide both a file name, to be tried directly,  and  the
   2874             name of one or more sites from which the file is known to be
   2875             accessible.  An implementation can try  to  retrieve  remote
   2876             files  using FTP or any other protocol, using anonymous file
   2877             retrieval or prompting the user for the necessary  name  and
   2878             password.   If  an  external body is accessible via multiple
   2879             mechanisms, the sender may include multiple  parts  of  type
   2880             message/external-body    within    an    entity    of   type
   2881             multipart/alternative.
   2882 
   2883             However, the external-body mechanism is not intended  to  be
   2884             limited  to  file  retrieval,  as  shown  by the mail-server
   2885             access-type.  Beyond this, one  can  imagine,  for  example,
   2886             using a video server for external references to video clips.
   2887 
   2888             If an entity is of type  "message/external-body",  then  the
   2889             body  of  the  entity  will contain the header fields of the
   2890             encapsulated message.  The body itself is to be found in the
   2891             external  location.   This  means  that  if  the body of the
   2892             "message/external-body"  message  contains  two  consecutive
   2893             CRLFs,  everything  after  those  pairs  is  NOT part of the
   2894             message itself.  For  most  message/external-body  messages,
   2895             this trailing area must simply be ignored.  However, it is a
   2896             convenient place for additional data that cannot be included
   2897             in  the  content-type  header field.   In particular, if the
   2898             "access-type" value is "mail-server", then the trailing area
   2899             must  contain  commands to be sent to the mail server at the
   2900             address given by NAME@SITE, where  NAME  and  SITE  are  the
   2901             values of the NAME and SITE parameters, respectively.
   2902 
   2903             The embedded message header fields which appear in the  body
   2904             of the message/external-body data can be used to declare the
   2905             Content-type  of  the  external  body.   Thus   a   complete
   2906             message/external-body  message,  referring  to a document in
   2907             PostScript format, might look like this:
   2908 
   2909                  From: Whomever
   2910                  Subject: whatever
   2911                  MIME-Version: 1.0
   2912                  Message-ID: id1@host.com
   2913                  Content-Type: multipart/alternative; boundary=42
   2914 
   2915 
   2916                  --42
   2917                  Content-Type: message/external-body;
   2918                       name="BodyFormats.ps";
   2919 
   2920 
   2921 
   2922 
   2923 
   2924             Borenstein & Freed                                 [Page 44]
   2925 
   2926 
   2927 
   2928 
   2929             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2930 
   2931 
   2932                       site="thumper.bellcore.com";
   2933                       access-type=ANON-FTP;
   2934                       directory="pub";
   2935                       mode="image";
   2936                       expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
   2937 
   2938                  Content-type: application/postscript
   2939 
   2940                  --42
   2941                  Content-Type: message/external-body;
   2942                       name="/u/nsb/writing/rfcs/RFC-XXXX.ps";
   2943                       site="thumper.bellcore.com";
   2944                       access-type=AFS
   2945                       expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
   2946 
   2947                  Content-type: application/postscript
   2948 
   2949                  --42
   2950                  Content-Type: message/external-body;
   2951                       access-type=mail-server
   2952                       server="listserv@bogus.bitnet";
   2953                       expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
   2954 
   2955                  Content-type: application/postscript
   2956 
   2957                  get rfc-xxxx doc
   2958 
   2959                  --42--
   2960 
   2961             Like the  message/partial  type,  the  message/external-body
   2962             type  is  intended to be transparent, that is, to convey the
   2963             data type in the external  body  rather  than  to  convey  a
   2964             message  with  a body of that type.  Thus the headers on the
   2965             outer and inner parts must be merged using the same rules as
   2966             for  message/partial.   In  particular,  this means that the
   2967             Content-type header is overridden, but the From and  Subject
   2968             headers are preserved.
   2969 
   2970             Note that since the external bodies are not  transported  as
   2971             mail,  they  need  not  conform to the 7-bit and line length
   2972             requirements, but might in fact be  binary  files.   Thus  a
   2973             Content-Transfer-Encoding is not generally necessary, though
   2974             it is permitted.
   2975 
   2976             Note that the body of a message of  type  "message/external-
   2977             body"  is  governed  by  the  basic  syntax  for  an RFC 822
   2978             message.   In  particular,   anything   before   the   first
   2979             consecutive  pair  of  CRLFs  is  header  information, while
   2980             anything after it is body information, which is ignored  for
   2981             most access-types.
   2982 
   2983 
   2984 
   2985 
   2986 
   2987 
   2988 
   2989             Borenstein & Freed                                 [Page 45]
   2990 
   2991 
   2992 
   2993 
   2994             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   2995 
   2996 
   2997             7.4  The Application Content-Type
   2998 
   2999             The "application" Content-Type is to be used for data  which
   3000             do  not fit in any of the other categories, and particularly
   3001             for data to be processed by mail-based uses  of  application
   3002             programs.  This is information which must be processed by an
   3003             application before it is  viewable  or  usable  to  a  user.
   3004             Expected  uses  for  Content-Type  application include mail-
   3005             based  file  transfer,  spreadsheets,  data  for  mail-based
   3006             scheduling    systems,    and    languages    for   "active"
   3007             (computational) email.  (The latter, in particular, can pose
   3008             security    problems   which   should   be   understood   by
   3009             implementors, and are considered in detail in the discussion
   3010             of the application/PostScript content-type.)
   3011 
   3012             For example, a meeting scheduler  might  define  a  standard
   3013             representation for information about proposed meeting dates.
   3014             An intelligent user agent  would  use  this  information  to
   3015             conduct  a dialog with the user, and might then send further
   3016             mail based on that dialog. More generally, there  have  been
   3017             several  "active"  messaging  languages  developed  in which
   3018             programs in a suitably specialized language are sent through
   3019             the   mail   and   automatically   run  in  the  recipient's
   3020             environment.
   3021 
   3022             Such  applications  may  be  defined  as  subtypes  of   the
   3023             "application"  Content-Type.   This  document  defines three
   3024             subtypes: octet-stream, ODA, and PostScript.
   3025 
   3026             In general, the subtype of application  will  often  be  the
   3027             name  of  the  application  for which the data are intended.
   3028             This does not mean, however, that  any  application  program
   3029             name  may  be used freely as a subtype of application.  Such
   3030             usages  must  be  registered  with  IANA,  as  described  in
   3031             Appendix F.
   3032 
   3033             7.4.1     The Application/Octet-Stream (primary) subtype
   3034 
   3035             The primary subtype of application, "octet-stream",  may  be
   3036             used  to indicate that a body contains binary data.  The set
   3037             of possible parameters includes, but is not limited to:
   3038 
   3039                  NAME -- a suggested name for the  binary  data  if
   3040                  stored as a file.
   3041 
   3042                  TYPE -- the general type  or  category  of  binary
   3043                  data.   This  is  intended  as information for the
   3044                  human recipient  rather  than  for  any  automatic
   3045                  processing.
   3046 
   3047                  CONVERSIONS -- the set  of  operations  that  have
   3048                  been  performed  on  the data before putting it in
   3049                  the mail (and before any Content-Transfer-Encoding
   3050                  that   might   have  been  applied).  If  multiple
   3051 
   3052 
   3053 
   3054             Borenstein & Freed                                 [Page 46]
   3055 
   3056 
   3057 
   3058 
   3059             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3060 
   3061 
   3062                  conversions have occurred, they must be  separated
   3063                  by  commas  and  specified  in the order they were
   3064                  applied -- that is, the leftmost conversion   must
   3065                  have  occurred  first,  and conversions are undone
   3066                  from right  to  left.   Note  that  NO  conversion
   3067                  values   are   defined   by  this  document.   Any
   3068                  conversion values that that do not begin with "X-"
   3069                  must  be preceded by a published specification and
   3070                  by  registration  with  IANA,  as   described   in
   3071                  Appendix F.
   3072 
   3073                  PADDING -- the number of bits of padding that were
   3074                  appended  to  the  bitstream comprising the actual
   3075                  contents to  produce  the  enclosed  byte-oriented
   3076                  data.  This is useful for enclosing a bitstream in
   3077                  a body when the total number  of  bits  is  not  a
   3078                  multiple of the byte size.
   3079 
   3080             The values  for  these  attributes  are  left  undefined  at
   3081             present,  but  may  require specification in the future.  An
   3082             example of a common (though UNIX-specific) usage might be:
   3083 
   3084                  Content-Type:  application/octet-stream;
   3085                       name=foo.tar.Z; type=tar;
   3086                       conversions="x-encrypt,x-compress"
   3087 
   3088             However, it should be noted that the use of such conversions
   3089             is  explicitly  discouraged due to a lack of portability and
   3090             standardization.   The  use  of  uuencode  is   particularly
   3091             discouraged,   in  favor  of  the  Content-Transfer-Encoding
   3092             mechanism, which is both more standardized and more portable
   3093             across mail boundaries.
   3094 
   3095             The recommended action for an implementation  that  receives
   3096             application/octet-stream  mail is to simply offer to put the
   3097             data in a file, with any  Content-Transfer-Encoding  undone,
   3098             or perhaps to use it as input to a user-specified process.
   3099 
   3100             To reduce the danger of transmitting rogue programs  through
   3101             the  mail,  it  is strongly recommended that implementations
   3102             NOT implement a path-search mechanism whereby  an  arbitrary
   3103             program  named  in  the  Content-Type  parameter  (e.g.,  an
   3104             "interpreter=" parameter) is found and  executed  using  the
   3105             mail body as input.
   3106 
   3107             7.4.2     The Application/PostScript subtype
   3108 
   3109             A  Content-Type  of  "application/postscript"  indicates   a
   3110             PostScript    program.    The   language   is   defined   in
   3111             [POSTSCRIPT].  It is recommended  that  Postscript  as  sent
   3112             through  email  should  use  Postscript document structuring
   3113             conventions if at all possible, and correctly.
   3114 
   3115 
   3116 
   3117 
   3118 
   3119             Borenstein & Freed                                 [Page 47]
   3120 
   3121 
   3122 
   3123 
   3124             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3125 
   3126 
   3127             The execution  of  general-purpose  PostScript  interpreters
   3128             entails   serious   security  risks,  and  implementors  are
   3129             discouraged from simply sending PostScript email  bodies  to
   3130             "off-the-shelf"  interpreters.   While it is usually safe to
   3131             send PostScript to a printer, where the potential  for  harm
   3132             is  greatly constrained, implementors should consider all of
   3133             the  following  before  they  add  interactive  display   of
   3134             PostScript bodies to their mail readers.
   3135 
   3136             The remainder of this section outlines some, though probably
   3137             not  all,  of  the possible problems with sending PostScript
   3138             through the mail.
   3139 
   3140             Dangerous operations in the PostScript language include, but
   3141             may  not be limited to, the PostScript operators deletefile,
   3142             renamefile,  filenameforall,  and  file.    File   is   only
   3143             dangerous  when  applied  to  something  other than standard
   3144             input or output. Implementations may also define  additional
   3145             nonstandard  file operators; these may also pose a threat to
   3146             security.     Filenameforall,  the  wildcard   file   search
   3147             operator,  may  appear at first glance to be harmless. Note,
   3148             however, that this operator  has  the  potential  to  reveal
   3149             information  about  what  files the recipient has access to,
   3150             and this  information  may  itself  be  sensitive.   Message
   3151             senders  should  avoid the use of potentially dangerous file
   3152             operators, since these operators  are  quite  likely  to  be
   3153             unavailable  in secure PostScript implementations.  Message-
   3154             receiving and -displaying software should either  completely
   3155             disable  all  potentially  dangerous  file operators or take
   3156             special care not to delegate any special authority to  their
   3157             operation. These operators should be viewed as being done by
   3158             an outside agency when  interpreting  PostScript  documents.
   3159             Such  disabling  and/or  checking  should be done completely
   3160             outside of the reach of the PostScript language itself; care
   3161             should  be  taken  to  insure  that  no  method  exists  for
   3162             reenabling full-function versions of these operators.
   3163 
   3164             The PostScript language provides facilities for exiting  the
   3165             normal  interpreter,  or  server, loop. Changes made in this
   3166             "outer"  environment   are   customarily   retained   across
   3167             documents, and may in some cases be retained semipermanently
   3168             in nonvolatile memory. The operators associated with exiting
   3169             the  interpreter  loop  have the potential to interfere with
   3170             subsequent document processing. As such, their  unrestrained
   3171             use  constitutes  a  threat  of  service denial.  PostScript
   3172             operators that exit the interpreter loop  include,  but  may
   3173             not  be  limited  to, the exitserver and startjob operators.
   3174             Message-sending software should not generate PostScript that
   3175             depends  on  exiting  the  interpreter  loop to operate. The
   3176             ability to exit  will  probably  be  unavailable  in  secure
   3177             PostScript     implementations.     Message-receiving    and
   3178             -displaying  software  should,  if  possible,  disable   the
   3179             ability   to   make   retained  changes  to  the  PostScript
   3180             environment. Eliminate the startjob and exitserver commands.
   3181 
   3182 
   3183 
   3184             Borenstein & Freed                                 [Page 48]
   3185 
   3186 
   3187 
   3188 
   3189             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3190 
   3191 
   3192             If  these  commands  cannot  be eliminated, at least set the
   3193             password associated with them to a hard-to-guess value.
   3194 
   3195             PostScript provides operators for  setting  system-wide  and
   3196             device-specific  parameters. These parameter settings may be
   3197             retained across jobs and may potentially pose  a  threat  to
   3198             the  correct  operation  of the interpreter.  The PostScript
   3199             operators that set system and device parameters include, but
   3200             may  not be limited to, the setsystemparams and setdevparams
   3201             operators.  Message-sending  software  should  not  generate
   3202             PostScript  that  depends on the setting of system or device
   3203             parameters to operate correctly. The ability  to  set  these
   3204             parameters will probably be unavailable in secure PostScript
   3205             implementations. Message-receiving and -displaying  software
   3206             should,  if  possible,  disable the ability to change system
   3207             and  device  parameters.  If  these  operators   cannot   be
   3208             disabled,  at least set the password associated with them to
   3209             a hard-to-guess value.
   3210 
   3211             Some   PostScript   implementations   provide    nonstandard
   3212             facilities  for  the direct loading and execution of machine
   3213             code.  Such  facilities  are  quite    obviously   open   to
   3214             substantial  abuse.    Message-sending  software  should not
   3215             make use of such features. Besides being  totally  hardware-
   3216             specific,  they  are also likely to be unavailable in secure
   3217             implementations  of  PostScript.     Message-receiving   and
   3218             -displaying  software  should not allow such operators to be
   3219             used if they exist.
   3220 
   3221             PostScript is an extensible language, and many, if not most,
   3222             implementations   of  it  provide  a  number  of  their  own
   3223             extensions. This document does not deal with such extensions
   3224             explicitly   since   they   constitute  an  unknown  factor.
   3225             Message-sending software should not make use of  nonstandard
   3226             extensions;   they  are  likely  to  be  missing  from  some
   3227             implementations. Message-receiving and -displaying  software
   3228             should  make  sure that any nonstandard PostScript operators
   3229             are secure and don't present any kind of threat.
   3230 
   3231             It is  possible  to  write  PostScript  that  consumes  huge
   3232             amounts  of various system resources. It is also possible to
   3233             write PostScript programs that loop infinitely.  Both  types
   3234             of  programs  have  the potential to cause damage if sent to
   3235             unsuspecting recipients.   Message-sending  software  should
   3236             avoid  the  construction and dissemination of such programs,
   3237             which  is  antisocial.   Message-receiving  and  -displaying
   3238             software  should  provide  appropriate  mechanisms  to abort
   3239             processing of a document after a reasonable amount  of  time
   3240             has  elapsed. In addition, PostScript interpreters should be
   3241             limited to the consumption of only a  reasonable  amount  of
   3242             any given system resource.
   3243 
   3244             Finally, bugs may  exist  in  some  PostScript  interpreters
   3245             which  could  possibly  be  exploited  to  gain unauthorized
   3246 
   3247 
   3248 
   3249             Borenstein & Freed                                 [Page 49]
   3250 
   3251 
   3252 
   3253 
   3254             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3255 
   3256 
   3257             access to a  recipient's  system.  Apart  from  noting  this
   3258             possibility,  there is no specific action to take to prevent
   3259             this, apart from the timely correction of such bugs  if  any
   3260             are found.
   3261 
   3262             7.4.3     The Application/ODA subtype
   3263 
   3264             The "ODA" subtype of application is used to indicate that  a
   3265             body  contains  information  encoded according to the Office
   3266             Document  Architecture  [ODA]   standards,  using  the  ODIF
   3267             representation  format.   For  application/oda, the Content-
   3268             Type line should also specify an attribute/value  pair  that
   3269             indicates  the document application profile (DAP), using the
   3270             key word "profile".  Thus an appropriate header field  might
   3271             look like this:
   3272 
   3273             Content-Type:  application/oda; profile=Q112
   3274 
   3275             Consult the ODA standard [ODA] for further information.
   3276 
   3277 
   3278 
   3279 
   3280 
   3281 
   3282 
   3283 
   3284 
   3285 
   3286 
   3287 
   3288 
   3289 
   3290 
   3291 
   3292 
   3293 
   3294 
   3295 
   3296 
   3297 
   3298 
   3299 
   3300 
   3301 
   3302 
   3303 
   3304 
   3305 
   3306 
   3307 
   3308 
   3309 
   3310 
   3311 
   3312 
   3313 
   3314             Borenstein & Freed                                 [Page 50]
   3315 
   3316 
   3317 
   3318 
   3319             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3320 
   3321 
   3322             7.5  The Image Content-Type
   3323 
   3324             A Content-Type of "image" indicates that the bodycontains an
   3325             image.   The subtype names the specific image format.  These
   3326             names are case insensitive.  Two initial subtypes are "jpeg"
   3327             for the JPEG format, JFIF encoding, and "gif" for GIF format
   3328             [GIF].
   3329 
   3330             The list of image subtypes given here is  neither  exclusive
   3331             nor  exhaustive,  and  is expected to grow as more types are
   3332             registered with IANA, as described in Appendix F.
   3333 
   3334             7.6  The Audio Content-Type
   3335 
   3336             A Content-Type of "audio" indicates that the  body  contains
   3337             audio  data.   Although  there  is not yet a consensus on an
   3338             "ideal" audio format for use  with  computers,  there  is  a
   3339             pressing   need   for   a   format   capable   of  providing
   3340             interoperable behavior.
   3341 
   3342             The initial subtype of "basic" is  specified  to  meet  this
   3343             requirement by providing an absolutely minimal lowest common
   3344             denominator  audio  format.   It  is  expected  that  richer
   3345             formats for higher quality and/or lower bandwidth audio will
   3346             be defined by a later document.
   3347 
   3348             The content of the "audio/basic" subtype  is  audio  encoded
   3349             using  8-bit ISDN u-law [PCM]. When this subtype is present,
   3350             a sample rate of 8000 Hz and a single channel is assumed.
   3351 
   3352             7.7  The Video Content-Type
   3353 
   3354             A Content-Type of "video" indicates that the body contains a
   3355             time-varying-picture   image,   possibly   with   color  and
   3356             coordinated sound.   The  term  "video"  is  used  extremely
   3357             generically,  rather  than  with reference to any particular
   3358             technology or format, and is not meant to preclude  subtypes
   3359             such  as animated drawings encoded compactly.    The subtype
   3360             "mpeg" refers to video coded according to the MPEG  standard
   3361             [MPEG].
   3362 
   3363             Note  that  although  in  general  this  document   strongly
   3364             discourages  the  mixing of multiple media in a single body,
   3365             it is recognized that many so-called "video" formats include
   3366             a   representation  for  synchronized  audio,  and  this  is
   3367             explicitly permitted for subtypes of "video".
   3368 
   3369             7.8  Experimental Content-Type Values
   3370 
   3371             A Content-Type value beginning with the characters "X-" is a
   3372             private  value,  to  be  used  by consenting mail systems by
   3373             mutual agreement.  Any format without a rigorous and  public
   3374             definition  must  be named with an "X-" prefix, and publicly
   3375             specified  values  shall  never  begin  with  "X-".   (Older
   3376 
   3377 
   3378 
   3379             Borenstein & Freed                                 [Page 51]
   3380 
   3381 
   3382 
   3383 
   3384             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3385 
   3386 
   3387             versions  of  the  widely-used Andrew system use the "X-BE2"
   3388             name, so new systems  should  probably  choose  a  different
   3389             name.)
   3390 
   3391             In general, the use of  "X-"  top-level  types  is  strongly
   3392             discouraged.   Implementors  should  invent  subtypes of the
   3393             existing types whenever  possible.   The  invention  of  new
   3394             types   is  intended  to  be  restricted  primarily  to  the
   3395             development of new media types for email,  such  as  digital
   3396             odors  or  holography,  and  not  for  new  data  formats in
   3397             general. In many cases, a subtype  of  application  will  be
   3398             more appropriate than a new top-level type.
   3399 
   3400 
   3401 
   3402 
   3403 
   3404 
   3405 
   3406 
   3407 
   3408 
   3409 
   3410 
   3411 
   3412 
   3413 
   3414 
   3415 
   3416 
   3417 
   3418 
   3419 
   3420 
   3421 
   3422 
   3423 
   3424 
   3425 
   3426 
   3427 
   3428 
   3429 
   3430 
   3431 
   3432 
   3433 
   3434 
   3435 
   3436 
   3437 
   3438 
   3439 
   3440 
   3441 
   3442 
   3443 
   3444             Borenstein & Freed                                 [Page 52]
   3445 
   3446 
   3447 
   3448 
   3449             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3450 
   3451 
   3452             Summary
   3453 
   3454             Using the MIME-Version, Content-Type, and  Content-Transfer-
   3455             Encoding  header  fields,  it  is  possible to include, in a
   3456             standardized way, arbitrary types of data objects  with  RFC
   3457             822  conformant  mail  messages.  No restrictions imposed by
   3458             either RFC 821 or RFC 822 are violated, and  care  has  been
   3459             taken  to  avoid  problems caused by additional restrictions
   3460             imposed  by  the  characteristics  of  some  Internet   mail
   3461             transport  mechanisms  (see Appendix B). The "multipart" and
   3462             "message"  Content-Types  allow  mixing   and   hierarchical
   3463             structuring  of  objects  of  different  types  in  a single
   3464             message.  Further  Content-Types  provide   a   standardized
   3465             mechanism  for  tagging  messages  or  body  parts as audio,
   3466             image, or several other  kinds  of  data.   A  distinguished
   3467             parameter syntax allows further specification of data format
   3468             details,  particularly  the   specification   of   alternate
   3469             character  sets.  Additional  optional header fields provide
   3470             mechanisms for certain extensions deemed desirable  by  many
   3471             implementors.  Finally, a number of useful Content-Types are
   3472             defined for general use by consenting user  agents,  notably
   3473             text/richtext, message/partial, and message/external-body.
   3474 
   3475 
   3476 
   3477 
   3478 
   3479 
   3480 
   3481 
   3482 
   3483 
   3484 
   3485 
   3486 
   3487 
   3488 
   3489 
   3490 
   3491 
   3492 
   3493 
   3494 
   3495 
   3496 
   3497 
   3498 
   3499 
   3500 
   3501 
   3502 
   3503 
   3504 
   3505 
   3506 
   3507 
   3508 
   3509             Borenstein & Freed                                 [Page 53]
   3510 
   3511 
   3512 
   3513 
   3514             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3515 
   3516 
   3517             Acknowledgements
   3518 
   3519             This document is the result of the collective  effort  of  a
   3520             large  number  of  people,  at several IETF meetings, on the
   3521             IETF-SMTP  and  IETF-822  mailing  lists,   and   elsewhere.
   3522             Although   any  enumeration  seems  doomed  to  suffer  from
   3523             egregious  omissions,  the  following  are  among  the  many
   3524             contributors to this effort:
   3525 
   3526             Harald Tveit Alvestrand       Timo Lehtinen
   3527             Randall Atkinson              John R. MacMillan
   3528             Philippe Brandon              Rick McGowan
   3529             Kevin Carosso                 Leo Mclaughlin
   3530             Uhhyung Choi                  Goli Montaser-Kohsari
   3531             Cristian Constantinof         Keith Moore
   3532             Mark Crispin                  Tom Moore
   3533             Dave Crocker                  Erik Naggum
   3534             Terry Crowley                 Mark Needleman
   3535             Walt Daniels                  John Noerenberg
   3536             Frank Dawson                  Mats Ohrman
   3537             Hitoshi Doi                   Julian Onions
   3538             Kevin Donnelly                Michael Patton
   3539             Keith Edwards                 David J. Pepper
   3540             Chris Eich                    Blake C. Ramsdell
   3541             Johnny Eriksson               Luc Rooijakkers
   3542             Craig Everhart                Marshall T. Rose
   3543             Patrik Faeltstroem              Jonathan Rosenberg
   3544             Erik E. Fair                  Jan Rynning
   3545             Roger Fajman                  Harri Salminen
   3546             Alain Fontaine                Michael Sanderson
   3547             James M. Galvin               Masahiro Sekiguchi
   3548             Philip Gladstone              Mark Sherman
   3549             Thomas Gordon                 Keld Simonsen
   3550             Phill Gross                   Bob Smart
   3551             James Hamilton                Peter Speck
   3552             Steve Hardcastle-Kille        Henry Spencer
   3553             David Herron                  Einar Stefferud
   3554             Bruce Howard                  Michael Stein
   3555             Bill Janssen                  Klaus Steinberger
   3556             Olle Jaernefors                Peter Svanberg
   3557             Risto Kankkunen               James Thompson
   3558             Phil Karn                     Steve Uhler
   3559             Alan Katz                     Stuart Vance
   3560             Tim Kehres                    Erik van der Poel
   3561             Neil Katin                    Guido van Rossum
   3562             Kyuho Kim                     Peter Vanderbilt
   3563             Anders Klemets                Greg Vaudreuil
   3564             John Klensin                  Ed Vielmetti
   3565             Valdis Kletniek               Ryan Waldron
   3566             Jim Knowles                   Wally Wedel
   3567             Stev Knowles                  Sven-Ove Westberg
   3568             Bob Kummerfeld                Brian Wideen
   3569 
   3570 
   3571 
   3572 
   3573 
   3574             Borenstein & Freed                                 [Page 54]
   3575 
   3576 
   3577 
   3578 
   3579             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3580 
   3581 
   3582             Pekka Kytolaakso              John Wobus
   3583             Stellan Lagerstr.m            Glenn Wright
   3584             Vincent Lau                   Rayan Zachariassen
   3585             Donald Lindsay                David Zimmerman
   3586             The authors apologize for  any  omissions  from  this  list,
   3587             which are certainly unintentional.
   3588 
   3589 
   3590 
   3591 
   3592 
   3593 
   3594 
   3595 
   3596 
   3597 
   3598 
   3599 
   3600 
   3601 
   3602 
   3603 
   3604 
   3605 
   3606 
   3607 
   3608 
   3609 
   3610 
   3611 
   3612 
   3613 
   3614 
   3615 
   3616 
   3617 
   3618 
   3619 
   3620 
   3621 
   3622 
   3623 
   3624 
   3625 
   3626 
   3627 
   3628 
   3629 
   3630 
   3631 
   3632 
   3633 
   3634 
   3635 
   3636 
   3637 
   3638 
   3639             Borenstein & Freed                                 [Page 55]
   3640 
   3641 
   3642 
   3643 
   3644             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3645 
   3646 
   3647             Appendix A -- Minimal MIME-Conformance
   3648 
   3649             The mechanisms described in this  document  are  open-ended.
   3650             It  is definitely not expected that all implementations will
   3651             support all of the Content-Types described,  nor  that  they
   3652             will  all  share  the  same extensions.  In order to promote
   3653             interoperability,  however,  it  is  useful  to  define  the
   3654             concept  of  "MIME-conformance" to define a certain level of
   3655             implementation  that  allows  the  useful  interworking   of
   3656             messages  with  content that differs from US ASCII text.  In
   3657             this  section,  we  specify  the   requirements   for   such
   3658             conformance.
   3659 
   3660             A mail user agent that is MIME-conformant MUST:
   3661 
   3662                  1.  Always generate a "MIME-Version:  1.0"  header
   3663                  field.
   3664 
   3665                  2.  Recognize the Content-Transfer-Encoding header
   3666                  field,  and  decode all received data encoded with
   3667                  either    the    quoted-printable    or     base64
   3668                  implementations.    Encode  any  data sent that is
   3669                  not in seven-bit mail-ready  representation  using
   3670                  one  of  these  transformations  and  include  the
   3671                  appropriate    Content-Transfer-Encoding    header
   3672                  field,  unless  the underlying transport mechanism
   3673                  supports non-seven-bit data, as SMTP does not.
   3674 
   3675                  3.   Recognize  and  interpret  the   Content-Type
   3676                  header  field,  and  avoid  showing users raw data
   3677                  with a Content-Type field  other  than  text.   Be
   3678                  able  to  send  at least text/plain messages, with
   3679                  the character set specified as a parameter  if  it
   3680                  is not US-ASCII.
   3681 
   3682                  4.  Explicitly handle the  following  Content-Type
   3683                  values, to at least the following extents:
   3684 
   3685                  Text:
   3686                       -- Recognize  and  display  "text"  mail
   3687                            with the character set "US-ASCII."
   3688                       -- Recognize  other  character  sets  at
   3689                            least  to  the extent of being able
   3690                            to  inform  the  user  about   what
   3691                            character set the message uses.
   3692                       -- Recognize the "ISO-8859-*"  character
   3693                            sets to the extent of being able to
   3694                            display those characters  that  are
   3695                            common  to ISO-8859-* and US-ASCII,
   3696                            namely all  characters  represented
   3697                            by octet values 0-127.
   3698                       -- For unrecognized  subtypes,  show  or
   3699                            offer  to  show  the user the "raw"
   3700                            version of the data.  An ability at
   3701 
   3702 
   3703 
   3704             Borenstein & Freed                                 [Page 56]
   3705 
   3706 
   3707 
   3708 
   3709             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3710 
   3711 
   3712                            least to convert "text/richtext" to
   3713                            plain text, as shown in Appendix D,
   3714                            is encouraged, but not required for
   3715                            conformance.
   3716                  Message:
   3717                       --Recognize and  display  at  least  the
   3718                            primary (822) encapsulation.
   3719                  Multipart:
   3720                       --   Recognize   the   primary   (mixed)
   3721                            subtype.    Display   all  relevant
   3722                            information on  the  message  level
   3723                            and  the body part header level and
   3724                            then display or  offer  to  display
   3725                            each     of    the    body    parts
   3726                            individually.
   3727                       -- Recognize the "alternative"  subtype,
   3728                            and    avoid   showing   the   user
   3729                            redundant         parts          of
   3730                            multipart/alternative mail.
   3731                       -- Treat any unrecognized subtypes as if
   3732                            they were "mixed".
   3733                  Application:
   3734                       -- Offer the ability to remove either of
   3735                            the  two types of Content-Transfer-
   3736                            Encoding defined in  this  document
   3737                            and  put  the resulting information
   3738                            in a user file.
   3739 
   3740                  5.  Upon encountering  any  unrecognized  Content-
   3741                  Type, an implementation must treat it as if it had
   3742                  a Content-Type of "application/octet-stream"  with
   3743                  no  parameter  sub-arguments.  How  such  data are
   3744                  handled is up to  an  implementation,  but  likely
   3745                  options   for   handling  such  unrecognized  data
   3746                  include offering the user to write it into a  file
   3747                  (decoded   from  its  mail  transport  format)  or
   3748                  offering the user to name a program to  which  the
   3749                  decoded   data   should   be   passed   as  input.
   3750                  Unrecognized predefined types, which  in  a  MIME-
   3751                  conformant   mailer  might  still  include  audio,
   3752                  image, or video, should also be  treated  in  this
   3753                  way.
   3754 
   3755             A user agent that meets the above conditions is said  to  be
   3756             MIME-conformant.   The  meaning of this phrase is that it is
   3757             assumed  to  be  "safe"  to  send  virtually  any  kind   of
   3758             properly-marked  data to users of such mail systems, because
   3759             such systems will at least be able  to  treat  the  data  as
   3760             undifferentiated  binary, and will not simply splash it onto
   3761             the screen of unsuspecting users.   There is  another  sense
   3762             in  which  it is always "safe" to send data in a format that
   3763             is MIME-conformant, which is that such data will  not  break
   3764             or  be  broken by any known systems that are conformant with
   3765             RFC 821 and RFC 822.  User agents that  are  MIME-conformant
   3766 
   3767 
   3768 
   3769             Borenstein & Freed                                 [Page 57]
   3770 
   3771 
   3772 
   3773 
   3774             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3775 
   3776 
   3777             have  the  additional  guarantee  that  the user will not be
   3778             shown data that were never intended to be viewed as text.
   3779 
   3780 
   3781 
   3782 
   3783 
   3784 
   3785 
   3786 
   3787 
   3788 
   3789 
   3790 
   3791 
   3792 
   3793 
   3794 
   3795 
   3796 
   3797 
   3798 
   3799 
   3800 
   3801 
   3802 
   3803 
   3804 
   3805 
   3806 
   3807 
   3808 
   3809 
   3810 
   3811 
   3812 
   3813 
   3814 
   3815 
   3816 
   3817 
   3818 
   3819 
   3820 
   3821 
   3822 
   3823 
   3824 
   3825 
   3826 
   3827 
   3828 
   3829 
   3830 
   3831 
   3832 
   3833 
   3834             Borenstein & Freed                                 [Page 58]
   3835 
   3836 
   3837 
   3838 
   3839             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3840 
   3841 
   3842             Appendix B -- General Guidelines For Sending Email Data
   3843 
   3844             Internet email is not a perfect, homogeneous  system.   Mail
   3845             may  become  corrupted  at several stages in its travel to a
   3846             final destination. Specifically, email sent  throughout  the
   3847             Internet  may  travel  across  many networking technologies.
   3848             Many networking and mail technologies  do  not  support  the
   3849             full   functionality   possible   in   the   SMTP  transport
   3850             environment. Mail traversing these systems is likely  to  be
   3851             modified in such a way that it can be transported.
   3852 
   3853             There exist many widely-deployed non-conformant MTAs in  the
   3854             Internet.  These  MTAs,  speaking  the  SMTP protocol, alter
   3855             messages on the fly to take advantage of the  internal  data
   3856             structure  of the hosts they are implemented on, or are just
   3857             plain broken.
   3858 
   3859             The following guidelines may be useful to anyone devising  a
   3860             data  format  (Content-Type)  that  will  survive the widest
   3861             range of  networking  technologies  and  known  broken  MTAs
   3862             unscathed.    Note  that  anything  encoded  in  the  base64
   3863             encoding will satisfy these rules, but that some  well-known
   3864             mechanisms,  notably  the  UNIX uuencode facility, will not.
   3865             Note also that  anything  encoded  in  the  Quoted-Printable
   3866             encoding will survive most gateways intact, but possibly not
   3867             some gateways to systems that use the EBCDIC character set.
   3868 
   3869                  (1) Under some circumstances the encoding used for
   3870                  data  may change as part of normal gateway or user
   3871                  agent operation. In  particular,  conversion  from
   3872                  base64  to  quoted-printable and vice versa may be
   3873                  necessary. This may result  in  the  confusion  of
   3874                  CRLF  sequences  with  line  breaks  in  text body
   3875                  parts.  As  such,  the  persistence  of  CRLF   as
   3876                  something  other  than  a line break should not be
   3877                  relied on.
   3878 
   3879                  (2) Many systems may elect to represent and  store
   3880                  text  data  using local newline conventions. Local
   3881                  newline conventions may not match the RFC822  CRLF
   3882                  convention -- systems are known that use plain CR,
   3883                  plain LF, CRLF, or counted records.  The result is
   3884                  that isolated CR and LF characters  are  not  well
   3885                  tolerated  in    general;  they  may  be  lost  or
   3886                  converted to delimiters on some systems, and hence
   3887                  should not be relied on.
   3888 
   3889                  (3) TAB (HT) characters may be  misinterpreted  or
   3890                  may be automatically converted to variable numbers
   3891                  of  spaces.    This   is   unavoidable   in   some
   3892                  environments, notably those not based on the ASCII
   3893                  character  set.  Such   conversion   is   STRONGLY
   3894                  DISCOURAGED,  but  it  may occur, and mail formats
   3895                  should not rely on the  persistence  of  TAB  (HT)
   3896 
   3897 
   3898 
   3899             Borenstein & Freed                                 [Page 59]
   3900 
   3901 
   3902 
   3903 
   3904             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3905 
   3906 
   3907                  characters.
   3908 
   3909                  (4) Lines longer than 76 characters may be wrapped
   3910                  or  truncated  in some environments. Line wrapping
   3911                  and line truncation are STRONGLY DISCOURAGED,  but
   3912                  unavoidable  in  some  cases.  Applications  which
   3913                  require long lines  should  somehow  differentiate
   3914                  between  soft and hard line breaks.  (A simple way
   3915                  to  do  this  is  to  use   the   quoted-printable
   3916                  encoding.)
   3917 
   3918                  (5)  Trailing "white space" characters (SPACE, TAB
   3919                  (HT)) on a line may be discarded by some transport
   3920                  agents, while other transport agents may pad lines
   3921                  with  these characters so that all lines in a mail
   3922                  file are of equal  length.    The  persistence  of
   3923                  trailing  white  space,  therefore,  should not be
   3924                  relied on.
   3925 
   3926                  (6)  Many mail domains use variations on the ASCII
   3927                  character  set,  or  use  character  sets  such as
   3928                  EBCDIC which contain most but not all of  the  US-
   3929                  ASCII  characters.   The  correct  translation  of
   3930                  characters not in the "invariant"  set  cannot  be
   3931                  depended  on across character converting gateways.
   3932                  For example, this  situation  is  a  problem  when
   3933                  sending  uuencoded  information  across BITNET, an
   3934                  EBCDIC system.  Similar problems can occur without
   3935                  crossing  a gateway, since many Internet hosts use
   3936                  character sets other than ASCII  internally.   The
   3937                  definition  of  Printable  Strings  in  X.400 adds
   3938                  further restrictions in certain special cases.  In
   3939                  particular,  the only characters that are known to
   3940                  be consistent  across  all  gateways  are  the  73
   3941                  characters  that correspond to the upper and lower
   3942                  case letters A-Z and a-z, the 10 digits  0-9,  and
   3943                  the following eleven special characters:
   3944 
   3945                                 "'"  (ASCII code 39)
   3946                                 "("  (ASCII code 40)
   3947                                 ")"  (ASCII code 41)
   3948                                 "+"  (ASCII code 43)
   3949                                 ","  (ASCII code 44)
   3950                                 "-"  (ASCII code 45)
   3951                                 "."  (ASCII code 46)
   3952                                 "/"  (ASCII code 47)
   3953                                 ":"  (ASCII code 58)
   3954                                 "="  (ASCII code 61)
   3955                                 "?"  (ASCII code 63)
   3956 
   3957                  A maximally portable mail representation, such  as
   3958                  the   base64  encoding,  will  confine  itself  to
   3959                  relatively short lines of text in which  the  only
   3960                  meaningful  characters  are taken from this set of
   3961 
   3962 
   3963 
   3964             Borenstein & Freed                                 [Page 60]
   3965 
   3966 
   3967 
   3968 
   3969             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   3970 
   3971 
   3972                  73 characters.
   3973 
   3974             Please note that the above list is NOT a list of recommended
   3975             practices  for  MTAs.  RFC  821  MTAs  are  prohibited  from
   3976             altering the character  of  white  space  or  wrapping  long
   3977             lines.   These  BAD and illegal practices are known to occur
   3978             on established networks, and implementions should be  robust
   3979             in dealing with the bad effects they can cause.
   3980 
   3981 
   3982 
   3983 
   3984 
   3985 
   3986 
   3987 
   3988 
   3989 
   3990 
   3991 
   3992 
   3993 
   3994 
   3995 
   3996 
   3997 
   3998 
   3999 
   4000 
   4001 
   4002 
   4003 
   4004 
   4005 
   4006 
   4007 
   4008 
   4009 
   4010 
   4011 
   4012 
   4013 
   4014 
   4015 
   4016 
   4017 
   4018 
   4019 
   4020 
   4021 
   4022 
   4023 
   4024 
   4025 
   4026 
   4027 
   4028 
   4029             Borenstein & Freed                                 [Page 61]
   4030 
   4031 
   4032 
   4033 
   4034             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4035 
   4036 
   4037             Appendix C -- A Complex Multipart Example
   4038 
   4039             What follows is the outline of a complex multipart  message.
   4040             This  message  has five parts to be displayed serially:  two
   4041             introductory  plain  text  parts,  an   embedded   multipart
   4042             message,  a  richtext  part, and a closing encapsulated text
   4043             message  in  a  non-ASCII  character  set.    The   embedded
   4044             multipart message has two parts to be displayed in parallel,
   4045             a picture and an audio fragment.
   4046 
   4047                  MIME-Version: 1.0
   4048                  From: Nathaniel Borenstein <nsb@bellcore.com>
   4049                  Subject: A multipart example
   4050                  Content-Type: multipart/mixed;
   4051                       boundary=unique-boundary-1
   4052 
   4053                  This is the preamble area of a multipart message.
   4054                  Mail readers that understand multipart format
   4055                  should ignore this preamble.
   4056                  If you are reading this text, you might want to
   4057                  consider changing to a mail reader that understands
   4058                  how to properly display multipart messages.
   4059                  --unique-boundary-1
   4060 
   4061                  ...Some text appears here...
   4062                  [Note that the preceding blank line means
   4063                  no header fields were given and this is text,
   4064                  with charset US ASCII.  It could have been
   4065                  done with explicit typing as in the next part.]
   4066 
   4067                  --unique-boundary-1
   4068                  Content-type: text/plain; charset=US-ASCII
   4069 
   4070                  This could have been part of the previous part,
   4071                  but illustrates explicit versus implicit
   4072                  typing of body parts.
   4073 
   4074                  --unique-boundary-1
   4075                  Content-Type: multipart/parallel;
   4076                       boundary=unique-boundary-2
   4077 
   4078 
   4079                  --unique-boundary-2
   4080                  Content-Type: audio/basic
   4081                  Content-Transfer-Encoding: base64
   4082 
   4083                  ... base64-encoded 8000 Hz single-channel
   4084                      u-law-format audio data goes here....
   4085 
   4086                  --unique-boundary-2
   4087                  Content-Type: image/gif
   4088                  Content-Transfer-Encoding: Base64
   4089 
   4090 
   4091 
   4092 
   4093 
   4094             Borenstein & Freed                                 [Page 62]
   4095 
   4096 
   4097 
   4098 
   4099             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4100 
   4101 
   4102                  ... base64-encoded image data goes here....
   4103 
   4104                  --unique-boundary-2--
   4105 
   4106                  --unique-boundary-1
   4107                  Content-type: text/richtext
   4108 
   4109                  This is <bold><italic>richtext.</italic></bold>
   4110                  <nl><nl>Isn't it
   4111                  <bigger><bigger>cool?</bigger></bigger>
   4112 
   4113                  --unique-boundary-1
   4114                  Content-Type: message/rfc822
   4115 
   4116                  From: (name in US-ASCII)
   4117                  Subject: (subject in US-ASCII)
   4118                  Content-Type: Text/plain; charset=ISO-8859-1
   4119                  Content-Transfer-Encoding: Quoted-printable
   4120 
   4121                  ... Additional text in ISO-8859-1 goes here ...
   4122 
   4123                  --unique-boundary-1--
   4124 
   4125 
   4126 
   4127 
   4128 
   4129 
   4130 
   4131 
   4132 
   4133 
   4134 
   4135 
   4136 
   4137 
   4138 
   4139 
   4140 
   4141 
   4142 
   4143 
   4144 
   4145 
   4146 
   4147 
   4148 
   4149 
   4150 
   4151 
   4152 
   4153 
   4154 
   4155 
   4156 
   4157 
   4158 
   4159             Borenstein & Freed                                 [Page 63]
   4160 
   4161 
   4162 
   4163 
   4164             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4165 
   4166 
   4167             Appendix D -- A Simple Richtext-to-Text Translator in C
   4168 
   4169             One of the major goals in the design of the richtext subtype
   4170             of the text Content-Type is to make formatted text so simple
   4171             that even  text-only  mailers  will  implement  richtext-to-
   4172             plain-text  translators, thus increasing the likelihood that
   4173             multifont text will become "safe" to use  very  widely.   To
   4174             demonstrate  this  simplicity,  what follows is an extremely
   4175             simple 44-line C program that converts richtext  input  into
   4176             plain text output:
   4177 
   4178                  #include <stdio.h>
   4179                  #include <ctype.h>
   4180                  main() {
   4181                      int c, i;
   4182                      char token[50];
   4183 
   4184                      while((c = getc(stdin)) != EOF) {
   4185                          if (c == '<') {
   4186                              for (i=0; (i<49 && (c = getc(stdin)) != '>'
   4187                                        && c != EOF); ++i) {
   4188                                  token[i] = isupper(c) ? tolower(c) : c;
   4189                              }
   4190                              if (c == EOF) break;
   4191                              if (c != '>') while ((c = getc(stdin)) !=
   4192                  '>'
   4193                                        && c != EOF) {;}
   4194                              if (c == EOF) break;
   4195                              token[i] = '\0';
   4196                              if (!strcmp(token, "lt")) {
   4197                                  putc('<', stdout);
   4198                              } else if (!strcmp(token, "nl")) {
   4199                                  putc('\n', stdout);
   4200                              } else if (!strcmp(token, "/paragraph")) {
   4201                                  fputs("\n\n", stdout);
   4202                              } else if (!strcmp(token, "comment")) {
   4203                                  int commct=1;
   4204                                  while (commct > 0) {
   4205                                      while ((c = getc(stdin)) != '<'
   4206                                       && c != EOF) ;
   4207                                      if (c == EOF) break;
   4208                                      for (i=0; (c = getc(stdin)) != '>'
   4209                                         && c != EOF; ++i) {
   4210                                          token[i] = isupper(c) ?
   4211                                           tolower(c) : c;
   4212                                      }
   4213                                      if (c== EOF) break;
   4214                                      token[i] = NULL;
   4215                                      if (!strcmp(token, "/comment")) --
   4216                  commct;
   4217                                      if (!strcmp(token, "comment"))
   4218                  ++commct;
   4219 
   4220 
   4221 
   4222 
   4223 
   4224             Borenstein & Freed                                 [Page 64]
   4225 
   4226 
   4227 
   4228 
   4229             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4230 
   4231 
   4232                                  }
   4233                              } /* Ignore all other tokens */
   4234                          } else if (c != '\n') putc(c, stdout);
   4235                      }
   4236                      putc('\n', stdout); /* for good measure */
   4237                  }
   4238             It should be noted that one can do considerably better  than
   4239             this  in  displaying  richtext  data on a dumb terminal.  In
   4240             particular, one can replace font information such as  "bold"
   4241             with textual emphasis (like *this* or   _T_H_I_S_).  One can
   4242             also  properly  handle  the  richtext  formatting   commands
   4243             regarding  indentation, justification, and others.  However,
   4244             the above program is all  that  is  necessary  in  order  to
   4245             present richtext on a dumb terminal.
   4246 
   4247 
   4248 
   4249 
   4250 
   4251 
   4252 
   4253 
   4254 
   4255 
   4256 
   4257 
   4258 
   4259 
   4260 
   4261 
   4262 
   4263 
   4264 
   4265 
   4266 
   4267 
   4268 
   4269 
   4270 
   4271 
   4272 
   4273 
   4274 
   4275 
   4276 
   4277 
   4278 
   4279 
   4280 
   4281 
   4282 
   4283 
   4284 
   4285 
   4286 
   4287 
   4288 
   4289             Borenstein & Freed                                 [Page 65]
   4290 
   4291 
   4292 
   4293 
   4294             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4295 
   4296 
   4297             Appendix E -- Collected Grammar
   4298 
   4299             This appendix contains the complete BNF grammar for all  the
   4300             syntax specified by this document.
   4301 
   4302             By itself, however, this grammar is incomplete.   It  refers
   4303             to  several  entities  that  are defined by RFC 822.  Rather
   4304             than   reproduce   those   definitions   here,   and    risk
   4305             unintentional  differences  between  the  two, this document
   4306             simply refers the  reader  to  RFC  822  for  the  remaining
   4307             definitions.  Wherever a term is undefined, it refers to the
   4308             RFC 822 definition.
   4309 
   4310             attribute := token
   4311 
   4312             body-part = <"message" as defined in RFC 822,
   4313                      with all header fields optional, and with the
   4314                      specified delimiter not occurring anywhere in
   4315                      the message body, either on a line by itself
   4316                      or as a substring anywhere.>
   4317 
   4318             boundary := 0*69<bchars> bcharsnospace
   4319 
   4320             bchars := bcharsnospace / " "
   4321 
   4322             bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+"  /
   4323             "_"
   4324                            / "," / "-" / "." / "/" / ":" / "=" / "?"
   4325 
   4326             close-delimiter := delimiter "--"
   4327 
   4328             Content-Description := *text
   4329 
   4330             Content-ID := msg-id
   4331 
   4332             Content-Transfer-Encoding  :=      "BASE64"     /   "QUOTED-
   4333             PRINTABLE" /
   4334                                             "8BIT"  / "7BIT" /
   4335                                             "BINARY"     / x-token
   4336 
   4337             Content-Type := type "/" subtype *[";" parameter]
   4338 
   4339             delimiter := CRLF "--" boundary   ; taken from  Content-Type
   4340             field.
   4341                                            ;   when   content-type    is
   4342             multipart
   4343                                          ; There should be no space
   4344                                          ; between "--" and boundary.
   4345 
   4346             encapsulation := delimiter CRLF body-part
   4347 
   4348             epilogue :=  *text                  ;  to  be  ignored  upon
   4349             receipt.
   4350 
   4351 
   4352 
   4353 
   4354             Borenstein & Freed                                 [Page 66]
   4355 
   4356 
   4357 
   4358 
   4359             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4360 
   4361 
   4362             MIME-Version := 1*text
   4363 
   4364             multipart-body := preamble  1*encapsulation  close-delimiter
   4365             epilogue
   4366 
   4367             parameter := attribute "=" value
   4368 
   4369             preamble :=  *text                  ;  to  be  ignored  upon
   4370             receipt.
   4371 
   4372             subtype := token
   4373 
   4374             token := 1*<any CHAR except SPACE, CTLs, or tspecials>
   4375 
   4376             tspecials :=  "(" / ")" / "<" / ">" / "@"  ; Must be in
   4377                        /  "," / ";" / ":" / "\" / <">  ; quoted-string,
   4378                        /  "/" / "[" / "]" / "?" / "."  ; to use within
   4379                        /  "="                        ; parameter values
   4380 
   4381 
   4382             type :=            "application"     /  "audio"     ;  case-
   4383             insensitive
   4384                       / "image"           / "message"
   4385                       / "multipart"  / "text"
   4386                       / "video"           / x-token
   4387 
   4388             value := token / quoted-string
   4389 
   4390             x-token := <The two characters "X-" followed, with no
   4391                        intervening white space, by any token>
   4392 
   4393 
   4394 
   4395 
   4396 
   4397 
   4398 
   4399 
   4400 
   4401 
   4402 
   4403 
   4404 
   4405 
   4406 
   4407 
   4408 
   4409 
   4410 
   4411 
   4412 
   4413 
   4414 
   4415 
   4416 
   4417 
   4418 
   4419             Borenstein & Freed                                 [Page 67]
   4420 
   4421 
   4422 
   4423 
   4424             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4425 
   4426 
   4427             Appendix F -- IANA Registration Procedures
   4428 
   4429             MIME  has  been  carefully  designed  to   have   extensible
   4430             mechanisms,  and  it  is  expected  that the set of content-
   4431             type/subtype pairs and their associated parameters will grow
   4432             significantly with time.  Several other MIME fields, notably
   4433             character  set  names,  access-type   parameters   for   the
   4434             message/external-body  type,  conversions parameters for the
   4435             application  type,  and  possibly   even   Content-Transfer-
   4436             Encoding  values, are likely to have new values defined over
   4437             time.  In order to ensure that the set  of  such  values  is
   4438             developed  in an orderly, well-specified, and public manner,
   4439             MIME defines a registration process which uses the  Internet
   4440             Assigned  Numbers Authority (IANA) as a central registry for
   4441             such values.
   4442 
   4443             In general, parameters in the content-type header field  are
   4444             used  to convey supplemental information for various content
   4445             types, and their use is defined when  the  content-type  and
   4446             subtype  are  defined.  New parameters should not be defined
   4447             as a way to introduce new functionality.
   4448 
   4449             In  order  to  simplify  and  standardize  the  registration
   4450             process,  this appendix gives templates for the registration
   4451             of new values with IANA.  Each of these is given in the form
   4452             of  an  email  message  template,  to  be  filled  in by the
   4453             registering party.
   4454 
   4455             F.1  Registration of New Content-type/subtype Values
   4456 
   4457             Note that MIME is  generally  expected  to  be  extended  by
   4458             subtypes.   If  a  new fundamental top-level type is needed,
   4459             its  specification  should  be  published  as  an   RFC   or
   4460             submitted  in  a  form   suitable  to  become an RFC, and be
   4461             subject to the Internet standards process.
   4462 
   4463                  To:  IANA@isi.edu
   4464                  Subject:  Registration of new MIME content-type/subtype
   4465 
   4466                  MIME type name:
   4467 
   4468                  (If the above is not an existing top-level MIME type,
   4469                  please explain why an existing type cannot be used.)
   4470 
   4471                  MIME subtype name:
   4472 
   4473                  Required parameters:
   4474 
   4475                  Optional parameters:
   4476 
   4477                  Encoding considerations:
   4478 
   4479                  Security considerations:
   4480 
   4481 
   4482 
   4483 
   4484             Borenstein & Freed                                 [Page 68]
   4485 
   4486 
   4487 
   4488 
   4489             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4490 
   4491 
   4492                  Published specification:
   4493 
   4494                  (The published specification must be an Internet RFC or
   4495                  RFC-to-be if a new top-level type is being defined, and
   4496                  must be a publicly available specification in any
   4497                  case.)
   4498 
   4499                  Person & email address to contact for further
   4500                  information:
   4501             F.2  Registration of New Character Set Values
   4502 
   4503                  To:  IANA@isi.edu
   4504                  Subject:  Registration of new MIME character set value
   4505 
   4506                  MIME character set name:
   4507 
   4508                  Published specification:
   4509 
   4510                  (The published specification must be an Internet RFC or
   4511                  RFC-to-be or an international standard.)
   4512 
   4513                  Person & email address to contact for further
   4514                  information:
   4515 
   4516             F.3  Registration of New Access-type Values for
   4517             Message/external-body
   4518 
   4519                  To:  IANA@isi.edu
   4520                  Subject:  Registration of new MIME Access-type for
   4521                       Message/external-body content-type
   4522 
   4523                  MIME access-type name:
   4524 
   4525                  Required parameters:
   4526 
   4527                  Optional parameters:
   4528 
   4529                  Published specification:
   4530 
   4531                  (The published specification must be an Internet RFC or
   4532                  RFC-to-be.)
   4533 
   4534                  Person & email address to contact for further
   4535                  information:
   4536 
   4537 
   4538             F.4  Registration of New Conversions Values for Application
   4539 
   4540                  To:  IANA@isi.edu
   4541                  Subject:  Registration of new MIME Conversions value
   4542                  for Application content-type
   4543 
   4544                  MIME Conversions name:
   4545 
   4546 
   4547 
   4548 
   4549             Borenstein & Freed                                 [Page 69]
   4550 
   4551 
   4552 
   4553 
   4554             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4555 
   4556 
   4557                  Published specification:
   4558 
   4559                  (The published specification must be an Internet RFC or
   4560                  RFC-to-be.)
   4561 
   4562                  Person & email address to contact for further
   4563                  information:
   4564 
   4565 
   4566 
   4567 
   4568 
   4569 
   4570 
   4571 
   4572 
   4573 
   4574 
   4575 
   4576 
   4577 
   4578 
   4579 
   4580 
   4581 
   4582 
   4583 
   4584 
   4585 
   4586 
   4587 
   4588 
   4589 
   4590 
   4591 
   4592 
   4593 
   4594 
   4595 
   4596 
   4597 
   4598 
   4599 
   4600 
   4601 
   4602 
   4603 
   4604 
   4605 
   4606 
   4607 
   4608 
   4609 
   4610 
   4611 
   4612 
   4613 
   4614             Borenstein & Freed                                 [Page 70]
   4615 
   4616 
   4617 
   4618 
   4619             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4620 
   4621 
   4622             Appendix G -- Summary of the Seven Content-types
   4623 
   4624             Content-type: text
   4625 
   4626             Subtypes defined by this document:  plain, richtext
   4627 
   4628             Important Parameters: charset
   4629 
   4630             Encoding notes: quoted-printable generally preferred  if  an
   4631                  encoding  is  needed and the character set is mostly an
   4632                  ASCII superset.
   4633 
   4634             Security considerations:  Rich text formats such as TeX  and
   4635                  Troff  often contain mechanisms for executing arbitrary
   4636                  commands or file system operations, and should  not  be
   4637                  used  automatically unless these security problems have
   4638                  been addressed.  Even plain text  may  contain  control
   4639                  characters that can be used to exploit the capabilities
   4640                  of   "intelligent"   terminals   and   cause   security
   4641                  violations.   User  interfaces  designed to run on such
   4642                  terminals should be aware of and try  to  prevent  such
   4643                  problems.
   4644             ________________________________________________________________
   4645 
   4646             Content-type: multipart
   4647 
   4648             Subtypes defined by  this  document:    mixed,  alternative,
   4649                  digest, parallel.
   4650 
   4651             Important Parameters: boundary
   4652 
   4653             Encoding notes: No content-transfer-encoding is permitted.
   4654 
   4655             ________________________________________________________________
   4656 
   4657             Content-type: message
   4658 
   4659             Subtypes  defined  by  this  document:    rfc822,   partial,
   4660                  external-body
   4661 
   4662             Important Parameters: id, number, total
   4663 
   4664             Encoding notes: No content-transfer-encoding is permitted.
   4665 
   4666             ________________________________________________________________
   4667 
   4668             Content-type: application
   4669 
   4670             Subtypes  defined   by   this   document:      octet-stream,
   4671                  postscript, oda
   4672 
   4673             Important Parameters: profile
   4674 
   4675 
   4676 
   4677 
   4678 
   4679             Borenstein & Freed                                 [Page 71]
   4680 
   4681 
   4682 
   4683 
   4684             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4685 
   4686 
   4687             Encoding notes: base64 generally preferred for  octet-stream
   4688                  or other unreadable subtypes.
   4689 
   4690             Security considerations:  This  type  is  intended  for  the
   4691             transmission  of data to be interpreted by locally-installed
   4692             programs.  If used,  for  example,  to  transmit  executable
   4693             binary  programs  or programs in general-purpose interpreted
   4694             languages, such as LISP programs or  shell  scripts,  severe
   4695             security  problems  could  result.   In  general, authors of
   4696             mail-reading  agents  are  cautioned  against  giving  their
   4697             systems  the  power  to  execute mail-based application data
   4698             without carefully  considering  the  security  implications.
   4699             While  it  is  certainly possible to define safe application
   4700             formats and even safe interpreters for unsafe formats,  each
   4701             interpreter  should  be  evaluated  separately  for possible
   4702             security problems.
   4703             ________________________________________________________________
   4704 
   4705             Content-type: image
   4706 
   4707             Subtypes defined by this document:  jpeg, gif
   4708 
   4709             Important Parameters: none
   4710 
   4711             Encoding notes: base64 generally preferred
   4712 
   4713             ________________________________________________________________
   4714 
   4715             Content-type: audio
   4716 
   4717             Subtypes defined by this document:  basic
   4718 
   4719             Important Parameters: none
   4720 
   4721             Encoding notes: base64 generally preferred
   4722 
   4723             ________________________________________________________________
   4724 
   4725             Content-type: video
   4726 
   4727             Subtypes defined by this document:  mpeg
   4728 
   4729             Important Parameters: none
   4730 
   4731             Encoding notes: base64 generally preferred
   4732 
   4733 
   4734 
   4735 
   4736 
   4737 
   4738 
   4739 
   4740 
   4741 
   4742 
   4743 
   4744             Borenstein & Freed                                 [Page 72]
   4745 
   4746 
   4747 
   4748 
   4749             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4750 
   4751 
   4752             Appendix H -- Canonical Encoding Model
   4753 
   4754 
   4755 
   4756             There was some confusion, in earlier drafts  of  this  memo,
   4757             regarding  the model for when email data was to be converted
   4758             to canonical form and encoded, and in  particular  how  this
   4759             process  would affect the treatment of CRLFs, given that the
   4760             representation of newlines varies  greatly  from  system  to
   4761             system.   For this reason, a canonical model for encoding is
   4762             presented below.
   4763 
   4764             The process of composing a MIME message part can be modelled
   4765             as  being  done in a number of steps.  Note that these steps
   4766             are roughly similar to those steps used in RFC1113:
   4767 
   4768             Step 1.  Creation of local form.
   4769 
   4770             The body part to be transmitted is created in  the  system's
   4771             native format.   The native character set is used, and where
   4772             appropriate local end of line conventions are used as  well.
   4773             The may be a UNIX-style text file, or a Sun raster image, or
   4774             a VMS indexed file, or  audio  data  in  a  system-dependent
   4775             format   stored  only  in  memory,  or  anything  else  that
   4776             corresponds to the local model  for  the  representation  of
   4777             some form of information.
   4778 
   4779             Step 2.  Conversion to canonical form.
   4780 
   4781             The entire body part,  including  "out-of-band"  information
   4782             such   as   record   lengths  and  possibly  file  attribute
   4783             information, is converted to  a  universal  canonical  form.
   4784             The  specific  content  type of the body part as well as its
   4785             associated attributes dictate the nature  of  the  canonical
   4786             form  that is used.  Conversion to the proper canonical form
   4787             may involve  character  set  conversion,  transformation  of
   4788             audio   data,   compression,  or  various  other  operations
   4789             specific to the various content types.
   4790 
   4791             For example, in the case of text/plain data, the  text  must
   4792             be  converted to a supported character set and lines must be
   4793             delimited with CRLF delimiters in  accordance  with  RFC822.
   4794             Note  that the restriction on line lengths implied by RFC822
   4795             is eliminated  if  the  next  step  employs  either  quoted-
   4796             printable or base64 encoding.
   4797 
   4798             Step 3.  Apply transfer encoding.
   4799 
   4800             A Content-Transfer-Encoding appropriate for this  body  part
   4801             is  applied.   Note  that  there  is  no  fixed relationship
   4802             between the content  type  and  the  transfer  encoding.  In
   4803             particular,  it  may  be  appropriate  to base the choice of
   4804             base64 or quoted-printable  on  character  frequency  counts
   4805             which are specific to a given instance of body part.
   4806 
   4807 
   4808 
   4809             Borenstein & Freed                                 [Page 73]
   4810 
   4811 
   4812 
   4813 
   4814             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4815 
   4816 
   4817             Step 4.  Insertion into message.
   4818 
   4819             The encoded object is inserted  into  a  MIME  message  with
   4820             appropriate body part headers and boundary markers.
   4821 
   4822             It is vital to note that these steps are only a model;  they
   4823             are  specifically  NOT  a blueprint for how an actual system
   4824             would be built.  In particular, the model fails  to  account
   4825             for two common designs:
   4826 
   4827                  1.  In many cases the conversion  to  a  canonical
   4828                  form  prior  to encoding will be subsumed into the
   4829                  encoder itself, which  understands  local  formats
   4830                  directly.    For   example,   the   local  newline
   4831                  convention for text  bodyparts  might  be  carried
   4832                  through to the encoder itself along with knowledge
   4833                  of what that format is.
   4834 
   4835                  2.  The output of the encoders may  have  to  pass
   4836                  through  one  or  more  additional  steps prior to
   4837                  being transmitted as  a  message.   As  such,  the
   4838                  output  of  the  encoder may not be compliant with
   4839                  the formats specified by RFC822.   In  particular,
   4840                  once   again   it   may  be  appropriate  for  the
   4841                  converter's output to  be  expressed  using  local
   4842                  newline conventions rather than using the standard
   4843                  RFC822 CRLF delimiters.
   4844 
   4845             Other implementation variations  are  conceivable  as  well.
   4846             The  only  important  aspect  of this discussion is that the
   4847             resulting messages are consistent with those produced by the
   4848             model described here.
   4849 
   4850 
   4851 
   4852 
   4853 
   4854 
   4855 
   4856 
   4857 
   4858 
   4859 
   4860 
   4861 
   4862 
   4863 
   4864 
   4865 
   4866 
   4867 
   4868 
   4869 
   4870 
   4871 
   4872 
   4873 
   4874             Borenstein & Freed                                 [Page 74]
   4875 
   4876 
   4877 
   4878 
   4879             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4880 
   4881 
   4882             References
   4883 
   4884             [US-ASCII] Coded Character Set--7-Bit American Standard Code
   4885             for Information Interchange, ANSI X3.4-1986.
   4886 
   4887             [ATK]  Borenstein,  Nathaniel  S.,  Multimedia  Applications
   4888             Development with the Andrew Toolkit, Prentice-Hall, 1990.
   4889 
   4890             [GIF] Graphics Interchange Format (Version 89a), Compuserve,
   4891             Inc., Columbus, Ohio, 1990.
   4892 
   4893             [ISO-2022] International Standard--Information  Processing--
   4894             ISO  7-bit  and  8-bit  coded character sets--Code extension
   4895             techniques, ISO 2022:1986.
   4896 
   4897             [ISO-8859] Information Processing -- 8-bit Single-Byte Coded
   4898             Graphic  Character Sets -- Part 1: Latin Alphabet No. 1, ISO
   4899             8859-1:1987.  Part 2: Latin  alphabet  No.  2,  ISO  8859-2,
   4900             1987.  Part 3: Latin alphabet No. 3, ISO 8859-3, 1988.  Part
   4901             4:  Latin  alphabet  No.  4,  ISO  8859-4,  1988.   Part  5:
   4902             Latin/Cyrillic   alphabet,  ISO  8859-5,  1988.     Part  6:
   4903             Latin/Arabic  alphabet,  ISO  8859-6,   1987.      Part   7:
   4904             Latin/Greek   alphabet,   ISO   8859-7,   1987.     Part  8:
   4905             Latin/Hebrew alphabet, ISO 8859-8, 1988.     Part  9:  Latin
   4906             alphabet No. 5, ISO 8859-9, 1990.
   4907 
   4908             [ISO-646] International  Standard--Information  Processing--
   4909             ISO  7-bit coded  character set for information interchange,
   4910             ISO 646:1983.
   4911 
   4912             [MPEG]  Video  Coding  Draft  Standard  ISO  11172  CD,  ISO
   4913             IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991.
   4914 
   4915             [ODA] ISO 8613;  Information  Processing:  Text  and  Office
   4916             System;  Office  Document Architecture (ODA) and Interchange
   4917             Format (ODIF), Part 1-8, 1989.
   4918 
   4919             [PCM] CCITT, Fascicle III.4 - Recommendation G.711,  Geneva,
   4920             1972, "Pulse Code Modulation (PCM) of Voice Frequencies".
   4921 
   4922             [POSTSCRIPT]  Adobe  Systems,  Inc.,   PostScript   Language
   4923             Reference Manual,  Addison-Wesley, 1985.
   4924 
   4925             [X400]  Schicker, Pietro, "Message Handling Systems, X.400",
   4926             Message  Handling  Systems  and Distributed Applications, E.
   4927             Stefferud, O-j. Jacobsen,  and  P.  Schicker,  eds.,  North-
   4928             Holland, 1989, pp. 3-41.
   4929 
   4930             [RFC-783]  Sollins, K.R.  TFTP Protocol (revision 2).  June,
   4931             1981, MIT, RFC-783.
   4932 
   4933             [RFC-821]  Postel,  J.B.   Simple  Mail  Transfer  Protocol.
   4934             August, 1982, USC/Information Sciences Institute, RFC-821.
   4935 
   4936 
   4937 
   4938 
   4939             Borenstein & Freed                                 [Page 75]
   4940 
   4941 
   4942 
   4943 
   4944             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   4945 
   4946 
   4947             [RFC-822]   Crocker, D.  Standard for  the  format  of  ARPA
   4948             Internet  text  messages. August, 1982, UDEL, RFC-822.
   4949 
   4950             [RFC-934]   Rose, M.T.; Stefferud, E.A.   Proposed  standard
   4951             for    message     encapsulation.  January,   1985, Delaware
   4952             and NMA, RFC-934.
   4953 
   4954             [RFC-959]   Postel,  J.B.;  Reynolds,  J.K.   File  Transfer
   4955             Protocol.      October,   1985,   USC/Information   Sciences
   4956             Institute, RFC-959.
   4957 
   4958             [RFC-1049]   Sirbu,  M.A.   Content-Type  header  field  for
   4959             Internet messages.  March, 1988, CMU,  RFC-1049.
   4960 
   4961             [RFC-1113]   Linn,  J.   Privacy  enhancement  for  Internet
   4962             electronic    mail:  Part    I  -  message  encipherment and
   4963             authentication procedures.   August,  1989, IAB Privacy Task
   4964             Force, RFC-1113.
   4965 
   4966             [RFC-1154]  Robinson, D.; Ullmann, R.  Encoding header field
   4967             for   Internet   messages.  April,   1990,   Prime Computer,
   4968             Inc., RFC-1154.
   4969 
   4970             [RFC-1342] Moore, Keith, Representation of Non-Ascii Text in
   4971             Internet   Message   Headers.   June,  1992,  University  of
   4972             Tennessee, RFC-1342.
   4973 
   4974             Security Considerations
   4975 
   4976             Security issues  are  discussed  in  Section  7.4.2  and  in
   4977             Appendix  G.   Implementors should pay special attention  to
   4978             the security implications of any mail content-types that can
   4979             cause the remote execution of any actions in the recipient's
   4980             environment.   In  such  cases,  the   discussion   of   the
   4981             applicaton/postscript   content-type  in  Section  7.4.2 may
   4982             serve as a model for considering  other  content-types  with
   4983             remote execution capabilities.
   4984 
   4985 
   4986 
   4987 
   4988 
   4989 
   4990 
   4991 
   4992 
   4993 
   4994 
   4995 
   4996 
   4997 
   4998 
   4999 
   5000 
   5001 
   5002 
   5003 
   5004             Borenstein & Freed                                 [Page 76]
   5005 
   5006 
   5007 
   5008 
   5009             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   5010 
   5011 
   5012             Authors' Addresses
   5013 
   5014             For more information, the authors of this  document  may  be
   5015             contacted via Internet mail:
   5016 
   5017                                 Nathaniel S. Borenstein
   5018                                  MRE 2D-296, Bellcore
   5019                                      445 South St.
   5020                                Morristown, NJ 07962-1910
   5021 
   5022                                 Phone: +1 201 829 4270
   5023                                  Fax:  +1 201 829 7019
   5024                                 Email: nsb@bellcore.com
   5025 
   5026 
   5027                                        Ned Freed
   5028                              Innosoft International, Inc.
   5029                                  250 West First Street
   5030                                        Suite 240
   5031                                   Claremont, CA 91711
   5032 
   5033                                 Phone:  +1 714 624 7907
   5034                                  Fax: +1 714 621 5319
   5035                                 Email: ned@innosoft.com
   5036 
   5037 
   5038 
   5039 
   5040 
   5041 
   5042 
   5043 
   5044 
   5045 
   5046 
   5047 
   5048 
   5049 
   5050 
   5051 
   5052 
   5053 
   5054 
   5055 
   5056 
   5057 
   5058 
   5059 
   5060 
   5061 
   5062 
   5063 
   5064 
   5065 
   5066 
   5067 
   5068 
   5069             Borenstein & Freed                                 [Page 77]
   5070 
   5071 
   5072 
   5073 
   5074             RFC 1341MIME: Multipurpose Internet Mail ExtensionsJune 1992
   5075 
   5076 
   5077 
   5078 
   5079 
   5080             THIS PAGE INTENTIONALLY LEFT BLANK.
   5081 
   5082             Please discard this page and place the  following  table  of
   5083             contents after the title page.
   5084 
   5085 
   5086 
   5087 
   5088 
   5089 
   5090 
   5091 
   5092 
   5093 
   5094 
   5095 
   5096 
   5097 
   5098 
   5099 
   5100 
   5101 
   5102 
   5103 
   5104 
   5105 
   5106 
   5107 
   5108 
   5109 
   5110 
   5111 
   5112 
   5113 
   5114 
   5115 
   5116 
   5117 
   5118 
   5119 
   5120 
   5121 
   5122 
   5123 
   5124 
   5125 
   5126 
   5127 
   5128 
   5129 
   5130 
   5131 
   5132 
   5133 
   5134             Borenstein & Freed                                  [Page i]
   5135 
   5136 
   5137 
   5138 
   5139 
   5140 
   5141 
   5142 
   5143                                Table of Contents
   5144 
   5145 
   5146             1     Introduction.......................................  1
   5147             2     Notations, Conventions, and Generic BNF Grammar....  3
   5148             3     The MIME-Version Header Field......................  5
   5149             4     The Content-Type Header Field......................  6
   5150             5     The Content-Transfer-Encoding Header Field......... 10
   5151             5.1   Quoted-Printable Content-Transfer-Encoding......... 14
   5152             5.2   Base64 Content-Transfer-Encoding................... 17
   5153             6     Additional Optional Content- Header Fields......... 19
   5154             6.1   Optional Content-ID Header Field................... 19
   5155             6.2   Optional Content-Description Header Field.......... 19
   5156             7     The Predefined Content-Type Values................. 20
   5157             7.1   The Text Content-Type.............................. 20
   5158             7.1.1 The charset parameter.............................. 20
   5159             7.1.2 The Text/plain subtype............................. 23
   5160             7.1.3 The Text/richtext subtype.......................... 23
   5161             7.2   The Multipart Content-Type......................... 29
   5162             7.2.1 Multipart:  The common syntax...................... 30
   5163             7.2.2 The Multipart/mixed (primary) subtype.............. 34
   5164             7.2.3 The Multipart/alternative subtype.................. 34
   5165             7.2.4 The Multipart/digest subtype....................... 36
   5166             7.2.5 The Multipart/parallel subtype..................... 36
   5167             7.3   The Message Content-Type........................... 37
   5168             7.3.1 The Message/rfc822 (primary) subtype............... 37
   5169             7.3.2 The Message/Partial subtype........................ 37
   5170             7.3.3 The Message/External-Body subtype.................. 40
   5171             7.4   The Application Content-Type....................... 46
   5172             7.4.1 The Application/Octet-Stream (primary) subtype..... 46
   5173             7.4.2 The Application/PostScript subtype................. 47
   5174             7.4.3 The Application/ODA subtype........................ 50
   5175             7.5   The Image Content-Type............................. 51
   5176             7.6   The Audio Content-Type............................. 51
   5177             7.7   The Video Content-Type............................. 51
   5178             7.8   Experimental Content-Type Values................... 51
   5179                   Summary............................................ 53
   5180                   Acknowledgements................................... 54
   5181                   Appendix A -- Minimal MIME-Conformance............. 56
   5182                   Appendix B -- General Guidelines For Sending Email Data59
   5183                   Appendix C -- A Complex Multipart Example.......... 62
   5184                   Appendix D -- A Simple Richtext-to-Text Translator in C64
   5185                   Appendix E -- Collected Grammar.................... 66
   5186                   Appendix F -- IANA Registration Procedures......... 68
   5187                   F.1  Registration of New Content-type/subtype Values..68
   5188                   F.2  Registration of New Character Set Values...... 69
   5189                   F.3  Registration of New Access-type Values for Message/external-body69
   5190                   F.4  Registration of New Conversions Values for Application69
   5191                   Appendix G -- Summary of the Seven Content-types... 71
   5192                   Appendix H -- Canonical Encoding Model............. 73
   5193                   References......................................... 75
   5194                   Security Considerations............................ 76
   5195                   Authors' Addresses................................. 77
   5196 
   5197 
   5198 
   5199             Borenstein & Freed                                 [Page ii]
   5200 
   5201 
   5202 
   5203 
   5204 
   5205 
   5206 
   5207 
   5208 
   5209 
   5210 
   5211 
   5212 
   5213 
   5214 
   5215 
   5216 
   5217 
   5218 
   5219 
   5220 
   5221 
   5222 
   5223 
   5224 
   5225 
   5226 
   5227 
   5228 
   5229 
   5230 
   5231 
   5232 
   5233 
   5234 
   5235 
   5236 
   5237 
   5238 
   5239 
   5240 
   5241 
   5242 
   5243 
   5244 
   5245 
   5246 
   5247 
   5248 
   5249 
   5250 
   5251 
   5252 
   5253 
   5254 
   5255 
   5256 
   5257 
   5258 
   5259 
   5260 
   5261 
   5262 
   5263 
   5264             Borenstein & Freed                                [Page iii]
   5265