7.2.1. Multipart: The common syntax (original) (raw)
Connected: An Internet Encyclopedia
7.2.1. Multipart: The common syntax
Up: Connected: An Internet Encyclopedia
Up: Requests For Comments
Up: RFC 1521
Up: 7. The Predefined Content-Type Values
Up: 7.2. The Multipart Content-Type
Prev: 7.2. The Multipart Content-Type
Next: 7.2.2. The Multipart/mixed (primary) subtype
7.2.1. Multipart: The common syntax
7.2.1. Multipart: The common syntax
All subtypes of "multipart" share a common syntax, defined in this section. A simple example of a multipart message also appears in this section. An example of a more complex multipart message is given in Appendix C.
The Content-Type field for multipart entities requires one parameter, "boundary", which is used to specify the encapsulation boundary. The encapsulation boundary is defined as a line consisting entirely of two hyphen characters ("-", decimal code 45) followed by the boundary parameter value from the Content-Type header field.
NOTE: The hyphens are for rough compatibility with the earlier RFC
934 method of message encapsulation, and for ease of searching for
the boundaries in some implementations. However, it should be
noted that multipart messages are NOT completely compatible with
RFC 934 encapsulations; in particular, they do not obey RFC 934
quoting conventions for embedded lines that begin with hyphens.
This mechanism was chosen over the RFC 934 mechanism because the
latter causes lines to grow with each level of quoting. The
combination of this growth with the fact that SMTP implementations
sometimes wrap long lines made the RFC 934 mechanism unsuitable
for use in the event that deeply-nested multipart structuring is
ever desired.
WARNING TO IMPLEMENTORS: The grammar for parameters on the Content- type field is such that it is often necessary to enclose the boundaries in quotes on the Content-type line. This is not always necessary, but never hurts. Implementors should be sure to study the grammar carefully in order to avoid producing illegal Content-type fields. Thus, a typical multipart Content-Type header field might look like this:
Content-Type: multipart/mixed;
boundary=gc0p4Jq0M2Yt08jU534c0p
But the following is illegal:
Content-Type: multipart/mixed;
boundary=gc0p4Jq0M:2Yt08jU534c0p
(because of the colon) and must instead be represented as
Content-Type: multipart/mixed;
boundary="gc0p4Jq0M:2Yt08jU534c0p"
This indicates that the entity consists of several parts, each itself with a structure that is syntactically identical to an RFC 822 message, except that the header area might be completely empty, and that the parts are each preceded by the line
--gc0p4Jq0M:2Yt08jU534c0p
Note that the encapsulation boundary must occur at the beginning of a line, i.e., following a CRLF, and that the initial CRLF is considered to be attached to the encapsulation boundary rather than part of the preceding part. The boundary must be followed immediately either by another CRLF and the header fields for the next part, or by two CRLFs, in which case there are no header fields for the next part (and it is therefore assumed to be of Content-Type text/plain).
NOTE: The CRLF preceding the encapsulation line is conceptually
attached to the boundary so that it is possible to have a part
that does not end with a CRLF (line break). Body parts that must
be considered to end with line breaks, therefore, must have two
CRLFs preceding the encapsulation line, the first of which is part
of the preceding body part, and the second of which is part of the
encapsulation boundary.
Encapsulation boundaries must not appear within the encapsulations, and must be no longer than 70 characters, not counting the two leading hyphens.
The encapsulation boundary following the last body part is a distinguished delimiter that indicates that no further body parts will follow. Such a delimiter is identical to the previous delimiters, with the addition of two more hyphens at the end of the line:
--gc0p4Jq0M2Yt08jU534c0p--
There appears to be room for additional information prior to the first encapsulation boundary and following the final boundary. These areas should generally be left blank, and implementations must ignore anything that appears before the first boundary or after the last one.
NOTE: These "preamble" and "epilogue" areas are generally not used
because of the lack of proper typing of these parts and the lack
of clear semantics for handling these areas at gateways,
particularly X.400 gateways. However, rather than leaving the
preamble area blank, many MIME implementations have found this to
be a convenient place to insert an explanatory note for recipients
who read the message with pre-MIME software, since such notes will
be ignored by MIME-compliant software.
NOTE: Because encapsulation boundaries must not appear in the body
parts being encapsulated, a user agent must exercise care to
choose a unique boundary. The boundary in the example above could
have been the result of an algorithm designed to produce
boundaries with a very low probability of already existing in the
data to be encapsulated without having to prescan the data.
Alternate algorithms might result in more 'readable' boundaries
for a recipient with an old user agent, but would require more
attention to the possibility that the boundary might appear in the
encapsulated part. The simplest boundary possible is something
like "---", with a closing boundary of "-----".
As a very simple example, the following multipart message has two parts, both of them plain text, one of them explicitly typed and one of them implicitly typed:
From: Nathaniel Borenstein <nsb@bellcore.com>
To: Ned Freed <ned@innosoft.com>
Subject: Sample message
MIME-Version: 1.0
Content-type: multipart/mixed; boundary="simple
boundary"
This is the preamble. It is to be ignored, though it
is a handy place for mail composers to include an
explanatory note to non-MIME conformant readers.
--simple boundary
This is implicitly typed plain ASCII text.
It does NOT end with a linebreak.
--simple boundary
Content-type: text/plain; charset=us-ascii
This is explicitly typed plain ASCII text.
It DOES end with a linebreak.
--simple boundary--
This is the epilogue. It is also to be ignored.
The use of a Content-Type of multipart in a body part within another multipart entity is explicitly allowed. In such cases, for obvious reasons, care must be taken to ensure that each nested multipart entity must use a different boundary delimiter. See Appendix C for an example of nested multipart entities.
The use of the multipart Content-Type with only a single body part may be useful in certain contexts, and is explicitly permitted.
The only mandatory parameter for the multipart Content-Type is the boundary parameter, which consists of 1 to 70 characters from a set of characters known to be very robust through email gateways, and NOT ending with white space. (If a boundary appears to end with white space, the white space must be presumed to have been added by a gateway, and must be deleted.) It is formally specified by the following BNF:
boundary := 0*69 bcharsnospace
bchars := bcharsnospace / " "
bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_" / "," / "-" / "." / "/" / ":" / "=" / "?"
Overall, the body of a multipart entity may be specified as follows:
multipart-body := preamble 1*encapsulation close-delimiter epilogue
encapsulation := delimiter body-part CRLF
delimiter := "--" boundary CRLF ; taken from Content-Type field. ; There must be no space ; between "--" and boundary.
close-delimiter := "--" boundary "--" CRLF ; Again, no space by "--",
preamble := discard-text ; to be ignored upon receipt.
epilogue := discard-text ; to be ignored upon receipt.
discard-text := *(*text CRLF)
body-part := <"message" as defined in RFC 822, with all header fields optional, and with the specified delimiter not occurring anywhere in the message body, either on a line by itself or as a substring anywhere. Note that the semantics of a part differ from the semantics of a message, as described in the text.>
NOTE: In certain transport enclaves, RFC 822 restrictions such as
the one that limits bodies to printable ASCII characters may not
be in force. (That is, the transport domains may resemble
standard Internet mail transport as specified in RFC821 and
assumed by RFC822, but without certain restrictions.) The
relaxation of these restrictions should be construed as locally
extending the definition of bodies, for example to include octets
outside of the ASCII range, as long as these extensions are
supported by the transport and adequately documented in the
Content-Transfer-Encoding header field. However, in no event are
headers (either message headers or body-part headers) allowed to
contain anything other than ASCII characters.
NOTE: Conspicuously missing from the multipart type is a notion of
structured, related body parts. In general, it seems premature to
try to standardize interpart structure yet. It is recommended
that those wishing to provide a more structured or integrated
multipart messaging facility should define a subtype of multipart
that is syntactically identical, but that always expects the
inclusion of a distinguished part that can be used to specify the
structure and integration of the other parts, probably referring
to them by their Content-ID field. If this approach is used,
other implementations will not recognize the new subtype, but will
treat it as the primary subtype (multipart/mixed) and will thus be
able to show the user the parts that are recognized.
Next: 7.2.2. The Multipart/mixed (primary) subtype
Connected: An Internet Encyclopedia
7.2.1. Multipart: The common syntax