0129-mime 10.1 KB
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
  <title>TAG Finding: Internet Media Type registration, consistency of
  use</title>
  <link rel="stylesheet" type="text/css"
  href="http://www.w3.org/StyleSheets/TR/base" />
</head>

<body>

<div class="head">
<p><a href="http://www.w3.org/"><img height="48" width="72" alt="W3C"
src="http://www.w3.org/Icons/w3c_home" /></a></p>

<h1>Internet Media Type registration, consistency of use</h1>
<h2>TAG Finding 3 June 2002 (Revised 4 September 2002)</h2>
<dl>
  <dt>This version:</dt>
    <dd><a
      href="http://www.w3.org/2001/tag/2002/0129-mime">http://www.w3.org/2001/tag/2002/0129-mime</a></dd>
  <dt>Editor:</dt>
    <dd>Tim Bray</dd>
</dl>

<p class="copyright"><a rel="copyright"
href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Copyright">Copyright</a>
© 2002 <a href="http://www.w3.org/"><abbr
title="World Wide Web Consortium">W3C</abbr></a><sup>®</sup> (<a
href="http://www.lcs.mit.edu/"><abbr
title="Massachusetts Institute of Technology">MIT</abbr></a>, <a
href="http://www.inria.fr/"><abbr xml:lang="fr" lang="fr"
title="Institut National de Recherche en Informatique et Automatique">INRIA</abbr></a>,
<a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a
href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Legal_Disclaimer">liability</a>,
<a
href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#W3C_Trademarks">trademark</a>,
<a
href="http://www.w3.org/Consortium/Legal/copyright-documents-19990405">document
use</a> and <a
href="http://www.w3.org/Consortium/Legal/copyright-software-19980720">software
licensing</a> rules apply.</p>
</div>
<hr />

<h2>Abstract</h2>

<p>Internet Media Types are an important part of the Web architecture. This
finding discusses three aspects of Internet Media Types: registration by W3C
Working Groups, consistency between Internet Media Type and content, and
consistency in the communication of character encoding information.</p>

<h2>Status of this document</h2>

<p>This document has been produced by the <a href="/2001/tag/">W3C
Technical Architecture Group (TAG)</a>. This version includes changes
that have not yet been approved by the TAG regarding (1) registration
requirements and (2) charset header information.</p>

<p>The TAG approved the previous draft of this finding
at its <a
href="http://lists.w3.org/Archives/Public/www-tag/2002Jun/0019">3 June
2002 teleconference</a>. The TAG originally reached consensus on this
issue at its <a
href="http://lists.w3.org/Archives/Public/www-tag/2002Jan/0235.html">28
Jan 2002</a> teleconference, and after its <a
href="/2002/05/20-tag-summary">20 May 2002 teleconference</a>
announced to www-tag. The TAG notes that Tantek Çelik expressed
dissent about this finding. At 
their <a href="/2002/12/16-tag-summary">16 Dec 2002
teleconference</a>, the TAG agreed to add a publication date to this
document, consistent with the TAG's expectation that findings no
longer be modified in place.</p>

<p>These findings were derived from discussion of TAG issues <a
href="http://www.w3.org/2001/tag/ilist#w3cMediaType-1">w3cMediaType-1</a>,
<a
href="http://www.w3.org/2001/tag/ilist#customMediaType-2">customMediaType-2</a>,
and <a
href="http://www.w3.org/2001/tag/ilist#nsMediaType-3">nsMediaType-3</a>
but in some cases extend beyond the specifics of the issue that was
raised.</p>


<p><a href="/2001/tag/findings">Additional TAG findings</a>, both approved
and in draft state, may also be available. The TAG expects to incorporate
this and other findings into a Web Architecture Document that will be
published according to the process of the <a
href="/Consortium/Process-20010719/tr#Recs">W3C Recommendation Track</a>.</p>

<p>The terms MUST, SHOULD, and SHOULD NOT are used in this document in
accordance with RFC 2119 [<a href="#RFC2119">RFC2119</a>].</p>

<p>Please send comments on this finding to the publicly archived TAG mailing
list www-tag@w3.org (<a
href="http://lists.w3.org/Archives/Public/www-tag/">archive</a>).</p>

<h2><a name="Contents" id="Contents">Contents</a></h2>
<ol>
  <li><a href="#registration">Registration of Media Types by W3C Working
    Groups</a></li>
  <li><a href="#consistency">Consistency of Media Types and Response
    Contents</a></li>
  <li><a href="#char-encoding">Consistency in Communicating Character
    Encoding</a></li>
  <li><a href="#refs">References</a></li>
</ol>

<h2>1. <a name="registration" id="registration">Registration of Media Types
by W3C Working Groups</a></h2>

<p>W3C Working Groups engaged in defining a language SHOULD arrange for the
registration of an Internet Media Type (defined in RFC 2046 <a
href="#RFC2046">[RFC2046]</a>) for that language; see
[<a href="#iana-reg">IANAREG</a>] for registration instructions.
The IETF registration forms MUST be available for review along
with the specification no later than Candidate Recommendation
(or at last call if the Working Group expects to advance directly
to Proposed Recommendation). The 
IETF registration forms SHOULD be available for review
no later than last call.</p>

<p>The conventions and framework established by RFC 3023 <a
href="#RFC3023">[RFC3023]</a> SHOULD be followed when registering an Internet
Media Type for a language that uses XML syntax.</p>

<h2>2. <a name="consistency" id="consistency">Consistency of Media Types and
Message Contents</a></h2>

<p>The architecture of the Web depends on applications making dispatching and
security decisions for resources based on their Internet Media Types and
other MIME headers. It is a serious error for the response body to be
inconsistent with the assertions made about it by the MIME headers. Web
software SHOULD NOT attempt to recover from such errors by guessing, but
SHOULD report the error to the user to allow intelligent corrective
action.</p>

<p>An example of <strong>incorrect and dangerous behavior</strong> is a
user-agent that reads some part of the body of a response and decides to
treat it as HTML based on its containing a <code>&lt;!DOCTYPE</code>
declaration or <code>&lt;title&gt;</code> tag, when it was served as
<code>text/plain</code> or some other non-HTML type.</p>

<p>Examples of such inconsistencies that have been observed on the Web
include:</p>
<ul>
  <li>The Unicode encoding of a message body (XML document) is inconsistent
    with the value of the <code>charset</code> parameter in the message
    headers. See <a
    href="http://www.w3.org/2001/04/roadmap/xml-charset.svg">SVG diagram for
    determining character encoding</a>.</li>
  <li>The namespace of the root element of a message body (XML document) is
    inconsistent with the value of the media type header in the message
    headers.</li>
</ul>

<h2>3. <a name="char-encoding" id="char-encoding">Consistency in
Communicating Character Encoding</a></h2>

<p>The first example in the preceding section is a particularly
troublesome case. Section 7.1 of [<a href="#RFC3023">RFC3023</a>] 
states:</p>

<blockquote>
<p>The use of the charset parameter is STRONGLY RECOMMENDED,
since this information can be used by XML processors to determine
authoritatively the charset of the XML MIME entity.</p>
</blockquote>

<p>and states that when used it is always authoritative. However, a
receiving application can, with very high reliability, determine the
encoding of an XML document by reading it, without reference to any
external headers and this is reflected by RFC 3023 in the following
sections:</p>
<ul>
<li>8.9 Application/xml with Omitted Charset and UTF-16 XML MIME
Entity</li>
<li>8.10 Application/xml with Omitted Charset and UTF-8 Entity</li>
<li>8.11 Application/xml with Omitted Charset and Internal Encoding
Declaration</li>
</ul>

<p>Thus there is no ambiguity when the charset is omitted, and the
STRONGLY RECOMMENDED injunction to use the charset is misplaced for
application/xml and for non-text "+xml" types. Consequently, for XML
representations, server-side applications SHOULD only supply a charset
header when there is complete certainty as to the encoding in
use. Otherwise, an error will cause a perfectly usable representation
to be rejected by an architecturally sound client.</p>

<p>We recommend that section 7.1 of [<a href="#RFC3023">RFC3023</a>]
be amended to something like the following:</p> 

<blockquote>
<p>The use of the charset parameter, when the charset is
reliably known and agrees with the encoding declaration, is
RECOMMENDED, since this information can be used by non-XML processors
to determine authoritatively the charset of the XML MIME
entity.</p>
</blockquote>


<h2><a id="refs" name="refs">4. References</a></h2>
<dl>
  <dt><a id="RFC2046" name="RFC2046">[RFC2046]</a></dt>
    <dd>"<a href="http://www.ietf.org/rfc/rfc2046.txt">RFC2046: Multipurpose
      Internet Mail Extensions (MIME) Part Two: Media Types</a>", N. Freed
      and N. Borenstein, November 1996. Available at
      http://www.ietf.org/rfc/rfc2046.txt.</dd>
  <dt><a name="RFC2119" id="RFC2119">[RFC2119]</a></dt>
    <dd>"<a href="http://www.ietf.org/rfc/rfc2119">RFC2119: Key words for use
      in RFCs to Indicate Requirement Levels</a>", S. Bradner, March 1997.
      Available at http://www.ietf.org/rfc/rfc2119.txt</dd>
  <dt><a id="RFC3023" name="RFC3023">[RFC3023]</a></dt>
    <dd>"<a href="http://www.ietf.org/rfc/rfc3023.txt">RFC3023: XML Media
      Types</a>", M. Murata, S. St. Laurent, D. Kohn, January 2001. Available
      at http://www.ietf.org/rfc/rfc3023.txt.</dd>
  <dt><a id="iana-reg" name="iana-reg">[IANAREG]</a></dt>
    <dd>"<a href="http://www.w3.org/2002/06/registering-mediatype">How
to Register a Media Type with IANA</a>". This is an informal
document intended to capture best practice 
for requests that a Mime Type defined by a W3C Recommendation be
registered in the IANA registry. This document may change as W3C
learns from experience or as processes in the various organizations
evolve. This document is available at
http://www.w3.org/2002/06/registering-mediatype.</dd>
</dl>
<hr />

<p>Last modified: $Date: 2002/12/17 13:06:11 $ by $Author: ijacobs $. $Revision: 1.33 $</p>
</body>
</html>