index.html
28.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Character Model for the World Wide Web 1.0: Resource Identifiers </title><style type="text/css">
code { font-family: monospace; }
div.constraint,
div.issue,
div.note,
div.example,
div.notice { margin-left: 2em; }
.example-head, .note-head { font-weight: bold }
li p { margin-top: 0.3em;
margin-bottom: 0.3em; }
.rfc2119, .uname { text-transform: lowercase; font-variant: small-caps; }
.new-term { font-weight: bold }
.quote { font-style: italic }
.figure { margin-bottom: 2em; }
.caption {
text-align: center;
margin: 0.5em 2em;
font-style: italic;
}
.editor-note { font-style: italic; color: red; }
.req { background: #ffffcc; }
.reqId, .reqId a {
color: #005A9C;
background: white;
font-weight: bold;
font-style: italic;
text-decoration: none;
}
img { border: 0; }
@media print {
.req { background: #ffcc99 }
}
div.exampleInner pre { margin-left: 1em;
margin-top: 0em; margin-bottom: 0em}
div.exampleOuter {border: 4px double gray;
margin: 0em; padding: 0em}
div.exampleInner { background-color: #d5dee3;
border-top-width: 4px;
border-top-style: double;
border-top-color: #d3d3d3;
border-bottom-width: 4px;
border-bottom-style: double;
border-bottom-color: #d3d3d3;
padding: 4px; margin: 0em }
div.exampleWrapper { margin: 4px }
div.exampleHeader { font-weight: bold;
margin: 4px}
</style><link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-CR" /></head><body><div class="head"><p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home" alt="W3C" height="48" width="72" /></a></p> <h1><a name="title" id="title" />Character Model for the World Wide Web 1.0: Resource Identifiers </h1> <h2><a name="w3c-doctype" id="w3c-doctype" />W3C Candidate Recommendation 22 November 2004</h2><dl><dt>This version:</dt><dd>
<a href="http://www.w3.org/TR/2004/CR-charmod-resid-20041122/">http://www.w3.org/TR/2004/CR-charmod-resid-20041122</a></dd><dt>Latest version:</dt><dd>
<a href="http://www.w3.org/TR/charmod-resid">http://www.w3.org/TR/charmod-resid</a>
</dd><dt>Previous version:</dt><dd><a href="http://www.w3.org/TR/2004/WD-charmod-20040225/">http://www.w3.org/TR/2004/WD-charmod-20040225 (prior to document split)</a></dd><dt>Editors:</dt><dd>Martin J. Dürst, W3C <a href="mailto:duerst@w3.org"><duerst@w3.org></a></dd><dd>François Yergeau (Invited Expert)</dd><dd>Richard Ishida, W3C <a href="mailto:ishida@w3.org"><ishida@w3.org></a></dd><dd>Misha Wolf (until Dec 2002), Reuters Ltd. <a href="mailto:misha.wolf@reuters.com"><misha.wolf@reuters.com></a></dd><dd>Tex Texin (Invited Expert), XenCraft <a href="mailto:tex@XenCraft.com"><tex@XenCraft.com></a></dd></dl><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2004 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr /><div> <h2><a name="abstract" id="abstract" />Abstract</h2><p>This Architectural Specification provides authors of specifications,
software developers, and content developers with a common reference for
the use of resource identifiers building on the Universal Character Set,
defined jointly by the Unicode Standard and
ISO/IEC 10646.</p><p>For topics such as
use of the terms '<span class="qterm">character</span>', '<span class="qterm">encoding</span>' and '<span class="qterm">string</span>', a reference processing model, choice and identification of character encodings, character escaping, and string indexing, see <cite> Character Model for the World Wide Web 1.0: Fundamentals</cite> <a href="#charmod1">[CharMod]</a>. For normalization and string identity matching, see <cite>Character Model for the World Wide Web 1.0: Normalization</cite> <a href="#charnorm">[CharNorm]</a>.</p></div><div> <h2><a name="status" id="status" />Status of this Document</h2><p><em>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em>
</p><p>This document is a Candidate Recommendation of the W3C. This document has been produced by the <a href="http://www.w3.org/International/Group/">W3C
Internationalization Working Group (I18N WG)</a> (Members only), with the help of the Internationalization Interest Group, as part of the <a href="http://www.w3.org/International/Activity">W3C
Internationalization Activity</a>.
Publication as a Candidate Recommendation does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or obsoleted
by other documents at any time. It is inappropriate to cite this document as
other than work in progress.</p><p>Section 3 of this document was formerly <a href="http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-URIs">Section
7</a> of the <a href="http://www.w3.org/TR/2004/WD-charmod-20040225/">Character
Model for the World Wide Web 1.0: Fundamentals</a> Last Call Working Draft
published 25 February 2004. A more detailed change log is given in <a href="#sec-Changes">Appendix C, Changes</a>.</p><p>The I18N WG invites comments on this specification. Comments
should be submitted by email to <a href="mailto:www-i18n-comments@w3.org">www-i18n-comments@w3.org</a>
(<a href="http://lists.w3.org/Archives/Public/www-i18n-comments/">public
archive</a>). Please send one email per comment where possible, otherwise
number comments clearly.</p><p>The WG is soliciting implementation reports on this specification and related
technology. The WG plans to submit this specification for consideration as a
W3C Proposed Recommendation as soon as the following conditions are met:
1) The document <cite>Internationalized Resource Identifiers
(IRIs)</cite> is published as an RFC in IETF Proposed Standard status.
2) There is a test suite that tests the use of IRIs along at least the
following axes:
a) use of IRIs in several document formats;
b) use of IRIs in several locations in the same document format;
c) use of non-ASCII characters in different parts of an IRI
(e.g. domain name part, path part);
d) use of IRIs in documents with various widely used character
encodings and with characters from various scripts;
e) use of document-specific escapes in IRIs;
f) use of IRIs with various URI schemes;
g) setup of various servers for IRIs;
h) the translation of IRIs into URIs.
3) For each of the above mentioned axes, there are at least
two implementations passing the applicable tests.
4) The WG has responded formally to all issues raised against this document.
The WG is expecting that this will take at least until 15 January 2005,
but possibly longer.</p><p>Patent disclosures relevant to this specification may be found on the Working Group's <a href="http://www.w3.org/International/2002/Disclosures">patent disclosure page</a>. This document has been produced under the <a href="http://www.w3.org/TR/2002/NOTE-patent-practice-20020124">24 January 2002 CPP</a> as amended by the <a href="http://www.w3.org/2004/02/05-pp-transition">W3C Patent Policy Transition Procedure</a>. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>.</p></div><div class="toc"> <h2><a name="contents" id="contents" />Table of Contents</h2><p class="toc">1 <a href="#sec-Intro">Introduction</a><br /> 2 <a href="#sec-Conformance">Conformance</a><br /> 3 <a href="#sec-URIs">Character Encoding in Resource Identifiers</a><br /> </p> <h3><a name="appendices" id="appendices" />Appendices</h3><p class="toc">A <a href="#sec-References">References</a><br /> A.1 <a href="#sec-NormativeReferences">Normative References</a><br /> A.2 <a href="#sec-OtherReferences">Other References</a><br /> B <a href="#sec-Checklist">List of conformance criteria</a> (Non-Normative)<br /> C <a href="#sec-Changes">Changes</a> (Non-Normative)<br /> D <a href="#sec-Acknowledgements">Acknowledgements</a> (Non-Normative)<br /> </p></div><hr /><div class="body"><div class="div1"> <h2><a name="sec-Intro" id="sec-Intro" />1 Introduction</h2><p>The goal of the Character Model for the World Wide Web is to facilitate use
of the Web by all people, regardless of their language, script, writing system,
and cultural conventions, in accordance with the <a href="http://www.w3.org/Consortium/#goals"><cite>W3C goal of universal access</cite></a>.
One basic prerequisite to achieve this goal is to be able to transmit and
process the characters used around the world in a well-defined and well-
understood way.</p><p>The main target audience of this specification is W3C specification
developers. This specification and parts of it can be referenced from other W3C
specifications. It defines conformance criteria for W3C specifications as well
as other specifications.</p><p>The character model described in this specification provides authors of
specifications, software developers, and content developers with a common
reference for consistent, interoperable text manipulation on the World Wide Web.
Working together, these three groups can build a more international Web.</p><p>The topic addressed in this part of the Character Model for the World Wide Web
is resource identifiers.</p><p>The topic addressed in this part of the Character Model for the World Wide Web is the character encoding of resource identifiers.
A resource identifier is a compact string of characters for identifying an abstract or physical resource.</p><p>Other parts of the Character Model address the fundamental aspects of the
model (<a href="#charmod1">[CharMod]</a>) and normalization and string identity matching
(<a href="#charnorm">[CharNorm]</a>). For more background information, please see <a href="#charmod1">[CharMod]</a>.</p><p>Topics as yet not addressed or barely touched include fuzzy
matching, and language tagging. Some of these topics may be addressed in a
future versions or parts of this specification.</p><p>At the core of the model is the Universal Character Set (UCS), defined
jointly by the Unicode Standard <a href="#unicode">[Unicode]</a> and ISO/IEC 10646
<a href="#iso10646">[ISO/IEC 10646]</a>. In this document, <span class="new-term"> Unicode</span> is used as a
synonym for the Universal Character Set. The model will allow Web documents
authored in the world's scripts (and on different platforms) to be exchanged,
read, and searched by Web users around the world.</p></div><div class="div1"> <h2><a name="sec-Conformance" id="sec-Conformance" />2 Conformance</h2><p>This section explains the conditions that specifications, software, and Web content have to fulfill to be able to claim conformance to this specification.</p><p>The key words "<span class="rfc2119" >MUST</span>", "<span class="rfc2119" >MUST
NOT</span>", "<span class="rfc2119" >REQUIRED</span>", "<span class="rfc2119" >SHALL</span>",
"<span class="rfc2119" >SHALL NOT</span>", <span class="rfc2119" >SHOULD</span>", "<span class="rfc2119" >SHOULD
NOT</span>", "<span class="rfc2119" >RECOMMENDED</span>", "<span class="rfc2119" >MAY</span>" and
"<span class="rfc2119" >OPTIONAL</span>" in this document are to be interpreted as
described in RFC 2119 <a href="#rfc2119">[RFC 2119]</a>.</p><div class="note"><p><span class="note-head">NOTE: </span>RFC 2119 makes it clear that requirements that use
<span class="rfc2119" >SHOULD</span> are not optional and must be complied with unless
there are specific reasons not to: "<span class="quote">This word, or the adjective
"RECOMMENDED", mean that there may exist valid reasons in particular
circumstances to ignore a particular item, but the full implications must be
understood and carefully weighed before choosing a different
course.</span>"
</p></div><p>This specification defines conformance criteria
for specifications. All
conformance criteria are
preceded by '<span class="qterm">[S]</span>' where 'S' stands for specifications.</p><p>A specification conforms to this document if it:</p><ol type="1"><li><p> does not violate any conformance criteria preceded by [S],</p></li><li><p>documents the reason for any deviation from criteria where the imperative is <span class="rfc2119" >SHOULD</span>, <span class="rfc2119" >SHOULD NOT</span>, or <span class="rfc2119" >RECOMMENDED</span>,</p></li><li><p>where applicable, requires implementations conforming to the specification to conform to this document,</p></li><li><p> where applicable, requires content conforming to the specification to conform to this document.</p></li></ol><div class="note"><p><span class="note-head">NOTE: </span>Requirements placed on specifications might indirectly cause requirements to be placed on implementations or content that claim to conform to those specifications. Likewise, requirements placed on content may affect implementations designed to produce such content, and so on.</p></div><p>Where this specification places requirements on processing, it is to be understood as a way to
specify the desired external behavior. Implementations can
use other means of achieving the same results, as
long as observable behavior is not affected.</p></div><div class="div1"> <h2><a name="sec-URIs" id="sec-URIs" />3 Character Encoding in Resource Identifiers</h2><p>According to the definition in RFC 2396 <a href="#rfc2396">[RFC 2396]</a>, URI
references are restricted to a subset of US-ASCII, with an escaping mechanism
to encode arbitrary byte values, using the %HH convention. However, the %HH
convention by itself is of limited use because there is no definitive mapping
from characters to bytes. Also, non-ASCII characters cannot be used directly.
<cite>Internationalized Resource Identifiers (IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a> solves both problems with an uniform approach that
conforms to the <a href="http://www.w3.org/TR/charmod#sec-RefProcModel">Reference Processing
Model</a>. </p><p>
<a id="C058" name="C058" href="#C058" ><span class="reqId">C058</span></a> <span class="req" >
<span class="requirement-type">[S]</span>
Specifications that define
protocol or format elements (e.g. HTTP headers, XML attributes, etc.) which are
to be interpreted as URI references (or specific subsets of URI references,
such as absolute URI references, URIs, etc.) <span class="rfc2119">SHOULD</span> use
<cite>Internationalized Resource Identifiers (IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a> (or an appropriate subset thereof).
</span></p><p><a id="C059" name="C059" href="#C059" ><span class="reqId">C059</span></a> <span class="req" >
<span class="requirement-type">[S]</span>
Specifications <span class="rfc2119">MUST</span>
define when the conversion from IRI references to URI references (or subsets
thereof) takes place, in accordance with <cite>Internationalized Resource
Identifiers (IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a>.
</span>
</p><div class="note"><p><span class="note-head">NOTE: </span>Many current specifications already contain provisions in
accordance with <cite>Internationalized Resource Identifiers
(IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a>. For XML 1.0 <a href="#xml10">[XML 1.0]</a>,
see <a href="http://www.w3.org/TR/REC-xml#sec-external-ent">Section
4.2.2, External Entities</a>. XML Schema Part 2: Datatypes <a href="#xmlschema-2">[XML Schema-2]</a>
provides the <code class="keyword">anyURI</code> datatype (see
<a href="http://www.w3.org/TR/xmlschema-2/#anyURI">Section
3.2.17</a>). The XML Linking Language (XLink) <a href="#xlink">[XLink]</a>
provides the href attribute (see
<a href="http://www.w3.org/TR/xlink/#link-locators">Section 5.4, Locator
Attribute</a>). Further information and links can be found at
<cite>Internationalization: URIs and other identifiers</cite>
<a href="#i18nuri">[Info URI-I18N]</a>.</p></div><div class="note"><p><span class="note-head">NOTE: </span>Document formats should allow IRIs to be used; handlers for protocols
that do not currently support IRIs can convert the IRI to a URI when
the IRI is dereferenced.</p></div><p>
<a id="C060" name="C060" href="#C060" ><span class="reqId">C060</span></a> <span class="req" >
<span class="requirement-type">[S]</span>
Specifications that define
new syntax for URIs, such as a new URI scheme or a new kind of fragment
identifier, <span class="rfc2119">MUST</span> specify that characters outside the
US-ASCII repertoire are encoded using UTF-8 and %HH-escaping.
</span></p><p>This is in accordance
with <cite>Guidelines for new URL Schemes</cite>
<a href="#rfc2718">[RFC 2718]</a>, Section 2.2.5.</p><p><a id="C061" name="C061" href="#C061" ><span class="reqId">C061</span></a> <span class="req" >
<span class="requirement-type">[S]</span>
Specifications that define new syntax for URIs SHOULD also define the
normalization requirements for the syntax they introduce.
</span>
</p></div></div><div class="back"><div class="div1"> <h2><a name="sec-References" id="sec-References" />A References</h2><div class="div2"> <h3><a name="sec-NormativeReferences" id="sec-NormativeReferences" />A.1 Normative References</h3><dl><dt class="label"><a name="uri-i18n" id="uri-i18n" />I-D IRI</dt><dd>Martin Dürst, Michel Suignard,
<a href="http://www.w3.org/International/iri-edit/draft-duerst-iri-08.txt"><cite>Internationalized
Resource Identifiers (IRIs)</cite></a>, Internet-Draft, September 2004. (See
<a href="http://www.w3.org/International/iri-edit/draft-duerst-iri-10.txt">http://www.w3.org/International/iri-edit/draft-duerst-iri-10.txt</a>.)
[NOTE: This reference will be updated once the IRI draft is available as an
RFC.]</dd><dt class="label"><a name="iso10646" id="iso10646" />ISO/IEC 10646</dt><dd>ISO/IEC 10646:2003,
<a href="http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=29819"><cite>Information
technology -- Universal Multiple-Octet Coded Character Set (UCS)</cite></a>, as, from time to time, amended, replaced by a
new edition or expanded by the addition of new parts. (See
<a href="http://www.iso.ch/iso/en/ISOOnline.openerpage">http://www.iso.ch/iso/en/ISOOnline.openerpage</a> for the
latest version.)</dd><dt class="label"><a name="rfc2119" id="rfc2119" />RFC 2119</dt><dd>S. Bradner,
<a href="http://www.ietf.org/rfc/rfc2119.txt"><cite>Key words for use in RFCs
to Indicate Requirement Levels</cite></a>, IETF RFC 2119. (See
<a href="http://www.ietf.org/rfc/rfc2119.txt">http://www.ietf.org/rfc/rfc2119.txt</a>.)
</dd><dt class="label"><a name="rfc2396" id="rfc2396" />RFC 2396</dt><dd>T. Berners-Lee, R. Fielding, L.
Masinter, <a href="http://www.ietf.org/rfc/rfc2396.txt"><cite>Uniform Resource
Identifiers (URI): Generic Syntax</cite></a>, IETF RFC 2396, August 1998. (See
<a href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</a>.)
[NOTE: This reference will be updated once the successor to this document,
<a href="http://www.ietf.org/internet-drafts/draft-fielding-uri-rfc2396bis-07.txt">draft-fielding-uri-rfc2396bis-07.txt</a>, is available as an RFC.]</dd><dt class="label"><a name="unicode" id="unicode" />Unicode</dt><dd>The Unicode Consortium,
<cite>The Unicode Standard, Version 4</cite>, ISBN 0-321-18578-1, as
updated from time to time by the publication of new versions. (See
<a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions</a>
for the latest version and additional information on versions of the standard
and of the Unicode Character Database).</dd></dl></div><div class="div2"> <h3><a name="sec-OtherReferences" id="sec-OtherReferences" />A.2 Other References</h3><dl><dt class="label"><a name="charmod1" id="charmod1" />CharMod</dt><dd>Martin J. Dürst,
François Yergeau, Richard Ishida, Misha Wolf, Tex Texin,
<a href="http://www.w3.org/TR/charmod"><cite>Character Model for the
World Wide Web 1.0: Fundamentals</cite></a>, W3C Proposed Recommendation 22 November 2004. (See
<a href="http://www.w3.org/TR/charmod">http://www.w3.org/TR/charmod</a>.)
</dd><dt class="label"><a name="charnorm" id="charnorm" />CharNorm</dt><dd>Martin J. Dürst,
François Yergeau, Richard Ishida, Misha Wolf, Tex Texin, Addison Phillips
<a href="http://www.w3.org/TR/charmod-norm"><cite>Character Model for the
World Wide Web 1.0: Normalization</cite></a>, W3C Working Draft 25 February 2004. (See
<a href="http://www.w3.org/TR/charmod-norm">http://www.w3.org/TR/charmod-norm</a>.)
</dd><dt class="label"><a name="i18nuri" id="i18nuri" />Info URI-I18N</dt><dd>
<a href="http://www.w3.org/International/O-URL-and-ident"><cite>Internationalization: URIs and other identifiers</cite></a>. (See
<a href="http://www.w3.org/International/O-URL-and-ident">http://www.w3.org/International/O-URL-and-ident</a>.)
</dd><dt class="label"><a name="rfc2718" id="rfc2718" />RFC 2718</dt><dd>L. Masinter, H. Alvestrand, D.
Zigmond, R. Petke, <a href="http://www.ietf.org/rfc/rfc2718.txt"><cite>Guidelines for new URL
Schemes</cite></a>, IETF RFC 2718, November 1999. (See
<a href="http://www.ietf.org/rfc/rfc2718.txt">http://www.ietf.org/rfc/rfc2718.txt</a>.)</dd><dt class="label"><a name="xlink" id="xlink" />XLink</dt><dd>Steve DeRose, Eve Maler, David Orchard,
Eds, <a href="http://www.w3.org/TR/xlink/"><cite>XML Linking Language (XLink)
Version 1.0</cite></a>, W3C Recommendation 27 June 2001. (See
<a href="http://www.w3.org/TR/xlink/">http://www.w3.org/TR/xlink</a>.) </dd><dt class="label"><a name="xml10" id="xml10" />XML 1.0</dt><dd>Tim Bray, Jean Paoli, C. M.
Sperberg-McQueen, Eve Maler, François Yergeau, Eds.,
<a href="http://www.w3.org/TR/REC-xml"><cite>Extensible Markup Language (XML)
1.0</cite></a>, W3C Recommendation first published 10 February 1998, revised 4 February 2004. (See
<a href="http://www.w3.org/TR/REC-xml">http://www.w3.org/TR/REC-xml</a>.)
</dd><dt class="label"><a name="xmlschema-2" id="xmlschema-2" />XML Schema-2</dt><dd>Paul V. Biron , Ashok
Malhotra , Eds., <a href="http://www.w3.org/TR/xmlschema-2/"><cite>XML Schema
Part 2: Datatypes Second Edition</cite></a>, W3C Recommendation first published 2 May 2001, revised 28 October 2004. (See
<a href="http://www.w3.org/TR/xmlschema-2/">http://www.w3.org/TR/xmlschema-2</a>.)</dd></dl></div></div><div class="div1" > <h2><a name="sec-Checklist" id="sec-Checklist" />B List of conformance criteria (Non-Normative)</h2><p>Below is a list of the conformance criteria in this specification, in document order. This list can be used to check specifications for conformance to this specification.</p><p>When doing so, the following points should be kept in mind:</p><ul><li><p>To ensure that you understand the meaning, read the whole document first. Use this list as a quick reference only after having first read the conformance criteria in context in the main body of the text.</p></li><li><p>If the meaning of a conformance criterion in this list is still unclear after referring back to the surrounding text in the main body of the document, consider sending a comment to www-i18n-comments@w3.org (<a href="http://lists.w3.org/Archives/Public/www-i18n-comments/">publicly archived</a>).</p></li><li><p>Not all conformance criteria apply to all specifications. Before checking for actual conformance, applicability should
be checked. As an example, C060 only applies if your specification defines
new syntax for URIs, not if you are just using resource identifiers.</p></li></ul><table id="req-checklist"><tbody><tr><td class="reqId" style="vertical-align: top;"><a href="#C058">C058</a></td><td style="vertical-align: top;"><span class="requirement-type">[S]</span> </td><td>Specifications that define
protocol or format elements (e.g. HTTP headers, XML attributes, etc.) which are
to be interpreted as URI references (or specific subsets of URI references,
such as absolute URI references, URIs, etc.) <span class="rfc2119">SHOULD</span> use
<cite>Internationalized Resource Identifiers (IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a> (or an appropriate subset thereof).</td></tr><tr><td class="reqId" style="vertical-align: top;"><a href="#C059">C059</a></td><td style="vertical-align: top;"><span class="requirement-type">[S]</span> </td><td>Specifications <span class="rfc2119">MUST</span>
define when the conversion from IRI references to URI references (or subsets
thereof) takes place, in accordance with <cite>Internationalized Resource
Identifiers (IRIs)</cite>
<a href="#uri-i18n">[I-D IRI]</a>.</td></tr><tr><td class="reqId" style="vertical-align: top;"><a href="#C060">C060</a></td><td style="vertical-align: top;"><span class="requirement-type">[S]</span> </td><td>Specifications that define
new syntax for URIs, such as a new URI scheme or a new kind of fragment
identifier, <span class="rfc2119">MUST</span> specify that characters outside the
US-ASCII repertoire are encoded using UTF-8 and %HH-escaping.</td></tr><tr><td class="reqId" style="vertical-align: top;"><a href="#C061">C061</a></td><td style="vertical-align: top;"><span class="requirement-type">[S]</span> </td><td>Specifications that define new syntax for URIs SHOULD also define the
normalization requirements for the syntax they introduce.</td></tr></tbody></table></div><div class="div1"> <h2><a name="sec-Changes" id="sec-Changes" />C Changes (Non-Normative)</h2><p>This document is based <a href="http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-URIs">Section 7</a> of the <a href="http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-URIs">Character Model for the World Wide Web 1.0: Fundamentals</a> Last Call Working Draft published 25 February 2004. Changes between Section 7 of that document and Section 3 of the present document are as follows:</p><ul><li><p>Title of section 3 changed from "Character Encoding in URI References" to "Character Encoding in Resource Identifiers".</p></li><li><p>Note after C059: "see Section 4.2.2, External Entities, and Erratum E26." changed to "see Section 4.2.2, External Entities.", since the erratum has been incorporated in a new edition.</p></li><li><p>C061: "Such specifications SHOULD" changed to "Specifications that define new syntax for URIs SHOULD".</p></li></ul><p>In addition, the remaining parts of this document have changed compared to the corresponding parts of the abovementioned Last Call Working Draft, as follows: The Introduction has been shortened to concentrate on the material in this document. The Conformance section has been reduced to take into account that this document only contains conformance criteria for specifications. The References section has been shortened by removing unrelated references. Where necessary, references have been updated. The reference to <a href="#uri-i18n">[I-D IRI]</a> was moved to the Normative References subsection. A List of Conformance and this section on Changes have been added.</p></div><div class="div1"> <h2><a name="sec-Acknowledgements" id="sec-Acknowledgements" />D Acknowledgements (Non-Normative)</h2><p>Tim
Berners-Lee and James Clark provided important details.
Asmus Freytag , Addison Phillips, and in early stages Ian Jacobs, provided significant help in the authoring and editing process. The W3C I18N WG and IG, as well as many others, provided many helpful comments and
suggestions.</p></div></div></body></html>