WD-xml-blueberry-req-20010921
11.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>XML Blueberry Requirements</title>
<link rel="stylesheet" type="text/css" media="screen" href="http://www.w3.org/StyleSheets/TR/W3C-WD">
</head>
<body>
<div class="head">
<p><a href="http://www.w3.org/"><img height="48" width="72" alt="W3C" src="http://www.w3.org/Icons/w3c_home">
</a>
</p>
<div align="Center">
<h1 align="Center">XML Blueberry Requirements</h1>
<h2>W3C Working Draft 21 September 2001</h2>
</div>
<dl>
<dt>This version:</dt>
<dd><a class="loc" href="http://www.w3.org/TR/2001/WD-xml-blueberry-req-20010921">
http://www.w3.org/TR/2001/WD-xml-blueberry-req-20010921</a>
</dd>
<dt>Latest version:</dt>
<dd><a href="http://www.w3.org/TR/xml-blueberry-req"> http://www.w3.org/TR/xml-blueberry-req</a>
</dd>
<dt>Previous Version:</dt>
<dd><a href="http://www.w3.org/TR/2001/WD-xml-blueberry-req-20010620">http://www.w3.org/TR/2001/WD-xml-blueberry-req-20010620</a></dd>
<dt>Editor: </dt>
<dd><span class="name">John Cowan, Reuters </span><i> (<span class="email"><a href="mailto:jcowan@reutershealth.com">
jcowan@reutershealth.com</a>
</span> )</i></dd>
</dl>
<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Copyright">
Copyright</a>
©2001 <a href="http://www.w3.org/"><abbr title="World Wide Web
Consortium">
W3C</abbr></a>
<sup>®</sup> (<a href="http://www.lcs.mit.edu/"><abbr title="Massachusetts Institute of
Technology">
MIT</abbr></a>
, <a href="http://www.inria.fr/"><abbr lang="fr" title="Institut National
de
Recherche en Informatique et Automatique">
INRIA</abbr></a>
, <a href="http://www.keio.ac.jp/">Keio</a>
), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Legal_Disclaimer">
liability</a>
, <a href="http://www.w3.org/Consortium/Legal/ipr-notice-20000612#W3C_Trademarks">
trademark</a>
, <a href="http://www.w3.org/Consortium/Legal/copyright-documents-19990405">
document use</a>
and <a href="http://www.w3.org/Consortium/Legal/copyright-software-19980720">
software licensing</a>
rules apply.</p>
<hr></div>
<!-- head -->
<div>
<h2 class="abstract">Abstract </h2>
<p>This document lists the design principles and requirements
for the Blueberry revision of the XML Recommendation, a limited revision
of XML 1.0 being developed by the World Wide Web Consortium's XML Core Working
Group solely to address character set issues.</p>
</div>
<div>
<h2 class="status">Status of this document</h2>
<p>This is a W3C Working Draft produced as a deliverable of the
XML Core WG according to its charter and the current <a href="/XML/">XML
Activity</a>
process. A list of current W3C working drafts and notes can be found at
<a href="http://www.w3.org/TR" class="loc"> http://www.w3.org/TR</a>
.</p>
<p>This document is a work in progress representing the current
consensus of the W3C XML Core Working Group. It is published for review
by W3C members and other interested parties. Publication as a Working Draft
does not imply endorsement by the W3C membership. Comments should be sent
to <a href="mailto:www-xml-blueberry-comments@w3.org" class="loc"> www-xml-blueberry-comments@w3.org</a>
, which is an automatically and publicly <a href="http://lists.w3.org/Archives/Public/www-xml-blueberry-comments/">
archived email list</a>
.</p>
</div>
<h2 class="table-of-contents">Table of Contents</h2>
<dl class="table-of-contents">
<dt>1. <a href="#intro">Introduction</a>
</dt>
<dt>2. <a href="#design-principles">Design Principles</a>
</dt>
<dt>3. <a href="#requirements">Requirements</a>
</dt>
<dt>4. <a href="#references">References</a>
</dt>
</dl>
<div class="div1">
<h2><a name="intro">1. Introduction</a>
</h2>
<p>The W3C's XML 1.0 Recommendation <a href="#rec-xml">
[XML]</a>
was first issued in 1998, and despite the issuance of many errata
culminating in a Second Edition of 2001, has remained (by intention)
unchanged with respect to what is well-formed XML and what is not.
This stability has been extremely useful for interoperability. However,
the Unicode Standard <a href="#rec-unicode"> [Unicode]</a>
on which XML 1.0 relies has not remained static, evolving from
version 2.0 to version 3.1. Characters present in Unicode 3.1 but
not in Unicode 2.0 may be used in XML character data. However,
they are not allowed in XML names such as element type names, attribute
names, enumerated attribute values, processing instruction targets, and so
on. In addition, some characters that should have been permitted
in XML names were not, due to oversights and inconsistencies in Unicode
2.0.</p>
<p>As a result, fully native-language XML markup
is not possible in <em>at least</em> the following languages:
Amharic, Burmese, Canadian aboriginal languages, Cherokee, Dhivehi, Hakka
Chinese (Bopomofo script), Khmer, Minnan Chinese (Bopomofo script),
Mongolian (traditional script), Oromo, Syriac, Tigre, and Yi, because the
characters required to write these languages did not exist in Unicode 2.0.
In addition, Chinese (particularly as used in Hong Kong) and Japanese can
make use in XML names of only a subset of their complete character
repertoires.<br>
</p>
<p>The point has been made that many of these languages
can be written using other scripts, notably the Latin script, which makes<br>
transliterated native markup possible. However, exactly the same argument
applies to many languages (for example, Greek) that were already fully encoded
in Unicode 2.0. Discriminating against languages simply because their
scripts were not encoded in Unicode 2.0 is inherently unjust. In addition,
working with transliteration is far more painful for native readers and writers
than working with the native script.<br>
</p>
<p>In addition, XML 1.0 attempts to adapt to the line-end
conventions of various modern operating systems, but discriminates
against the conventions used on IBM and IBM-compatible mainframes.
As a result, XML documents on mainframes are not plain text files according
to the local conventions. XML 1.0 documents generated on mainframes
must either violate the local line-end conventions, or employ otherwise
unnecessary translation phases before parsing and after generation.
Allowing straightforward interoperability is particularly important
when data stores are shared between mainframe and non-mainframe systems (as
opposed to being copied from one to the other).</p>
<p>A new XML version, rather than a set of errata to
XML 1.0, is being created because the change affects the definition of well-formed
documents. XML 1.0 processors must continue to reject documents that
contain new characters in XML names or new line-end conventions. It is
presumed that the distinction between XML 1.0 and XML Blueberry will be indicated
by the XML declaration. </p>
</div>
<div class="div1">
<h2><a name="design-principles">2. Design Principles</a>
</h2>
<ol>
<li>
<p>The XML 1.0 goals listed in section 1.1 of the
XML Recommendation are reaffirmed.</p>
</li>
<li>
<p> XML Blueberry documents shall permit the full
and straightforward use of writing systems supported by Unicode
3.1. <br>
</p>
</li>
<li>
<p>XML Blueberry documents shall permit the full
and straightforward use of operating environments that support
Unicode 3.1.</p>
</li>
<li>
<p>The changes required for XML 1.0 processors to
also process XML Blueberry shall be as few and as small as
possible.</p>
</li>
</ol>
</div>
<div class="div1">
<h2><a name="requirements">3. Requirements</a>
</h2>
<ol>
<li>
<p>XML Blueberry documents shall allow the use within
XML names of all Unicode 3.1 characters, insofar as appropriate
for XML. </p>
</li>
<li>
<p>XML Blueberry documents shall support the line-end
conventions associated with Unicode 3.1, insofar as appropriate
for XML.<br>
</p>
</li>
<li>
<p>The working group shall consider the issue of
future updates to Unicode.</p>
</li>
<li>
<p>The working group shall consider the issue of
W3C normalization as expressed in the W3C Character Model <a href="#charmod">
[CharMod].</a>
</p>
</li>
<li>
<p>In creating XML Blueberry, the working group shall
not consider any revisions to XML 1.0 except those needed to
accomplish these requirements.</p>
</li>
</ol>
</div>
<div class="div1">
<h2><a name="references">4. References</a>
</h2>
<dl>
<dt>CharMod</dt>
<dd><a name="charmod"></a>
W3C (World Wide Web Consortium). <i> Character Model for the World
Wide Web</i> (work in progress). [Cambridge, MA]. <code><a href="http://www.w3.org/TR/charmod">
http://www.w3.org/TR/charmod</a>
</code></dd>
<dt>XML</dt>
<dd><a name="rec-xml"></a>
W3C (World Wide Web Consortium). <i> Extensible Markup Language
(XML) Recommendation.</i> Version 1.0, 2nd edition. [Cambridge, MA].
<code><a href="http://www.w3.org/TR/REC-xml">
http://www.w3.org/TR/REC-xml</a>
</code></dd>
<dt>Unicode</dt>
<dd><a name="rec-unicode"></a>
The Unicode Consortium. <i> The Unicode Standard, Version 3.1.</i>
[Reading, MA: Addison-Wesley Developers Press, 2000]. <code><a href="http://www.unicode.org">
http://www.unicode.org</a>
</code></dd>
</dl>
</div>
</body>
</html>