index.html
55.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"></meta><title> Authoring Techniques for XHTML & HTML Internationalization: Characters and Encodings 1.0</title><style type="text/css" >
</style><link rel="stylesheet" type="text/css" href="techniques.css" /><link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-WD" /></head><body><div style="text-align:center;"><p><a href="#contents">[ contents ]</a></p></div><div class="head" ><p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home" alt="W3C" height="48" width="72" /></a></p> <h1><a name="title" id="title" /> Authoring Techniques for XHTML & HTML Internationalization: Characters and Encodings 1.0</h1> <h2><a name="w3c-doctype" id="w3c-doctype" />W3C Working Draft 9 May 2004</h2><dl><dt>This version:</dt><dd>
<a href="http://www.w3.org/TR/2004/WD-i18n-html-tech-char-20040509/">http://www.w3.org/TR/2004/WD-i18n-html-tech-char-20040509/</a>
</dd><dt>Latest version:</dt><dd>
<a href="http://www.w3.org/TR/i18n-html-tech-char/">http://www.w3.org/TR/i18n-html-tech-char/</a>
</dd><dt>Previous version:</dt><dd><a href="http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/">http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/</a></dd><dt>Editor:</dt><dd>Richard Ishida, W3C</dd></dl><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2004 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>, <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-software">software licensing</a> rules apply.</p></div><hr /><div > <h2><a name="abstract" id="abstract" />Abstract</h2><p>It is important to consider character encoding matters when producing internationalization content, and
further to understand how to choose and declare encodings, how and when to use character escapes, etc.</p><p>This document is one of a series of documents providing HTML authors with techniques for developing
internationalized HTML using XHTML 1.0 or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3. It focuses
specifically on advice about character sets, encodings, and other character-specific matters. It is produced by the
Guidelines, Education & Outreach Task Force (GEO) of the
<a href="http://www.w3.org/International/">W3C Internationalization Working Group (I18N WG)</a>. The GEO
Task Force encourages feedback about the content of this document as well as participation in the development of the
techniques by people who have experience creating Web content that conforms to internationalization needs.</p></div><div > <h2><a name="status" id="status" />Status of this Document</h2><p><em>This section describes the status of this document at the time of its publication. Other documents may
supersede this document. A list of current W3C publications and the latest revision of this technical report can be
found in the
<a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p><p>This is the First Public Working Draft of a document produced by the
<a href="http://www.w3.org/International/geo/">GEO (Guidelines, Education & Outreach) Task Force</a> of
the
<a href="http://www.w3.org/International/">W3C Internationalization Working Group (I18N WG)</a>. The
Internationalization Working Group is part of the
<a href="http://www.w3.org/International/Activity">W3C Internationalization Activity</a>. This is a draft
document that does not fully represent the consensus of the group at this time. The Working Group expects to advance
this Working Draft to Working Group Note.</p><p>The document provides practical techniques related to character sets, encodings, and other character-specific
matters that HTML content authors can use to ensure that their HTML is easily adaptable for an international audience.
These are techniques that are best addressed from the start of content development if unnecessary costs and resource
issues are to be avoided later on.</p><p>This document was last published as part of a larger document entitled
<a href="http://www.w3.org/TR/2003/WD-i18n-html-tech-20031009/">Authoring Techniques for XHTML & HTML
Internationalization 1.0</a>. The material in that document will now be published as a number of smaller independent
documents to allow for easier ongoing improvements and updates. The total number of such documents is not fixed, but
will grow as material and resources become available. The title of all related documents will begin with "Authoring
Techniques for XHTML & HTML Internationalization:..." and they can be found in the
<a href="http://www.w3.org/TR/">W3C technical reports index</a>.</p><p>The Task Force encourages feedback about the content of this document as well as participation in the
development of the guidelines by people who have experience creating Web content that conforms to internationalization
needs. Send comments about this document to
<a href="mailto:www-i18n-comments@w3.org">www-i18n-comments@w3.org</a>. The
<a href="http://lists.w3.org/Archives/Public/www-i18n-comments/">archives</a> for this list are publicly
available.</p><p>The Internationalization Working Group will not allow early implementation to constrain its ability to make
changes to this specification prior to final release. Publication as a Working Draft does not imply endorsement by the
W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It
is inappropriate to cite this document as other than work in progress.</p><p>This document has been produced under the
<a href="http://www.w3.org/TR/2002/NOTE-patent-practice-20020124">24 January 2002 CPP</a> as amended by the
<a href="http://www.w3.org/2004/02/05-pp-transition">W3C Patent Policy Transition Procedure</a>. The
Working Group maintains a
<a href="http://www.w3.org/International/2002/Disclosures">public list of patent disclosures</a> relevant
to this document; that page also includes instructions for disclosing a patent. An individual who has actual knowledge
of a patent which the individual believes contains Essential Claim(s) with respect to this specification should
disclose the information in accordance with
<a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent
Policy</a>. At the time of publication, the Working Group believed there were no patent disclosures relevant to this
specification.</p></div><div class="toc" > <h2><a name="contents" id="contents" />Table of Contents</h2><p class="toc">1 <a href="#ri20030912.142608197">Introduction</a><br /> 1.1 <a href="#ri20031001.170046667">Who should use this document</a><br /> 1.2 <a href="#ri20030912.142616699">How to use this document</a><br /> 1.3 <a href="#ri20030912.143319987">Standards addressed</a><br /> 1.4 <a href="#ri20030912.144634229">User agents addressed</a><br /> 1.5 <a href="#IDA4MFO">Editorial notes</a><br />2 <a href="#IDAPNFO">Choosing a page encoding </a><br />3 <a href="#ri20040310.054442951">Specifying a page encoding</a><br /> 3.1 <a href="#IDARVFO">Using the HTTP header</a><br /> 3.2 <a href="#IDAK1FO">Declaring the encoding in-document</a><br /> 3.3 <a href="#IDAIIGO">Declaring the encoding in more than one place</a><br /> 3.4 <a href="#IDA5JGO">Choosing names for your encodings</a><br />4 <a href="#IDAPNGO">Representing characters using escapes</a><br /></p> <h3><a name="appendices" id="appendices" />Appendices</h3><p class="toc">A <a href="#IDAPXGO">Acknowledgements</a><br />B <a href="#IDAXXGO">References</a><br /></p></div><hr /><div class="body" ><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.142608197" id="ri20030912.142608197" />1 Introduction</h2><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20031001.170046667" id="ri20031001.170046667" />1.1 Who should use this document</h3><p >All HTML content authors working with XHTML 1.0, HTML 4.01, XHTML 1.1, CSS1, CSS2 and CSS3.</p><p >The term author is used in the sense described by the HTML 4.01 spec, ie. as a person or program that writes
or generates HTML documents.</p><p >This document provides guidance for the development of HTML so that it will support international usage.
This is the responsibility of all content authors, not just the localization group, and is relevant from the very start
of development. Ignoring the advice in this document, or relegating it to a later phase in the development, will only
add unnecessary costs and resource issues at a later date.</p><p >It is assumed that readers of this document are proficient in developing HTML and XHTML pages - this
document limits itself to providing advice related specifically to internationalization.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.142616699" id="ri20030912.142616699" />1.2 How to use this document</h3><p >If you are new to this topic you may wish to read this document from end to end. It is, however, expected
that this document will normally be used for reference purposes - the reader dipping in to a particular section to find
out how to perform a specific task with internationalization in mind. </p><p >This document is one of several documents relating to the design of XHTML and HTML documents. An
<a href="http://www.w3.org/International/geo/html-tech/outline/html-authoring-outline.html">overview
document</a> is available that summarises all the recommendations of this and its companion documents together,
organized according to tasks that a developer of XHMTL/HTML content may want to perform. When this material is used as
a reference, it is recommended that the overview document is used as a starting point.</p><p >Cross references and further resources are summarized at the end of each section.</p><p >Editorial notes have been left in this version of the document. These are marked
.</p><p >For information about the applicability of recommendations to user agents see below.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.143319987" id="ri20030912.143319987" />1.3 Standards addressed</h3><p >This document provides techniques for developing pages using HTML 4.01, XHTML 1.0 and XHTML 1.1 with CSS1,
CSS2 and some parts of CSS3.</p><p >Note that XHTML source can be served as XML (using MIME types <code class="keyword">application/xhtml+xml</code>,
<code class="keyword">application/xml</code> or <code class="keyword">text/xml</code>) or HTML (using the MIME type <code class="keyword">text/html</code>).</p><p >It is very common for XHTML to be served as HTML, following the
<a href="http://www.w3.org/TR/xhtml1/#guidelines">compatibility guidelines in Appendix C </a>of the XHTML
1.0 specification. This allows authors with the right editing tools to produce valid XML code, which therefore lends
itself to processing with such things as scripting or XSLT, but is also well supported for display by most mainstream
browsers. (XHTML served as <code>application/xhtml+xml</code> is not well supported for browser display at the moment.)
In this document we wish to reflect practical reality for content authors, so we cover XHTML served as
<code class="keyword">text/html</code> in the techniques.</p><p >Indeed we encourage the use of XHTML, and all the examples (unless trying to make a specific point about
HTML 4.01) are written in XHTML.</p><p > For XHTML served as XML, this document limits its advice to documents served as
<code class="keyword">application/xhtml+xml</code>. Note that user agent support for XHTML served as XML is still patchy.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20030912.144634229" id="ri20030912.144634229" />1.4 User agents addressed</h3><p >In order to improve the value of this information to the user we try to ground techniques with information
about their applicability to particular user agents.</p><p >User agents, in this current version, means a number of mainstream browsers. (The scope may grow as
resources and test results become available for other user agents.)</p><p >In an attempt to make the task of tracking browser applicability manageable, we have chosen a 'base version'
for each of the user agents we are tracking for applicability. This base version represents a fairly recent,
standards-compliant version of the browser. Where a browser operates in both standards- and quirks-mode, standards-mode
is assumed (ie. you should use a DOCTYPE statement).</p><p >The base versions considered for this version of the document include:</p><ul ><li><p >Internet Explorer 6 (Windows)</p></li><li><p >Mozilla 1.4</p></li><li><p >Opera 7</p></li><li><p >Netscape Navigator 7</p></li><li><p >Safari</p></li><li><p >Internet Explorer 5 (Mac)</p></li></ul><p >If the technique is applicable to a base version of a user agent the name of that user agent will appear
immediately below the summary of the technique. If the technique is not applicable, the name will appear crossed out.
If the name does not appear at all, this signifies that further investigation is needed. If the technique is applicable
to a later version than the chosen base version, this will be indicated by adding the version number to the name.</p><p >Detailed information may also be provided from time to time about behavior of a user agent in an earlier
version than the base version, or about some particular aspect of the behavior of a base version or later user agent.
This is provided in a special boxed section within the body of the text.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDA4MFO" id="IDA4MFO" />1.5 Editorial notes</h3><p ></p><p ></p><p ></p></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPNFO" id="IDAPNFO" />2 Choosing a page encoding </h2><div class="rule"><a id="ri20030112.213746362" name="ri20030112.213746362" href="#ri20030112.213746362">
Choose UTF-8 or another Unicode encoding for all content.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >When selecting a page encoding, consider both current and future localization requirements, and the benefits
of using the same encoding across all pages and all languages. These considerations make the use of Unicode an
attractive choice for the following reasons:</p>
<ul ><li><p >Unicode supports many languages, enabling the use of a single encoding across all pages and forms,
regardless of language.</p></li><li><p >Unicode allows many more languages to be mixed on a single page than almost any other choice. If the set
of languages to be represented on a single page cannot be represented directly by any single native encoding (such as
ISO-8859-1, Shift-JIS, etc.), then Unicode is almost certainly the best choice.</p></li><li><p >For dynamically-generated pages, a single encoding for all pages eliminates the need for server-side
logic to determine the character encoding for each page served.</p></li><li><p >For interactive applications using forms, a single encoding eliminates the need for server-side logic to
determine the character encoding of incoming form data.</p></li><li><p >Unicode enables a form in one language (e.g. English) to accept input in a different language (e.g.
Chinese).</p></li><li><p >Unicode (UTF-8) forms will be easier to migrate to XForms.</p></li></ul>
<p >UTF-8 and UTF-16 are both Unicode encodings. Since support for Unicode is currently limited to UTF-8 in many
user agents, UTF-8 is usually the appropriate Unicode encoding. However, as user agent support for UTF-16 expands,
UTF-16 will become an increasingly viable alternative.</p>
<p >Although there are other multi-script encodings (such as ISO-2022 and GB18030), Unicode generally provides
the best combination of user agent and script support.</p></div><div class="rule"><a id="ri20030112.21374337" name="ri20030112.21374337" href="#ri20030112.21374337">
If you don't use a Unicode encoding, select an encoding that best supports the languages / characters to be
included in the page text.
</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >There are some situations where selecting a Unicode encoding is not practical. If content is encoded in a
native encoding (legacy content or content originating from an external source) and the system lacks functionality for
converting content between encodings, Unicode may greatly complicate implementation. If such a site is only required to
serve single-script pages (containing languages that can be represented by a single native encoding), then the cost of
using a Unicode encoding may outweigh the benefits. In this case, a native encoding (such as ISO-8859-1, Shift-JIS,
etc.) may be a better choice.</p>
<p >Be sure to select an encoding that covers most
of the characters required for the content, and (if it is a form) all
of the characters that must be accepted as input.</p></div><div class="rule"><a id="ri20030314.181040685" name="ri20030314.181040685" href="#ri20030314.181040685">
Check that user agents (all agents that must render the page) adequately support the page encoding that you
have selected. If not, you might need to use a more widely supported encoding to achieve an adequate degree of user
agent support.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Not all user agents support all page encodings, so it is important to understand which user agents must be
able to render the page, and be sure that they have adequate support for the page encoding you have selected.</p>
<p >In general, user agents are most likely to support the commonly-used native character encodings for the
major languages used on the web. Support for less commonly used encodings depends on the user agent. Older user agents,
or user agents that operate under severe memory limitations, may not support UTF-8.</p>
<p >It is important to note that support for a given encoding does not necessarily imply support for all writing
systems that encoding supports. For example, a user agent might support UTF-8, but not correctly display bidirectional
Arabic text encoded in UTF-8. To display a page correctly, a user agents must support both the page encoding and the
writing system.</p></div><div class="rule"><a id="ri20030112.213752611" name="ri20030112.213752611" href="#ri20030112.213752611">
Use character sets and encodings that will be accessible and common to your
users.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >.</p></div>
</div><div class="resources"><div class="small-head">Resources:</div><h4><a id="FIIDAPNFO" name="FIIDAPNFO">Further information</a></h4><ul><li>How do I specify the encoding?<br /><a href="#ri20040310.054442951" ><b>3 Specifying a page encoding</b></a><br /></li></ul><h4><a id="IGIDAPNFO" name="IGIDAPNFO">Implementation guidelines</a></h4><ul><li><a title="The Unicode Standard, Version 3" href="#unicode">[Unicode]</a> <a href="http://www.unicode.org/versions/Unicode4.0.0/">The Unicode Standard 4.0</a><br />The Unicode Standard
is very readable and contains a large amount of useful information besides code point
listings.</li></ul><h4><a id="RLIDAPNFO" name="RLIDAPNFO">Reference links</a></h4><ul><li> <a href="http://www.alanwood.net/unicode/index.html">Alan Wood’s Unicode Resources</a><br />Various resources
about Unicode and multilingual support in HTML, fonts, web browsers and other applications.</li></ul><h4><a id="SIDAPNFO" name="SIDAPNFO">Sources</a></h4><ul><li><a title="Character Model for the World Wide Web 1.0" href="#charmod">[CharMod]</a> <a href="http://www.w3.org/TR/charmod/#sec-Escaping">3.7 Character Escaping</a><br />Character Model for the
World Wide Web 1.0</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.1">5.2.1 Choosing an encoding</a><br />HTML 4.01
spec</li></ul></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="ri20040310.054442951" id="ri20040310.054442951" />3 Specifying a page encoding</h2><p >For overviews of the mechanics of specifying a page encoding and additional examples, see the tutorial
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
encodings</a>.</p><div class="rule"><a id="ri20040215.100236230" name="ri20040215.100236230" href="#ri20040215.100236230">
Always declare the encoding of your documents.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Whether you declare the encoding by passing information alongside the document in the HTTP header, or inside
the document itself, you should always ensure that the encoding is declared. If you don't do this, the chances are high
that your document will be incorrectly rendered.</p>
<p >Note also that you should include a character encoding declaration even if your document uses a basic Latin
encoding such as ISO 8859-1. For example, Japanese user agents will default to a Japanese encoding that does not
include the accented letters, so they may not see your text correctly unless you specified the
encoding.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDARVFO" id="IDARVFO" />3.1 Using the HTTP header</h3><div class="rule"><a id="ri20030509.093901773" name="ri20030509.093901773" href="#ri20030509.093901773">
Where appropriate, declare the page's character encoding by setting the <code class="keyword">charset</code> parameter in the
HTTP <code class="keyword">Content-Type</code> header.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >According to the HTML specification, in a case of conflict the HTTP charset declaration has the highest
priority of all means of declaring the character set.</p>
<p >Advantages to this approach:</p>
<ul ><li><p >User agents can easily find the character encoding information when it is sent in the HTTP header.</p></li><li><p >The HTTP header information has the highest priority in case of conflict, so this approach should be
used by intermediate servers that transcode the data (ie. convert to a different encoding). This is sometimes done for
small devices that only recognize a small number of encodings. Because the HTTP header information has precedence over
any in-document declaration, it doesn't matter that transcoders typically do not change the internal encoding
declarations, just the document encoding.</p></li></ul>
<p >There may be some disadvantages when dealing with static files or templates:</p>
<ul ><li><p >It may be difficult for content authors to change the encoding information on the server - especially
when dealing with an ISP. They will need knowledge of and access to the server settings.</p></li><li><p >Server settings may get out of synchronization with the document for one reason or another. This may
happen, for example, if you rely on the server default, and that default is changed. This is a very bad situation,
since the higher precedence of the HTTP information versus the in-document declaration may cause the document to become
unreadable.</p></li></ul>
<p >In addition, there are potential problems for both static and dynamic documents if they are to be saved by
the user or used from a location such as a CD or hard disk. In these cases encoding information from an HTTP header is
not available.</p>
<p >Similarly, if the character encoding is only declared in the HTTP header, this information may become
separated from files that are processed by such things as XSLT or scripts, or from files that are sent for
translation.</p>
<p >For these reasons you should always ensure that encoding information is <em>also</em> declared inside
the document.</p>
<p >Care should also be taken to ensure that the server-side settings are maintained if the file is moved or
the server technology is changed.</p></div><div class="rule"><a id="ri20040215.104619262" name="ri20040215.104619262" href="#ri20040215.104619262">
If declaring the character encoding in the HTTP header, ensure that the server-side settings will be
maintained, especially if the file is moved or the server technology is changed.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Discrepancies may arise due to the document being moved, because a server administrator or other content
author changes settings that cascade to your document, or because the server or server version has changed, etc. Since
encoding declarations in the HTTP header have highest priority in determining the encoding of the document, it is a
very bad situation if the server-side settings are inadvertently changed.</p>
<p >If content authors need to set server-side settings, it is important to also ensure that they have the
required knowledge, access and privileges to do so. This is especially important when dealing with a third-party
ISP.</p>
</div><div class="rule"><a id="ri20040215.101337371" name="ri20040215.101337371" href="#ri20040215.101337371">
If declaring the character encoding in the HTTP header, always declare the encoding inside the document
too.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >This does not rule out also declaring it in the HTTP information provided by the server, but provides for
use of the document when the HTTP information is not available.</p>
<p >This is important for both static and dynamic documents if there is a chance that your documents will be
saved to or read from disk, CD, etc.</p>
<p >Also, if the character encoding is only declared in the HTTP header, this information may become separated
from files from files that are sent for translation or processed by such things as XSLT or scripts.</p>
<p >It is also valuable for developers, testers, or translation production managers who may want to perform a
visual check of a document.</p>
</div></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAK1FO" id="IDAK1FO" />3.2 Declaring the encoding in-document</h3><div class="rule"><a id="ri20030112.213757177" name="ri20030112.213757177" href="#ri20030112.213757177">
For HTML documents and XHTML documents served as text/html, always use a <code class="keyword">meta</code> element to explicitly
declare the document's character encoding.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
encodings</a>.</p>
<div class="example"><div class="small-head">Example:</div><p ><code><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></code></p></div>
<p >This approach is not appropriate for documents served as XML, but when serving a document as HTML, there
are no disadvantages and a couple of definite advantages, even if the encoding has been declared in the HTTP
header:</p>
<ul ><li><p >An in-document encoding allows the document to be read correctly when not on a server. This applies
not only to static documents read from disk or CD, but also dynamic documents that are saved by the reader.</p></li><li><p >An in-document declaration of this kind helps developers, testers, or translation production managers
who want to perform a visual check of a document. This applies particularly to static documents or templates used to
generate dynamic documents.</p></li></ul>
</div><div class="rule"><a id="ri20030112.223147682" name="ri20030112.223147682" href="#ri20030112.223147682">
Use <code class="keyword">meta</code> charset declarations as early as possible in the <code class="keyword">head</code>
element.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >This maximizes the likelihood that non-ASCII characters will be correctly recognized by the user
agent.</p>
<p >The HTML spec says "The <code class="keyword">meta</code> declaration must only be used when the character encoding is
organized such that ASCII-valued bytes stand for ASCII characters (at least until the <code class="keyword">meta</code> element is parsed).
"
</p></div><div class="rule"><a id="ri20031001.14582550" name="ri20031001.14582550" href="#ri20031001.14582550">
For XHTML served as <code class="keyword">application/xhtml+xml</code>, always use an XML declaration with an encoding
attribute.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
encodings</a>.</p>
<div class="example"><div class="small-head">Example:</div><p ><code><?xml version="1.0" encoding="UTF-8"?></code></p></div>
<p >If you are serving XHTML as <code class="keyword">application/xhtml+xml</code>, the encoding attribute is mandatory unless you
are using UTF-8 or UTF-16 or declaring the encoding in the HTTP header.</p>
<p >Even if the file document is encoding in UTF-8 or UTF-16, declaring the encoding in the document is useful
for the following reasons:</p>
<ul ><li><p >It is useful to have the encoding declared in the document when editing or processing the file as
XML.</p></li><li><p >An in-document declaration helps developers, testers, or translation production managers who want to
perform a visual check of a document. This is a good reason for including the encoding declaration even if the file is
in UTF-8 or UTF-16, despite the fact that it is not strictly necessary for these encodings.</p></li><li><p >An in-document encoding allows the document to be read correctly when not read from the server.</p></li><li><p >There is likely to be no other in-document alternative to express the character encoding. (The charset
<code class="keyword">meta</code> declaration is not recognized by XML processors.)</p></li></ul></div><div class="rule"><a id="ri20030509.100837166" name="ri20030509.100837166" href="#ri20030509.100837166">
For XHTML served as text/html, where practical use an XML declaration with an encoding
attribute.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >The following is an example of a meta statement. For more information about usage, see the tutorial
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">Character sets &
encodings</a>.</p>
<div class="example"><div class="small-head">Example:</div><p ><code><?xml version="1.0" encoding="UTF-8"?></code></p></div>
<p >Key reasons for using XHTML are to take advantage of the benefits that XML brings for editing and
processing, but when these documents are served as text/html to user agents, they are treated as HTML, not XML.</p>
<p >Advantages to including an XML declaration include the following:</p>
<ul ><li><p >If your document is not encoded in UTF-8 or UTF-16 and the encoding is not declared in an HTTP header,
it is necessary to have this in-document encoding declaration when editing or processing the file as XML, eg. using
XSLT transformations or scripting, since the XML processors do not see HTTP information, and do not recognize the meta
charset statement described earlier.</p></li><li><p >In some cases, you may want to serve the same static document as either HTML or XML, depending on the
capabilities of the requesting user agent. This can be achieved by server-side logic. In these cases you will want to
have an XML declaration in the document when it is served as XML. (We are assuming that the appropriate declaration can
be added to the file via scripting for dynamically created documents.)</p></li></ul>
<p >On the other hand:</p>
<ul ><li><p >Because the XML declaration may cause undesirable effects in some user agents (see
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#serving">Serving HTML &
XHTML</a>), you may prefer to omit it.</p></li><li><p >The XML declaration is not actually needed for HTML documents (which is what we are discussing here).
HTML processors do not use this information, and the encoding information should be included in the meta charset
statement described above.</p></li></ul>
<p >In summary we could say the following:</p>
<ul ><li><p >If the XML declaration will not cause your document any harm, it is best to include it. If you do use
an XML declaration, you should always declare the encoding in it.</p></li><li><p >If you are worried about the undesirable effects sometimes associated with use of the XML declaration
in HTML files, the best solution is to omit the declaration but serve the file as UTF-8 or UTF-16.</p></li><li><p >If you use UTF-8 or UTF-16 the file is still perfectly valid XML, but no XML declaration is
required.</p></li></ul></div><div class="rule"><a id="ri20040215.115249590" name="ri20040215.115249590" href="#ri20040215.115249590">
If you serve an XHTML file without an encoding declaration in the HTTP header or the XML declaration, you
must use either UTF-8 or UTF-16 as the document encoding.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >This is required by the XHTML specification.</p>
</div></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAIIGO" id="IDAIIGO" />3.3 Declaring the encoding in more than one place</h3><div class="rule"><a id="ri20040215.121036394" name="ri20040215.121036394" href="#ri20040215.121036394">
If you declare the document's character encoding in more than one place, take steps to ensure that it is
always correct.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >If all declarations are correct, then there will be no conflicts.</p>
<p >If you serve encoding information in the HTTP header, it is particularly important to ensure that it is
always served correctly since this declaration has the highest priority. It is also the method most open to risks of
inadvertent change.</p>
<p >Also ensure that any editing or scripting tools you use consistently apply the correct encoding
information - especially if your tools add the declarations automatically.</p>
</div>
</div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDA5JGO" id="IDA5JGO" />3.4 Choosing names for your encodings</h3><div class="rule"><a id="ri20030112.213749756" name="ri20030112.213749756" href="#ri20030112.213749756">
Use the preferred names from IANA's charset registry.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >The IANA charset registry shows a name plus a list of aliases for each registered charset value. One of
these is identified as the preferred MIME name. Wherever you declare the character encoding, use the preferred MIME
name in the charset value.</p>
<p >This maximizes the likelihood of interoperability.</p></div><div class="rule"><a id="ri20040215.112209454" name="ri20040215.112209454" href="#ri20040215.112209454">
Do not invent your own encoding names using the <code class="keyword">x-</code> syntax.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >This is not usually a good idea since it limits interoperability.</p>
</div>
</div></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="BIIDAAUFO" name="BIIDAAUFO">Background information</a></h4><ul><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#serving">Serving HTML &
XHTML</a><br />Describes possible problems when serving HTML files with an XML
declaration.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
encoding</a><br />Provides a description of how the charset information is passed with the HTTP header, and
more background.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
encoding</a><br />Shows how to set the character encoding in a meta
statement.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html#declaring">Declaring the document
encoding</a><br />Shows how to set the character encoding in an XML
declaration.</li></ul><h4><a id="RLIDAAUFO" name="RLIDAAUFO">Reference links</a></h4><ul><li><a title="Official Names for Character Sets" href="#iana">[IANA]</a> <a href="http://www.iana.org/assignments/character-sets">IANA charset
registry</a><br /></li><li> <a href="http://www.w3.org/International/O-HTTP-charset.html">The HTTP
charset parameter</a><br />Explains how to set the HTTP charset parameter of the Content-Type header on
various servers and with various dynamic technologies.</li></ul><h4><a id="SIDAAUFO" name="SIDAAUFO">Sources</a></h4><ul><li><a title="Hypertext
 Transfer Protocol -- HTTP/1.1" href="#rfc2616">[RFC2616]</a> <a href="http://www.ietf.org/rfc/rfc2616.txt">RFC2616: Hypertext Transfer Protocol --
HTTP/1.1</a><br /></li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#C_9">3.1.1. Strictly Conforming Documents (towards the bottom of the
section)</a><br />General requirements for specification of encoding in XHTML
documents.</li><li> <a href="http://www.w3.org/International/questions/qa-setting-encoding-in-applications.html">FAQ: Setting encoding in web
authoring applications</a><br />How do I set character encoding in my web authoring
applications?</li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#C_9">C.9 Character encoding</a><br />How to specify
character encoding for XHTML served as text/html using compatibility markup.</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.2">5.2.2 Specifying the character
encoding</a><br />HTML 4.01 spec</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.2.2">5.2.2 Specifying the character
encoding</a><br />HTML 4.01 spec</li><li><a title="XHTML™ 1.0
 The Extensible HyperText Markup Language (Second Edition)" href="#xhtml1">[XHTML 1.0]</a> <a href="http://www.w3.org/TR/xhtml1/#strict">3.1.1. Strictly Conforming Documents</a><br />XHTML 1.0
requirements for use of the XML declaration.</li></ul></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPNGO" id="IDAPNGO" />4 Representing characters using escapes</h2><p >For an explanation of the different types of escape available in XHTML, HTML and CSS, see
<a href="/International/tutorials/tutorial-char-enc.html#entities">What are entities and NCRs?</a>.</p><div class="rule"><a id="ri20030112.223401895" name="ri20030112.223401895" href="#ri20030112.223401895">
Only use escapes for characters in exceptional circumstances. Create pages using an encoding that supports all
the characters you need.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Using escapes can make it difficult to read and maintain source code, and can also significantly increase
file size. Many English-speaking developers have the expectation that other languages only make occasional use of
non-ASCII characters, but this is wrong.</p>
<p >There are three characters which should always appear in content as escapes, so that they do not interact
with the syntax of the markup:</p>
<ul ><li><p >&lt; (<)</p></li><li><p >&gt; (>)</p></li><li><p >&amp; (&)</p></li></ul>
<p >You may also want to represent the double-quote (") as &quot; - particularly in attribute text when you
need to use the same type of quotes as you used to surround the attribute value.</p>
<p >Escapes can be useful to represent characters not supported by the encoding you chose for the document. For
example, to represent Chinese characters in an ISO Latin 1 document. You should ask yourself first, however, why you
have not changed the encoding of the document to something that covers all the characters you need (such as, of course,
UTF-8).</p>
<p >If your editing tool does not allow you to easily enter needed characters you may also resort to using
escapes. Note that this is not a long-term solution, nor one that works well if you have to enter a lot of such
characters - it takes longer and makes maintenance more difficult. Ideally you would choose an editing tool that
allowed you to enter these characters as characters.</p>
<p >A potentially very useful role for escapes is for characters that are invisible or ambiguous in
presentation.</p>
<p >One example would be Unicode character <span class="uname">200F: RIGHT-TO-LEFT MARK</span>. This character can be used
to clarify directionality in bidirectional text (eg. when using the Arabic or Hebrew scripts). It has no graphic form,
however; so it is difficult to see where these characters are in the text, and if they are lost or forgotten they could
create unexpected results during later editing. Using &rlm; (or its NCR equivalent &#x200F;) instead makes it
very easy to spot these characters.</p>
<p >An example of an ambiguous character is <span class="uname">00A0: NO-BREAK SPACE</span>. This type of space prevents
line breaking, but it looks just like any other space when used as a character. Using &nbsp; (or &#xA0;) makes
it quite clear where such spaces appear in the text.</p></div><div class="rule"><a id="ri20030112.223703527" name="ri20030112.223703527" href="#ri20030112.223703527">
Ensure that numbers in numeric character references always reference a Unicode
codepoint.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >It is a common error for people working on a page encoded in Windows code page 1252, for example, to try to
represent the euro sign using &#x80;. This is because the euro appears at position 80 on the Windows 1252 code
page. Using &#x80; would actually produce a control character, since the escape would be expanded as the character
at position 80 in the Unicode repertoire. What was really needed was &#x20AC;.</p></div><div class="rule"><a id="ri20040312.072207969" name="ri20040312.072207969" href="#ri20040312.072207969">
When using escapes, use the hexadecimal form.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Typically when the Unicode Standard refers to or lists characters it does so using a hexadecimal value. For
instance, the code point for the letter á may be referred to as U+00E1. Given the prevalence of this convention, it is
often useful, though not required, to use hexadecimal numeric values in escapes rather than decimal values. You do not
need to use leading zeros in escapes.</p></div><div class="rule"><a id="ri20040312.105840211" name="ri20040312.105840211" href="#ri20040312.105840211">
Use numeric character references rather than entities if your document is to be processed by unknown XML tools
or converted to XML.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Any XML application recognizes numeric character references such as &#xE1; as representing Unicode
characters. On the other hand, an entity such as &aacute; has to be declared in the DTD or Schema to be recognized
in the XML. Character entities are defined as part of the HTML / XHTML standard, but are often not incorporated in
other flavours of XML.</p>
<p >If there is a likelihood that you will want to repurpose or process this information (including sometimes
running it through localization tools), you should think carefully about which approach is most
appropriate.</p></div><div class="rule"><a id="ri20040312.070547700" name="ri20040312.070547700" href="#ri20040312.070547700">
If you use escapes, to represent characters in a <code class="keyword">style</code> attribute consider using CSS escapes, rather
than NCRs or entities.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >This is likely to be a very rare occurrence, firstly, because it is usually better to use style information
in a separate stylesheet or stylesheet element; and, secondly, because there are not many situations where you are
likely to need non-ASCII characters in styling that appears in an attribute.</p>
<p >The issue arises because a <code class="keyword">style</code> attribute in XHTML or HTML can represent characters using NCRs,
entities or CSS escapes. On the other hand, the <code class="keyword">style</code> <em>element</em> in HTML can contain neither NCRs
nor entities, and the same applies to an external style sheet.</p>
<p >Because there is a tendency to want to move styles declared in attributes to the style element or an
external style sheet (for example, this might be done automatically using an application or script), it is safest to
use only CSS escapes.</p>
<p >For example, it is better to use</p>
<div class="example"><div class="small-head">Example:</div><p ><code><span style="font-family: L\FC beck">...</span></code></p></div>
<p >than</p>
<div class="example"><div class="small-head">Example:</div><p ><code><span style="font-family: L&#xFC;beck">...</span></code></p></div></div><div class="rule"><a id="ri20030112.223804174" name="ri20030112.223804174" href="#ri20030112.223804174">
If, for a specific application, it becomes necessary to refer to characters outside [ISO10646], characters
should be assigned to a private zone to avoid conflicts with present or future versions of the standard. Use of private
use characters is highly discouraged, however, for reasons of portability.</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >tbd</p></div><div class="rule"><a id="ri20030112.223911671" name="ri20030112.223911671" href="#ri20030112.223911671">
</a></div><div class="applicability"> IE(Win) Mozilla Opera NNav Safari IE(Mac) </div><div class="description">
<p >Discuss</p></div></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="BIIDAPNGO" name="BIIDAPNGO">Background information</a></h4><ul><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="/International/tutorials/tutorial-char-enc.html">What are entities and
NCRs?</a><br />Background reading about the use of escapes, including some examples not found
here.</li><li><a title="Character Sets & Encodings in
 XHTML, HTML and CSS" href="#charEncTutorial">[CharEncTutorial]</a> <a href="/International/tutorials/tutorial-char-enc.html">What are entities and NCRs?</a><br />Background
reading about the use of escapes, including some examples not found here.</li></ul><h4><a id="SIDAPNGO" name="SIDAPNGO">Sources</a></h4><ul><li><a title="Cascading Style Sheets, level 2 revision 1" href="#css21">[CSS2.1]</a> <a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/syndata.html#q24">4.4.1 Referring to characters not represented in a
character encoding</a><br />Advises use of CSS escapes in style attributes.</li><li><a title="Character Model for the World Wide Web 1.0" href="#charmod">[CharMod]</a> <a href="http://www.w3.org/TR/charmod/#sec-Escaping">3.7 Character Escaping</a><br />Character Model for the
World Wide Web 1.0</li><li><a title="HTML 4.01 Specification" href="#html401">[HTML 4.01]</a> <a href="http://www.w3.org/TR/html401/charset.html#h-5.3">5.3 Specifying the character
encoding</a><br />HTML 4.01 spec</li></ul></div></div><div class="back" ><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAPXGO" id="IDAPXGO" />A Acknowledgements</h2><p >The following GEO Task Force members have contributed their time and valuable comments to shaping these
guidelines:</p><p >Phil Arko, Steve Billings, Deborah Cawkwell, Wendy Chisholm, Andrew Cunningham, Martin Dürst, Lloyd Honomichl,
Russ Rolfe, Peter Sigrist, Tex Texin, Najib Tounsi</p></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" width="26" height="26" alt="Return to top of contents..." align="right" /></a><a name="IDAXXGO" id="IDAXXGO" />B References</h2><dl><dt class="label" ><a name="charEncTutorial" id="charEncTutorial" />CharEncTutorial</dt><dd >Richard Ishida,
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html"><cite>Character Sets & Encodings in
XHTML, HTML and CSS</cite></a>, Draft. (See
<a href="http://www.w3.org/International/tutorials/tutorial-char-enc.html">http://www.w3.org/International/tutorials/tutorial-char-enc.html).</a></dd><dt class="label" ><a name="charmod" id="charmod" />CharMod</dt><dd >M. J. Dürst, F. Yergeau, R. Ishida, M. Wolf, T. Texin,
<a href="http://www.w3.org/TR/charmod/"><cite>Character Model for the World Wide Web 1.0</cite></a>, Working Draft in
Last Call . (See <a href="http://www.w3.org/TR/charmod/">http://www.w3.org/TR/charmod/</a>.)</dd><dt class="label" ><a name="css21" id="css21" />CSS2.1</dt><dd >Håkon Wium Lie, Bert Bos, Tantek Çelik, Ian Hickson, Eds.,
<a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/"><cite>Cascading Style Sheets, level 2 revision 1</cite></a>,
Candidate Recommendation, W3C Recommendation. (See <a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/">http://www.w3.org/TR/2004/CR-CSS21-20040225</a>.) </dd><dt class="label" ><a name="html401" id="html401" />HTML 4.01</dt><dd >Dave Raggett, Arnaud Le Hors, Ian Jacobs, Eds.,
<a href="http://www.w3.org/TR/html401/"><cite>HTML 4.01 Specification</cite></a>, W3C Recommendation. (See
<a href="http://www.w3.org/TR/html401/">http://www.w3.org/TR/html401</a>.) </dd><dt class="label" ><a name="iana" id="iana" />IANA</dt><dd >Internet Assigned Numbers Authority,
<a href="http://www.iana.org/assignments/character-sets"><cite>Official Names for Character Sets</cite></a>. (See
<a href="http://www.iana.org/assignments/character-sets">http://www.iana.org/assignments/character-sets</a>.)
</dd><dt class="label" ><a name="rfc2616" id="rfc2616" />RFC2616</dt><dd >R. Fielding et al., <a href="http://www.ietf.org/rfc/rfc3066.txt"><cite>Hypertext
Transfer Protocol -- HTTP/1.1</cite></a>, January 2001. (See <a href="http://www.ietf.org/rfc/rfc2616.txt">http://www.ietf.org/rfc/rfc2616.txt</a></dd><dt class="label" ><a name="unicode" id="unicode" />Unicode</dt><dd >The Unicode Consortium, <cite>The Unicode Standard, Version 3</cite>, ISBN
0-201-61633-5, as updated from time to time by the publication of new versions. (See
<a href="http://www.unicode.org/unicode/standard/versions/">http://www.unicode.org/unicode/standard/versions</a> for the
latest version and additional information on versions of the standard and of the Unicode Character Database).</dd><dt class="label" ><a name="xhtml1" id="xhtml1" />XHTML 1.0</dt><dd >W3C HTML Working Group, <a href="http://www.w3.org/TR/xhtml1/"><cite>XHTML™ 1.0
The Extensible HyperText Markup Language (Second Edition)</cite></a>, W3C Recommendation. (See
<a href="http://www.w3.org/TR/xhtml1/">http://www.w3.org/TR/xhtml1/</a>.) </dd></dl></div></div></body></html>