NOTE-html-lan-19980313
7.65 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
<!doctype html public '-//W3C//DTD HTML 4.0 Transitional//EN'>
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META HTTP-EQUIV="Content-Language" CONTENT="en">
<META NAME="Author" LANG="es" CONTENT="Manuel Tomas CARRASCO BENITEZ">
<TITLE>Primary Language in HTML</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF">
<H3 ALIGN='RIGHT'>
<A href='http://www.w3.org/'>
<img border='0' align='left' alt='W3C' src='http://www.w3.org/Icons/WWW/w3c_home'>
</A>
</H3>
<H1 ALIGN=CENTER>Primary Language in HTML</H1>
<H3 ALIGN=CENTER>World Wide Web Consortium Note 13-March-1998</H3>
<DL>
<DT>This version:</DT>
<DD>
<A HREF="http://www.w3.org/TR/1998/NOTE-html-lan-19980313.html">http://www.w3.org/TR/1998/NOTE-html-lan-19980313.html</A></DD>
<DT>Latest Version:</DT>
<DD>
<A HREF="http://www.w3.org/TR/NOTE-html-lan">http://www.w3.org/TR/NOTE-html-lan</A></DD>
</DL>
<DL>
<DT>Editor:</DT>
<DD>
M.T. Carrasco Benitez
<A HREF="#CAR">[CAR]</A>
<A HREF="mailto:manuel.carrasco@emea.eudra.org"><manuel.carrasco@emea.eudra.org></A></DD>
</DL>
<H2>Status of this document</H2>
This document is a NOTE made available by the W3 Consortium for discussion
only. This indicates no endorsement of its content, nor that the Consortium
has had any editorial control in its preparation, nor that the Consortium
has, is, or will be allocating any resources to the issues addressed by
the NOTE.
<P>
This document recommends how to mark the <EM>primary language(s)</EM>
in a HTML document.
It could be considered a clarification of the
<EM>HTML 4.0 Specification</EM>
<A HREF="#HTML40">[HTML40]</A>;
in particular,
it is not in contradiction with the HTML 4.0 Specification.
The objective is to have a
<EM>best practice</EM> in this field; at present there is some confusion.
<H2>Abstract</H2>
In HTML elements,
the <CODE>lang</CODE> attribute specifies the natural language.
This document is mostly concerned with how to specify the primary language(s)
(there could be more than one)
and the <EM>base language</EM>
(there is only one)
in HTML documents.
<H2>Overview</H2>
Most of the existing documents are monolingual.
<EM>Linguistic versions</EM>
(e.g., translations) of the same text are often kept as separated documents.
This is indeed the most sensible approach.
<P>
Some documents are bilingual and few are trilingual or n-lingual.
Bilingual documents are usually short;
i.e, a few paragraphs.
N-lingual documents are usually very short; a few sentences.
<P>
The main reason for the existence of n-lingual documents is political;
i.e., in certain situations it is not politically correct to assume a base
language. A common practice is to have one small document
that is a menu of languages.
For example,
the Europa server of the European Commission
<A HREF="#EUR">[EUR]</A>.
<P>
Another approach to choose the language is to set the client (e.g.,
the browser) to the preferred language(s).
The client will transmit the language(s) in the Accept-Language field of HTTP.
Immediately, the server will send an appropriate document.
For example, the Spanish version will
be presented if the language preferences (in the browser) are Spanish and
French and the document is available (in the server)
in French, German and Spanish.
<H2>Where to specify the primary language(s)</H2>
There should be <STRONG>one</STRONG> recommended place
to specify the primary language(s).
It is recommended that the primary language(s) be specified in a META element.
For example:
<PRE>
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Language" Content="fr">
<TITLE><SPAN lang=fr>Mon doc</SPAN></TITLE>
</HEAD>
<BODY>
<SPAN lang=fr>Je suis un Berlinois</span>.
</BODY>
</HTML>
</PRE>
<P>
The value of the <CODE>Content</CODE> attribute
of the META element is the same as the
value of the <CODE>Content-Language</CODE> header in HTTP;
i.e,
a comma-separated list of language codes.
For example:
<P>
<CODE>
<META HTTP-EQUIV="Content-Language" Content="fr,en">
</CODE>
<P>
These language codes are the same used in the <CODE>lang</CODE>
attribute of some HTML elements.
For example:
<P>
<CODE>
<BODY LANG=fr>
</CODE>
<P>
The language codes are defined in
<A HREF="#RFC1766">[RFC1766]</A>.
See also
<A HREF="http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.1">
<EM>8.1.1 Language codes</EM>
</A>
of the HTML 4.0 Specification <A HREF="#HTML40">[HTML40]</A>
and
<A HREF="#RFC2068">[RFC2068]</A>.
<P>
The order of the languages in the Content-Language is significant.
The first language in the list is the base language of the document;
i.e., any text not re-specified with the <CODE>lang</CODE> attribute is in
the base language.
<P>
The META should not be marked with more than one language in
documents with minor fragments in other languages.
The rules to specify a document as
monolingual, bilingual or n-lingual are the same as for printed books.
<P>
The reason for recommending META as opposed to the HTML element with
the <CODE>lang</CODE> attribute are:
<UL>
<LI>
N-lingual document could be specified.
For example, a bilingual French/Spanish document can be specified.
</LI>
<LI>
The language(s) would be transmitted in the Content-Language field of HTTP
header.
</LI>
</UL>
<P>
A <CODE>lang</CODE> attribute in the HTML element overrides the language
specified in the META element.
The inheritance rules are in
<A HREF="http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.2">
<EM>
8.1.2 Language information and text direction
</EM>
</A>
of the HTML 4.0 Specification
<A HREF="#HTML40">[HTML40]</A>.
<H2>Acknowledgment</H2>
The recommendations are the rough consensus from the mailing list
www-international@w3.org
<A HREF="#LIST">[LIST]</A>
of the W3C and a meeting during the Unicode Conference in Mainz in March 1997.
<P>
In particular, thanks to
<BR>
<UL>
<LI> Bert Bos from the W3C,
<A HREF="http://www.w3.org/People/Bos/">
http://www.w3.org/People/Bos/
</A>
<LI> Martin Dürst from the W3C,
<A HREF="http://www.w3.org/People/W3Cpeople.html#Durst/">
http://www.w3.org/People/W3Cpeople.html#Durst
</A>
</UL>
<H2>References</H2>
<DL>
<DT>
<A NAME="CAR"></A>[CAR]
<DD>
M.T. Carrasco Benitez.
<A HREF="HTTP://dragoman.org">http://dragoman.org/</A>
<DT>
<A NAME="EUR"></A>[EUR]
<DD>
Europa. <A HREF="http://europa.eu.int">http://europa.eu.int/</A>
<DT><A NAME="HTML40"></A>[HTML40]
<DD>
HTML 4.0 Specification.
<A HREF="http://www.w3.org/TR/REC-html40/">http://www.w3.org/TR/REC-html40/</A>
<BR>
In particular:
<BR>
<A HREF="http://www.w3.org/TR/REC-html40/intro/intro.html#h-2.3.1">2.3.1
Internationalization</A>
<BR>
<A HREF="http://www.w3.org/TR/REC-html40/charset.html#h-5.1">5.1 The
Document Character Set</A>
<BR>
<A HREF="http://www.w3.org/TR/REC-html40/struct/global.html#h-7.4.4">7.4.4
Meta data</A>
<BR>
<A HREF="http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8">8
Language information and text direction</A>
<DT>
<A NAME="LIST">[LIST]</A>
<DD>
<A HREF="http://www.w3.org/International/O-misc-mlists.html">
http://www.w3.org/International/O-misc-mlists.html
</A>
<DT>
<A NAME="RFC1766">[RFC1766]</A>
<DD>
<EM>Tags for the Identification of Languages</EM>, H. Alvestrand, March 1995.
<BR>
Available at
<A HREF="http://ds.internic.net/rfc/rfc1766.txt">http://ds.internic.net/rfc/rfc1766.txt</A>
<DT>
<A NAME="RFC2068">[RFC2068]</A>
<DD>
<EM>Hypertext Transfer Protocol -- HTTP/1.1</EM>,
R. Fielding, J. Gettys, J. Mogul, H. Frystyk Nielsen and T. Berners-Lee,
January 1997.
<BR>
Available at
<A HREF="http://ds.internic.net/rfc/rfc2068.txt">http://ds.internic.net/rfc/rfc2068.txt</A>
<BR>
In particular:
<BR>
3.10 Language Tags
<BR>
12 Content Negotiation
<BR>
12.3 Transparent Negotiation
<BR>
14.4 Accept-Language
<BR>
14.13 Content-Laguage
<BR>
14.43 Vary
<BR>
15.7 Privacy Issues Connected to Accept Headers
<BR>
</DL>
</BODY>
</HTML>