index.html 99 KB
<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/><title>Internationalization Best Practices: Specifying Language in XHTML &amp; HTML Content</title><style type="text/css">

</style><link rel="stylesheet" href="local.css" type="text/css"/><link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-WG-NOTE.css"/></head><body><div style="text-align:center;"><p>[ <a href="#contents">contents</a> ]</p></div><div class="head"><p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home" alt="W3C" height="48" width="72"/></a></p>
<h1><a name="title" id="title"></a>Internationalization Best Practices: Specifying Language in XHTML &amp; HTML Content</h1>
<h2><a name="w3c-doctype" id="w3c-doctype"></a>W3C Working Group Note 12 April 2007</h2><dl><dt>This version:</dt><dd> 
		<a href="http://www.w3.org/TR/2007/NOTE-i18n-html-tech-lang-20070412/">http://www.w3.org/TR/2007/NOTE-i18n-html-tech-lang-20070412/</a></dd><dt>Latest version:</dt><dd> 
		<a href="http://www.w3.org/TR/i18n-html-tech-lang/">http://www.w3.org/TR/i18n-html-tech-lang/</a>
		</dd><dt>Previous version:</dt><dd><a href="http://www.w3.org/TR/2006/WD-i18n-html-tech-lang-20060721/">http://www.w3.org/TR/2006/WD-i18n-html-tech-lang-20060721/</a></dd><dt>Editor:</dt><dd>Richard Ishida, W3C</dd></dl><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2007 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr/><div>
<h2><a name="abstract" id="abstract"></a>Abstract</h2><p>Specifying the language of content is useful for a wide number of applications, from linguistically-sensitive
		  searching to applying language-specific display properties. In some cases the potential applications for language
		  information are still waiting for implementations to catch up, whereas in others, such as detection of language by
		  voice browsers, it is a necessity today. On the other hand, adding markup for language information to content is something that can and should be done
		  today. Without it, it will not be possible to take advantage of any future developments.</p></div><div>
<h2><a name="status" id="status"></a>Status of this Document</h2><p><em>This section describes the status of this document at the time of its publication. Other documents may
		  supersede this document. A list of current W3C publications and the latest revision of this technical report can be
		  found in the 
		  <a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p><p>This is a W3C Working Group Note  produced by the 
		  <a href="http://www.w3.org/International/core/">Internationalization Core Working Group</a>, part of the 
		  <a href="http://www.w3.org/International/Activity">W3C Internationalization Activity</a>.</p><p>This document is one of a planned series of documents providing HTML authors with best practices for developing
		  internationalized HTML using XHTML 1.0 or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3. It focuses
		  specifically on advice about specifying the language of content. It is produced by the Internationalization Core Working Group of the 
		  <a href="http://www.w3.org/International/">W3C Internationalization Activity</a>.</p><p>The document provides practical best practices related to specifying the language of content that HTML content
		  authors can use to ensure that their HTML is easily adaptable for an international audience. These are best practices that
		  are best addressed from the start of content development if unnecessary costs and resource issues are to be avoided
		  later on.</p><p>Please send comments related to this document to 
		  <a href="mailto:www-international@w3.org">www-international@w3.org</a> (<a href="http://lists.w3.org/Archives/Public/www-international/">public archive</a>).</p><p>Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 W3C Patent Policy</a>. W3C maintains a <a href="http://www.w3.org/2004/01/pp-impl/32113/status">public list of any patent disclosures</a> made in connection with the deliverables of the group;  that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the information in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>.</p></div><div class="toc">
<h2><a name="contents" id="contents"></a>Table of Contents</h2><div class="toc"><div class="toc1">1 <a href="#ri20030912.142608197">Introduction</a><div class="toc2">1.1 <a href="#ri20031001.170046667">Who should use this document</a></div>
<div class="toc2">1.2 <a href="#ri20030912.142616699">How to use this document</a></div>
<div class="toc2">1.3 <a href="#ri20030912.143319987">Technologies addressed</a></div>
</div>
<div class="toc1">2 <a href="#ri20050208.091505539">Why read this document?</a></div>
<div class="toc1">3 <a href="#ri20040808.100519373">Important concepts</a><div class="toc2">3.1 <a href="#ri20040808.101452727">The  language of the intended audience</a></div>
<div class="toc2">3.2 <a href="#ri20040808.102523274">The text-processing language</a></div>
<div class="toc2">3.3 <a href="#ri20050208.093646470">Relationships between language, character encoding and directionality
</a></div>
</div>
<div class="toc1">4 <a href="#ri20050208.095812479">Mechanisms for declaring language in HTML</a><div class="toc2">4.1 <a href="#ri20060630.133615821">Possible approaches</a></div>
<div class="toc2">4.2 <a href="#ri20060630.133619987">Which approach should I use?</a></div>
</div>
<div class="toc1">5 <a href="#ri20030510.102829377">Using attributes to declare language</a><ul style="margin-top:0;margin-bottom:0;"><li class="toc-technique"><a href="#ri20030112.213801634">Using attributes in the <code>html</code> tag</a></li><li class="toc-technique"><a href="#ri20040728.121403792">Using attributes in the <code>html</code> tag for multilingual audiences</a></li><li class="toc-technique"><a href="#ri20040721.18455302">Dividing multilingual documents</a></li><li class="toc-technique"><a href="#ri20030112.213804197">Identifying changes in language within the document</a></li><li class="toc-technique"><a href="#ri20040429.092928424">Choosing between <code class="keyword">lang</code> and <code class="keyword">xml:lang</code></a></li><li class="toc-technique"><a href="#ri20040808.110827800">Choosing between <code class="keyword">Content-Language</code> and attributes</a></li><li class="toc-technique"><a href="#ri20040429.094630704">Using the <code class="keyword">body</code> tag</a></li><li class="toc-technique"><a href="#ri20050128.175100333">Handling attribute values and element content in different languages</a></li></ul></div>
<div class="toc1">6 <a href="#ri20040728.121358444">Declaring metadata about the language of the intended audience</a><ul style="margin-top:0;margin-bottom:0;"><li class="toc-technique"><a href="#ri20040429.094220724">Using HTTP or a <code class="keyword">meta</code> tag  to indicate audience</a></li><li class="toc-technique"><a href="#ri20040728.121940236">Providing a comma-separated list of languages</a></li></ul></div>
<div class="toc1">7 <a href="#ri20030218.131140352">Choosing language values</a><ul style="margin-top:0;margin-bottom:0;"><li class="toc-technique"><a href="#ri20030112.224623362">Using BCP 47</a></li><li class="toc-technique"><a href="#ri20030112.224717800">Deciding on language tag length</a></li><li class="toc-technique"><a href="#ri20040429.113217290">Using <code class="keyword">Hans</code> and <code class="keyword">Hant</code> codes</a></li></ul></div>
<div class="toc1">8 <a href="#ri20040310.074302350">Indicating the language of a link destination</a><ul style="margin-top:0;margin-bottom:0;"><li class="toc-technique"><a href="#ri20050128.152033553">Identifying the language of a target document</a></li><li class="toc-technique"><a href="#ri20030112.224458239">Using <code class="keyword">hreflang</code> with CSS</a></li><li class="toc-technique"><a href="#ri20040808.173208643">Using flags to indicate languages</a></li></ul></div>
</div>
<h3><a name="appendices" id="appendices"></a>Appendix</h3><div class="toc1">A <a href="#d2e1861">Acknowledgments</a></div>
</div><hr/><div class="body"><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20030912.142608197" id="ri20030912.142608197"></a>1 Introduction</h2><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20031001.170046667" id="ri20031001.170046667"></a>1.1 Who should use this document</h3><p>All HTML content authors working with XHTML 1.0, HTML 4.01, XHTML 1.1, and CSS.</p><p>The term '<span class="qterm">author</span>' is used in the sense described by the HTML 4.01 specification, ie. as a
			 person or program that writes or generates HTML documents.</p><p>This document provides guidance for developers of HTML that enables support for international deployment.
			 Enabling international deployment is the responsibility of all content authors, not just localization groups or
			 vendors, and is relevant from the very start of development. Ignoring the advice in this document, or relegating it to
			 a later phase in the development process, will only add unnecessary costs and resource issues at a later date.</p><p>It is assumed that readers of this document are proficient in developing HTML and XHTML pages - this
			 document limits itself to providing advice specifically related to internationalization.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20030912.142616699" id="ri20030912.142616699"></a>1.2 How to use this document</h3><p>This document is one of several  relating to best practices for the design of Web content using W3C technologies.</p><p>If you are new to this topic you may wish to read the document from end to end, however, you will probably want to  use the document later for reference purposes - dipping in to a particular section to
			 find out how to perform a specific task with internationalization in mind.</p><p>Each best practice recommendation is summarized tersely. The text that follows that gives advice on how to implement the best practice, and provides additional explanations and discussion where appropriate. In some cases, the applicability of the recommendation may vary, depending on your aims and context.  Where there are pros and cons for a given recommendation, we try to clearly indicate those.</p><p>Additional resources are pointed to at the end of each best practice. To check whether new resources have become available since the publication of this document, follow the links at the end of the resource sections to the techniques and topic indexes provided on the Internationalization section of the W3C site.</p><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e120" id="d2e120"></a>1.2.1 User agent specific notes</h4><p>User
			 agents, in the current version of this document, means a number of mainstream browsers. (The scope may grow as
			 resources and test results become available for other user agents.)</p><p>If there is something you should know about how a best practice is supported by a particular user agent, we try to make that clear.</p><p>Small icons immediately after the initial statement of the best practice will indicate if there are notes you should read. The notes themselves appear in the descriptive text.</p><p>The user agents tested for the current document, their versions, and the icons used are as follows:</p><ul><li><p>Internet Explorer 7 <img align="middle" src="images/ie7.gif" alt="Internet Explorer icon" height="16" width="21"/></p></li><li><p>Internet Explorer 6 <img align="middle" src="images/ie6.gif" alt="Internet Explorer icon" height="16" width="21"/></p></li><li><p>Firefox 2.0 <img align="middle" src="images/firefox.gif" alt="Firefox icon" height="16" width="16"/></p></li><li><p>Opera 9.0 <img align="middle" src="images/opera.gif" alt="Opera icon" height="14" width="16"/></p></li><li><p>Netscape Navigator 8.1 <img align="middle" src="images/netscape.gif" alt="Netscape icon" height="16" width="16"/></p></li><li><p>Safari 2.0 <img align="middle" src="images/safari.gif" alt="Safari icon" height="16" width="16"/></p></li></ul><p>Detailed information may also be provided from time to time about behavior of a user agent in another
			 version than the base or current versions.</p></div></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20030912.143319987" id="ri20030912.143319987"></a>1.3 Technologies addressed</h3><p>This document provides best practices for developing pages using HTML 4.01, XHTML 1.0 and XHTML 1.1 with CSS.</p><p>XHTML 1.0 can be served as XML (using MIME types <code class="keyword">application/xhtml+xml</code>, <code class="keyword">application/xml</code> or
			 <code class="keyword">text/xml</code>) or HTML (using the MIME type <code class="keyword">text/html</code>).  It is very common for XHTML 1.0 to be served as HTML,  hopefully following the 
			 <a href="http://www.w3.org/TR/xhtml1/#guidelines">compatibility guidelines in Appendix C </a>of the XHTML
			 1.0 specification. This allows authors to produce valid XML code, which has benefits for processing with scripts or XSLT, but is also well supported for display by most mainstream browsers.
			 (Unlike XHTML served as <code>application/xhtml+xml</code>, which is not well supported by some browsers at the moment.)</p><p>In this document we want to reflect practical reality for content authors, so we cover XHTML served as
			 <code class="keyword">text/html</code>. All the examples (unless trying to make
			 a specific point about HTML 4.01) are written in XHTML 1.0.</p><p> For XHTML served as XML, this document limits its advice to XHTML 1.1 documents served as
			 <code class="keyword">application/xhtml+xml</code>.</p><p>Where a browser operates in both 
			 <a href="http://www.w3.org/International/articles/serving-xhtml/#quirks">standards- and quirks-mode</a>,
			 standards-mode is assumed (ie. you should use a DOCTYPE statement).</p></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20050208.091505539" id="ri20050208.091505539"></a>2 Why read this document?</h2><p>Applications already exist that can use information about the <span class="new-term">natural language</span> (ie. the human, non-programmatic language) of content to deliver to users the most
		  relevant information or styling, based on their language preferences. The more content is tagged and tagged correctly, the more
		  useful and pervasive such applications will become.</p><p>Language information is useful for things such as authoring tools, translation tools, accessibility, font
		  selection, page rendering, search, and scripting.</p><p>These applications can't work, however, if the information about the language of the text is not available.  Language information should therefore be specified for the page as a whole, and wherever language changes within the
		  page.</p><p>In the future there will be other applications for language information, driven by developments in technology. For example, implementations of the CSS3 <code class="keyword">:first-letter</code> pseudo-element will need language information to apply correct styling. However, we are
		  currently faced with a circular problem. People who don't see the application of language information do not provide
		  information about their content, and language-related applications are slow to be deployed until this information is widely
		  available. This cycle can be broken by content authors taking steps now to declare language information. This is usually
		  very easy to do, and carries no penalties.</p></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20040808.100519373" id="ri20040808.100519373"></a>3 Important concepts</h2><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20040808.101452727" id="ri20040808.101452727"></a>3.1 The  language of the intended audience</h3><p>Metadata that describes the language of the intended audience is about <strong>the document as a whole</strong>. Such metadata may be
			 used for searching, serving the right language version, classification, etc. Where there are language changes in a document, information about the language of the intended audience is not specific enough to support <a title="" href="#ri20040808.102523274">text-processing</a>, for example, in a way that would be needed for the application of text-to-speech, styling, automatic font assignment,
			 etc.</p><p>The language of the intended audience does not include every language used in a document. Many documents on the Web contain
			 embedded fragments of content in different languages, whereas the page is clearly aimed at speakers of one particular
			 language. For example, a German city-guide for Beijing may contain useful phrases in Chinese, but  it is aimed at a German-speaking audience, not a Chinese one.</p><p>On the other hand, it  is also possible to imagine a situation where a document contains the same or parallel content in more
			 than one language. For example, a Web page may welcome Canadian readers with French content in the left column, and the
			 same content in English in the right-hand column. Here the document is equally targeted at speakers of both languages,
			 so there are <em>two</em> audience languages. This situation is not as common on the Web as in printed material
			 since it is easy to link to separate pages on the Web for different audiences, but it does occur where there are multilingual communities. Another use case is a blog or a news page aimed at a multilingual community, where some articles on a page are in one language and some in another. </p><p>There are also pages where the navigational information, including the page title, is in one language but
			 the real content of the page is in another. While this is not necessarily good practice, it doesn't change the fact that
			 the language of the intended audience is usually that of the content, regardless of the
			 language at the top of the document source.</p><p>Metadata about the language of the intended audience is usually best declared outside the document in the HTTP Content-Language header,
			 although there may be situations where an internal declaration using the <code class="keyword">meta</code> element is
			 appropriate.</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20040808.102523274" id="ri20040808.102523274"></a>3.2 The text-processing language</h3><p>When specifying the <span class="new-term">text-processing language</span> you are declaring the language in which <strong>a
			 specific range of text</strong> is actually written, so that user agents or applications that manipulate the text, such
			 as voice browsers, spell checkers, or style processors can effectively handle the text in question. So we are, by
			 necessity, talking about associating a <em>single</em> language with a <em>specific</em> range of text.</p><p>This specificity distinguishes the declaration of the language for text-processing from the <a title="" href="#ri20040808.101452727">language of the intended audience</a>.</p><p>The language for text-processing is usually best declared using attributes on elements, including the <code class="keyword">html</code> element that contains all the content of the document. Enclosed elements
			 inherit the declared value, but you can, of course, override an initial declaration by specifying a different language
			 on embedded elements where the language changes, eg. a French word in an English paragraph (see
			 <a class="section-ref" href="#ri20030510.102829377">Section 5: Using attributes to declare language</a>).</p></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20050208.093646470" id="ri20050208.093646470"></a>3.3 Relationships between language, character encoding and directionality
</h3><p><em>Language declarations in HTML and XHTML do not, and should not,  provide information about character encoding or the direction of
			 text.</em></p><p>There are separate mechanisms for declaring character encoding and directionality in HTML and XHTML, and
			 these ideas should not be confused with mechanisms for declaring language.</p><p><span class="new-term">Character encoding</span> refers to the sequences of bytes that are used to represent characters in text. It is
			 important to declare which encoding is being used for your document,  but this is a separate issue from declaring language. (To better understand character encoding declarations see <a href="/International/tutorials/tutorial-char-enc/">Character sets &amp; encodings in XHTML, HTML and CSS</a>.)</p><p>Some people think that information about language can be inferred from the character encoding, but this is
			 not true. There would have to be a one-to-one mapping between encoding and language for this to work, and there isn't. A single
			 character encoding such as ISO 8859-1 (Latin1), could encode both French and English, as well as a great many other
			 languages. In addition, different character encodings can be used for a single language, eg, Arabic could be encoded
			 with 'Windows-1256' or 'ISO 8859-6' or 'UTF-8'.</p><p><span class="new-term">Text direction</span> is another thing that should not be confused with language. In some scripts, such as Arabic and Hebrew, displayed text is read predominantly from right to left, although within that flow,
			 numbers and text from other scripts are displayed from left to right. Markup is needed to set the overall right-to-left context, and in some circumstances markup is needed to correctly render bidirectional text, but this cannot be done using language markup. (To better understand text direction and markup see <a href="/International/tutorials/bidi-xhtml/">Creating (X)HTML Pages in Arabic &amp; Hebrew</a>.)</p><p>As with encodings and language, there is not always a one-to-one mapping between language and script, and therefore
			 directionality. For example, Azerbaijani can be written using both right-to-left and left-to-right scripts, and the language code <code class="keyword">az</code> can be relevant for either. In addition, text direction markup used with inline text applies a range of different values to the text, whereas language is a simple switch that is not up to the tasks required.</p><p>Additional 
			 <a href="/International/technique-index">best practice</a> linked from the W3C Internationalization site
			 describe how to declare character encoding and text direction.</p></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20050208.095812479" id="ri20050208.095812479"></a>4 Mechanisms for declaring language in HTML</h2><p>The HTML and XHTML specifications define a number of places where you can and can't declare language. In
		  <a class="section-ref" href="#ri20060630.133615821">Section 4.1: Possible approaches</a> we will simply show examples of the alternatives available. If you are familiar with this, jump to <a class="section-ref" href="#ri20060630.133619987">Section 4.2: Which approach should I use?</a>, which will discuss which method you should use, and when. </p><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20060630.133615821" id="ri20060630.133615821"></a>4.1 Possible approaches</h3><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e337" id="d2e337"></a>4.1.1 Attributes</h4><p>The first method is to use the <code class="keyword"><a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">lang</a></code> and <code class="keyword"><a href="http://www.w3.org/TR/2004/REC-xml-20040204/#sec-lang-tag">xml:lang</a></code> attributes on an XHTML element. </p><p>To set
		  the language of a whole document, you can use attributes on the <code class="keyword">html</code> tag.  This value will be inherited by the whole document, unless overridden by a declaration on a contained element.</p><p>You can also use attributes on elements that contain text in a language that is different from the surrounding content.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e357" id="d2e357"></a>Example 1: Attribute-based language declarations in an XHTML 1.0 document served as text/html.</div><p><code>&lt;html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml"&gt;</code></p></div></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e363" id="d2e363"></a>4.1.2 Content-Language meta element</h4><p>Alternatively, you may find documents that put language information in a <code class="keyword">meta</code>
		  element with <code class="keyword">http-equiv</code> set to <code class="keyword">Content-Language</code>.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e377" id="d2e377"></a>Example 2: A Content-Language declaration in a meta element.</div><p><code>&lt;meta http-equiv="Content-Language" content="en" /&gt;</code></p></div></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e383" id="d2e383"></a>4.1.3 Dublin Core meta element</h4><p>Since the  <code class="keyword">meta</code>
		  element puts few limits on what you can say, it would  also be possible, though not very common, to express language information using Dublin Core notation.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e391" id="d2e391"></a>Example 3: A Dublin Core notation declaration in a meta element.</div><p><code>&lt;meta name="dc.language" content="en" /&gt;</code></p></div></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e397" id="d2e397"></a>4.1.4 HTTP header</h4><p>Language information may also be found in the 
		  <a href="http://www.faqs.org/rfcs/rfc1945.html">HTTP header</a> that is sent with a document (see the last
		  line in the following example of an HTTP header).</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e405" id="d2e405"></a>Example 4: An HTTP header containing a language declaration.</div><div class="exampleInner"><pre>HTTP/1.1 200 OK
Date: Wed, 05 Nov 2003 10:46:04 GMT
Server: Apache/1.3.28 (Unix) PHP/4.2.3
Content-Location: CSS2-REC.en.html
Vary: negotiate,accept-language,accept-charset
TCN: choice
P3P: policyref=http://www.w3.org/2001/05/P3P/p3p.xml
Cache-Control: max-age=21600
Expires: Wed, 05 Nov 2003 16:46:04 GMT
Last-Modified: Tue, 12 May 1998 22:18:49 GMT
ETag: "3558cac9;36f99e2b"
Accept-Ranges: bytes
Content-Length: 10734
Connection: close
Content-Type: text/html; charset=iso-8859-1
<strong>Content-Language: en</strong></pre></div></div></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e412" id="d2e412"></a>4.1.5 Multilingual readers</h4><p>Note that the <code class="keyword">meta</code> element with <code class="keyword">Content-Language</code> and the HTTP header both allow you to supply a
		  <em>list</em> of values. The example below declares the languages of the intended audience of the document to be (in equal measure) German,
		  French and Italian.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e426" id="d2e426"></a>Example 5: A meta element with a value of  multiple languages.</div><p><code>&lt;meta http-equiv="Content-Language" content="de, fr, it"/&gt;</code></p></div></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e432" id="d2e432"></a>4.1.6 CSS</h4><p>It is not possible to declare the language of text in CSS declarations.</p></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e437" id="d2e437"></a>4.1.7 DOCTYPE declarations</h4><p>Sometimes people are confused by what looks like a language declaration on the DOCTYPE declaration. These declarations may appear at the top of an HTML or XHTML file, before the <code class="keyword">html</code> element. Example <a class="example-ref" href="#ri20070314.144711765">6</a> shows a DOCTYPE declaration containing the sequence <code>EN</code>, which stands for '<span class="qterm">English</span>'. This, however, indicates the language of the <em>schema</em> associated with this document - it has nothing to do with the language of the document itself.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20070314.144711765" id="ri20070314.144711765"></a>Example 6: A DOCTYPE declaration does not declare the language of the document.</div><div class="exampleInner"><pre>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt;</pre></div></div></div></div><div class="div2">
<h3><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20060630.133619987" id="ri20060630.133619987"></a>4.2 Which approach should I use?</h3><p>In short, this document recommends that you always declare the language of content to support <a title="" href="#ri20040808.102523274">text-processing</a> needs. We recommend  that you do so using attributes in the <code class="keyword">html</code> element (to set the default language for the whole document) and on any element containing content in a different language.</p><p>Attribute-based language declarations are important for most of the applications of  language information on the Web today,  from spell-checking in the editor, to styling and text-to-speech in the delivered page, etc.</p><p>If you want to provide metadata about the <a title="" href="#ri20040808.101452727">language of the document's intended audience</a>, you should use one or more of the other mechanisms described in the previous section, ie. not attributes.</p><p>There are still many unknowns surrounding the current usefulness of HTTP headers or <code class="keyword">meta</code> elements to declare the language of the intended audience, due to the currently low level of
			 exploitation of this information. This may change in the future, particularly if libraries and similar users take an
			 increasing interest in language metadata.</p><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e484" id="d2e484"></a>4.2.1 Attributes vs. Content-Language: why they are different</h4><p>People are often particularly confused about the difference between declaring the language of the document as a whole using the Content-Language field in the HTTP header or <code class="keyword">meta</code> elements, and doing so using
			 an attribute on the <code class="keyword">html</code> element.</p><p>Much of the informal advice on the Web about how to declare the language of a
			 document tells you to just use the <code class="keyword">meta</code> tag to declare the language of the document. At least one
			 popular authoring tool automatically inserts language information that
			 you declare in the page properties dialog box into a <code class="keyword">meta</code> element only. We contend that if you are only going to do one thing you should declare language for text-processing purposes, and that attributes should be used for that, not the other methods.</p><p>The following bullet points discuss why attributes are most suited to declaring the text-processing language, and the other mechanisms to metadata declarations.</p><ol class="depth1"><li><p>HTTP and <code class="keyword">meta</code> declarations allow you to specify more than one language value.  This is inappropriate for labeling the text-processing language, which must be done one language at a time. On the other hand, multiple language values <em>are</em> appropriate when declaring  language for  documents that are aimed at speakers of more than one language. Attribute-based language declarations can only specify one language at a time, so they are less appropriate for specifying the language of the intended audience, but they are perfect for labeling the text-processing language for text.</p></li><li><p>The language information contained in HTTP headers is rarely used by mainstream
			 browsers for text-processing applications, and such implementation as there is is inconsistent (see the <a href="http://www.w3.org/International/tests/#lang-decl">test results</a>). Unfortunately, we have yet to identify <em>any</em> user agent or application
			 that recognizes information declared in a <code class="keyword">meta</code> tag when it comes to text-processing. On the other hand, language information declared in the
			 <code class="keyword">html</code> tag <em>is</em> consistently recognized.</p></li><li><p>Since changes in the text-processing language within the document can only be done using attributes, it promotes consistency to use attributes on the <code class="keyword">html</code> element to express the default text-processing language of the document.</p></li><li><p>It is important to always know the default text-processing language for the document, but if the document is not read from a server, or the author is unable to apply the necessary server settings, the HTTP content header will not be available.</p></li></ol></div><div class="div3">
<h4><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e542" id="d2e542"></a>4.2.2 Alternative approaches to metadata</h4><p>When it comes to choosing between the HTTP header or the <code class="keyword">meta</code> element for expressing information about the intended audience, there is a lack of information on which to base any advice. In some ways the <code class="keyword">meta</code> element may appeal, because it is an in-document declaration. This avoids potential issues if authors cannot access server settings, particularly if dealing with an ISP, or if the document is to be read from a CD or other non-HTTP source. Until more practical use cases arise, however, this is just theory.</p><p>If, in the future, we see systematic use of in-document declarations of
			 audience language using the <code class="keyword">meta</code> element. It may also become acceptable to infer the language of the intended audience
			 from the language attribute on the <code class="keyword">html</code> element for documents with a monolingual audience. Discussion amongst
			 various stakeholders needs to take place, however, before this can be decided.</p><p>Nothing is known at this time about the value of using the Dublin Core approach mentioned in the previous section.  This is perhaps a good reason to not use it in some people's minds. For in-document metadata declarations, the use of the Content-Language meta element is already much more widespread.</p></div></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20030510.102829377" id="ri20030510.102829377"></a>5 Using attributes to declare language</h2><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20030112.213801634" name="ri20030112.213801634" href="#ri20030112.213801634">Best Practice 1: Using attributes in the <code>html</code> tag</a></div><div class="rule">Always declare the default language for text in the page using attributes on  the <code class="keyword">html</code> tag, unless the document contains content aimed at speakers of more than one language.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Use the <code class="keyword">lang</code>
			 and/or <code class="keyword">xml:lang</code> attributes on the <code class="keyword">html</code> tag. Example <a class="example-ref" href="#ri20050131.134137927">7</a> declares
			 an HTML document to be in Canadian French:</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.134137927" id="ri20050131.134137927"></a>Example 7: </div><p><code>&lt;html lang="fr-CA"&gt;</code></p></div><p>For details of which language attribute to use, see <a class="technique-ref" href="#ri20040429.092928424">Best Practice 5: Choosing between lang and xml:lang</a>.</p><p>For details of how to specify language values, see <a class="section-ref" href="#ri20030218.131140352">Section 7: Choosing language values</a>.</p><p><strong>Discussion:</strong> Declaring the default <a title="" href="#ri20040808.102523274">text-processing
			 language</a> is already important for applications such as 
			 <a href="http://www.w3.org/TR/WCAG10/#gl-abbreviated-and-foreign">accessibility</a> and searching, but
			 many 
			 <a href="/International/questions/qa-lang-why">other possible applications</a> for this information may
			 emerge over time.</p><p>Declaring the text-processing
			 language in the <code class="keyword">html</code> tag sets the default text-processing language for the whole document. It can be
			 overridden for portions of the document as required. For this reason you should try to always declare a language in the <code class="keyword">html</code> tag. It
			 is usually very easy to do when creating the content, but more difficult to retrofit later in order to take advantage of
			 language-related features.</p><p>Most documents contain content aimed at speakers of a single language, but where the intended audience is expected to read content in more than one language (eg. a multilingual blog, or a page aimed at more than one language community) it may make more sense to declare the default text-processing language lower down in the document than in the <code class="keyword">html</code> tag. The best approach will depend on the structure used for
			 the document. See <a class="technique-ref" href="#ri20040728.121403792">Best Practice 2: Using attributes in the html tag for multilingual audiences</a>.</p><div class="note"><p><span class="note-head">Note: </span>See <a class="section-ref" href="#ri20040728.121358444">Section 6: Declaring metadata about the language of the intended audience</a> for information about language declarations using the HTTP header or the <code class="keyword">meta</code> tag and related to the <a title="" href="#ri20040808.101452727">language of the intended audience</a>.</p></div></div><div class="resources"><div class="small-head">Resources:</div><h4 class="resource-first"><a id="BId2e645" name="BId2e645">Background information</a></h4><ul><li>Why use the language attribute? A number of useful
		  reasons. W3C article.<br/> <a href="/International/questions/qa-lang-why">Why use the
			 language attribute?</a></li></ul><h4><a id="RLd2e645" name="RLd2e645">Reference links</a></h4><ul><li>W3C tutorial.<br/> <a href="/International/tutorials/language-decl/">Declaring Language in XHTML and HTML</a></li></ul><h4><a id="Sd2e645" name="Sd2e645">Sources</a></h4><ul><li>Express natural language in a document. (in Web Content Accessibility Guidelines, Guideline 4).<br/> <a href="http://www.w3.org/TR/WCAG10/#gl-abbreviated-and-foreign">Clarify natural language
			 usage</a></li><li>Use lang attribute on html tag (in Web Content Accessibility Techniques for HTML, section 2.2).<br/> <a href="http://www.w3.org/TR/WCAG10-HTML-TECHS/#identify-primary-lang">Identifying the primary
			 language</a></li><li>lang attribute definition in HTML 4.01 (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>xml:lang attribute definition in XML 1.0. (in XML 1.0 spec, section 2.12).<br/> <a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">Language Identification</a></li><li>xml:lang and lang attribute definitions in XHTML 1.0 (section C.7)<br/> <a href="http://www.w3.org/TR/xhtml1/#C_7">The lang and
			 xml:lang Attributes</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040728.121403792" name="ri20040728.121403792" href="#ri20040728.121403792">Best Practice 2: Using attributes in the <code>html</code> tag for multilingual audiences</a></div><div class="rule">Where a document contains content aimed at speakers of more than one language, decide whether you want to declare one language in the <code class="keyword">html</code> tag, or leave the languages undefined until later.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Example <a class="example-ref" href="#ri20060630.161258582">8</a> shows a very simple document containing content aimed at multiple linguistic audiences. In this case, the document is split in two right after the <code class="keyword">body</code> element, and the author has delayed the declaration of the text-processing language until then.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20060630.161258582" id="ri20060630.161258582"></a>Example 8: </div><div class="exampleInner"><pre>&lt;html xmlns= "http://www.w3.org/1999/xhtml"&gt; 
&lt;head&gt; 
  &lt;meta http-equiv="Content-Type" content="text/html; charset=utf-8"/&gt; 
  &lt;title&gt;Welcome - <span xml:lang="fr" lang="fr">Bienvenue</span>&lt;/title&gt; 
  &lt;/head&gt; 
&lt;body&gt; 
  &lt;div lang="en" xml:lang="en"&gt;
     &lt;h1&gt;Welcome!&lt;/h1&gt; 
     &lt;p&gt;Lots of text in English...&lt;/p&gt;
     &lt;/div&gt;
  &lt;div lang="fr" xml:lang="fr"&gt;
     &lt;h1&gt;<span xml:lang="fr" lang="fr">Bienvenue !</span>&lt;/h1&gt; 
     &lt;p&gt;<span xml:lang="fr" lang="fr">Beaucoup de texte en français...</span>&lt;/p&gt;
     &lt;/div&gt;
  &lt;/body&gt; 
&lt;/html&gt; </pre></div></div><div class="note"><p><span class="note-head">Note: </span>There is a problem when dealing with multilingual <code class="keyword">title</code> elements. Only one language can be
				declared for this element in HTML 4.01, since the only content allowed is character data. There is currently no adequate solution for this problem. In XHTML 2.0 this problem should disappear.</p></div><p>For details of which language attribute to use, see <a class="technique-ref" href="#ri20040429.092928424">Best Practice 5: Choosing between lang and xml:lang</a>.</p><p>For details of how to specify language values, see <a class="section-ref" href="#ri20030218.131140352">Section 7: Choosing language values</a>.</p><p><strong>Discussion:</strong> See the definition of <a title="" href="#ri20040808.101452727">the intended language of the audience</a>. Documents  containing content aimed at an audience in more than one language are rare. A document is not aimed at a multilingual audience if it contains small amounts of text in another language. We are talking here about the languages the intended audience speaks.</p><p>Although we would normally recommend to declare the default <a title="" href="#ri20040808.102523274">text-processing language</a> in the <code class="keyword">html</code> tag, since only one language
			 can be defined at a time when using attributes, there may appear to be little point in doing so if a document has separate content to support multilingual audiences. It may be more appropriate to begin labeling the language on lower level elements, where the actual text is in one language or another.</p><p>If, however, the page header information or navigation is in one particular language, or there is a bias of
			 some other kind towards one particular language in the early part of the document, you may still want to use a language attribute on the <code class="keyword">html</code>
			 tag, and then override it in the appropriate lower level elements.</p></div><div class="resources"><div class="small-head">Resources:</div><h4 class="resource-first"><a id="BId2e751" name="BId2e751">Background information</a></h4><ul><li>Should I declare the language of my XHTML document using a language attribute, the Content-Language HTTP header, or a Content-Language meta element? W3C article.<br/> <a href="/International/questions/qa-http-and-lang#answer">Using HTTP and meta for language
			 information</a></li><li>Why use the language attribute? A number of useful
		  reasons. W3C article.<br/> <a href="/International/questions/qa-lang-why">Why use the
			 language attribute?</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040721.18455302" name="ri20040721.18455302" href="#ri20040721.18455302">Best Practice 3: Dividing multilingual documents</a></div><div class="rule">Where a document contains content aimed at speakers of more than one language, try to divide the document linguistically at the highest possible level, and
			 declare the appropriate language for each of  those divisions.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p>Dividing content in multiple languages at the highest possible level can simplify the process of guiding users to the
				text via searching, links, etc. It also reduces the work of labeling the language of document fragments. </p><p>For details of how to use language attributes, see the section <a class="section-ref" href="#ri20030218.131140352">Section 7: Choosing language values</a>.</p></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20030112.213804197" name="ri20030112.213804197" href="#ri20030112.213804197">Best Practice 4: Identifying changes in language within the document</a></div><div class="rule">Use the <code class="keyword">lang</code> and/or <code class="keyword">xml:lang</code> attributes around text to indicate any changes in
		  language.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Where the language of the text is different from the language declared in
			 the <code class="keyword">html</code> tag, indicate this using the <code class="keyword"><a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">lang</a></code> or <code class="keyword"><a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">xml:lang</a></code> attributes. For example, in HTML you would
			 write:</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e805" id="d2e805"></a>Example 9: </div><p><code>&lt;p&gt;The French for &lt;em&gt;Cat&lt;/em&gt; is &lt;em lang="fr"&gt;<span xml:lang="fr" lang="fr">chat</span>&lt;/em&gt;.&lt;/p&gt;</code></p></div><p>The <code class="keyword">lang</code> attribute can be used on all HTML elements except <code class="keyword">applet</code>, <code class="keyword">base</code>,
			 <code class="keyword">basefont</code>, <code class="keyword">br</code>, <code class="keyword">frame</code>, <code class="keyword">frameset</code>, <code class="keyword">iframe</code>, <code class="keyword">param</code> and <code class="keyword">script</code>.
			 (Note, by the way, that this means that you could use language attributes on things like bitmaps and audio files that
			 are language specific. Such information may be particularly useful for script-based processing of documents.)</p><p>If there is no markup around the text in a different language, use a <code class="keyword">span</code> element to delimit the
			 boundaries. Here is an example in XHTML 1.0 served as text/html:</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e850" id="d2e850"></a>Example 10: </div><p><code>&lt;p&gt;The title in Chinese is &lt;span lang="zh-Hans" xml:lang="zh-Hans"&gt;<span xml:lang="zh-hans" lang="zh-hans">中国科学院文献情报中心</span>&lt;/span&gt;.&lt;/p&gt;</code></p></div><p>For details of which language attribute to use, see <a class="technique-ref" href="#ri20040429.092928424">Best Practice 5: Choosing between lang and xml:lang</a>.</p><p>For details of how to specify language values, see <a class="section-ref" href="#ri20030218.131140352">Section 7: Choosing language values</a>.</p></div><div class="resources"><div class="small-head">Resources:</div><h4 class="resource-first"><a id="BId2e865" name="BId2e865">Background information</a></h4><ul><li>Why use the language attribute? A number of useful
		  reasons. W3C article.<br/> <a href="/International/questions/qa-lang-why">Why use the
			 language attribute?</a></li></ul><h4><a id="RLd2e865" name="RLd2e865">Reference links</a></h4><ul><li>W3C tutorial.<br/> <a href="/International/tutorials/language-decl/">Declaring Language in XHTML and HTML</a></li></ul><h4><a id="Sd2e865" name="Sd2e865">Sources</a></h4><ul><li>lang attribute definition in HTML 4.01 (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>xml:lang attribute definition in XML 1.0. (in XML 1.0 spec, section 2.12).<br/> <a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">Language Identification</a></li><li>xml:lang and lang attribute definitions in XHTML 1.0 (section C.7)<br/> <a href="http://www.w3.org/TR/xhtml1/#C_7">The lang and
			 xml:lang Attributes</a></li><li>Express natural language in a document. (in Web Content Accessibility Guidelines, Guideline 4).<br/> <a href="http://www.w3.org/TR/WCAG10/#gl-abbreviated-and-foreign">Clarify natural language
			 usage</a></li><li>Use lang attribute when language changes in a document (in Web Content Accessibility Techniques for HTML, section 2.1)<br/> <a href="http://www.w3.org/TR/WCAG10-HTML-TECHS/#changes-in-lang">2.1 Identifying changes in
			 language</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040429.092928424" name="ri20040429.092928424" href="#ri20040429.092928424">Best Practice 5: Choosing between <code class="keyword">lang</code> and <code class="keyword">xml:lang</code></a></div><div class="rule">For HTML use the <code class="keyword">lang</code> attribute only, for XHTML 1.0 served as text/html use the <code class="keyword">lang</code> and
		  <code class="keyword">xml:lang</code> attributes, and for XHTML served as XML use the <code class="keyword">xml:lang</code> attribute
		  only.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> When serving HTML use the <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">lang</a> attribute. For example, the following declares a document to be in Canadian
			 French:</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e933" id="d2e933"></a>Example 11: </div><p><code>&lt;html lang="fr-CA"&gt;</code></p></div><p>When serving XHTML as text/html, use both the <code class="keyword">lang</code> attribute <em>and</em> the
			 <code class="keyword"><a href="http://www.w3.org/TR/xhtml1/#C_7">xml:lang</a></code> attribute. Example <a class="example-ref" href="#ri20050131.134638379">12</a> shows how you would
			 mark up a document for XHTML 1.0 served as text/html.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.134638379" id="ri20050131.134638379"></a>Example 12: </div><p><code>&lt;html lang="fr-CA" xml:lang="fr-CA" xmlns ="http://www.w3.org/1999/xhtml"&gt;</code></p></div><p>If you are serving XHTML pages as XML (ie. using a MIME type such as application/xhtml+xml), for instance serving XHTML 1.1 pages, use just the
			 <code class="keyword">xml:lang</code> attribute (see Example <a class="example-ref" href="#ri20050131.134718797">13</a>).</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.134718797" id="ri20050131.134718797"></a>Example 13: </div><p><code>&lt;html xml:lang="fr-CA" xmlns ="http://www.w3.org/1999/xhtml"&gt;</code></p></div><p><strong>Discussion:</strong> The <code class="keyword">xml:lang</code> attribute is the
			 standard way to identify language information in XML, but the browser only recognizes the <code class="keyword">lang</code> attribute if the page is served as <code class="keyword">text/html</code>. On the other hand, when processing the document as XML, the <code class="keyword">xml:lang</code> will be the most useful. Since XHTML 1.0 may be used in both an HTML and XML context, you should use both attributes.</p><p>The <code class="keyword">lang</code> attribute will cause XHTML 1.1 pages to fail to validate, since it was removed from the language definition.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="Sd2e987" name="Sd2e987">Sources</a></h4><ul><li>lang attribute definition in HTML 4.01 (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>xml:lang attribute definition in XML 1.0. (in XML 1.0 spec, section 2.12).<br/> <a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">Language Identification</a></li><li>xml:lang and lang attribute definitions in XHTML 1.0 (section C.7)<br/> <a href="http://www.w3.org/TR/xhtml1/#C_7">The lang and
			 xml:lang Attributes</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040808.110827800" name="ri20040808.110827800" href="#ri20040808.110827800">Best Practice 6: Choosing between <code class="keyword">Content-Language</code> and attributes</a></div><div class="rule">Use language
		  attributes rather than HTTP or meta elements to declare the default language for text processing.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Use the <code class="keyword">lang</code>
			 and/or <code class="keyword">xml:lang</code> attributes on the <code class="keyword">html</code> tag. Example <a class="example-ref" href="#ri20060630.16203883">14</a> declares
			 an HTML document to be in Canadian French:</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20060630.16203883" id="ri20060630.16203883"></a>Example 14: </div><p><code>&lt;html lang="fr-CA"&gt;</code></p></div><p><strong>Discussion:</strong> The basic reason is that current user agents rarely use information in the HTTP header or <code class="keyword">meta</code> element for <a title="" href="#ri20040808.102523274">text-processing language</a> applications, and such implementations as there are are inconsistent (see the <a href="http://www.w3.org/International/tests/#lang-decl">test results</a>).</p><p>This is explained fully in <a class="section-ref" href="#ri20050208.095812479">Section 4: Mechanisms for declaring language in HTML</a>.</p></div><div class="resources"><div class="small-head">Resources:</div><h4 class="resource-first"><a id="BId2e1050" name="BId2e1050">Background information</a></h4><ul><li>Should I declare the language of my XHTML document using a language attribute, the
		  Content-Language HTTP header, or a Content-Language meta element?<br/> <a href="/International/questions/qa-http-and-lang#answer">W3C I18N FAQ: Using HTTP and meta for language
			 information</a></li></ul><h4><a id="Sd2e1050" name="Sd2e1050">Sources</a></h4><ul><li>HTML spec on Content-Language. It only says that the HTML language attribute has a higher precedence (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>Content-Language definition in HTTP 1.1 (section 14.12).<br/> <a href="http://www.ietf.org/rfc/rfc2616.txt">Content-Language</a></li><li>lang attribute definition in HTML 4.01 (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>xml:lang attribute definition in XML 1.0. (in XML 1.0 spec, section 2.12).<br/> <a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">Language Identification</a></li><li>xml:lang and lang attribute definitions in XHTML 1.0 (section C.7)<br/> <a href="http://www.w3.org/TR/xhtml1/#C_7">The lang and
			 xml:lang Attributes</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040429.094630704" name="ri20040429.094630704" href="#ri20040429.094630704">Best Practice 7: Using the <code class="keyword">body</code> tag</a></div><div class="rule">Do not declare the default language of a document in the <code class="keyword">body</code> element, use the <code class="keyword">html</code> element.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Use the <code class="keyword">lang</code>
			 and/or <code class="keyword">xml:lang</code> attributes on the <code class="keyword">html</code> tag. Example <a class="example-ref" href="#ri20060630.162233138">15</a> declares
			 an HTML document to be in Canadian French:</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20060630.162233138" id="ri20060630.162233138"></a>Example 15: </div><p><code>&lt;html lang="fr-CA"&gt;</code></p></div><p><strong>Discussion:</strong> The <code class="keyword">html</code> element is the highest level element in the
			 document, and is therefore most appropriate for declaring the default <a title="" href="#ri20040808.102523274">text-processing language</a> of the document. All elements within the document will
			 inherit that value.</p><p>The <code class="keyword">body</code> tag is usually the wrong place to express this information because it only refers to a
			 portion of the text in the document. For example, the text in the <code class="keyword">title</code> element is natural language text that
			 should also inherit the language information. If language is declared in the <code class="keyword">body</code> element, however, this is
			 not the case.</p><p>The only time it would make sense is when the content of the head and body elements are in different
			 languages.</p></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20050128.175100333" name="ri20050128.175100333" href="#ri20050128.175100333">Best Practice 8: Handling attribute values and element content in different languages</a></div><div class="rule">If the text in attribute values and element content is in different languages, consider using a nested
		  approach.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>Problem:</strong> You may come across a situation where the language of the text in an
			 attribute and the element content are in different languages. For example, at the top right corner of each page at the W3C Internationalization site, there are links to translated versions of that page (see Figure <a class="figure-ref" href="#ri20050131.132916912">1</a>). The
			 name of the language is given in the language of the target page, but a <code class="keyword">title</code> attribute contains the name in the
			 language of the current page:</p><div class="figure"><a name="ri20050131.132916912" id="ri20050131.132916912"/><img align="middle" src="images/language-link.gif" alt="Screen snap showing a tooltip containing the word 'Swedish' popping up from the document text&#xA;&#x9;&#x9;&#x9;&#x9;  'svenska'." height="88" width="146"/><div class="caption">Figure 1: An example of a scenario where the content and attribute value of an element could be in different
				languages.</div></div><p>If you create the code as shown in Example <a class="example-ref" href="#ri20050131.132304396">16</a> below, the language
			 attributes would actually indicate that not only the content but also the <code class="keyword">title</code> attribute text is in Swedish.
			 This is obviously incorrect.</p><div class="exampleOuter" xml:lang="en-US" lang="en-US"><div class="exampleHeader"><a name="ri20050131.132304396" id="ri20050131.132304396"></a>Example 16: An <strong>inappropriate</strong> way to label language when the attribute value and element content differ.</div><p><code>&lt;p&gt;&lt;a xml:lang="sv" lang="sv" title="Swedish"
				href="index.sv.html"&gt;<span xml:lang="sv" lang="sv">svenska</span>&lt;/a&gt;&lt;/p&gt;</code></p></div><p><strong>How to:</strong> Move the attribute containing text in a different language to another element, as shown  in this example, where  the <code class="keyword">p</code> tag inherits the default <code>en</code> setting of the
			 <code class="keyword">html</code> tag.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.133504872" id="ri20050131.133504872"></a>Example 17: A better way to label language when the attribute value and element content differ.</div><p><code>&lt;p title="Swedish"&gt;&lt;a xml:lang="sv" lang="sv"
				href="index.sv.html"&gt;<span xml:lang="sv" lang="sv">svenska</span>&lt;/a&gt;&lt;/p&gt;</code></p></div><p>The markup in Example <a class="example-ref" href="#ri20050131.133504872">17</a> lends itself easily to this approach. In other
			 cases you may need to add a <code class="keyword">span</code> element, to have somewhere  to attach the <code class="keyword">title</code> attribute.</p><p>For details of which language attribute to use, see <a class="technique-ref" href="#ri20040429.092928424">Best Practice 5: Choosing between lang and xml:lang</a>.</p><p>For details of how to specify language values, see <a class="section-ref" href="#ri20030218.131140352">Section 7: Choosing language values</a>.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="Sd2e1223" name="Sd2e1223">Sources</a></h4><ul><li>lang attribute definition in HTML 4.01 (in HTML 4.01 spec, section 8.1)<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li><li>xml:lang attribute definition in XML 1.0. (in XML 1.0 spec, section 2.12).<br/> <a href="http://www.w3.org/TR/REC-xml/#sec-lang-tag">Language Identification</a></li><li>xml:lang and lang attribute definitions in XHTML 1.0 (section C.7)<br/> <a href="http://www.w3.org/TR/xhtml1/#C_7">The lang and
			 xml:lang Attributes</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20040728.121358444" id="ri20040728.121358444"></a>6 Declaring metadata about the language of the intended audience</h2><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040429.094220724" name="ri20040429.094220724" href="#ri20040429.094220724">Best Practice 9: Using HTTP or a <code class="keyword">meta</code> tag  to indicate audience</a></div><div class="rule">Consider using a Content-Language declaration in the HTTP header or a <code class="keyword">meta</code> tag to
		  declare metadata about the language(s) of the intended audience of a document.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Content-Language information sent in the HTTP header is defined on the
			 server. The method for setting that up is server-specific and is not discussed here.</p><p>Alternatively, you can add a <code class="keyword">Content-Language</code> declaration in a <code class="keyword">meta</code> element to the head of your document, as shown in  Example <a class="example-ref" href="#ri20050131.13420100">18</a>).</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.13420100" id="ri20050131.13420100"></a>Example 18: </div><p><code>&lt;meta http-equiv="Content-Language" content="en"/&gt;</code></p></div><p><strong>Discussion:</strong> The Content-Language declaration, whether it is used in the HTTP
			 header or a <code class="keyword">Content-Language meta</code> tag, can be useful for expressing metadata about the
			 <a title="" href="#ri20040808.101452727">language(s) of the intended audience</a> of the document being served.</p><div class="note"><p><span class="note-head">Note: </span>This is different from expressing the default language of content for
			 <a title="" href="#ri20040808.102523274">text-processing</a>, which should be done using a language attribute on the
			 <code class="keyword">html</code> tag (see <a class="technique-ref" href="#ri20030112.213801634">Best Practice 1: Using attributes in the html tag</a>).</p></div><p>The extent to which applications use metadata information in the HTTP header or a <code class="keyword">meta</code> tag, or which of the two is preferred, is not clear at this point.</p><p>Using <code class="keyword">Content-Language</code> in the HTTP header entails potential issues related to the maintenance and use of
			 server-side information. Many authors may find it difficult to access server settings, particularly when dealing with
			 an ISP. Also, pages may not always be located on servers. So this approach is not a solution that is always
			 available.</p><p>Sometimes a server has been set up to automatically serve a language-specific version of a resource based on the user's browser settings (content negotiation). In this case, your server is likely to send language information in the Content-Language header.</p><p>For further discussion of this topic, see 
			 <a class="section-ref" href="#ri20040808.100519373">Section 3: Important concepts</a> and <a class="section-ref" href="#ri20050208.095812479">Section 4: Mechanisms for declaring language in HTML</a>.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="RLd2e1313" name="RLd2e1313">Reference links</a></h4><ul><li>Should I declare the language of my XHTML document using a language attribute, the Content-Language HTTP header, or a Content-Language meta element? W3C article.<br/> <a href="/International/questions/qa-http-and-lang#answer">Using HTTP and meta for language
			 information</a></li></ul><h4><a id="Sd2e1313" name="Sd2e1313">Sources</a></h4><ul><li>Content-Language definition in HTTP 1.1 (section 14.12).<br/> <a href="http://www.ietf.org/rfc/rfc2616.txt">Content-Language</a></li><li>HTML on Content-Language, only says that the HTML language attribute has a higher precedence (in HTML 4.01 spec, section 8.1).<br/> <a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">Specifying the language of content:
			 the lang attribute</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040728.121940236" name="ri20040728.121940236" href="#ri20040728.121940236">Best Practice 10: Providing a comma-separated list of languages</a></div><div class="rule">Where a document contains content aimed at speakers of more than one language, use Content-Language with a comma-separated list of language tags.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Content-Language information sent in the HTTP header is defined on the
				server. The HTTP specification provides for more than one language to be expressed as the value of the Content-Language
				header.</p><p>Example <a class="example-ref" href="#ri20050131.134303270">19</a> shows part of the HTTP header sent from the server and
				declares a document to be aimed at speakers of three languages: German, French and
				Italian:</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.134303270" id="ri20050131.134303270"></a>Example 19: </div><p><code>Content-Language: de,fr,it</code></p></div><p>The in-document <code class="keyword">Content-Language meta</code> element provides a similar possibility (see Example
				<a class="example-ref" href="#ri20050131.134414502">20</a>):</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050131.134414502" id="ri20050131.134414502"></a>Example 20: </div><p><code>&lt;meta http-equiv="Content-Language" content="de,fr,it"/&gt;</code></p></div><p><strong>Discussion:</strong> It is not common to find a single page on the Web containing content aimed at an audience that speaks more than one language. One reason is that it is easy to link to alternative pages instead. On the other hand, such pages do exist. One example would be a welcome page in both English and Canadian French, or English and Welsh.  Another type of example would be a page aimed at an audience that is largely multilingual, and containing news or blog posts in more than one language (for example in India, where English and Hindi are common languages, but people also use their own regional language to communicate).</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="Sd2e1364" name="Sd2e1364">Sources</a></h4><ul><li>Content-Language definition in HTTP 1.1 (section 14.12).<br/> <a href="http://www.ietf.org/rfc/rfc2616.txt">Content-Language</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20030218.131140352" id="ri20030218.131140352"></a>7 Choosing language values</h2><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20030112.224623362" name="ri20030112.224623362" href="#ri20030112.224623362">Best Practice 11: Using BCP 47</a></div><div class="rule">Follow the guidelines in the IETF's BCP 47 for language attribute
		  values.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Choose subtags from the <a href="http://www.iana.org/assignments/language-subtag-registry">IANA Language Subtag Registry</a>.  If combining subtags, do so according to the syntax described by <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47</a>.</p><p>For an gentle introduction to the registry  and BCP 47 rules for composing language codes, see 
			 <a href="/International/articles/language-tags/">Language tags in HTML and XML</a>. </p><p>Note that <code class="keyword"><a href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/dirlang.html#h-8.1">lang</a></code> and <code class="keyword"><a href="http://www.w3.org/TR/xhtml1/#C_7">xml:lang</a></code> attributes only take a single language value (unlike HTTP
			 Content-language headers).</p><p><strong>Discussion:</strong>  <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47</a> points to IETF documents that define language tags (BCP stands for Best Current Practice). At the time this best practices document was published, the BCP 47 link pointed to <a href="http://www.rfc-editor.org/rfc/rfc4646.txt">RFC 4646, Tags for the Identification of Languages</a> and <a href="http://www.rfc-editor.org/rfc/rfc4647.txt">RFC 4647 Matching of Language Tags</a>. The first of these documents describes the syntax of language tags.  (See  the note below for the history.) </p><p>Using BCP 47 as a common reference for defining language tags ensures that your tags will be recognized widely.</p><div class="note"><p><span class="note-head">Note: </span>BCP 47 is a non-changing name for the latest in a series of IETF specifications normally referred to as RFCs. Each new RFC has a number which is typically not related to the number of any RFC it replaces and obsoletes. The original IETF specification that described values for language tags was <a href="http://www.ietf.org/rfc/rfc1766.txt">RFC 1766</a>.  This was then obsoleted by <a href="http://www.ietf.org/rfc/rfc3066.txt">RFC 3066</a>, which was also the first document called BCP 47. That was replaced in September 2006 by  two specifications, <a href="http://www.rfc-editor.org/rfc/rfc4646.txt">RFC 4646</a> and <a href="http://www.rfc-editor.org/rfc/rfc4647.txt">RFC 4647</a>. The first describes language tag syntax, the second describes how to match tags.  The associated <a href="http://www.iana.org/assignments/language-subtag-registry">IANA Language Subtag Registry</a> had been in force for some months already when these specifications were assigned numbers by the IETF.</p><p>RFC 4646 merely expands and clarifies the possibilities for
			 specifying languages. If you have been using RFC 1766 or RFC 3066 you do not need to make any changes to your code in
			 order to start using RFC 4646. Successors to
			 RFC 4646 will also retain backwards compatibility with tags created using RFC
			 4646.</p></div><p>The HTML specification still recommends the use of RFC 1766 for identifying language. However, there is a planned
				erratum in place for the HTML specification, so <em>despite what the HTML specification
				currently says, you should use RFC 4646 or its successor when that is published</em>.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="IGd2e1446" name="IGd2e1446">How to's</a></h4><ul><li>How to choose the right attribute values. W3C article.<br/> <a href="/International/articles/language-tags/">Language tags in HTML and XML</a></li><li>Lists the codes you can use for language values.<br/> <a href="http://www.iana.org/assignments/language-subtag-registry">Language Subtag Registry</a></li><li>Latest version of the specification for constructing language tags (replacement for RFC 3066)<br/> <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47, Tags for the Identification of Languages &amp; Matching of Language Tags</a></li></ul><h4><a id="RLd2e1446" name="RLd2e1446">Reference links</a></h4><ul><li>Rationale for changes introduced in RFC 4646 by one of its authors. W3C article.<br/> <a href="http://www.w3.org/International/articles/bcp47/">Understanding the New Language Tags</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20030112.224717800" name="ri20030112.224717800" href="#ri20030112.224717800">Best Practice 12: Deciding on language tag length</a></div><div class="rule">Use the shortest possible language tag values.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> The golden rule when creating language tags is to keep the tag as short as possible. Avoid region, script or other subtags except where
				they add useful distinguishing information. For instance, use <code>ja</code> for Japanese and not <code>ja-JP</code>, unless there is a particular
				reason that you need to say that this is Japanese <em>as spoken specifically in Japan</em>.</p><p>Similarly, do not use script or variant codes unless they are needed to correctly distinguish your content from something else. Although RFC 4646 introduces script tags,  as RFC 4646 co-author,
					Addison Phillips, writes, "For virtually any content that does not use a script tag today, it remains the best practice not to use one in the
					future".</p><p>In the past, people tended to wonder which ISO language code to choose, since there are often 2-letter and 3-letter alternatives for the same language (and sometimes two 3-letter alternatives). Although there were clear rules about this in RFC3066, this question is now moot because now you should only use language tags specified in the <abbr title="">IANA</abbr> <a href="http://www.iana.org/assignments/language-subtag-registry">Language Subtag Registry</a>, and only one subtag exists per language in that registry (the shortest one).</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="IGd2e1498" name="IGd2e1498">How to's</a></h4><ul><li>How to choose the right attribute values. W3C article.<br/> <a href="/International/articles/language-tags/">Language tags in HTML and XML</a></li><li>Lists the codes you can use for language values.<br/> <a href="http://www.iana.org/assignments/language-subtag-registry">Language Subtag Registry</a></li><li>Latest version of the specification for constructing language tags (replacement for RFC 3066)<br/> <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47, Tags for the Identification of Languages &amp; Matching of Language Tags</a></li></ul><h4><a id="RLd2e1498" name="RLd2e1498">Reference links</a></h4><ul><li>Rationale for changes introduced in RFC 4646 by one of its authors. W3C article.<br/> <a href="http://www.w3.org/International/articles/bcp47/">Understanding the New Language Tags</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040429.113217290" name="ri20040429.113217290" href="#ri20040429.113217290">Best Practice 13: Using <code class="keyword">Hans</code> and <code class="keyword">Hant</code> codes</a></div><div class="rule">Where possible, use the codes <code class="keyword">zh-Hans</code> and <code class="keyword">zh-Hant</code> to refer to Simplified and Traditional
		  Chinese, respectively.</div><div class="applicability"><span class="applic-title">UA applicability issues for:   </span><img src="images/ie6.gif" alt="ie6"/>   </div><div class="description"><p><strong>How to:</strong> Use <code class="keyword">zh-Hans</code> and <code class="keyword">zh-Hant</code> for Simplified and Traditional Chinese,
			 respectively, in language attribute values, and possibly also for Content-Language values. These codes are available from the <abbr title="">IANA</abbr> <a href="http://www.iana.org/assignments/language-subtag-registry">Language Subtag Registry</a>.</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e1555" id="d2e1555"></a>Example 21: Simplified Chinese:</div><p><code>&lt;p lang="zh-Hans" xml:lang="zh-Hans"&gt;<span xml:lang="zh-Hans" lang="zh-Hans">当世界需要沟通时,请用统一码!</span>&lt;/p&gt;</code></p></div><div class="exampleOuter"><div class="exampleHeader"><a name="d2e1564" id="d2e1564"></a>Example 22: Traditional Chinese:</div><p><code>&lt;p lang="zh-Hant" xml:lang="zh-Hant"&gt;<span xml:lang="zh-Hant" lang="zh-Hant">當世界需要溝通時,請用統一碼!</span>&lt;/p&gt;</code></p></div><p><strong>Discussion:</strong> Simplified
			 vs. Traditional Chinese is a distinction based on <em>script</em>. Until recently there was no provision for using script information in language tags, so <code>zh-CN</code> (Chinese spoken
			 in Mainland China) was commonly used to label Simplified Chinese writing, and <code>zh-TW</code> (Chinese spoken in Taiwan) was
			 commonly used for Traditional Chinese writing. Apart from the fact that this is mislabeled, you could not guarantee that others
			 would recognize these conventions, or even follow them. For example, some people used zh-HK to represent Traditional
			 Chinese.</p><p>You should start using the new tags as soon as possible in order to introduce widespread interoperability quickly. There is already substantial use of these codes.</p><p>On the other hand, in some cases you may need to assess the impact of changing the tags. This is not really an issue for
			 self-describing usage, such as with <code class="keyword">:lang</code> for application of language-based styling. It may be more of an
			 issue where external applications are looking for tags related to Chinese but are unaware of the <code>zh-Hans</code> and <code>zh-Hant</code>
			 variants.</p><p><strong>UA specific notes:</strong> There is one particular area where this may be an issue for the display of text on a user agent. Some user agents use language information to automatically choose a font for
				<abbr title="">CJK</abbr> ideographic text. However, note that this assumes that the following conditions hold:</p><ol class="depth1"><li><p>you have
				appropriate fonts set in your preferences,</p></li><li><p>the document styling does not apply a font, and that</p></li><li><p>the user
				agent supports this behavior (not all do).</p></li></ol><p>So this is a fairly limited scenario.</p><p>The following summarizes support for this feature in the  <a href="#ri20030912.142616699">user
				agents tested for this document</a> at the time of writing. See the 
				<a href="/International/tests/results/results-lang-and-cjk-font">test results page</a> for more
				details and latest results.</p><p>Safari doesn't support this automatic font assignment. Firefox, Mozilla, Netscape, Opera and IE7 handle zh-Hans and zh-Hant as you would expect. IE6, however, applies the default font, which is Japanese.</p><p>Note that Firefox, Mozilla and Netscape also allow you to set a different setting for Traditional Chinese in Taiwan and Hong Kong. They use the Taiwan font for zh-Hans and zh-TW. They use the Hong Kong font setting for zh-HK.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="IGd2e1631" name="IGd2e1631">How to's</a></h4><ul><li>Lists the codes you can use for language values.<br/> <a href="http://www.iana.org/assignments/language-subtag-registry">Language Subtag Registry</a></li><li>Latest version of the specification for constructing language tags (replacement for RFC 3066)<br/> <a href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">BCP 47, Tags for the Identification of Languages &amp; Matching of Language Tags</a></li></ul><h4><a id="RLd2e1631" name="RLd2e1631">Reference links</a></h4><ul><li>Rationale for changes introduced in RFC 4646 by one of its authors. W3C article.<br/> <a href="http://www.w3.org/International/articles/bcp47/">Understanding the New Language Tags</a></li></ul><h4><a id="Td2e1631" name="Td2e1631">Test data</a></h4><ul><li>Test results.<br/> <a href="/International/tests/results/lang-and-cjk-font">Automatic font assignment for
		  CJK text</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div></div><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="ri20040310.074302350" id="ri20040310.074302350"></a>8 Indicating the language of a link destination</h2><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20050128.152033553" name="ri20050128.152033553" href="#ri20050128.152033553">Best Practice 14: Identifying the language of a target document</a></div><div class="rule">When pointing to a resource in another language, consider the pros and cons of indicating the language of the
		  target document.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>Pros:</strong> May help the reader avoid wasted time linking to pages they can't read.</p><p><strong>Cons:</strong> May become out-of-date and so give incorrect information.</p><p><strong>Discussion:</strong> If you add some text or graphic to a link indicating that the target
			 document is in another language, it may allow the reader to decide in advance whether or not to follow the link,
			 according to their language skill. If the user follows the link, only to find out that they cannot read
			 the target document, this wastes time and introduces fatigue, and they may eventually lack confidence when faced with links that actually do go to readable
			 pages.</p><p>There are, however, potential problems with this approach if  a newly translated version of the target document becomes available. Assume, for example that a French page has
			 used this approach some time ago to point to a second document which at that time was only in English. Later, the second document is
			 translated into French and language negotiation is put in place. Unless the first French page referred to earlier is updated,
			 it will now be incorrectly warning French readers that the second  document is in English, and possibly discouraging them from
			 following a link to what is actually a perfectly legible document.</p></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20030112.224458239" name="ri20030112.224458239" href="#ri20030112.224458239">Best Practice 15: Using <code class="keyword">hreflang</code> with CSS</a></div><div class="rule">If you want to indicate that the target document of an <code class="keyword">a</code> element is in another language, consider the
		  pros and cons of using <code class="keyword">hreflang</code> with CSS.</div><div class="applicability"><span class="applic-title">UA applicability issues for:   </span><img src="images/ie7.gif" alt="ie7"/>   <img src="images/ie6.gif" alt="ie6"/>   </div><div class="description"><p><strong>Pros:</strong> May help the reader avoid wasted time linking to pages they can't read;
			 saves the author time and effort if <code class="keyword">hreflang</code> is used consistently.</p><p><strong>Cons:</strong> May become out-of-date and so give incorrect information; not all user
			 agents support the necessary CSS; problematic when linking to language negotiated sites.</p><p><strong>How to:</strong> This approach relies on CSS selectors that detect the value of the
			 <code class="keyword">hreflang</code> attribute and use the CSS content property to display an indicator of the language.</p><p>For example, the following link points to a page in Swedish.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20050223.15165421" id="ri20050223.15165421"></a>Example 23: </div><p>There is also a page describing 
				<a href="http://www.w3.org/International/articles/serving-xhtml/Overview.sv.html">why a DOCTYPE is
				  useful</a> [sv].</p></div><p>The markup of the content would read as follows:</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e1723" id="d2e1723"></a>Example 24: </div><p>&lt;p&gt;There is also a page describing &lt;a href="swedish-doc.html" hreflang="sv"&gt;why a
				DOCTYPE is useful&lt;/a&gt;.&lt;/p&gt;</p></div><p>The code to enable this in CSS may be something like:</p><div class="exampleOuter"><div class="exampleHeader"><a name="d2e1728" id="d2e1728"></a>Example 25: </div><p>a[hreflang]:after { content: " [" attr(hreflang) "] "; } </p></div><p>This says, "For each <code class="keyword">a</code> element with an <code class="keyword">hreflang</code> attribute, add the value of that attribute
			 in square parentheses after the link". </p><p>You could just as easily append text or even a graphic after the link by associating it with the <code class="keyword">content</code> property, rather than the <code class="keyword">attr(hreflang)</code>.  This might be better if you are not sure that readers will recognize the ISO abbreviations.</p><p>This time you could use the following code in CSS. You would need one of these for every target document in a different language.</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20070314.16531000" id="ri20070314.16531000"></a>Example 26: </div><p>a[hreflang = 'sv']:after { content: " [Swedish] "; } </p></div><p>This says, "For each <code class="keyword">a</code> element with an <code class="keyword">hreflang</code> attribute with a value of <code>sv</code>, add the value of the  <code>content:</code>
			 property after the link". The markup would be the same. The displayed result would be:</p><div class="exampleOuter"><div class="exampleHeader"><a name="ri20070314.165312593" id="ri20070314.165312593"></a>Example 27: </div><p>There is also a page describing 
				<a href="http://www.w3.org/International/articles/serving-xhtml/Overview.sv.html">why a DOCTYPE is
				  useful</a> [Swedish].</p></div><p><strong>Discussion:</strong> In HTML, the hreflang attribute on an <code class="keyword">a</code> element indicates
			 the language of the document at the other end of the link. In practice, <code class="keyword">hreflang</code> information is typically not picked up by
			 mainstream browsers. Besides that it is much better to ensure that the target document uses the language attribute in
			 the <code class="keyword">html</code> tag, so that this information should  not be needed.</p><p>It is perhaps (slightly) more common to use this attribute  to generate a visible marker attached to link text that
			 indicates the language of the destination page for the reader. The idea is to allow the reader to decide in advance
			 whether or not to follow the link, according to their language skill. </p><p>There are some usability-related pros and cons to this approach that are discussed in
			 <a class="technique-ref" href="#ri20050128.152033553">Best Practice 14: Identifying the language of a target document</a>.</p><p>There are, also, potential technical problems with this approach when using Internet Explorer (see below). The fact that IE doesn't support this is important, given its market share. It doesn't break the page, however, on IE. The user simply doesn't see this information.

This means that as long as the information is not critical for the user, you can still use this technique and it will provide an enhanced user experience on the other browsers. </p><p>Note also that if a resource is available in multiple languages via server-side content negotiation it is not possible to express the range of languages that are available, since the
				  <code class="keyword">hreflang</code> attribute accepts only a single language as its value.</p><p><strong>UA issues:</strong> The following summarizes support for this feature in the      <a href="#ri20030912.142616699">user
				agents tested for this document</a> at the time of writing. See the 
				<a href="/International/tests/results/hreflang-style">test results page</a> for more
				details and latest results.</p><p>Internet Explorer 6 and 7 do
				  not support the <code class="keyword">:before</code>, <code class="keyword">:after</code> selectors, or the <code class="keyword">content</code> property.</p><p>The approach works fine for all the other user agents tested.</p></div><div class="resources"><div class="small-head">Resources:</div><h4><a id="IGd2e1823" name="IGd2e1823">How to's</a></h4><ul><li>:before and :after in the CSS 2.1 spec (section 12.1).<br/> <a href="http://www.w3.org/TR/2004/CR-CSS21-20040225/generate.html#x5">The :before and :after
			 pseudo-elements</a></li></ul><h4><a id="Sd2e1823" name="Sd2e1823">Sources</a></h4><ul><li>hreflang in the HTML spec (section 12.2).<br/> <a href="http://www.w3.org/TR/html401/struct/links.html#adef-hreflang">The A
			 element</a></li></ul><h4><a id="Td2e1823" name="Td2e1823">Test data</a></h4><ul><li>Test results<br/> <a href="/International/tests/results/hreflang-style.html">Hreflang content
		  generation</a></li></ul><div class="techIndexPtr"><h4>More resources</h4> <a href="/International/technique-index?topic=htmlauth">Technique index</a> - <a href="/International/resource-index?topic=lang">Topic index</a></div></div><div class="short-name"><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a id="ri20040808.173208643" name="ri20040808.173208643" href="#ri20040808.173208643">Best Practice 16: Using flags to indicate languages</a></div><div class="rule">Do not use flag icons to indicate languages.</div><div class="applicability"><span class="applic-title">No UA applicability issues.</span></div><div class="description"><p><strong>How to:</strong> Use text. See  Example <a class="example-ref" href="#ri20050223.15165421">23</a> for one illustration.</p><p><strong>Discussion:</strong> Flags represent countries, not languages. Numerous countries
			 use the same language as another country, and numerous countries have more than one official language. Flags don't map onto these permutations.</p></div></div></div><div class="back"><div class="div1">
<h2><a href="#contents"><img src="images/topOfPage.gif" align="right" height="26" width="26" title="Go to the table of contents." alt="Go to the table of contents."/></a><a name="d2e1861" id="d2e1861"></a>A Acknowledgments</h2><p>Members of the former GEO Working Group and the former GEO Task Force have contributed their time and
		  valuable comments to shaping these best practices. They include:</p><p>Phil Arko (Siemens), Steve Billings (Invited Expert), David Clarke (Invited Expert), Deborah Cawkwell (BBC World Service), Wendy Chisholm (W3C WAI), Andrew
		  Cunningham (State Library of Victoria), Martin Dürst (Invited Expert), Lloyd Honomichl (Invited Expert), Susan K. Miller (Boeing), Russ Rolfe
		  (Microsoft), Peter Sigrist (Invited Expert), Tex Texin (Yahoo), Najib Tounsi (<span xml:lang="fr" lang="fr">Ecole Mohammadia
		  d'Ingénieurs</span>).</p></div></div></body></html>