vision.html 9.84 KB
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
       "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
  <title>HTML/XForms/XHTML2 Architectural Vision</title>
  <style type="text/css">
        body {margin:1em; font-family: sans-serif }
        ul li {margin-top: 8px; line-height: 1.5 }
        .toc ul li {line-height: 1.2; margin-top: 2px}
        .contacted {display: none; background: #69F}
        .happy {display: none; background: #3D3 }
        .unhappy {display: none; background: #D33}
        .meh {display: none; background: #999}
        .agree {border: 2px solid #3D3; padding: 1px }
        .disagree {border: 2px solid #D33; padding: 1px }
        .discuss {border: 2px solid #93D; padding: 1px; background:#EDF }
        span em { background: #FFD }
    </style>
  <link href="/StyleSheets/base.css" rel="stylesheet" type="text/css" />
</head>

<body>
<h1><a href="/" shape="rect"><img src="/Icons/w3c_home" alt="W3C" width="72"
height="48" />
</a> <a href="/Interaction/"><img src="/Icons/interaction"
alt="Interaction Domain" />
</a> Architectural vision for HTML/XHTML2/Forms Chartering</h1>

<p>The discussion around the re-chartering of the HTML-related work was
extensive. In the interest of providing a convenient summary, this document
discusses the overall architectural vision behind the chartering of these
groups,and how they fit into the wider pattern of the Interaction Domain and
the overall Web Architecture.</p>

<p>The architectural directions along which the community is now moving are
the result of much input, and everyone involved in the new activity will have
to make some accommodation to the reality of the situation and the
requirements of others. There is a strong common component throughout this
work, a serious need on the part of users and web designers, and a
significant opportunity to improve this space for everyone.</p>

<h2 id="xml-arch">XML-based Architecture and tag soup</h2>

<p>W3C has in general assumed that XML is the correct way forward and that
implementations will fall into line as necessary over time. For the mobile
market, and for non-HTML client technologies like SMIL, SVG, MathML,
Timed-Text and so forth, this has indeed happened. For the desktop browser
market, however, tag soup markup has persisted much longer than we would have
expected or hoped. In consequence, the TAG <a
href="http://www.w3.org/2001/tag/issues.html?type=1#TagSoupIntegration-54">issue
TagSoupIntegration-54: Tag soup integration</a> has been opened to consider
whether the indefinite persistence of 'tag soup' HTML is consistent with a
sound architecture for the Web.</p>

<p>There are several ways to approach this situation, given that pretending
the situation does not exist is not acceptable:</p>
<ol>
  <li><p>Try to force users and implementers to greater adoption of the
    existing XHTML 1.x. In essence, this was the strategy before. There are
    several drawbacks, however:</p>
    <ol>
      <li><p>since <a href="http://www.w3.org/TR/xhtml1/#guidelines">Appendix
        C of XHTML 1.0</a> allows such content to be sent to legacy user
        agents, users get no warning when their content is not well formed.
        Malformed content therefore proliferates. User agents start to assume
        that any XHTML 1.x is not well formed, or sniff it for guides such as
        an XML declaration or a Strict doctype</p>
      </li>
      <li><p>since XHTML added no new features (XHTML 1.0) or one new feature
        (Ruby, in XHTML 1.1) the incentive for users to move to the XML based
        format is small. They get no reward for doing so, beyond the rather
        theoretical satisfaction of creating well-formed content.</p>
      </li>
    </ol>
  </li>
  <li><p>Create a new language, with a different media type, which is more
    extensible, more accessible, has richer semantics, and so forth. Older
    user agents which do not understand this format will not request it, and
    will reject it. This was the strategy for XHTML 2.0.</p>
    <p>Unfortunately this also has a drawback. While XHTML 2.0 has been
    adopted for authoring (for example, in device independent authoring) and
    in some corporate situations (where the XForms support is valuable and
    the choice of client can be controlled) it has not been successful among
    legacy browser vendors nor have new browser vendors emerged to promote
    it. Thus, client-side use remains small and this is a barrier to entry.
    This approach may well succeed in the longer term, but it does not seem
    to have sufficient traction currently.</p>
  </li>
  <li><p>Create independent but related languages for different audiences.
    This has a clear and obvious drawback relative to a single language, and
    yet can be considered especially if XML forms a common parsing model.</p>
    <p>It would have been possible (and there were some calls for this) for
    the primarily desktop oriented, consumer oriented language to have
    <em>only</em> a tag-soup serialization. However, that would certainly
    have a negative and divisive effect on the Web architecture. Gratuitous
    incompatibilities with XML should be strenuously avoided.</p>
    <p>Instead, the charter calls for two <em>equivalent</em> serializations
    to be developed by the HTML WG, corresponding to a single DOM (or
    infoset, though tag soup cannot be considered to have an infoset
    currently, while it can have a DOM). This ensures that decisions are not
    made which would preclude an XML serialization. It allows the two
    serializations to be inter-converted automatically. Having new language
    features, there is an incentive for content authors to use it; and having
    client-side implementations means that there is the possibility to really
    use it.</p>
  </li>
</ol>

<p>Of these, W3C has chosen the third approach. If this new HTML-family
format is widely used, and if it can be reliably converted to XML if it is
not already serialized in that form (reliably meaning not only that
formatting is the same but the structure is the same, and the semantics are
not altered) then XML-based workflows can create and consume this content.
Meanwhile, enterprise-strength needs are met by XHTML2, which includes
XForms. The two formats are differentiated by deployment strategy and
expected field of use.</p>

<p>Interconversion between two serializations of a single DOM should be well
defined. Experience with, for example, HTML Tidy, and John Cowan's work on
TagSoup, demonstrates the feasibility (although, unlike the case with HTML
Tidy, the interconversion should not be seen as error correction).</p>

<p>As mobile clients cannot afford the luxury of multiple parsers, and given
that an XML parser is already required, it should be the case that content
which is expected to be viewed on (or to not exclude) a mobile device should
be authored using the XML serialization. Also, as soon as there is a need for
any extensibility, the XML serialization (with use of XML namespaces) gains
an immediate practical advantage.</p>

<p>Over time therefore the amount of content in this format should be
expected to increase and the percentage of it in the XML serialization to
increase.</p>

<p>This direction does not diminish the role of XML as the central
architecture for markup on the Web and elswhere. It is merely trying out more
creative, and hopefully more succesful, ways to reach the same goal -- by
building bridges rather than barriers -- by reducing the large step into a
set of separate steps which can be motivated independeently.</p>

<h2 id="xml-integrate">Integration with the XML ecosystem</h2>

<p>The Compound Document Formats (CDF) WG, which has up to now worked on
compound documents by reference, has now started work on compound documents
by inclusion - real multi-namespace documents, where XML is clearly the only
way forward in this plan. This should also drive adoption (once more, on
mobile first and then later on the desktop).</p>

<p>The role of the XHTML 2 working group in creating an enterprise-strength,
extensible markup language and also in producing spin-off technologies which
are applicable to other XML grammars, will also be emphasized. In particular
the XHTML 2 WG will take part in the XML Coordination group as well as the
Hypertext Coordination group.</p>

<p>The issue of extensibility was raised by several commenters. Because XML
has namespaces, and namespaced attributes, there is a clear method for
creating compound documents with clearly identified extensions - from
components like MathML or SVG, to rich metadata. It is expected that the tag
soup form only be used where no extensions are present.</p>
<hr />
<address>
  Dan Connolly, Chris Lilley, Tim Berners-Lee
</address>

<p></p>

<p class="copyright"><a rel="Copyright"
href="/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2007 <a
href="/"><acronym
title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a
href="http://www.csail.mit.edu/"><acronym
title="Massachusetts Institute of Technology">MIT</acronym></a>, <a
href="http://www.ercim.org/"><acronym
title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>,
<a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a
href="/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a
href="/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>, <a
rel="Copyright" href="/Consortium/Legal/copyright-documents">document use</a>
and <a rel="Copyright" href="/Consortium/Legal/copyright-software">software
licensing</a> rules apply. Your interactions with this site are in accordance
with our <a href="/Consortium/Legal/privacy-statement#Public">public</a> and
<a href="/Consortium/Legal/privacy-statement#Members">Member</a> privacy
statements.</p>

<p>Last modified: $Date: 2007/03/07 19:02:59 $</p>
</body>
</html>