semantic-extractor.html 6.42 KB
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
	<meta http-equiv="content-type" content="text/html; charset=utf-8" />
	<title>Semantic data extractor - QA @ W3C</title>
	<meta name="Keywords" content="qa, quality assurance, conformance, validity, test suite, Semantics Analysis, XHTML, HTML, outline" />
	<meta name="Description" content="W3C QA - A tool to create a semantic outline of a document" />
	<link rel="schema.DC" href="http://purl.org/dc" />
	<meta name="DC.Subject" lang="en" content="Semantics Analysis, XHTML, HTML, outline" />
	<meta name="DC.Title" lang="en" content="Semantic data extractor" />

	<meta name="DC.Description.Abstract" lang="en" content="A tool to create a semantic outline of a document" />
	<meta name="DC.Date.Created" content="2006-11-15" />
	<meta name="DC.Language" scheme="RFC1766" content ="en" />
	<meta name="DC.Creator" content="Dominique Hazaël-Massieux" />
	<meta name="DC.Publisher" content="W3C - World Wide Web Consortium - http://www.w3.org" />
	<meta name="DC.Rights" content="http://www.w3.org/Consortium/Legal/copyright-documents" />
	    <style type="text/css" media="all">
	    @import "/QA/2006/01/blogstyle.css";
		.form {background-color: #dfebf7;
			padding: 1em;
			margin: 1em 250px 2em 2em;
			}
		.uri-field {background-color: #FFC;
			font-size: 1.1em;
			width: 25em;
			border-width: 1px;
			border-color: #555 #eee #eee #555;
			border-style: solid;			
			}
		.button-extract {
			display: block;
			background-color: #005A9C;
			color: #fff;
		}
	    </style>
	</head>

	<body>
	    <div id="banner">
	      <h1 id="title">
		<a href="http://www.w3.org/"><img height="48" alt="W3C" id="logo" src="http://www.w3.org/Icons/WWW/w3c_home_nb" /></a>
	        <a href="http://www.w3.org/QA/"><img src="http://www.w3.org/QA/2002/12/qa.png" alt="Quality Assurance" /></a>
	        Semantic Data Extractor
	</h1>
	    </div>
	    <ul class="navbar" id="menu">

	        <li><strong><a href="/QA/" title="Quality Assurance Web Site Home">W3C QA Home</a></strong></li>
	        <li><a href="/QA/IG/" title="The Quality Assurance Interest Group">QA IG</a></li>
	        <li><a href="/QA/Library/" title="Documents and Publications on Web and Quality">Documents</a></li>
	        <li><a href="/QA/Tools">Tools</a></li>
	        <li><a href="/QA/IG/#contact">Feedback</a></li>
	    </ul>

	<div id="searchbox">
	<form method="get" action="http://www.google.com/custom" enctype="application/x-www-form-urlencoded">
	<p id="formbox"><input type="text" size="15" class="textfield" name="q" accesskey="E" maxlength="255" /><input type="submit" class="submitfield" value="Search" id="goButton" name="sa" accesskey="G" /><input type="hidden" name="cof" value="T:black;LW:72;ALC:#ff3300;L:http://www.w3.org/Icons/w3c_home;LC:#000099;LH:48;BGC:white;AH:left;VLC:#660066;GL:0;AWFID:0b9847e42caf283e;" /><input type="hidden" id="searchW3C" name="sitesearch" checked="checked" value="www.w3.org/QA" /><input type="hidden" name="domains" value="www.w3.org/QA" /></p>
	</form>
	</div>



	    <div id="main"><!-- This DIV encapsulates everything in this page - necessary for the positioning -->
	        <div id="jumpbar">

	         <h2>Quick Introduction</h2>
			<p>This tool, geared by an <a href="/2002/08/extract-semantic">XSLT stylesheet</a>, tries to extract some information from a HTML semantic rich document. It only uses information available through a good usage of the semantics defined in HTML.</p>
	        <p>The aim is to show that providing a semantically rich HTML gives much more value to your code: using a semantically rich HTML code allows a better use of CSS, makes your HTML intelligible to a wider range of user agents (especially search engines bots).</p>
	        <p>As an aside, it can give clues to user agents developers on some hooks that could be interesting to add in their product.</p>
	
			</div>

	        <h2>On-line service</h2>
	        <p>See a demonstration of the service, relying both on the <a href="/2001/05/xslt">W3C XSLT Servlet</a> and <a href="http://services.w3.org/tidy/tidy">tidy on-line</a>:</p>

<form class="form" action="http://www.w3.org/2002/08/xslt4html" method="get">
   	<p><label><span style="display:block;">URI of the HTML document to extract semantics from:</span> 
	  <input class="uri-field" type="text" name="xmlfile" /></label>
      <input type="hidden" name="xslfile" value="http://www.w3.org/2002/08/extract-semantic.xsl" />
      <input class="button-extract" type="submit" value="Extract semantics" />
    </p>
    </form>
	<h2>More Semantics ?</h2>
	<p>If you have suggestion to improve this XSLT, please send patches to <a href="mailto:public-qa-dev@w3.org?subject=[Semantic%20Data%20Extractor]">public-qa-dev@w3.org</a>.</p>
	</div>

	<!-- Footer -->

	<address class="author">
	    <a href="http://validator.w3.org/check?uri=referer"><img
	        src="http://www.w3.org/Icons/valid-xhtml10" height="31" width="88"
	        alt="Valid XHTML 1.0!" /></a> Created Date: 2006-11-15 by Dominique Hazaël-Massieux<br />
	Last modified $Date: 2011/01/04 17:14:10 $ by $Author: dom $</address>

	    <p class="copyright">
	      <a rel="Copyright" href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> &copy; 1994-2006
	      <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a>&reg;

	      (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>,
	      <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>,
	      <a href="http://www.keio.ac.jp/">Keio</a>),
	      All Rights Reserved.
	      W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>,
	      <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>,
	      <a rel="Copyright" href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a>
	      and <a rel="Copyright" href="http://www.w3.org/Consortium/Legal/copyright-software">software licensing</a>

	      rules apply. Your interactions with this site are in accordance
	      with our <a href="http://www.w3.org/Consortium/Legal/privacy-statement#Public">public</a> and
	      <a href="http://www.w3.org/Consortium/Legal/privacy-statement#Members">Member</a> privacy
	      statements.
	    </p>

	</body>
	</html>