semantic-extractor.html
6.42 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>Semantic data extractor - QA @ W3C</title>
<meta name="Keywords" content="qa, quality assurance, conformance, validity, test suite, Semantics Analysis, XHTML, HTML, outline" />
<meta name="Description" content="W3C QA - A tool to create a semantic outline of a document" />
<link rel="schema.DC" href="http://purl.org/dc" />
<meta name="DC.Subject" lang="en" content="Semantics Analysis, XHTML, HTML, outline" />
<meta name="DC.Title" lang="en" content="Semantic data extractor" />
<meta name="DC.Description.Abstract" lang="en" content="A tool to create a semantic outline of a document" />
<meta name="DC.Date.Created" content="2006-11-15" />
<meta name="DC.Language" scheme="RFC1766" content ="en" />
<meta name="DC.Creator" content="Dominique Hazaël-Massieux" />
<meta name="DC.Publisher" content="W3C - World Wide Web Consortium - http://www.w3.org" />
<meta name="DC.Rights" content="http://www.w3.org/Consortium/Legal/copyright-documents" />
<style type="text/css" media="all">
@import "/QA/2006/01/blogstyle.css";
.form {background-color: #dfebf7;
padding: 1em;
margin: 1em 250px 2em 2em;
}
.uri-field {background-color: #FFC;
font-size: 1.1em;
width: 25em;
border-width: 1px;
border-color: #555 #eee #eee #555;
border-style: solid;
}
.button-extract {
display: block;
background-color: #005A9C;
color: #fff;
}
</style>
</head>
<body>
<div id="banner">
<h1 id="title">
<a href="http://www.w3.org/"><img height="48" alt="W3C" id="logo" src="http://www.w3.org/Icons/WWW/w3c_home_nb" /></a>
<a href="http://www.w3.org/QA/"><img src="http://www.w3.org/QA/2002/12/qa.png" alt="Quality Assurance" /></a>
Semantic Data Extractor
</h1>
</div>
<ul class="navbar" id="menu">
<li><strong><a href="/QA/" title="Quality Assurance Web Site Home">W3C QA Home</a></strong></li>
<li><a href="/QA/IG/" title="The Quality Assurance Interest Group">QA IG</a></li>
<li><a href="/QA/Library/" title="Documents and Publications on Web and Quality">Documents</a></li>
<li><a href="/QA/Tools">Tools</a></li>
<li><a href="/QA/IG/#contact">Feedback</a></li>
</ul>
<div id="searchbox">
<form method="get" action="http://www.google.com/custom" enctype="application/x-www-form-urlencoded">
<p id="formbox"><input type="text" size="15" class="textfield" name="q" accesskey="E" maxlength="255" /><input type="submit" class="submitfield" value="Search" id="goButton" name="sa" accesskey="G" /><input type="hidden" name="cof" value="T:black;LW:72;ALC:#ff3300;L:http://www.w3.org/Icons/w3c_home;LC:#000099;LH:48;BGC:white;AH:left;VLC:#660066;GL:0;AWFID:0b9847e42caf283e;" /><input type="hidden" id="searchW3C" name="sitesearch" checked="checked" value="www.w3.org/QA" /><input type="hidden" name="domains" value="www.w3.org/QA" /></p>
</form>
</div>
<div id="main"><!-- This DIV encapsulates everything in this page - necessary for the positioning -->
<div id="jumpbar">
<h2>Quick Introduction</h2>
<p>This tool, geared by an <a href="/2002/08/extract-semantic">XSLT stylesheet</a>, tries to extract some information from a HTML semantic rich document. It only uses information available through a good usage of the semantics defined in HTML.</p>
<p>The aim is to show that providing a semantically rich HTML gives much more value to your code: using a semantically rich HTML code allows a better use of CSS, makes your HTML intelligible to a wider range of user agents (especially search engines bots).</p>
<p>As an aside, it can give clues to user agents developers on some hooks that could be interesting to add in their product.</p>
</div>
<h2>On-line service</h2>
<p>See a demonstration of the service, relying both on the <a href="/2001/05/xslt">W3C XSLT Servlet</a> and <a href="http://services.w3.org/tidy/tidy">tidy on-line</a>:</p>
<form class="form" action="http://www.w3.org/2002/08/xslt4html" method="get">
<p><label><span style="display:block;">URI of the HTML document to extract semantics from:</span>
<input class="uri-field" type="text" name="xmlfile" /></label>
<input type="hidden" name="xslfile" value="http://www.w3.org/2002/08/extract-semantic.xsl" />
<input class="button-extract" type="submit" value="Extract semantics" />
</p>
</form>
<h2>More Semantics ?</h2>
<p>If you have suggestion to improve this XSLT, please send patches to <a href="mailto:public-qa-dev@w3.org?subject=[Semantic%20Data%20Extractor]">public-qa-dev@w3.org</a>.</p>
</div>
<!-- Footer -->
<address class="author">
<a href="http://validator.w3.org/check?uri=referer"><img
src="http://www.w3.org/Icons/valid-xhtml10" height="31" width="88"
alt="Valid XHTML 1.0!" /></a> Created Date: 2006-11-15 by Dominique Hazaël-Massieux<br />
Last modified $Date: 2011/01/04 17:14:10 $ by $Author: dom $</address>
<p class="copyright">
<a rel="Copyright" href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 1994-2006
<a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a>®
(<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>,
<a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>,
<a href="http://www.keio.ac.jp/">Keio</a>),
All Rights Reserved.
W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>,
<a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>,
<a rel="Copyright" href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a>
and <a rel="Copyright" href="http://www.w3.org/Consortium/Legal/copyright-software">software licensing</a>
rules apply. Your interactions with this site are in accordance
with our <a href="http://www.w3.org/Consortium/Legal/privacy-statement#Public">public</a> and
<a href="http://www.w3.org/Consortium/Legal/privacy-statement#Members">Member</a> privacy
statements.
</p>
</body>
</html>