index.html
14.2 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
<?xml version="1.0" encoding="UTF-8"?><!--*- nxml -*-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Gleaning Resource Descriptions from Dialects of Languages
(GRDDL)</title>
<style type="text/css">
.issue {
background-color:#dfd;
border: thin solid black;
color:black;
}
.designSketch {
background-color:#fdf;
border: thin solid black;
color:black;
}
.illustration {
margin-left:auto;
margin-right:auto;
text-align:center;
}
.example {
margin-left:auto;
margin-right:auto;
padding-top:0.5em;
padding-bottom:0.5em;
width:70%;
border-top:thin dashed black;
border-bottom:thin dashed black;
}</style>
<link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-CG-NOTE" />
</head>
<body xml:lang="en" lang="en">
<div class="head">
<a href="http://www.w3.org/"><img alt="W3C" src="http://www.w3.org/Icons/w3c_home"
height="48" width="72" /></a>
<h1>Gleaning Resource Descriptions from Dialects of Languages (GRDDL)</h1>
<h2>W3C Coordination Group Note 13 April 2004</h2>
<dl>
<dt>This Version:</dt>
<dd><a href="http://www.w3.org/TR/2004/NOTE-grddl-20040413/">http://www.w3.org/TR/2004/NOTE-grddl-20040413/</a></dd>
<dt>Latest Version:</dt>
<dd><a
href="http://www.w3.org/TR/grddl/">http://www.w3.org/TR/grddl/</a></dd>
<dt>Authors:</dt>
<dd><a href="/People/Dom/">Dominique Hazaël-Massieux</a></dd>
<dd><a
href="/People/Connolly/">Dan Connolly</a></dd>
</dl>
<p class="copyright"><a
href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a>
© 2003, 2004 <a href="http://www.w3.org/"><acronym
title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a
href="http://www.csail.mit.edu/"><acronym
title="Massachusetts Institute of Technology">MIT</acronym></a>, <a
href="http://www.ercim.org/"><acronym
title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>,
<a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a
href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>,
<a
href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a>,
<a href="http://www.w3.org/Consortium/Legal/copyright-documents">document
use</a> and <a
href="http://www.w3.org/Consortium/Legal/copyright-software">software
licensing</a> rules apply.</p>
</div>
<hr />
<h2>Abstract</h2>
<p>This document presents GRDDL, a mechanism for encoding RDF statements in
XHTML and XML to be extracted by programs such as XSLT transformations.</p>
<div>
<h2>Status of This Document</h2>
<p><em>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A
list of current W3C publications and the latest revision of this
technical report can be found in the <a
href="http://www.w3.org/TR/">W3C technical reports index</a> at
<tt>http://www.w3.org/TR/</tt>.</em></p>
<p>As part of the work of the <a
href="http://www.w3.org/2001/sw/Activity">W3C Semantic Web
Activity</a>, the <a href="/2001/sw/CG/">Semantic Web Coordination Group</a> (Member-only) and the <a href="/MarkUp/">HTML Working
Group</a> started a task force on RDF in XHTML. This draft is a snapshot
of one of the designs discussed in that task force.</p>
<p>Please send review comments, implementation experience reports,
etc. to <a href= "mailto:public-rdf-in-xhtml-tf@w3.org"
>public-rdf-in-xhtml-tf@w3.org</a>, a mailing list with <a
href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/">public
archive</a>.</p>
<p>The <a
href="http://esw.w3.org/topic/EmbeddingRDFinHTML">EmbeddingRDFinHTML</a>
wiki topic is also available as a shared space for collected wisdom on
related topics.</p>
<p>A related <a
href="http://www.w3.org/2004/01/rdxh/specbg.html">design history and
rationale</a> discusses contribution of this draft to RDF issues such
as <a
href="http://www.w3.org/2000/03/rdf-tracking/#faq-html-compliance"
>faq-html-compliance</a> and <a
href="http://www.w3.org/2000/03/rdf-tracking/#rdfms-validating-embedded-rdf"
>rdfms-validating-embedded-rdf</a> and Web Architecture issues such as
<a href="http://www.w3.org/2001/tag/issues.html?type=1#RDFinXHTML-35"
>RDFinXHTML-35</a> and <a
href="http://www.w3.org/2001/tag/issues.html?type=1#namespaceDocument-8"
>namespaceDocument-8</a>.</p>
<p>This is something of a design sketch, but it is backed by running
code. We provide pair of online services, <a
href="http://www.w3.org/2003/11/rdf-in-xhtml-demo">one demo for
XHTML</a> and <a
href="http://www.w3.org/2004/01/rdxh/grddl-xml-demo">one demo for
generic XML</a> on an experimental, best-effort basis.</p>
<p>The editors are aware of a few <span class="issue">remaining issues,
marked up like this <q>@@@</q></span>.</p>
<p>A <a href="#changes">log of changes</a> is appended.</p>
<p><em>Publication as a Coordination Group Note does not imply
endorsement by the W3C Membership. This is a draft document and may be
updated, replaced or obsoleted by other documents at any time. It is
inappropriate to cite this document as other than work in
progress.</em></p>
</div>
<div>
<h2 id="toc">Contents</h2>
<ol>
<li><a href="#intro">Introduction</a></li>
<li><a href="#grddl-xhtml">GRDDL for XHTML</a></li>
<li><a href="#grddl-xml">GRDDL for XML</a></li>
<li><a href="#ns-bind">GRDDL for XML Namespace Documents</a></li>
<li><a href="#sec">Security Considerations</a></li>
<li class="issue">@@ References</li>
</ol>
<ul>
<li><a href="#changes">Changelog</a></li>
</ul>
<h3 id="toc-app">Supplementary Material</h3>
<ul>
<li><a
href="http://www.w3.org/2004/lambda/Sites/index.html">Example
Homepage with Dublin Core, GeoURL, RSS, Creative Commons, etc.</a></li>
<li><a id="notes" href="http://www.w3.org/2004/01/rdxh/specbg.html">Design Histoy and Rationale</a></li>
</ul>
</div>
<div>
<h2 id="intro"><span class="gen">1.</span> Introduction</h2>
<p>An article by J. Kunze in 1999, <cite><a
href="http://www.ietf.org/rfc/rfc2731.txt">Encoding Dublin Core Metadata in
HTML</a></cite>, explains one way that the Dublin Core community encodes its
metadata in HTML documents. This metadata can also be expressed in the
Resource Description Framework (<a href="http://www.w3.org/RDF/">RDF</a>).</p>
<p>The mapping between the HTML encoding and the RDF encoding can be
represented as an XSLT transformation, <a
href="http://www.w3.org/2000/06/dc-extract/dc-extract.xsl">dc-extract.xsl</a>:</p>
<div class="illustration">
<img src="dc-extract.png" alt="diagram: HTML to RDF via dc-extract.xsl" /><br
/>
Decoding HTML metadata to RDF <br />
<small>(<a href="dc-extract.svg">svg</a>)</small></div>
<p>If the HTML author understood and agreed to these encoding conventions,
then their HTML document will conform to the syntactic conventions. In this
case, the mapping preserves the author's meaning. But an author may have
<em>accidentally</em> conformed to the syntactic conventions without any
knowledge of Dublin Core at all. In that case, the mapping most likely does
<em>not</em> preserve the author's meaning.</p>
<h2 id="grddl-xhtml"><span class="gen">2.</span> The GRDDL profile for
XHTML</h2>
<p>The HTML specification, in section <a href=
"http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#h-7.4.4.3"
>7.4.4.3 Meta data profiles</a> provides a mechanism for authors to
use particular metadata vocabularies and thereby indicate the author's
intent to use those terms in accordance with the conventions of the
community that originated the terms.</p>
<blockquote>
<p>Authors may wish to define additional link types not described in this
specification. If they do so, they should use a <a
href="http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#profiles">profile</a>
to cite the conventions used to define the link types.</p>
</blockquote>
<p><dfn>GRDDL</dfn> is such a profile; it's a mechanism for <b>G</b>leaning
<b>R</b>esource <b>D</b>escriptions from <b>D</b>ialects of <b>L</b>anguages.
Use of the <tt><a
href="/2003/g/data-view">http://www.w3.org/2003/g/data-view</a></tt> profile
indicates that <em>RDF statements that result from transformation of the HTML
document to RDF by designated algorithms are part of the document's
meaning.</em></p>
<p>In this profile, the <tt>transformation</tt> link relationship relates a
document to an algorithm for for gleaning resource descriptions from the
dialect the document is written in.</p>
<div class="illustration">
<img src="processing.png" alt="diagram: link to transformation" /><br />
Decoding HTML metadata to RDF <br />
<small>(<a href="processing.svg">svg</a>)</small>
</div>
<p class="issue">@@@ Should we namespace-qualify token used in
<code>rel</code>?cf <a
href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2004Jan/0005.html">Profiles
attribute: A format to be defined</a> Karl Dubost 15 Jan 2004.</p>
<p>For example:</p>
<pre class="example"><html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://www.w3.org/2003/g/data-view">
<title>Some Document</title>
<link rel="transformation"
href="http://www.w3.org/2000/06/dc-extract/dc-extract.xsl" />
<meta name="DC.Subject"
content="ADAM; Simple Search; Index+; prototype" />
...
</head>
...
</html></pre>
<p>The following RDF statement is part of the meaning of this document:</p>
<pre class="example"><rdf:RDF
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
<rdf:Description rdf:about="">
<dc:subject>ADAM; Simple Search; Index+; prototype</dc:subject>
</rdf:Description>
</rdf:RDF></pre>
<p>Transformation algorithms <b>should</b> be represented in XSLT. While
javascript, C, or any other programming language technically expresses the
relevant information, XSLT is specifically designed to express XML to XML
transformations and has some good safety characteristics. Other
representations <b>may</b> be used by prior agreement of all concerned
parties.</p>
<p>Transformation algorithms <b>should</b> be well-defined functions whose
only input is the source document. The use of the XSLT
<code>document()</code> function to incorporate other data at transformation
time is an <b>error</b>.</p>
<p class="issue">Limitations on <code>xsl:import</code>?</p>
<p>Note that an XHTML document may conform to a number of dialects
simultaneously and link to more than one decoding algorithm. For example, the
fictional <a
href="http://www.w3.org/2004/lambda/Sites/index.html">Joe
Lambda's Homepage</a> demonstrates a mixture of Dublin Core, Creative
Commons, RSS, FOAF, and geoURL dialects.</p>
</div>
<div>
<h2 id="grddl-xml"><span class="gen">3.</span> The GRDDL attribute in XML</h2>
<p>The GRDDL profile mechanism is a special case of GRDDL designed to fit
within the syntax of XHTML 1.0. The general form of GRDDL is an attribute
suitable for use with a wide variety of XML dialects.</p>
<p>Use of the <code>interpreter</code> attribute in the
<code>http://www.w3.org/2003/g/data-view#</code> namespace on the root
element of an XML document indicates that <em>RDF statements that result from
transformation of the HTML document to RDF by designated algorithms are part
of the document's meaning.</em></p>
<p>The value of the <code>grddl:interpreter</code> attribute designates a
list of algorithms by URI reference. <span class="issue">@@@IRI
reference?</span></p>
<p>For example: <em class="issue">update to P3Q example?</em></p>
<pre class="example"><code><svg xmlns="http://www.w3.org/2000/svg"
xmlns:data-view="http://www.w3.org/2003/g/data-view#"
data-view:interpreter="http://www.example.org/2004/01/svg2dc.xsl"
width="4cm" height="8cm"
version="1.1" baseProfile="tiny" ></code></pre>
</div>
<div>
<h2 id="ns-bind"><span class="gen">4.</span> XML Namespaces and embedded RDF</h2>
<p>The RDF property
<code>http://www.w3.org/2003/g/data-view#namespaceTransformation</code>
links an XML Namespace to an interpreter that may be applied to any document
which has its root element in that namespace, such that the output of the
interpreter will be an RDF/XML form of some (or all) of the information
content of the document.</p>
<p>For instance, given the XML Namespace
<code>http://www.example.net/fooML</code>,</p>
<div class="example">
<pre><code><rdf:Description rdf:about="http://www.example.net/fooML">
<namespaceTransformation xmlns='http://www.w3.org/2003/g/data-view#'
rdf:resource='http://www.example.net/fooML2rdf.xsl' />
</rdf:Description></code></pre>
</div>
<p>asserts that if an XML document has a root element in the
<code>http://www.example.net/fooML</code> namespace, and it is run through
the XSLT style sheet <code>http://www.example.net/fooML2rdf.xsl</code>
then the result will be valid RDF/XML which is information which can be
considered to have been expressed by the document.</p>
</div>
<div>
<h2 id="sec"><span class="gen">5.</span> Security considerations</h2>
<p><a href="http://www.faqs.org/rfcs/rfc2046.html">RFC 2046</a>, in
section 9. Security Considerations says:</p>
<blockquote>
<p>Implementors should pay special attention to the
security implications of any media types that can cause the remote
execution of any actions in the recipient's environment. In such
cases, the discussion of the "application/postscript" type may serve
as a model for considering other media types with remote execution
capabilities.</p>
</blockquote>
<p>Given the expressive power of XSLT, and the possibility to access external
resources from a XSLT style sheet (e.g. through the <code>document</code>
function or the <code>xsl:import</code> mechanism), implementors should take
the appropriate measures to prevent malicious usage of this mechanism.</p>
</div>
<div>
<h2 id="changes"><em>Change History</em></h2>
<p>The <a href="http://www.w3.org/2003/11/rdf-in-xhtml-proposal">Nov
2003 draft</a> is a predecessor of this spec.</p>
<p>An <a href="http://www.w3.org/2004/01/rdxh/spec">editor's working draft</a> is also available; v1.11 was announced in <a
href="http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2004Jan/0011.html">a
message of 16Jan</a>.</p>
</div>
</body>
</html>