all.htm
8.84 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en-US" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>The Semantic Web and its applications at W3C</title>
<link rel="stylesheet" type="text/css" title="W3C Talk"
href="http://www.w3.org/2003/Talks/Tools/w3ctalk-summary.css" />
<link rel="stylesheet" type="text/css" title="W3C Talk"
href="http://www.w3.org/2003/Talks/Tools/w3ctalk-proj.css" media="projection"/>
</head>
<body>
<h1>The Semantic Web and its applications at W3C</h1>
<address>Dominique Hazaël-Massieux<br />
W3C Team (Aix-en-Provence)<br />
SIMO - Madrid<br />
slides: <code>http://www.w3.org/2003/Talks/simo-semwebapp/</code>
</address>
<h1>Overview</h1>
<ol>
<li>Semantic Web Technologies</li>
<li>Examples of Semantic Web Applications in W3C Work</li>
</ol>
<h1>Semantic Web Technologies (1)</h1>
<p>Goal: Adding a Web of data to the Web of documents</p>
<p>Foundation: RDF (Resource Description Framework) allows to state anything about anything</p>
<h1>Semantic Web Technologies (2)</h1>
<div>
<img src="../../../DesignIssues/diagrams/sw-stack-2002.png" alt="architectural layers of the semantic web" />
<p>The layers of Semantic Web Technologies</p>
</div>
<h1>So what?</h1>
<ul>
<li>RDF is XML → Internationalization, XSLT (XML transformation language) and other common tools</li>
<li>RDF is perfect for merging data from various sources, and for mixing namespaces</li>
<li>RDF is for the Web → easier to share, to protect, to retrieve</li>
<li>a consistant stack of technologies → better integration</li>
<li>using URIs → network effect</li>
</ul>
<h1>Not tomorrow, today</h1>
<p>These benefits are NOT theoretical...</p>
<p>Not everything is standardized yet, but the prototypes are promizing<br />
→ use of Semantic Web Technologies integrated in W3C Work</p>
<div>
<img src="sw-stack-annotated2.png" alt="architectural layers of the semantic web annotated as standard/experimental" />
<p>The layers of Semantic Web Technologies: what's standard, being standardized, experimental</p>
</div>
<h1>W3C Process and Deliverables</h1>
<ul>
<li>Main W3C Deliverables are its Technical Reports (TR)</li>
<li>produced following an adopted process</li>
<li>in a quite decentralized way</li>
</ul>
<h1><a href="../../../2002/01/tr-automation/">Technical Reports management automation</a></h1>
<ul>
<li><a href="../../../TR/Overview.html">TR page</a>: more than 400 referenced documents, produced by more than 500 editors</li>
<li>maintained by hand until Nov. 2002 (<em>I know, I used to do it!</em>)</li>
<li>now managed with Semantic Web technologies</li>
</ul>
<h1>TR automation (2)</h1>
<ul>
<li>W3C Specification must conform to <a href="http://www.w3.org/2003/05/27-pubrules.html">W3C Publication Rules</a>:<ul>
<li>→ common format between W3C Specifications</li>
<li>common set of metadata (title, date, editors list, status of the publication, ...)</li>
</ul></li>
<li>metadata manually reported in the list of Technical Reports (before)</li>
</ul>
<h1>TR automation (3)</h1>
<p>But now:</p>
<ul>
<li>an XSLT is used to <a href="pubrules-example.html">check the conformance to the publication rules</a></li>
<li>... and this XSLT also allows to extract the metadata</li>
</ul>
<h1>TR automation (4)</h1>
<div>
<p>Modeling the process as Rules</p>
<img src="tr-pub-process-simplified.png" alt="Simplified schema of the automated publication process" /><br />
→ a completely formalized digital library
</div>
<p>(in fact, it is <a href="tr-pub-process.png">a process a bit more complex</a>)</p>
<h1>Benefits of TR automation</h1>
<p>Webmaster's point of view:</p>
<ul>
<li>less human manipulations → less errors</li>
<li>much more efficient (publication rate grows steadily)</li>
<li>new views of the TR page (<a href="../../../TR/tr-editor.html">by editor</a>, <a href="../../../TR/tr-date.html">by date</a>, <a href="../../../TR/tr-title.html">by title</a>, <a href="../../../TR/tr-activity.html">by W3C Activity</a>) possible for free! (thanks XSLT)</li>
</ul>
<h1>But even better!</h1>
<p>The data gets reused and completed all over the place!</p>
<ul>
<li>our COO wants statistical analysis → <a href="../../05/tr-history/rec-history.svg">graphical overview</a> (wasn't available before)</li>
<li>our Communication Team needs number → <a href="tr-stats-sample.html">available at any time through XSLT</a> (was done manually before)</li>
<li>people want to integrate our list of standards on their site → <a href="../../../2002/01/tr-automation/tr.rdf">it's in RDF/XML available for every one</a> (had to be parsed by ad-hoc scripts before)</li>
<li>our specifications Editors want <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Ftr-biblio.xsl&xmlfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Ftr-logs.rdf&uris=http%3A%2F%2Fwww.w3.org%2FTR%2Fqaframe-ops%2F%0D%0Ahttp%3A%2F%2Fwww.w3.org%2FTR%2F2003%2FWD-qaframe-spec-20030912%2F%0D%0A">automated bibliographies</a> for their specification (wasn't available before)</li>
<li>a very non-intrusive approach (didn't need more work for anybody)</li>
</ul>
<h1>A basis for other works</h1>
<ul>
<li><a href="../../../QA/TheMatrix.html">The QA Matrix</a> now uses this as a basis for its data:<ul>
<li>only maintains what's relevant (validators, test suites, ...)</li>
<li>gets automated updates from the TR in RDF</li>
</ul></li>
<li><a href="../../03/Translations/OverviewLang">Translations at W3C</a><ul>
<li>only maintains what's relevant (translations URIs, translators, ...)</li>
<li>provide various views (by Technology, by Language)</li>
</ul></li>
<li>other sites re-use our digital library as a basis for other works</li>
</ul>
<h1>Related to...</h1>
<ul>
<li>the <a href="../../../2000/04/mem-news/public-groups.rdf">RDF structure of W3C</a> (automatically extracted from HTML) gives a very powerful <a href="../../02/W3COrg.svg">overview of the work at W3C</a> (and their dependencies)</li>
<li>the <a href="../../../2002/10/scrape/Introduction.html">WG Markup guidelines</a> allow us to get very detailed data on the Working Group using them</li>
<li>the <a href="../../01/pubcalendar/calendar.html">W3C Publication Calendar</a> allows to get an 10 000 feet overview of the work happening at W3C</li>
</ul>
<h1>It's actually all about integration</h1>
<p>Integration of Web Technologies:</p>
<ul>
<li>XML → Internationalization (e.g. in Translations)</li>
<li>XSLT gives us all the power on our data</li>
<li>RDF/S brings modeling of real entities, in a <a href="../../../2001/02pd/rec54-img.png">self-describing way</a></li>
<li>XHTML can be used as output and even as input (through social/technical conventions)</li>
<li>SVG (vectorial graphics) makes the output even more powerful</li>
</ul>
<h1>It's all about integration (2)</h1>
<p>Integration of tools:</p>
<ul>
<li><a href="../../../2000/10/swap/">CWM</a> and <a href="http://rdflib.net">RDFLib</a></li>
<li>Any (reasonably compliant) XSLT processor</li>
</ul>
<h1>Decentralized data management</h1>
<p>I'm presenting this work, but I have built only a small piece of it... Data are managed:</p>
<ul>
<li>by the Webmaster (list of TRs)</li>
<li>by the Quality Assurance Team (Matrix data)</li>
<li>by the Translations Management Team (Translations list)</li>
<li>by the Communication Team (structural data about W3C)</li>
<li>by the Working Group Chairs (detailed info about WG)</li>
</ul>
<p>In different formats: RDF/XML, Notation3, (X)HTML</p>
<h1>Decentralized data manipulation</h1>
<p>(credits go to)</p>
<ul>
<li>Ryan Lee</li>
<li>Ivan Herman</li>
<li>Dan Connolly</li>
<li>(myself)</li>
</ul>
<p>The fact that the data are on the Web (using URIs) allows to deal with data from various origins without trouble (network effect)</p>
<h1>Upcoming uses</h1>
<ul>
<li>reports of discrepancies in the data</li>
<li>graphical navigation in W3C site (?), in TR page (in progress)</li>
<li>more statistical analysis of our work</li>
<li>more automation of our processes</li>
</ul>
<h1>Comparisons with traditional solutions</h1>
<ul>
<li>Modeling, Business/Processing Rules are not "lost" in the code <br />
→ declarative approach more re-usable, easier to adapt</li>
<li>HTTP is the most ubiquitous protocol <br />
→ easy re-use of available tools and technologies (e.g. access control)</li>
<li>integrated techonologies from top to bottom <strong>and</strong> open standards (cf. XML success)</li>
<li>highly decentralizable (cf. Web success :)</li>
</ul>
<h1>Thanks for your attention</h1>
<p>Your questions are welcome!</p>
<h2>References</h2>
<p>More details:</p>
<ul>
<li><a href="http://www.w3.org/2002/01/tr-automation/">TR automation</a></li>
<li><a href="http://www.w3.org/2003/03/Translations/">Translations management</a></li>
<li><a href="http://www.w3.org/2002/10/scrape/Introduction.html">WG Markup guidelines</a> for RDF scraping</li>
</ul>
</body>
</html>