all.htm 8.84 KB
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en-US" xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>The Semantic Web and its applications at W3C</title>
<link rel="stylesheet" type="text/css" title="W3C Talk"
      href="http://www.w3.org/2003/Talks/Tools/w3ctalk-summary.css" />
<link rel="stylesheet" type="text/css" title="W3C Talk"
      href="http://www.w3.org/2003/Talks/Tools/w3ctalk-proj.css" media="projection"/>
</head>
<body>
<h1>The Semantic Web and its applications at W3C</h1>
<address>Dominique Haza&euml;l-Massieux<br />
W3C Team (Aix-en-Provence)<br />
SIMO - Madrid<br />
slides: <code>http://www.w3.org/2003/Talks/simo-semwebapp/</code>
</address>
<h1>Overview</h1>
<ol>
<li>Semantic Web Technologies</li>
<li>Examples of Semantic Web Applications in W3C Work</li>
</ol>
<h1>Semantic Web Technologies (1)</h1>
<p>Goal: Adding a Web of data to the Web of documents</p>
<p>Foundation: RDF (Resource Description Framework) allows to state anything about anything</p>
<h1>Semantic Web Technologies (2)</h1>
<div>
<img src="../../../DesignIssues/diagrams/sw-stack-2002.png" alt="architectural layers of the semantic web" />
<p>The layers of Semantic Web Technologies</p>
</div>
<h1>So what?</h1>
<ul>
<li>RDF is XML &rarr; Internationalization, XSLT (XML transformation language) and other common tools</li>
<li>RDF is perfect for merging data from various sources, and for mixing namespaces</li>
<li>RDF is for the Web &rarr; easier to share, to protect, to retrieve</li>
<li>a consistant stack of technologies &rarr; better integration</li>
<li>using URIs &rarr; network effect</li>
</ul>
<h1>Not tomorrow, today</h1>
<p>These benefits are NOT theoretical...</p>
<p>Not everything is standardized yet, but the prototypes are promizing<br />
&rarr; use of Semantic Web Technologies integrated in W3C Work</p>
<div>
<img src="sw-stack-annotated2.png" alt="architectural layers of the semantic web annotated as standard/experimental" />
<p>The layers of Semantic Web Technologies: what's standard, being standardized, experimental</p>
</div>
<h1>W3C Process and Deliverables</h1>
<ul>
<li>Main W3C Deliverables are its Technical Reports (TR)</li>
<li>produced following an adopted process</li>
<li>in a quite decentralized way</li>
</ul>
<h1><a href="../../../2002/01/tr-automation/">Technical Reports management automation</a></h1>
<ul>
<li><a href="../../../TR/Overview.html">TR page</a>: more than 400 referenced documents, produced by more than 500 editors</li>
<li>maintained by hand until Nov. 2002 (<em>I know, I used to do it!</em>)</li>
<li>now managed with Semantic Web technologies</li>
</ul>
<h1>TR automation (2)</h1>
<ul>
<li>W3C Specification must conform to <a href="http://www.w3.org/2003/05/27-pubrules.html">W3C Publication Rules</a>:<ul>
<li>&rarr; common format between W3C Specifications</li>
<li>common set of metadata (title, date, editors list, status of the publication, ...)</li>
</ul></li>
<li>metadata manually reported in the list of Technical Reports (before)</li>
</ul>
<h1>TR automation (3)</h1>
<p>But now:</p>
<ul>
<li>an XSLT is used to <a href="pubrules-example.html">check the conformance to the publication rules</a></li>
<li>... and this XSLT also allows to extract the metadata</li>
</ul>
<h1>TR automation (4)</h1>
<div>
<p>Modeling the process as Rules</p>
<img src="tr-pub-process-simplified.png" alt="Simplified schema of the automated publication process" /><br />
&rarr; a completely formalized digital library
</div>
<p>(in fact, it is <a href="tr-pub-process.png">a process a bit more complex</a>)</p>
<h1>Benefits of TR automation</h1>
<p>Webmaster's point of view:</p>
<ul>
<li>less human manipulations &rarr; less errors</li>
<li>much more efficient (publication rate grows steadily)</li>
<li>new views of the TR page (<a href="../../../TR/tr-editor.html">by editor</a>, <a href="../../../TR/tr-date.html">by date</a>, <a href="../../../TR/tr-title.html">by title</a>, <a href="../../../TR/tr-activity.html">by W3C Activity</a>) possible for free! (thanks XSLT)</li>
</ul>
<h1>But even better!</h1>
<p>The data gets reused and completed all over the place!</p>
<ul>
<li>our COO wants statistical analysis &rarr; <a href="../../05/tr-history/rec-history.svg">graphical overview</a> (wasn't available before)</li>
<li>our Communication Team needs number &rarr; <a href="tr-stats-sample.html">available at any time through XSLT</a> (was done manually before)</li>
<li>people want to integrate our list of standards on their site &rarr; <a href="../../../2002/01/tr-automation/tr.rdf">it's in RDF/XML available for every one</a> (had to be parsed by ad-hoc scripts before)</li>
<li>our specifications Editors want <a href="http://www.w3.org/2000/06/webdata/xslt?xslfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Ftr-biblio.xsl&xmlfile=http%3A%2F%2Fwww.w3.org%2F2002%2F01%2Ftr-automation%2Ftr-logs.rdf&uris=http%3A%2F%2Fwww.w3.org%2FTR%2Fqaframe-ops%2F%0D%0Ahttp%3A%2F%2Fwww.w3.org%2FTR%2F2003%2FWD-qaframe-spec-20030912%2F%0D%0A">automated bibliographies</a> for their specification (wasn't available before)</li>
<li>a very non-intrusive approach (didn't need more work for anybody)</li>
</ul>
<h1>A basis for other works</h1>
<ul>
<li><a href="../../../QA/TheMatrix.html">The QA Matrix</a> now uses this as a basis for its data:<ul>
<li>only maintains what's relevant (validators, test suites, ...)</li>
<li>gets automated updates from the TR in RDF</li>
</ul></li>
<li><a href="../../03/Translations/OverviewLang">Translations at W3C</a><ul>
<li>only maintains what's relevant (translations URIs, translators, ...)</li>
<li>provide various views (by Technology, by Language)</li>
</ul></li>
<li>other sites re-use our digital library as a basis for other works</li>
</ul>
<h1>Related to...</h1>
<ul>
<li>the <a href="../../../2000/04/mem-news/public-groups.rdf">RDF structure of W3C</a> (automatically extracted from HTML) gives a very powerful <a href="../../02/W3COrg.svg">overview of the work at W3C</a> (and their dependencies)</li>
<li>the <a href="../../../2002/10/scrape/Introduction.html">WG Markup guidelines</a> allow us to get very detailed data on the Working Group using them</li> 
<li>the <a href="../../01/pubcalendar/calendar.html">W3C Publication Calendar</a> allows to get an 10 000 feet overview of the work happening at W3C</li>
</ul>
<h1>It's actually all about integration</h1>
<p>Integration of Web Technologies:</p>
<ul>
<li>XML &rarr; Internationalization (e.g. in Translations)</li>
<li>XSLT gives us all the power on our data</li>
<li>RDF/S brings modeling of real entities, in a <a href="../../../2001/02pd/rec54-img.png">self-describing way</a></li>
<li>XHTML can be used as output and even as input (through social/technical conventions)</li>
<li>SVG (vectorial graphics) makes the output even more powerful</li>
</ul>
<h1>It's all about integration (2)</h1>
<p>Integration of tools:</p>
<ul>
<li><a href="../../../2000/10/swap/">CWM</a> and <a href="http://rdflib.net">RDFLib</a></li>
<li>Any (reasonably compliant) XSLT processor</li>
</ul>
<h1>Decentralized data management</h1>
<p>I'm presenting this work, but I have built only a small piece of it... Data are managed:</p>
<ul>
<li>by the Webmaster (list of TRs)</li>
<li>by the Quality Assurance Team (Matrix data)</li>
<li>by the Translations Management Team (Translations list)</li>
<li>by the Communication Team (structural data about W3C)</li>
<li>by the Working Group Chairs (detailed info about WG)</li>
</ul>
<p>In different formats: RDF/XML, Notation3, (X)HTML</p>
<h1>Decentralized data manipulation</h1>
<p>(credits go to)</p>
<ul>
<li>Ryan Lee</li>
<li>Ivan Herman</li>
<li>Dan Connolly</li>
<li>(myself)</li>
</ul>
<p>The fact that the data are on the Web (using URIs) allows to deal with data from various origins without trouble (network effect)</p>
<h1>Upcoming uses</h1>
<ul>
<li>reports of discrepancies in the data</li>
<li>graphical navigation in W3C site (?), in TR page (in progress)</li>
<li>more statistical analysis of our work</li>
<li>more automation of our processes</li>
</ul>
<h1>Comparisons with traditional solutions</h1>
<ul>
<li>Modeling, Business/Processing Rules are not "lost" in the code <br />
&rarr; declarative approach more re-usable, easier to adapt</li>
<li>HTTP is the most ubiquitous protocol <br />
&rarr; easy re-use of available tools and technologies (e.g. access control)</li>
<li>integrated techonologies from top to bottom <strong>and</strong> open standards (cf. XML success)</li>
<li>highly decentralizable (cf. Web success :)</li>
</ul>
<h1>Thanks for your attention</h1>
<p>Your questions are welcome!</p>
<h2>References</h2>
<p>More details:</p>
<ul>
<li><a href="http://www.w3.org/2002/01/tr-automation/">TR automation</a></li>
<li><a href="http://www.w3.org/2003/03/Translations/">Translations management</a></li>
<li><a href="http://www.w3.org/2002/10/scrape/Introduction.html">WG Markup guidelines</a> for RDF scraping</li>
</ul>
</body>
</html>