index.html 35.5 KB
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
  <meta http-equiv="content-type" content="text/html; charset=UTF-8" />
  <meta name="RCS-Id"
  content="$Id: Overview.html,v 1.7 2008/06/07 08:29:30 eric Exp $" />
  <title>Experiences with the conversion of SenseLab databases to
  RDF/OWL</title>
  <style type="text/css">


      /*<![CDATA[*/
.mesh { background-color: #ffc }
.goa { background-color: #fcf }
.glbl { background-color: #ccf }
.plbl { background-color: #cfc }
.var { font-weight: bold }
.db { font-weight: bold }
.gene { color: blue }
.process{ color: red }
.senselab { background-color: #0ff }
.identifier { font-weight: bold }
.comment{ color: orange; font-size: 1.3em; }
.schema th { text-align: left }
table, td, th { border-style: solid;
                  border-width: 1px;
                  border-color: black;
                  border-bottom-color: gray;
                  border-right-color: gray; }
table.dbsTable { border-collapse: collapse; border-color: #000000; }
table.dbsTable td:first-child { vertical-align: top; }
table.dbsTable td { padding: 2px 5px 2px 5px; }
.at-issue {text-decoration: underline;}
.issue {background-color: #fcc;}
      /*]]>*/
  </style>
    <link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-IG-NOTE" />
</head>

<body>

<div class="head">
<p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home"
alt="W3C" height="48" width="72" /></a></p>

<h1 id="main">Experiences with the conversion of SenseLab databases to
RDF/OWL</h1>
<h2 class="no-num no-toc" id="w3c-doctype">W3C Interest Group Note 4 June 2008</h2>
<dl>
  <!-- dt>Editors working draft.</dt>
    <dd><span class="cvs-id">$Revision: 1.7 $ of
    $Date: 2008/06/07 08:29:30 $</span></dd -->
	<dt>This version:</dt>
	<dd><a href="http://www.w3.org/TR/2008/NOTE-hcls-senselab-20080604/">http://www.w3.org/TR/2008/NOTE-hcls-senselab-20080604/</a></dd>
	<dt>Latest version:</dt>
	<dd><a href="http://www.w3.org/TR/hcls-senselab/">http://www.w3.org/TR/hcls-senselab/</a></dd>
	<dt>Previous version:</dt>
	<dd><a href="http://www.w3.org/TR/2008/WD-hcls-senselab-20080404/">http://www.w3.org/TR/2008/WD-hcls-senselab-20080404/</a></dd>
  <dt>Editors:</dt>
    <dd>Matthias Samwald, <a href="http://ycmi.med.yale.edu/">Yale Center for
      Medical Informatics</a> / <a href="http://www.deri.ie/">DERI Galway</a>
      / <a href="http://www.semantic-web.at/">Semantic Web Company</a> &lt;<a
      href="mailto:samwald@gmx.at">samwald@gmx.at</a>&gt;</dd>
    <dd>Kei-Hoi Cheung, Yale Center for Medical Informatics &lt;<a
      href="kei.cheung@yale.edu">kei.cheung@yale.edu</a>&gt;</dd>
  <dt>Contributors:</dt>
    <dd>Alan Ruttenberg, <a href="http://sciencecommons.org/">Science
      Commons</a> &lt;<a
      href="mailto:alanruttenberg@gmail.com">alanruttenberg@gmail.com</a>&gt;</dd>
    <dd>Huajun Chen, Yale Center for Medical Informatics / <a
      href="http://www.zju.edu.cn/english/">Zhejiang University</a> &lt;<a
      href="mailto:huajunsir@zju.edu.cn">huajunsir@zju.edu.cn</a>&gt;</dd>
</dl>

<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> &copy; 2008 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>&reg;</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p>
</div>
<hr title="Separator for header" />

<div>
<h2 class="notoc" id="abstract">Abstract</h2>

<p>One of the challenges facing Semantic Web for Health Care and Life
Sciences is that of converting relational databases into Semantic Web format.
The issues and the steps involved in such a conversion have not been well
documented. To this end, we have created this document to describe the
process of converting SenseLab databases into OWL. SenseLab is a collection
of relational (Oracle) databases for neuroscientific research. The conversion
of these databases into RDF/OWL format is an important step towards realizing
the benefits of Semantic Web in integrative neuroscience research. This
document describes how we represented some of the SenseLab databases in
Resource Description Framework (RDF) and Web Ontology Language (OWL), and
discusses the advantages and disadvantages of these representations. Our OWL
representation is based on the reuse and extension of existing standard OWL
ontologies developed in the biomedical ontology communities. The purpose of
this document is to share our implementation experience with the
community.</p>
</div>

<div>
<h2 id="status">Status of This Document</h2>

<p><em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a href="http://www.w3.org/TR/">W3C technical reports index</a>  at http://www.w3.org/TR/.</em></p>

      <p>This is 
an Interest Group Note
<!-- an editor's draft -->
of the <a href="http://www.w3.org/2001/sw/hcls/">Semantic Web in Health Care and Life Sciences Interest Group (HCLS)</a>, part of the <a href="http://www.w3.org/2001/sw/">W3C Semantic Web Activity</a>. It is considered stable and expected to be published as an Interest Group Note in May 2008.

This document serves as a companion to 
<a href="http://www.w3.org/TR/2008/NOTE-hcls-kb-20080604/">A Prototype Knowledge Base for the Life Sciences</a>
 and describes the process for integrating new data into an existing biological database. We hope other groups who plan to convert their databases into RDF/OWL format will benefit from this document.</p>

<p>The document was produced by the <a href="http://www.w3.org/2001/sw/hcls/">Semantic Web in Health Care and Life Sciences Interest Group (HCLS)</a>, part of the <a href="http://www.w3.org/2001/sw/">W3C Semantic Web Activity</a> (<a href="http://www.w3.org/2001/sw/hcls/charter">see charter</a>). Comments may be sent to the <a href="http://lists.w3.org/Archives/Public/public-semweb-lifesci/">publicly archived</a> <a href="mailto:public-semweb-lifesci@w3.org">public-semweb-lifesci@w3.org</a> mailing list. Feedback is encouraged, as is participation in the recently <a href="http://www.w3.org/2008/05/HCLSIGCharter">re-charted</a> HCLSIG. A <a href="WD2NOTE">list of changes since the last publication</a> is available.</p>

<p>Publication as an Interest Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p>

<p>This document was produced by a group operating under the disclosure
obligations of the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 W3C Patent Policy</a>. The group does 
not expect this document to become a W3C Recommendation. An 
individual who has actual knowledge of a patent which the individual 
believes contains <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the information to
<a href="mailto:public-semweb-lifesci@w3.org">public-semweb-lifesci@w3.org</a> [<a href="http://lists.w3.org/Archives/Public/public-semweb-lifesci/">public archive</a>] in accordance with
in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>.</p>

</div>
<hr />

<div class="toc">
<h2 id="TOC">Table of Contents</h2>
<ul class="toc">
  <li><a href="#process">Conversion process</a> 
    <ul>
      <li><a href="#sources">Original data sources</a></li>
      <li><a href="#first">Initial RDF and OWL conversions</a> 
        <ul>
          <li><a href="#Motivation">Motivation</a></li>
          <li><a href="#Process">Process</a></li>
          <li><a href="#Outcome">Outcome </a></li>
        </ul>
      </li>
      <li><a href="#revised">Revised OWL conversions</a> 
        <ul>
          <li><a href="#Motivation1">Motivation</a></li>
          <li><a href="#Process1">Process</a></li>
          <li><a href="#Outcome1">Outcome</a></li>
        </ul>
      </li>
    </ul>
  </li>
  <li><a href="#advantages">Advantages</a></li>
  <li><a href="#disadvantages">Disadvantages</a></li>
  <li><a href="#future">Future directions and plans</a></li>
  <li><a href="#suggestions">Suggestions based on our experiences</a></li>
  <li><a href="#conclusion">Conclusion</a></li>
  <li><a href="#references">References</a></li>
  <li><a href="#Acknowledg">Acknowledgements (Informative)</a></li>
</ul>
</div>

<hr />

<h2 id="process">Conversion process</h2>

<h3 id="sources">Original data sources</h3>

<p>The SenseLab databases can be accessed through a web interface at the
SenseLab web site [<a href="#ref-SENSELAB-WEB">SENSELAB-WEB</a>]. SenseLab is
divided into a number of specialised databases, of which we have converted
three to Semantic Web formats. These databases are NeuronDB, BrainPharm and
ModelDB. All databases are based on compartmental models of neurons. NeuronDB
contains descriptions of anatomic locations, cell architecture and
physiologic parameters of neuronal cells. The pilot BrainPharm database is
intended to support research on drugs for the treatment of neurological
disorders. It enhances the descriptions in a portion of NeuronDB with
descriptions of the actions of pathological and pharmacological agents.
ModelDB is a large repository of computational neuroscience models and
simulations. The mathematical models in ModelDB are annotated with references
to NeuronDB. Taken together, these databases allow the researcher to query
information and to run simulations pertaining to the function of neurons in
healthy and disease states. All databases contain extensive literature
references and excerpts from texts that have been used to curate the database
entries.</p>

<p>The databases are based on the "entity-attribute-value with classes and
relationships" (EAV/CR) schema [<a href="#ref-EAV-CR">EAV-CR</a>]. The data
can also be downloaded from the SenseLab Semantic Web development portal [<a
href="#ref-SENSELAB-SW">SENSELAB-SW</a>] as a database dump in Microsoft
Access format and as text.</p>

<h3 id="first">Initial RDF and OWL conversions</h3>

<h4 id="Motivation">Motivation</h4>

<p>Our motivation was to make the SenseLab databases available in RDF(S) [<a
href="#ref-RDF">RDFS</a>] (without OWL) and in OWL DL [<a
href="#ref-OWL-Overview">OWL Overview</a>]. The two versions were developed
in parallel in order to compare the difference between the conversion
processes and the outcomes. We wanted to explore the issues in mapping
relational databases to RDF/OWL structure. In addition, we wanted to explore
the possibility of automatic translation from EAV/CR to RDF.</p>

<h4 id="Process">Process</h4>

<p>We developed a converter application in Java that queried the SenseLab
database and wrote RDF/XML files. The conversion was fully automatic for the
RDF version, but required some manual editing for the OWL version.</p>

<h4 id="Outcome">Outcome </h4>

<p>These conversions were too tied to the original database structure, which
resulted in inconsistent OWL ontologies. Some shortcomings of the first
conversion to OWL were: </p>
<ul>
  <li>'Part of' relations were incorrectly represented as subclass relations.
    This seems to be one of the most common mistakes in ontology development
    in general. </li>
  <li>Class disjoints<a id="disjoint-ref" href="#disjoint">¹</a> were
    missing which made it hard to find inconsistencies and data entry errors.
  </li>
  <li>After disjoints were introduced, we found some previously unidentified
    inconsistencies with the help of OWL reasoners: some classes (e.g.
    'GABA') were subclasses of both 'neurotransmitter' and 'receptor', which
    was wrong. This was an artifact caused by the automated conversion --
    both GABA transmitters and GABA receptors were simply labeled with 'GABA'
    in the source database. The conversion algorithm generated URIs based on
    these labels, so they were represented with identical URIs
    (<code>http://neuroweb.med.yale.edu/senselab/neuron_ontology.owl#<strong>GABA</strong></code>).
    This grave mistake would not have been noticed without the use of OWL
    reasoning. </li>
  <li>Some of the labels of entities generated by the Java converter were
    very terse and not understandable outside the user interface of the
    original database. For example, "Ded" was the label of the "distal part
    of the dendrite". </li>
</ul>

<p id="disjoint"><a href="#disjoint-ref">¹</a> Disjoint classes are used in
OWL to assert that they have no members in common. Inferences from this can
be used to flag any inconsistent models.</p>

<h3 id="revised">Revised OWL conversions</h3>

<p>The revised OWL conversion was based on the first OWL conversions
described above. The design of the revised SenseLab ontologies follows the
"ontological realism" approach [<a href="#ref-SMITH-2004">SMITH-2004</a>].
This means that the revised ontologies are focused on direct representations
of physical objects and processes (e.g., neuronal cells, ionic currents), and
not on their abstractions (e.g., concepts or database entries). </p>

<h4 id="Motivation1">Motivation</h4>

<p>Manually correcting the logical inconsistencies in the first version of
the OWL ontology; making use of foundational ontologies (BFO, Relation
Ontology) where possible; mapping the ontology to other neuroscience
ontologies. </p>

<h4 id="Process1">Process</h4>

<p>An ontology containing basic class hierarchies and relations was manually
created, based on the structure of existing SenseLab databases. This basic
ontology could not be created from the database structure in an automated
process because this would not have resulted in a logically consistent
ontology. This ontology was edited by a domain expert, based on inspection
and manual editing with Protege 3.2 [<a href="#ref-PROTEGE">PROTEGE</a>] and
Topbraid Composer [<a href="#ref-TOPBRAID">TOPBRAID</a>]. The ontologies were
built upon established foundational ontologies in order to maximize the
interoperability with other existing and forthcoming biomedical Semantic Web
resources. The foundational ontologies used were:</p>
<ul>
  <li>the Relation Ontology [<a href="#ref-RO">RO</a>] from the Open
    Biomedical Ontologies repository [<a href="#ref-OBO">OBO</a>], which
    defines basic relations such as 'part of', 'participant of' or 'contained
    in'. </li>
  <li>the Basic Formal Ontology [<a href="#ref-BFO">BFO</a>], which defines
    basic classes such as 'process', 'object', 'quality' or 'function'. </li>
</ul>

<p>Based on this manually created basic ontology, the data from the SenseLab
databases were then automatically converted to OWL using programs written in
Java and Python. The automated export scripts extended the manually created
basic ontology through the creation of subclasses, OWL property restrictions
and individuals. The resulting ontologies show no clearly distinguishable
divide between the 'schema' and 'data'. </p>

<p>The OWL export of NeuronDB was based on a transformation from the EAV/CR
model of the SenseLab database to files in RDF/XML syntax by a Java program.
The export from ModelDB and BrainPharm was based on a simple flat text file
export of the databases. The text file exports were converted to RDF/XML
files with a Python script. </p>

<p>For mappings to external bioinformatics databases that did not yet offer
stable URIs for reference on the Semantic Web, we used the URI scheme for
database record identifiers established by Science Commons [<a
href="#ref-SC-URI">SC-URI</a>]. URIs for database records could simply be
generated by concatenating the record identifier to a predefined namespace.
For example, the Entrez Gene record with ID '3579' was identified by the URI
<a
href="http://purl.org/commons/record/ncbi_gene/3579"><code>http://purl.org/commons/record/ncbi_gene/<strong>3579</strong></code></a>,
the Uniprot record 'P46663' was identified by <a
href="http://purl.org/commons/record/uniprotkb/P46663"><code>http://purl.org/commons/record/uniprotkb/<strong>P46663</strong></code></a>
and the Pubmed record with ID '11160518' was identified by <a
href="http://purl.org/commons/record/pmid/11160518"><code>http://purl.org/commons/record/pmid/<strong>11160518</strong></code></a>.
The database entries were connected to the ontological representations of
real-word entities through relations such as
<code>has_nucleotide_sequence_described_by</code>. For example, the gene of
the Dopamine Receptor D1 (DRD1) is defined through a reference to NCBI record
1812, which contains a description of the sequence of this specific gene: </p>

<p><tt>&lt;http://purl.org/ycmi/senselab/neuron_ontology.owl#DRD1_Gene&gt;
owl:equivalentClass _:property_restriction1 .</tt><br />
<tt>_:property_restriction1 owl:onProperty
senselab:has_nucleotide_sequence_described_by .</tt><br />
<tt>_:property_restriction1 owl:hasValue
&lt;http://purl.org/commons/record/ncbi_gene/1812&gt; .</tt></p>

<p>Mappings were made to the following ontologies: </p>
<ul>
  <li>the BAMS ontology which was derived from the Brain Architecture
    Management System [<a href="#ref-BAMS">BAMS</a>]</li>
  <li>the Subcellular Anatomy Ontology (SAO) created by the Cell Centered
    Database project. [<a href="#ref-SAO">SAO</a>]</li>
  <li>the BirnLex ontology developed by members of the Biomedical Informatics
    Research Network [<a href="#ref-BIRNLEX">BIRNLEX</a>]</li>
  <li>the Common Anatomy Reference Ontology (CARO)<!-- [<a
    href="#ref-CARO">CARO</a>] --></li>
  <li>the Gene Ontology [<a href="#ref-GO">GO</a>]</li>
  <li>the Ontology of Biomedical Investigation (OBI) [<a
    href="#ref-OBI">OBI</a>]</li>
</ul>

<p>The mappings were made with the following cross-ontology relations: <a
href="http://www.w3.org/TR/owl-ref/#equivalentClass-def">owl:equivalentClass</a>,
<a href="http://www.w3.org/TR/rdf-schema/#ch_subclassof">rdfs:subClassOf</a>
and the <a href="http://www.obofoundry.org/ro/#details">"has part" relation
from the OBO relation ontology</a>. </p>

<p><img alt="Ontology import hierarchy" src="ontology_import_hierarchy.png"
/></p>

<p>Figure 1: Import hierarchy of OWL ontologies. Ontologies printed in bold
have been created by the SenseLab team, other ontologies have been created by
other groups. The arrows point from the imported ontology to the importing
ontology, e.g., the NeuronDB Ontology imports the Relation Ontology. Import
statements are transitive, e.g., the ModelDB Ontology imports both the
NeuronDB ontology and the Relation ontology.</p>

<p></p>

<p><img alt="Examples of ontology mappings" src="examples-of-mappings.png"
/></p>

<p>Figure 2: Examples of relations ('mappings') spanning between classes from
the NeuronDB ontology (in the middle) and classes from external
ontologies.</p>

<p></p>

<p>Terse rdfs:labels were replaced by more descriptive ones that could be
better understood without knowledge about context. For example, the
rdfs:label "Ded" was changed to "Distal part of equivalent dendrite (Ded)".
Note that, in this case, the original label was also preserved (in brackets),
because it might still be useful for people that <em>do</em> know about the
context. </p>

<p>The ontology development was moved to a Subversion (SVN) system on a
central web server. During most of the development, the ontologies were
simply developed on the client side and were periodically uploaded via FTP.
Of course this led to problems when more than one person was working on the
ontologies at a time, and it was also impossible for users of the ontology to
access previous versions of the ontology, since only the most recent version
was available on the web site. </p>

<p>The namespaces / ontology locations were changed to PURL-based URIs. For
example, the URI
<code>http://<strong>neuroweb.med.yale.edu</strong>/senselab/neuron_ontology.owl#Dopamine</code>
was changed to <a
href="http://purl.org/ycmi/senselab/neuron_ontology.owl#Dopamine"><code>http://<strong>purl.org/ycmi</strong>/senselab/neuron_ontology.owl#Dopamine</code></a>
('ycmi' stands for 'Yale Center for Medical Informatics'). PURL-based URIs
are easier to maintain when server configurations change or (in the worst
case) the original server is unavailable and the ontologies need to be served
from a different location. The increased stability of PURLs encourages the
re-use of entities in ontologies developed by other groups -- which is a key
factor in the creation of a coherent Semantic Web. </p>

<p>A SPARQL endpoint for the SenseLab ontologies was set up using the open
source version of the Openlink Virtuoso server [<a
href="#ref-VIRTUOSO">VIRTUOSO</a>]. A SPARQL endpoint is a service that
allows clients to query a RDF store with the SPARQL query language through
simple HTTP GET requests. The ontologies were loaded into the triple store of
the server to make them accessible to SPARQL queries. Each ontology file was
put into a separate labeled graph, the label of each graph was identical to
the URL of the ontology file. For example, the ontology located at <a
href="http://purl.org/ycmi/senselab/neuron_ontology.owl">http://purl.org/ycmi/senselab/neuron_ontology.owl</a>
was loaded into a graph labeled
<code>http://purl.org/ycmi/senselab/neuron_ontology.owl</code>. Loading each
ontology into a separate graph makes it possible to restrict SPARQL queries
to certain graphs and hence, certain ontologies. This has the advantage that
queries can be more selective and can be executed with better performance.</p>

<h4 id="Outcome1">Outcome</h4>

<p>The final products of the project are accessible at <a
href="http://neuroweb.med.yale.edu/senselab/">http://neuroweb.med.yale.edu/senselab/</a>.
A SVN repository can be accessed through a web interface at <a
href="http://neuroweb.med.yale.edu/svn/trunk/ontology/senselab/">http://neuroweb.med.yale.edu/svn/trunk/ontology/senselab/</a>.
The SPARQL endpoint can be accessed at <a
href="http://hcls.deri.ie/sparql">http://hcls.deri.ie/sparql</a>. The
SenseLab OWL ontologies are mentioned as an example of the application of OBO
ontologies in the article <em>The OBO Foundry: coordinated evolution of
ontologies to support biomedical data integration</em><em></em> [<a
href="#ref-OBO-ARTICLE">OBO-ARTICLE</a>]. </p>

<h2 id="advantages">Advantages</h2>

<p>We experienced the following advantages from using RDF/OWL:</p>
<ul>
  <li>The use of OWL significantly eased the integration of SenseLab data
    with ontologies developed by other projects. OWL-based data integration
    does not require the development and maintenance of central mediators,
    reducing development and maintenance costs. The ontology integration can
    be accomplished by creating meaningful relations between entities in
    distributed ontologies.<br />
  </li>
  <li>Ontologies can be modularized; dependencies between ontologies can be
    made explicit through 'owl:imports' statements. This makes distributed
    development of ontology modules feasible and encourages the re-use of
    selected ontology modules by other groups.<br />
  </li>
  <li>Good OWL ontologies are self-descriptive because every entity can be
    annotated with text.</li>
  <li>Reasoners can be used to identify errors and real (i.e., conscious)
    contradictions in submitted data sets. You might find more errors and
    contradictions than you expected.</li>
  <li>Ontologies can be used to directly represent biological reality without
    introducing unnecessary abstractions such as database tables, data
    dictionaries, and documents.</li>
</ul>

<h2 id="disadvantages">Disadvantages</h2>

<p>We experienced the following problems while using RDF/OWL:</p>
<ul>
  <li>The open-source ontology editors used for this project (conducted in
    2007) were relatively unreliable. A lot of time was spent with steering
    around software bugs that caused instability of the software and errors
    in the generated RDF/OWL. Future versions of freely available editors or
    currently available commercial ontology editors might be preferable. </li>
  <li>Descriptions of OWL classes and their relations (i.e., OWL property
    restrictions) result in very complex and unintuitive RDF graphs. This
    makes it hard to generate them automatically, or use SPARQL to query such
    ontologies. </li>
  <li>Current reasoners can still have performance problems when checking /
    classifying complex OWL ontologies. </li>
  <li>The RDF/XML serialisation of RDF is not very easy to work with. It is
    often a source of errors. </li>
</ul>

<h2 id="future">Future directions and plans</h2>

<p>The SenseLab ontologies will be further integrated with other
neuroscientific and biomedical ontologies. User friendly applications will be
developed to query a multitude of interrelated ontologies in a scientifically
meaningful way. To this end, we have implemented a prototype Web application
called 'Entrez Neuron' that allows the user to query data across multiple
sources based on key words. The user can browse the query results and
retrieve more detailed information about neurons based on a
'brain-anatomy/neuron' view. A paper describing this application was
published in the <a href="http://esw.w3.org/topic/HCLS/WWW2008">WWW/HCLS2008
workshop</a>. Currently, we are expanding this application to include more
views and features.</p>

<h2 id="suggestions">Suggestions based on our experiences</h2>

<p>Based on our experiences we can make the following suggestions for other
projects that have similar goals:</p>
<ul>
  <li>Try to create consistent OWL DL ontologies. Pure RDF(S) without OWL
    constructs is not much simpler than OWL DL and often leads to the
    creation of too many properties because pure RDF(S) does not support
    property restrictions. </li>
  <li>Try to re-use entities and properties from existing ontologies where
    possible. </li>
  <li>If you do not want to import another ontology in its entirety (e.g.
    because it would be too large, too buggy or would introduce unnecessary
    constructs), you can still 'copy &amp; paste' portions of the ontology
    into your own. </li>
  <li>Try to base your ontology on a foundational ontology like BFO, OBO
    Relation Ontology or DOLCE [<a href="#ref-DOL">DOL</a>]. </li>
  <li>Where possible use the rdfs:label property to give clear,
    understandable labels to each entity and property in the ontology. Try to
    formulate labels in a way that makes them understandable without too much
    additional context (e.g. a certain user interface). </li>
  <li>Where possible, give concise rdfs:comments. </li>
  <li>Make a habit out of running your ontology through the RDF validator [<a
    href="#ref-RDF-VALID">RDF-VALID</a>] periodically, especially when you
    create RDF/XML with scripts that you wrote yourself. Keep in mind that
    the RDF validator does not throw an error message when URIs contain blank
    spaces. Blank spaces in URIs are problematic for many Semantic Web
    applications, so try to make sure that your URIs do not contain blank
    spaces. </li>
  <li>Check the consistency of your OWL ontology periodically. We used the
    Pellet reasoner [<a href="#ref-PELLET">PELLET</a>], which seems to be the
    best choice at the moment. </li>
  <li>Use purl.org URIs for your ontologies. You can easily register a
    sub-domain at purl.org free of charge. </li>
  <li>If you write a program that generates RDF/OWL, do <strong>not</strong>
    try to write RDF/XML code directly. RDF/XML is relatively complicated and
    messy, and it is very easy to produce syntactic or even semantic errors
    because of that. So if you write a program that generates RDF, use a RDF
    or OWL API for writing triples. If that is not possible, generate your
    RDF in the much simpler TURTLE syntax instead of RDF/XML. The TURTLE
    syntax is a subset of the N3 syntax [<a href="#ref-n3">N3</a>]. You can
    save the resulting RDF in TURTLE format to a text file. If you need
    RDF/XML for another application, you can convert the TURTLE to RDF/XML in
    a second step. </li>
</ul>

<h2 id="conclusion">Conclusion</h2>

<p>We experienced clear benefits from using Semantic Web technologies for the
integration of SenseLab data with other neuroscientific data in a consistent,
flexible and decentralised manner. The main obstacle in our work was the lack
of mature and scalable open source software for editing the complex,
expressive ontologies we were dealing with. Since the quality of these tools
is rapidly improving, this may cease to be an issue in the near future. The
detailed analysis of the experiences with the SenseLab ontologies and other
complex biomedical ontologies may help drive the improvement of current
ontology editors.</p>

<h2 id="references">References</h2>
<dl>
  <dt><a name="ref-EAV-CR" id="ref-EAV-CR"></a>[EAV-CR]</dt>
    <dd><i>L. Marenco, N. Tosches, C. Crasto, G. Shepherd, P.L. Millera and
      P.M. Nadkarni, Achieving evolvable Web-database bioscience applications
      using the EAV/CR framework: recent advances, J Am Med Inform Assoc.
      (2003) 10(5):444-53</i> </dd>
  <dt><a name="ref-SENSELAB-WEB" id="ref-SENSELAB-WEB"></a>[SENSELAB-WEB]</dt>
    <dd><i><a href="">SenseLab database</a></i>,
    http://senselab.med.yale.edu/</dd>
  <dt><a name="ref-SENSELAB-SW" id="ref-SENSELAB-SW"></a>[SENSELAB-SW]</dt>
    <dd><i><a href="http://neuroweb.med.yale.edu/senselab/">SenseLab Semantic
      Web Development</a></i>, http://neuroweb.med.yale.edu/senselab/ </dd>
  <dt><a name="ref-PROTEGE" id="ref-PROTEGE"></a>[PROTEGE]</dt>
    <dd><i><a href="http://protege.stanford.edu/">The Protege Ontology Editor
      and Knowledge Acquisition System</a></i>, http://protege.stanford.edu/
    </dd>
  <dt><a name="ref-TOPBRAID" id="ref-TOPBRAID"></a>[TOPBRAID]</dt>
    <dd><i><a href=""></a></i><em><a
      href="http://www.topbraidcomposer.org/">TopBraid Composer</a></em>,
      http://www.topbraidcomposer.org/ </dd>
  <dt><a name="ref-RO" id="ref-RO"></a>[RO]</dt>
    <dd><i><a href="http://www.obofoundry.org/ro/">Relation Ontology</a></i>,
      http://www.obofoundry.org/ro/ </dd>
  <dt><a name="ref-OBO" id="ref-OBO"></a>[OBO]</dt>
    <dd><i><a href="http://obofoundry.org">The Open Biomedical
      Ontologies</a></i>, http://obofoundry.org/</dd>
  <dt><a name="ref-BFO" id="ref-BFO"></a>[BFO]</dt>
    <dd><i><a href="http://www.ifomis.uni-saarland.de/bfo/">Basic Formal
      Ontology (BFO)</a></i>, http://www.ifomis.uni-saarland.de/bfo/</dd>
  <dt><a name="ref-SC-URI" id="ref-SC-URI"></a>[SC-URI]</dt>
    <dd><i><a
      href="http://sw.neurocommons.org/2007/uri-explanation.html">Explanation
      of HCLS and Science Commons URIs</a></i>,
      http://sw.neurocommons.org/2007/uri-explanation.html</dd>
  <dt><a name="ref-BAMS" id="ref-BAMS"></a>[BAMS]</dt>
    <dd><i><a href="http://brancusi.usc.edu/bkms/">The Brain Architecture
      Management System</a></i>, http://brancusi.usc.edu/bkms/ </dd>
  <dt><a name="ref-SAO" id="ref-SAO"></a>[SAO]</dt>
    <dd><i><a href="http://ccdb.ucsd.edu/CCDBWebSite/sao.html">CCDB
      Subcellular Anatomy Ontology</a></i>,
      http://ccdb.ucsd.edu/CCDBWebSite/sao.html</dd>
  <dt>[CARO]</dt>
    <dd><i><a
      href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=caro">Common
      Anatomy Reference Ontology </a></i>,
      http://www.obofoundry.org/cgi-bin/detail.cgi?id=caro </dd>
  <dt><a name="ref-BIRNLEX" id="ref-BIRNLEX"></a>[BIRNLEX]</dt>
    <dd><i><a href="">BIRNLex Ontology Documentation</a></i>,
      http://fireball.drexelmed.edu/birnlex/OWLdocs/ </dd>
  <dt><a name="ref-GO" id="ref-GO"></a>[GO]</dt>
    <dd><i><a href="http://geneontology.org/">Gene Ontology</a></i>,
      http://geneontology.org/</dd>
  <dt><a name="ref-OBI" id="ref-OBI"></a>[OBI]</dt>
    <dd><i><a href="http://obi.sourceforge.net/">Ontology of Biomedical
      Investigation</a></i>, http://obi.sourceforge.net/ </dd>
  <dt><a name="ref-VIRTUOSO" id="ref-VIRTUOSO"></a>[VIRTUOSO]</dt>
    <dd><i><a href="http://virtuoso.openlinksw.com/">OpenLink Universal
      Integration Middleware - Virtuoso Product Family</a></i>,
      http://virtuoso.openlinksw.com/ </dd>
  <dt><a name="ref-OBO-ARTICLE" id="ref-OBO-ARTICLE"></a>[OBO-ARTICLE]</dt>
    <dd><i>The OBO Foundry: coordinated evolution of ontologies to support
      biomedical data integration</i>, Barry Smith, Michael Ashburner,
      Cornelius Rosse, Jonathan Bard, William Bug, Werner Ceusters <em>et
      al.</em>, Nature Biotechnology 25, 1251 - 1255, 2007,
      http://dx.doi.org/10.1038/nbt1346 </dd>
  <dt><a name="ref-DOL" id="ref-DOL"></a>[DOL]</dt>
    <dd><i><a href="http://www.loa-cnr.it/DOLCE.html">DOLCE Ontology</a></i>,
      http://www.loa-cnr.it/DOLCE.html </dd>
  <dt><a name="ref-RDF-VALID" id="ref-RDF-VALID"></a>[RDF-VALID]</dt>
    <dd><i><a href="http://www.w3.org/RDF/Validator/">RDF Validator</a></i>,
      http://www.w3.org/RDF/Validator/</dd>
  <dt><a name="ref-PELLET" id="ref-PELLET"></a>[PELLET]</dt>
    <dd><i><a href="http://pellet.owldl.org/">The PELLET Open Source OWL DL
      Reasoner</a></i>, http://pellet.owldl.org/ </dd>
  <dt><a name="ref-SMITH-2004" id="ref-SMITH-2004"></a>[SMITH-2004]</dt>
    <dd><em>Beyond Concepts: Ontology as Reality Representation</em>, Barry
      Smith, iin A. Varzi, L. Vieu, eds., Proceedings of FOIS (IOS Press,
      Amsterdam, 2004) 319-330. <a
      href="http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf">http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf</a></dd>
  <dt><a name="ref-kb" id="ref-kb"></a>[KB]</dt>
    <dd><i><a href="../NOTE-hcls-kb-20080604/">A Prototype Knowledge Base for the Life Sciences</a></i>,
      http://www.w3.org/TR/2008/NOTE-hcls-kb-20080604/ </dd>
  <dt><a name="ref-n3" id="ref-n3"></a>[N3]</dt>
    <dd><i><a href="http://www.w3.org/2000/10/swap/Primer">Primer: Getting
      into RDF and Semantic Web using N3</a></i>,
      http://www.w3.org/2000/10/swap/Primer </dd>
  <dt><a name="ref-OWL-Overview" id="ref-OWL-Overview"></a>[OWL Overview]</dt>
    <dd><i><a href="http://www.w3.org/TR/2004/REC-owl-features-20040210/">OWL
      Web Ontology Language Overview</a></i>, Deborah L. McGuinness and Frank
      van Harmelen, Editors, W3C Recommendation, 10 February 2004,
      http://www.w3.org/TR/2004/REC-owl-features-20040210/ . <a
      href="http://www.w3.org/TR/owl-features/">Latest version</a> available
      at http://www.w3.org/TR/owl-features/ </dd>
  <dt><a id="ref-RDF" name="ref-RDF">[RDFS]</a></dt>
    <dd><a href="http://www.w3.org/TR/2004/REC-rdf-schema-20040210/">RDF
      Vocabulary Description Language 1.0: RDF Schema </a>, Dan Brickley and
      R.V. Guha, Editors. W3C Recommendation, 10 February 2004,<br />
      http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ .<br />
      <a href="http://www.w3.org/TR/rdf-schema/">Latest version</a> available
      at http://www.w3.org/TR/rdf-schema/. </dd>
</dl>

<h2 id="Acknowledg">Acknowledgements (Informative)</h2>

<p>Thanks to Huajun Chen and Ernest Lim who contributed to the SenseLab
conversion. Thanks to Gordon Shepherd, Perry Miller, Luis Marenco and Tom
Morse for their input, suggestions and support. Thanks to Susie Stephens for
her detailed suggestions for improving this document. Thanks to Alan
Ruttenberg for his technical suggestions during the conversion process.
Thanks to Eric Prud'hommeaux for technical advice and assistance on the
creation of this document.</p>
</body>
</html>