index.html
35.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta name="RCS-Id"
content="$Id: Overview.html,v 1.7 2008/06/07 08:29:30 eric Exp $" />
<title>Experiences with the conversion of SenseLab databases to
RDF/OWL</title>
<style type="text/css">
/*<![CDATA[*/
.mesh { background-color: #ffc }
.goa { background-color: #fcf }
.glbl { background-color: #ccf }
.plbl { background-color: #cfc }
.var { font-weight: bold }
.db { font-weight: bold }
.gene { color: blue }
.process{ color: red }
.senselab { background-color: #0ff }
.identifier { font-weight: bold }
.comment{ color: orange; font-size: 1.3em; }
.schema th { text-align: left }
table, td, th { border-style: solid;
border-width: 1px;
border-color: black;
border-bottom-color: gray;
border-right-color: gray; }
table.dbsTable { border-collapse: collapse; border-color: #000000; }
table.dbsTable td:first-child { vertical-align: top; }
table.dbsTable td { padding: 2px 5px 2px 5px; }
.at-issue {text-decoration: underline;}
.issue {background-color: #fcc;}
/*]]>*/
</style>
<link rel="stylesheet" type="text/css" href="http://www.w3.org/StyleSheets/TR/W3C-IG-NOTE" />
</head>
<body>
<div class="head">
<p><a href="http://www.w3.org/"><img src="http://www.w3.org/Icons/w3c_home"
alt="W3C" height="48" width="72" /></a></p>
<h1 id="main">Experiences with the conversion of SenseLab databases to
RDF/OWL</h1>
<h2 class="no-num no-toc" id="w3c-doctype">W3C Interest Group Note 4 June 2008</h2>
<dl>
<!-- dt>Editors working draft.</dt>
<dd><span class="cvs-id">$Revision: 1.7 $ of
$Date: 2008/06/07 08:29:30 $</span></dd -->
<dt>This version:</dt>
<dd><a href="http://www.w3.org/TR/2008/NOTE-hcls-senselab-20080604/">http://www.w3.org/TR/2008/NOTE-hcls-senselab-20080604/</a></dd>
<dt>Latest version:</dt>
<dd><a href="http://www.w3.org/TR/hcls-senselab/">http://www.w3.org/TR/hcls-senselab/</a></dd>
<dt>Previous version:</dt>
<dd><a href="http://www.w3.org/TR/2008/WD-hcls-senselab-20080404/">http://www.w3.org/TR/2008/WD-hcls-senselab-20080404/</a></dd>
<dt>Editors:</dt>
<dd>Matthias Samwald, <a href="http://ycmi.med.yale.edu/">Yale Center for
Medical Informatics</a> / <a href="http://www.deri.ie/">DERI Galway</a>
/ <a href="http://www.semantic-web.at/">Semantic Web Company</a> <<a
href="mailto:samwald@gmx.at">samwald@gmx.at</a>></dd>
<dd>Kei-Hoi Cheung, Yale Center for Medical Informatics <<a
href="kei.cheung@yale.edu">kei.cheung@yale.edu</a>></dd>
<dt>Contributors:</dt>
<dd>Alan Ruttenberg, <a href="http://sciencecommons.org/">Science
Commons</a> <<a
href="mailto:alanruttenberg@gmail.com">alanruttenberg@gmail.com</a>></dd>
<dd>Huajun Chen, Yale Center for Medical Informatics / <a
href="http://www.zju.edu.cn/english/">Zhejiang University</a> <<a
href="mailto:huajunsir@zju.edu.cn">huajunsir@zju.edu.cn</a>></dd>
</dl>
<p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">Copyright</a> © 2008 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p>
</div>
<hr title="Separator for header" />
<div>
<h2 class="notoc" id="abstract">Abstract</h2>
<p>One of the challenges facing Semantic Web for Health Care and Life
Sciences is that of converting relational databases into Semantic Web format.
The issues and the steps involved in such a conversion have not been well
documented. To this end, we have created this document to describe the
process of converting SenseLab databases into OWL. SenseLab is a collection
of relational (Oracle) databases for neuroscientific research. The conversion
of these databases into RDF/OWL format is an important step towards realizing
the benefits of Semantic Web in integrative neuroscience research. This
document describes how we represented some of the SenseLab databases in
Resource Description Framework (RDF) and Web Ontology Language (OWL), and
discusses the advantages and disadvantages of these representations. Our OWL
representation is based on the reuse and extension of existing standard OWL
ontologies developed in the biomedical ontology communities. The purpose of
this document is to share our implementation experience with the
community.</p>
</div>
<div>
<h2 id="status">Status of This Document</h2>
<p><em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p>
<p>This is
an Interest Group Note
<!-- an editor's draft -->
of the <a href="http://www.w3.org/2001/sw/hcls/">Semantic Web in Health Care and Life Sciences Interest Group (HCLS)</a>, part of the <a href="http://www.w3.org/2001/sw/">W3C Semantic Web Activity</a>. It is considered stable and expected to be published as an Interest Group Note in May 2008.
This document serves as a companion to
<a href="http://www.w3.org/TR/2008/NOTE-hcls-kb-20080604/">A Prototype Knowledge Base for the Life Sciences</a>
and describes the process for integrating new data into an existing biological database. We hope other groups who plan to convert their databases into RDF/OWL format will benefit from this document.</p>
<p>The document was produced by the <a href="http://www.w3.org/2001/sw/hcls/">Semantic Web in Health Care and Life Sciences Interest Group (HCLS)</a>, part of the <a href="http://www.w3.org/2001/sw/">W3C Semantic Web Activity</a> (<a href="http://www.w3.org/2001/sw/hcls/charter">see charter</a>). Comments may be sent to the <a href="http://lists.w3.org/Archives/Public/public-semweb-lifesci/">publicly archived</a> <a href="mailto:public-semweb-lifesci@w3.org">public-semweb-lifesci@w3.org</a> mailing list. Feedback is encouraged, as is participation in the recently <a href="http://www.w3.org/2008/05/HCLSIGCharter">re-charted</a> HCLSIG. A <a href="WD2NOTE">list of changes since the last publication</a> is available.</p>
<p>Publication as an Interest Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p>
<p>This document was produced by a group operating under the disclosure
obligations of the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February 2004 W3C Patent Policy</a>. The group does
not expect this document to become a W3C Recommendation. An
individual who has actual knowledge of a patent which the individual
believes contains <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#def-essential">Essential Claim(s)</a> must disclose the information to
<a href="mailto:public-semweb-lifesci@w3.org">public-semweb-lifesci@w3.org</a> [<a href="http://lists.w3.org/Archives/Public/public-semweb-lifesci/">public archive</a>] in accordance with
in accordance with <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section 6 of the W3C Patent Policy</a>.</p>
</div>
<hr />
<div class="toc">
<h2 id="TOC">Table of Contents</h2>
<ul class="toc">
<li><a href="#process">Conversion process</a>
<ul>
<li><a href="#sources">Original data sources</a></li>
<li><a href="#first">Initial RDF and OWL conversions</a>
<ul>
<li><a href="#Motivation">Motivation</a></li>
<li><a href="#Process">Process</a></li>
<li><a href="#Outcome">Outcome </a></li>
</ul>
</li>
<li><a href="#revised">Revised OWL conversions</a>
<ul>
<li><a href="#Motivation1">Motivation</a></li>
<li><a href="#Process1">Process</a></li>
<li><a href="#Outcome1">Outcome</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#advantages">Advantages</a></li>
<li><a href="#disadvantages">Disadvantages</a></li>
<li><a href="#future">Future directions and plans</a></li>
<li><a href="#suggestions">Suggestions based on our experiences</a></li>
<li><a href="#conclusion">Conclusion</a></li>
<li><a href="#references">References</a></li>
<li><a href="#Acknowledg">Acknowledgements (Informative)</a></li>
</ul>
</div>
<hr />
<h2 id="process">Conversion process</h2>
<h3 id="sources">Original data sources</h3>
<p>The SenseLab databases can be accessed through a web interface at the
SenseLab web site [<a href="#ref-SENSELAB-WEB">SENSELAB-WEB</a>]. SenseLab is
divided into a number of specialised databases, of which we have converted
three to Semantic Web formats. These databases are NeuronDB, BrainPharm and
ModelDB. All databases are based on compartmental models of neurons. NeuronDB
contains descriptions of anatomic locations, cell architecture and
physiologic parameters of neuronal cells. The pilot BrainPharm database is
intended to support research on drugs for the treatment of neurological
disorders. It enhances the descriptions in a portion of NeuronDB with
descriptions of the actions of pathological and pharmacological agents.
ModelDB is a large repository of computational neuroscience models and
simulations. The mathematical models in ModelDB are annotated with references
to NeuronDB. Taken together, these databases allow the researcher to query
information and to run simulations pertaining to the function of neurons in
healthy and disease states. All databases contain extensive literature
references and excerpts from texts that have been used to curate the database
entries.</p>
<p>The databases are based on the "entity-attribute-value with classes and
relationships" (EAV/CR) schema [<a href="#ref-EAV-CR">EAV-CR</a>]. The data
can also be downloaded from the SenseLab Semantic Web development portal [<a
href="#ref-SENSELAB-SW">SENSELAB-SW</a>] as a database dump in Microsoft
Access format and as text.</p>
<h3 id="first">Initial RDF and OWL conversions</h3>
<h4 id="Motivation">Motivation</h4>
<p>Our motivation was to make the SenseLab databases available in RDF(S) [<a
href="#ref-RDF">RDFS</a>] (without OWL) and in OWL DL [<a
href="#ref-OWL-Overview">OWL Overview</a>]. The two versions were developed
in parallel in order to compare the difference between the conversion
processes and the outcomes. We wanted to explore the issues in mapping
relational databases to RDF/OWL structure. In addition, we wanted to explore
the possibility of automatic translation from EAV/CR to RDF.</p>
<h4 id="Process">Process</h4>
<p>We developed a converter application in Java that queried the SenseLab
database and wrote RDF/XML files. The conversion was fully automatic for the
RDF version, but required some manual editing for the OWL version.</p>
<h4 id="Outcome">Outcome </h4>
<p>These conversions were too tied to the original database structure, which
resulted in inconsistent OWL ontologies. Some shortcomings of the first
conversion to OWL were: </p>
<ul>
<li>'Part of' relations were incorrectly represented as subclass relations.
This seems to be one of the most common mistakes in ontology development
in general. </li>
<li>Class disjoints<a id="disjoint-ref" href="#disjoint">¹</a> were
missing which made it hard to find inconsistencies and data entry errors.
</li>
<li>After disjoints were introduced, we found some previously unidentified
inconsistencies with the help of OWL reasoners: some classes (e.g.
'GABA') were subclasses of both 'neurotransmitter' and 'receptor', which
was wrong. This was an artifact caused by the automated conversion --
both GABA transmitters and GABA receptors were simply labeled with 'GABA'
in the source database. The conversion algorithm generated URIs based on
these labels, so they were represented with identical URIs
(<code>http://neuroweb.med.yale.edu/senselab/neuron_ontology.owl#<strong>GABA</strong></code>).
This grave mistake would not have been noticed without the use of OWL
reasoning. </li>
<li>Some of the labels of entities generated by the Java converter were
very terse and not understandable outside the user interface of the
original database. For example, "Ded" was the label of the "distal part
of the dendrite". </li>
</ul>
<p id="disjoint"><a href="#disjoint-ref">¹</a> Disjoint classes are used in
OWL to assert that they have no members in common. Inferences from this can
be used to flag any inconsistent models.</p>
<h3 id="revised">Revised OWL conversions</h3>
<p>The revised OWL conversion was based on the first OWL conversions
described above. The design of the revised SenseLab ontologies follows the
"ontological realism" approach [<a href="#ref-SMITH-2004">SMITH-2004</a>].
This means that the revised ontologies are focused on direct representations
of physical objects and processes (e.g., neuronal cells, ionic currents), and
not on their abstractions (e.g., concepts or database entries). </p>
<h4 id="Motivation1">Motivation</h4>
<p>Manually correcting the logical inconsistencies in the first version of
the OWL ontology; making use of foundational ontologies (BFO, Relation
Ontology) where possible; mapping the ontology to other neuroscience
ontologies. </p>
<h4 id="Process1">Process</h4>
<p>An ontology containing basic class hierarchies and relations was manually
created, based on the structure of existing SenseLab databases. This basic
ontology could not be created from the database structure in an automated
process because this would not have resulted in a logically consistent
ontology. This ontology was edited by a domain expert, based on inspection
and manual editing with Protege 3.2 [<a href="#ref-PROTEGE">PROTEGE</a>] and
Topbraid Composer [<a href="#ref-TOPBRAID">TOPBRAID</a>]. The ontologies were
built upon established foundational ontologies in order to maximize the
interoperability with other existing and forthcoming biomedical Semantic Web
resources. The foundational ontologies used were:</p>
<ul>
<li>the Relation Ontology [<a href="#ref-RO">RO</a>] from the Open
Biomedical Ontologies repository [<a href="#ref-OBO">OBO</a>], which
defines basic relations such as 'part of', 'participant of' or 'contained
in'. </li>
<li>the Basic Formal Ontology [<a href="#ref-BFO">BFO</a>], which defines
basic classes such as 'process', 'object', 'quality' or 'function'. </li>
</ul>
<p>Based on this manually created basic ontology, the data from the SenseLab
databases were then automatically converted to OWL using programs written in
Java and Python. The automated export scripts extended the manually created
basic ontology through the creation of subclasses, OWL property restrictions
and individuals. The resulting ontologies show no clearly distinguishable
divide between the 'schema' and 'data'. </p>
<p>The OWL export of NeuronDB was based on a transformation from the EAV/CR
model of the SenseLab database to files in RDF/XML syntax by a Java program.
The export from ModelDB and BrainPharm was based on a simple flat text file
export of the databases. The text file exports were converted to RDF/XML
files with a Python script. </p>
<p>For mappings to external bioinformatics databases that did not yet offer
stable URIs for reference on the Semantic Web, we used the URI scheme for
database record identifiers established by Science Commons [<a
href="#ref-SC-URI">SC-URI</a>]. URIs for database records could simply be
generated by concatenating the record identifier to a predefined namespace.
For example, the Entrez Gene record with ID '3579' was identified by the URI
<a
href="http://purl.org/commons/record/ncbi_gene/3579"><code>http://purl.org/commons/record/ncbi_gene/<strong>3579</strong></code></a>,
the Uniprot record 'P46663' was identified by <a
href="http://purl.org/commons/record/uniprotkb/P46663"><code>http://purl.org/commons/record/uniprotkb/<strong>P46663</strong></code></a>
and the Pubmed record with ID '11160518' was identified by <a
href="http://purl.org/commons/record/pmid/11160518"><code>http://purl.org/commons/record/pmid/<strong>11160518</strong></code></a>.
The database entries were connected to the ontological representations of
real-word entities through relations such as
<code>has_nucleotide_sequence_described_by</code>. For example, the gene of
the Dopamine Receptor D1 (DRD1) is defined through a reference to NCBI record
1812, which contains a description of the sequence of this specific gene: </p>
<p><tt><http://purl.org/ycmi/senselab/neuron_ontology.owl#DRD1_Gene>
owl:equivalentClass _:property_restriction1 .</tt><br />
<tt>_:property_restriction1 owl:onProperty
senselab:has_nucleotide_sequence_described_by .</tt><br />
<tt>_:property_restriction1 owl:hasValue
<http://purl.org/commons/record/ncbi_gene/1812> .</tt></p>
<p>Mappings were made to the following ontologies: </p>
<ul>
<li>the BAMS ontology which was derived from the Brain Architecture
Management System [<a href="#ref-BAMS">BAMS</a>]</li>
<li>the Subcellular Anatomy Ontology (SAO) created by the Cell Centered
Database project. [<a href="#ref-SAO">SAO</a>]</li>
<li>the BirnLex ontology developed by members of the Biomedical Informatics
Research Network [<a href="#ref-BIRNLEX">BIRNLEX</a>]</li>
<li>the Common Anatomy Reference Ontology (CARO)<!-- [<a
href="#ref-CARO">CARO</a>] --></li>
<li>the Gene Ontology [<a href="#ref-GO">GO</a>]</li>
<li>the Ontology of Biomedical Investigation (OBI) [<a
href="#ref-OBI">OBI</a>]</li>
</ul>
<p>The mappings were made with the following cross-ontology relations: <a
href="http://www.w3.org/TR/owl-ref/#equivalentClass-def">owl:equivalentClass</a>,
<a href="http://www.w3.org/TR/rdf-schema/#ch_subclassof">rdfs:subClassOf</a>
and the <a href="http://www.obofoundry.org/ro/#details">"has part" relation
from the OBO relation ontology</a>. </p>
<p><img alt="Ontology import hierarchy" src="ontology_import_hierarchy.png"
/></p>
<p>Figure 1: Import hierarchy of OWL ontologies. Ontologies printed in bold
have been created by the SenseLab team, other ontologies have been created by
other groups. The arrows point from the imported ontology to the importing
ontology, e.g., the NeuronDB Ontology imports the Relation Ontology. Import
statements are transitive, e.g., the ModelDB Ontology imports both the
NeuronDB ontology and the Relation ontology.</p>
<p></p>
<p><img alt="Examples of ontology mappings" src="examples-of-mappings.png"
/></p>
<p>Figure 2: Examples of relations ('mappings') spanning between classes from
the NeuronDB ontology (in the middle) and classes from external
ontologies.</p>
<p></p>
<p>Terse rdfs:labels were replaced by more descriptive ones that could be
better understood without knowledge about context. For example, the
rdfs:label "Ded" was changed to "Distal part of equivalent dendrite (Ded)".
Note that, in this case, the original label was also preserved (in brackets),
because it might still be useful for people that <em>do</em> know about the
context. </p>
<p>The ontology development was moved to a Subversion (SVN) system on a
central web server. During most of the development, the ontologies were
simply developed on the client side and were periodically uploaded via FTP.
Of course this led to problems when more than one person was working on the
ontologies at a time, and it was also impossible for users of the ontology to
access previous versions of the ontology, since only the most recent version
was available on the web site. </p>
<p>The namespaces / ontology locations were changed to PURL-based URIs. For
example, the URI
<code>http://<strong>neuroweb.med.yale.edu</strong>/senselab/neuron_ontology.owl#Dopamine</code>
was changed to <a
href="http://purl.org/ycmi/senselab/neuron_ontology.owl#Dopamine"><code>http://<strong>purl.org/ycmi</strong>/senselab/neuron_ontology.owl#Dopamine</code></a>
('ycmi' stands for 'Yale Center for Medical Informatics'). PURL-based URIs
are easier to maintain when server configurations change or (in the worst
case) the original server is unavailable and the ontologies need to be served
from a different location. The increased stability of PURLs encourages the
re-use of entities in ontologies developed by other groups -- which is a key
factor in the creation of a coherent Semantic Web. </p>
<p>A SPARQL endpoint for the SenseLab ontologies was set up using the open
source version of the Openlink Virtuoso server [<a
href="#ref-VIRTUOSO">VIRTUOSO</a>]. A SPARQL endpoint is a service that
allows clients to query a RDF store with the SPARQL query language through
simple HTTP GET requests. The ontologies were loaded into the triple store of
the server to make them accessible to SPARQL queries. Each ontology file was
put into a separate labeled graph, the label of each graph was identical to
the URL of the ontology file. For example, the ontology located at <a
href="http://purl.org/ycmi/senselab/neuron_ontology.owl">http://purl.org/ycmi/senselab/neuron_ontology.owl</a>
was loaded into a graph labeled
<code>http://purl.org/ycmi/senselab/neuron_ontology.owl</code>. Loading each
ontology into a separate graph makes it possible to restrict SPARQL queries
to certain graphs and hence, certain ontologies. This has the advantage that
queries can be more selective and can be executed with better performance.</p>
<h4 id="Outcome1">Outcome</h4>
<p>The final products of the project are accessible at <a
href="http://neuroweb.med.yale.edu/senselab/">http://neuroweb.med.yale.edu/senselab/</a>.
A SVN repository can be accessed through a web interface at <a
href="http://neuroweb.med.yale.edu/svn/trunk/ontology/senselab/">http://neuroweb.med.yale.edu/svn/trunk/ontology/senselab/</a>.
The SPARQL endpoint can be accessed at <a
href="http://hcls.deri.ie/sparql">http://hcls.deri.ie/sparql</a>. The
SenseLab OWL ontologies are mentioned as an example of the application of OBO
ontologies in the article <em>The OBO Foundry: coordinated evolution of
ontologies to support biomedical data integration</em><em></em> [<a
href="#ref-OBO-ARTICLE">OBO-ARTICLE</a>]. </p>
<h2 id="advantages">Advantages</h2>
<p>We experienced the following advantages from using RDF/OWL:</p>
<ul>
<li>The use of OWL significantly eased the integration of SenseLab data
with ontologies developed by other projects. OWL-based data integration
does not require the development and maintenance of central mediators,
reducing development and maintenance costs. The ontology integration can
be accomplished by creating meaningful relations between entities in
distributed ontologies.<br />
</li>
<li>Ontologies can be modularized; dependencies between ontologies can be
made explicit through 'owl:imports' statements. This makes distributed
development of ontology modules feasible and encourages the re-use of
selected ontology modules by other groups.<br />
</li>
<li>Good OWL ontologies are self-descriptive because every entity can be
annotated with text.</li>
<li>Reasoners can be used to identify errors and real (i.e., conscious)
contradictions in submitted data sets. You might find more errors and
contradictions than you expected.</li>
<li>Ontologies can be used to directly represent biological reality without
introducing unnecessary abstractions such as database tables, data
dictionaries, and documents.</li>
</ul>
<h2 id="disadvantages">Disadvantages</h2>
<p>We experienced the following problems while using RDF/OWL:</p>
<ul>
<li>The open-source ontology editors used for this project (conducted in
2007) were relatively unreliable. A lot of time was spent with steering
around software bugs that caused instability of the software and errors
in the generated RDF/OWL. Future versions of freely available editors or
currently available commercial ontology editors might be preferable. </li>
<li>Descriptions of OWL classes and their relations (i.e., OWL property
restrictions) result in very complex and unintuitive RDF graphs. This
makes it hard to generate them automatically, or use SPARQL to query such
ontologies. </li>
<li>Current reasoners can still have performance problems when checking /
classifying complex OWL ontologies. </li>
<li>The RDF/XML serialisation of RDF is not very easy to work with. It is
often a source of errors. </li>
</ul>
<h2 id="future">Future directions and plans</h2>
<p>The SenseLab ontologies will be further integrated with other
neuroscientific and biomedical ontologies. User friendly applications will be
developed to query a multitude of interrelated ontologies in a scientifically
meaningful way. To this end, we have implemented a prototype Web application
called 'Entrez Neuron' that allows the user to query data across multiple
sources based on key words. The user can browse the query results and
retrieve more detailed information about neurons based on a
'brain-anatomy/neuron' view. A paper describing this application was
published in the <a href="http://esw.w3.org/topic/HCLS/WWW2008">WWW/HCLS2008
workshop</a>. Currently, we are expanding this application to include more
views and features.</p>
<h2 id="suggestions">Suggestions based on our experiences</h2>
<p>Based on our experiences we can make the following suggestions for other
projects that have similar goals:</p>
<ul>
<li>Try to create consistent OWL DL ontologies. Pure RDF(S) without OWL
constructs is not much simpler than OWL DL and often leads to the
creation of too many properties because pure RDF(S) does not support
property restrictions. </li>
<li>Try to re-use entities and properties from existing ontologies where
possible. </li>
<li>If you do not want to import another ontology in its entirety (e.g.
because it would be too large, too buggy or would introduce unnecessary
constructs), you can still 'copy & paste' portions of the ontology
into your own. </li>
<li>Try to base your ontology on a foundational ontology like BFO, OBO
Relation Ontology or DOLCE [<a href="#ref-DOL">DOL</a>]. </li>
<li>Where possible use the rdfs:label property to give clear,
understandable labels to each entity and property in the ontology. Try to
formulate labels in a way that makes them understandable without too much
additional context (e.g. a certain user interface). </li>
<li>Where possible, give concise rdfs:comments. </li>
<li>Make a habit out of running your ontology through the RDF validator [<a
href="#ref-RDF-VALID">RDF-VALID</a>] periodically, especially when you
create RDF/XML with scripts that you wrote yourself. Keep in mind that
the RDF validator does not throw an error message when URIs contain blank
spaces. Blank spaces in URIs are problematic for many Semantic Web
applications, so try to make sure that your URIs do not contain blank
spaces. </li>
<li>Check the consistency of your OWL ontology periodically. We used the
Pellet reasoner [<a href="#ref-PELLET">PELLET</a>], which seems to be the
best choice at the moment. </li>
<li>Use purl.org URIs for your ontologies. You can easily register a
sub-domain at purl.org free of charge. </li>
<li>If you write a program that generates RDF/OWL, do <strong>not</strong>
try to write RDF/XML code directly. RDF/XML is relatively complicated and
messy, and it is very easy to produce syntactic or even semantic errors
because of that. So if you write a program that generates RDF, use a RDF
or OWL API for writing triples. If that is not possible, generate your
RDF in the much simpler TURTLE syntax instead of RDF/XML. The TURTLE
syntax is a subset of the N3 syntax [<a href="#ref-n3">N3</a>]. You can
save the resulting RDF in TURTLE format to a text file. If you need
RDF/XML for another application, you can convert the TURTLE to RDF/XML in
a second step. </li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>We experienced clear benefits from using Semantic Web technologies for the
integration of SenseLab data with other neuroscientific data in a consistent,
flexible and decentralised manner. The main obstacle in our work was the lack
of mature and scalable open source software for editing the complex,
expressive ontologies we were dealing with. Since the quality of these tools
is rapidly improving, this may cease to be an issue in the near future. The
detailed analysis of the experiences with the SenseLab ontologies and other
complex biomedical ontologies may help drive the improvement of current
ontology editors.</p>
<h2 id="references">References</h2>
<dl>
<dt><a name="ref-EAV-CR" id="ref-EAV-CR"></a>[EAV-CR]</dt>
<dd><i>L. Marenco, N. Tosches, C. Crasto, G. Shepherd, P.L. Millera and
P.M. Nadkarni, Achieving evolvable Web-database bioscience applications
using the EAV/CR framework: recent advances, J Am Med Inform Assoc.
(2003) 10(5):444-53</i> </dd>
<dt><a name="ref-SENSELAB-WEB" id="ref-SENSELAB-WEB"></a>[SENSELAB-WEB]</dt>
<dd><i><a href="">SenseLab database</a></i>,
http://senselab.med.yale.edu/</dd>
<dt><a name="ref-SENSELAB-SW" id="ref-SENSELAB-SW"></a>[SENSELAB-SW]</dt>
<dd><i><a href="http://neuroweb.med.yale.edu/senselab/">SenseLab Semantic
Web Development</a></i>, http://neuroweb.med.yale.edu/senselab/ </dd>
<dt><a name="ref-PROTEGE" id="ref-PROTEGE"></a>[PROTEGE]</dt>
<dd><i><a href="http://protege.stanford.edu/">The Protege Ontology Editor
and Knowledge Acquisition System</a></i>, http://protege.stanford.edu/
</dd>
<dt><a name="ref-TOPBRAID" id="ref-TOPBRAID"></a>[TOPBRAID]</dt>
<dd><i><a href=""></a></i><em><a
href="http://www.topbraidcomposer.org/">TopBraid Composer</a></em>,
http://www.topbraidcomposer.org/ </dd>
<dt><a name="ref-RO" id="ref-RO"></a>[RO]</dt>
<dd><i><a href="http://www.obofoundry.org/ro/">Relation Ontology</a></i>,
http://www.obofoundry.org/ro/ </dd>
<dt><a name="ref-OBO" id="ref-OBO"></a>[OBO]</dt>
<dd><i><a href="http://obofoundry.org">The Open Biomedical
Ontologies</a></i>, http://obofoundry.org/</dd>
<dt><a name="ref-BFO" id="ref-BFO"></a>[BFO]</dt>
<dd><i><a href="http://www.ifomis.uni-saarland.de/bfo/">Basic Formal
Ontology (BFO)</a></i>, http://www.ifomis.uni-saarland.de/bfo/</dd>
<dt><a name="ref-SC-URI" id="ref-SC-URI"></a>[SC-URI]</dt>
<dd><i><a
href="http://sw.neurocommons.org/2007/uri-explanation.html">Explanation
of HCLS and Science Commons URIs</a></i>,
http://sw.neurocommons.org/2007/uri-explanation.html</dd>
<dt><a name="ref-BAMS" id="ref-BAMS"></a>[BAMS]</dt>
<dd><i><a href="http://brancusi.usc.edu/bkms/">The Brain Architecture
Management System</a></i>, http://brancusi.usc.edu/bkms/ </dd>
<dt><a name="ref-SAO" id="ref-SAO"></a>[SAO]</dt>
<dd><i><a href="http://ccdb.ucsd.edu/CCDBWebSite/sao.html">CCDB
Subcellular Anatomy Ontology</a></i>,
http://ccdb.ucsd.edu/CCDBWebSite/sao.html</dd>
<dt>[CARO]</dt>
<dd><i><a
href="http://www.obofoundry.org/cgi-bin/detail.cgi?id=caro">Common
Anatomy Reference Ontology </a></i>,
http://www.obofoundry.org/cgi-bin/detail.cgi?id=caro </dd>
<dt><a name="ref-BIRNLEX" id="ref-BIRNLEX"></a>[BIRNLEX]</dt>
<dd><i><a href="">BIRNLex Ontology Documentation</a></i>,
http://fireball.drexelmed.edu/birnlex/OWLdocs/ </dd>
<dt><a name="ref-GO" id="ref-GO"></a>[GO]</dt>
<dd><i><a href="http://geneontology.org/">Gene Ontology</a></i>,
http://geneontology.org/</dd>
<dt><a name="ref-OBI" id="ref-OBI"></a>[OBI]</dt>
<dd><i><a href="http://obi.sourceforge.net/">Ontology of Biomedical
Investigation</a></i>, http://obi.sourceforge.net/ </dd>
<dt><a name="ref-VIRTUOSO" id="ref-VIRTUOSO"></a>[VIRTUOSO]</dt>
<dd><i><a href="http://virtuoso.openlinksw.com/">OpenLink Universal
Integration Middleware - Virtuoso Product Family</a></i>,
http://virtuoso.openlinksw.com/ </dd>
<dt><a name="ref-OBO-ARTICLE" id="ref-OBO-ARTICLE"></a>[OBO-ARTICLE]</dt>
<dd><i>The OBO Foundry: coordinated evolution of ontologies to support
biomedical data integration</i>, Barry Smith, Michael Ashburner,
Cornelius Rosse, Jonathan Bard, William Bug, Werner Ceusters <em>et
al.</em>, Nature Biotechnology 25, 1251 - 1255, 2007,
http://dx.doi.org/10.1038/nbt1346 </dd>
<dt><a name="ref-DOL" id="ref-DOL"></a>[DOL]</dt>
<dd><i><a href="http://www.loa-cnr.it/DOLCE.html">DOLCE Ontology</a></i>,
http://www.loa-cnr.it/DOLCE.html </dd>
<dt><a name="ref-RDF-VALID" id="ref-RDF-VALID"></a>[RDF-VALID]</dt>
<dd><i><a href="http://www.w3.org/RDF/Validator/">RDF Validator</a></i>,
http://www.w3.org/RDF/Validator/</dd>
<dt><a name="ref-PELLET" id="ref-PELLET"></a>[PELLET]</dt>
<dd><i><a href="http://pellet.owldl.org/">The PELLET Open Source OWL DL
Reasoner</a></i>, http://pellet.owldl.org/ </dd>
<dt><a name="ref-SMITH-2004" id="ref-SMITH-2004"></a>[SMITH-2004]</dt>
<dd><em>Beyond Concepts: Ontology as Reality Representation</em>, Barry
Smith, iin A. Varzi, L. Vieu, eds., Proceedings of FOIS (IOS Press,
Amsterdam, 2004) 319-330. <a
href="http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf">http://ontology.buffalo.edu/bfo/BeyondConcepts.pdf</a></dd>
<dt><a name="ref-kb" id="ref-kb"></a>[KB]</dt>
<dd><i><a href="../NOTE-hcls-kb-20080604/">A Prototype Knowledge Base for the Life Sciences</a></i>,
http://www.w3.org/TR/2008/NOTE-hcls-kb-20080604/ </dd>
<dt><a name="ref-n3" id="ref-n3"></a>[N3]</dt>
<dd><i><a href="http://www.w3.org/2000/10/swap/Primer">Primer: Getting
into RDF and Semantic Web using N3</a></i>,
http://www.w3.org/2000/10/swap/Primer </dd>
<dt><a name="ref-OWL-Overview" id="ref-OWL-Overview"></a>[OWL Overview]</dt>
<dd><i><a href="http://www.w3.org/TR/2004/REC-owl-features-20040210/">OWL
Web Ontology Language Overview</a></i>, Deborah L. McGuinness and Frank
van Harmelen, Editors, W3C Recommendation, 10 February 2004,
http://www.w3.org/TR/2004/REC-owl-features-20040210/ . <a
href="http://www.w3.org/TR/owl-features/">Latest version</a> available
at http://www.w3.org/TR/owl-features/ </dd>
<dt><a id="ref-RDF" name="ref-RDF">[RDFS]</a></dt>
<dd><a href="http://www.w3.org/TR/2004/REC-rdf-schema-20040210/">RDF
Vocabulary Description Language 1.0: RDF Schema </a>, Dan Brickley and
R.V. Guha, Editors. W3C Recommendation, 10 February 2004,<br />
http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ .<br />
<a href="http://www.w3.org/TR/rdf-schema/">Latest version</a> available
at http://www.w3.org/TR/rdf-schema/. </dd>
</dl>
<h2 id="Acknowledg">Acknowledgements (Informative)</h2>
<p>Thanks to Huajun Chen and Ernest Lim who contributed to the SenseLab
conversion. Thanks to Gordon Shepherd, Perry Miller, Luis Marenco and Tom
Morse for their input, suggestions and support. Thanks to Susie Stephens for
her detailed suggestions for improving this document. Thanks to Alan
Ruttenberg for his technical suggestions during the conversion process.
Thanks to Eric Prud'hommeaux for technical advice and assistance on the
creation of this document.</p>
</body>
</html>