metalog.html
23.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
<HTML>
<HEAD>
<STYLE>
DIV.abstract {
font-style: italic;
}
DIV.copyright {
font-style: bold;
margin: 5%;
}
.example {
background: orange;
padding: 0.2em 1em 0.2em 1em;
border: none;
whitespace: pre;
font-family: monospace;
}
P.caption {
font-weight: bold;
}
</STYLE>
<!-- Created with AOLpress/2.0 -->
<TITLE ALIGN=Center>Query + Metadata + Logic = Metalog</TITLE>
<LINK rel=stylesheet href="http://www.w3.org/StyleSheets/Core/Modernist"
type="text/css">
</HEAD>
<BODY BGCOLOR="white">
<H1 ALIGN=Center>
Query + Metadata + Logic = Metalog
</H1>
<DIV class=authors>
<H3 ALIGN=Center>
<A HREF="http://www.w3.org/People/Massimo/">Massimo Marchiori</A>,
<A HREF="http://www.w3.org/People/Janne/">Janne Saarela</A><BR>
{massimo,jsaarela}@w3.org
</H3>
</DIV>
<DIV class=organization>
<H2 ALIGN=Center>
<I><A HREF="http://www.w3.org">The World Wide Web Consortium (W3C)</A></I>
</H2>
</DIV>
<DIV class=abstract>
<P>
The Resource Description Framework (RDF) Model&Syntax Specification describes
a metadata infrastructure which can accommodate classification elements from
different vocabularies i.e. schemas. The underlying model consists of a labeled
directed acyclic graph which can be linearized into eXtensible Markup Language
(XML) transfer syntax for interchange between applications.
<P>
This paper will demonstrate how a new querying language, Metalog, allows
users to write inference rules and queries in English-like syntax. We will
demonstrate how these reasoning rules have equivalent representation both
as RDF descriptions and as logic programs. We will also show how an automated
compilation between these translations is possible.
<P>
For the sake of clarity, here we will just give an overview of the system,
trying to avoid technicalities and cumbersome details.
</DIV>
<H2>
Query Languages
</H2>
<P>
In general, query languages are formal languages to retrieve data from a
database. Standardadized languages already exist to retrieve information
from different types of databases such as Structured Query Language (SQL)
for relational databases and Object Query Language (OQL) and SQL3 for object
databases.
<P>
Semi-structure query languages such as XML-QL [3] operate on the document
level structure.
<P>
Logic programs consist of facts and rules where valid inference rules are
used to determine all the facts that apply within a given model.
<P>
With RDF, the most suitable approach is to focus on the underlying data model.
Even though XML-QL could be used to query RDF descriptions in their XML encoded
form, a single RDF data model could not be correctly determined with a single
XML-QL query due to the fact that RDF allows several XML syntax encodings
for the same data model.
<H2>
The Metalog Approach
</H2>
<P>
RDF provides the basis for structuring the data present in the web in a
consistent and accurate way. However, RDF is only the first step towards
the construction of what Tim Berners-Lee calls the "web of knowledge", a
World Wide Web where data is structured, <I>and</I> users can fully benefit
by this structure when accessing information on the web. RDF only provides
the "basic vocabulary" in which data can be expressed and structured. Then,
the whole problem of <I>accessing</I> an <I>managing</I> these data structured
arises.
<P>
Metalog provides a "logical" view of metadata present on the web. The Metalog
approach is composed by several components.
<P>
In the first component, a particular data semantics is established. Metalog
provides way to express logical relationships like "and", "or" and so on,
and to build up complex <I>inference rules</I> that encode logical reasoning.
This "semantic layer" builds on top of RDF using a so-called <I>RDF schema</I>.
<P>
The second component consists of a "logical interpretation" of RDF data
(optionally enriched with the semantic schema) into logic programming. This
way, the understood semantics of RDF is unwielded into its logical components
(a logic program, indeed). This means that every reasonment on RDF data can
be performed acting upon the corresponding logical view, the logic program,
providing a neat and powerful way to reason about data.
<P>
The third component is a language interface to writing structured data and
reasoning rules. In principle, the first component already suffices: data
and rules can be written directly in RDF, using RDF syntax and the metalog
schema. However, this is not convenient from the practical viewpoint. Indeed,
RDF syntax aims at being more an encoding language rather than a user-friendly
language, and it is well recognised in the RDF community and among vendors
that the typical applications will provide more user-friendly interfaces
between the "raw RDF" code and the user. Our proposed language is innovative
in that it tries to stress user-friendliness as much as possible: a program
is a collection of <I>natural language</I> assertions. We think this feature
will be particularly important for the wide deployment not only of metalog,
but of RDF itself: the measure of the success of metadata and proper structuring
of information on the web is given by the number of people that will actually
lose time and energy in write (and/or translate) data into the structured
format. Therefore, it is of primary importance that the entry level is kept
extremely easy, to avoid that the difficuly of just learning how to encode
and structure data will just block the widespread diffusion of metadata in
the web.
<P>
Another important feature of the language, in this respect, is indeed that
it can be used just as an interface to RDF, without the metalog extensions.
This way, users will be able to access and structure metadata using RDF in
a smooth and seamless way, using the metalog language.
<H2>
The Metalog Schema
</H2>
<P>
The first correspondance in Metalog is between the basic RDF data model and
the predicates in logic. The RDF data model consists of so-called
<I>statements</I> Statements are triples where there is a subject (the
"resource"), a predicate (the "property"), and an object (the "literal").
Metalog views an RDF statement in the logical setting as just a binary predicate
involving the subject and the literal. For example, the RDF statement expressing
the fact that <I>Tim Berners-Lee invented the Web </I>(formally, the RDF
triple <I>{invented, Tim Berners-Lee, Web}</I>) is seen in logic programming
as the predicate <I>invented(Tim Berners-Lee, Web)</I>.
<P>
Once estalished the basic correspondance between the basic RDF data model
and predicates in logic, the next step comes easy: we can extend RDF so that
the mapping to logic is able to take advantage of all of the logical
relationships present in logical systems: that is to say, behind the ability
of expresing <I>static facts</I>, we want the ability to encode <I>dynamic
reasoning rules,</I> like in logic programming.
<P>
In order to do so, we need at least:
<UL>
<LI>
the standard logical connectors (<I>and</I>, <I>or</I>, <I>not</I>)
<LI>
variables
</UL>
<P>
The metalog schema extends plain RDF with this "logical layer", enabling
to express arbitrary logical relationships within RDF. In fact, the metalog
schema provides more accessories besides the aforementioned basic ones (like
for example, the "implies" connector): anyway, not to heaven the discussion,
we don't go into further details on this topic. What the reader should keep
in mind is just that the Metalog schema provides the "meta-logic" operators
to reason with RDF statements.
<P>
Technically, this is quite easy to do: the metalog schema is just a schema
as defined by the RDF schema specification where, for example, <I>and</I>
and <I>or</I> are subinstances of the RDF <I>Bag</I> connector.
<P>
The mapping between "metalog RDF" and logical formulas is then completely
natural: for each RDF statement that does not use a metalog connector, there
is a corresponding logical predicate as defined before. Then, the metalog
connectors are translated into the corresponding logical connectors in the
natural way (so, for instance, the metalog <I>and</I> connector is mapped
using logical conjunction, while the metalog <I>or</I> connector is mapped
using logical disjunction).
<H2>
The Metalog Syntax
</H2>
<P>
Note that the RDF metalog schema and the corresponding translation into logical
formulas is absolutely general. However, in practicse, one need also to then
be able to process the resulting logical formulas in an effective ways. In
other words, while the RDF metalog schema nicely extends RDF with the full
power of first order predicate calculus, thus increasing by far the
<I>expressibility</I> of basic RDF, there is still the other,
<I>computational</I>, side of the coin: how to process and effectively reason
with all these logical inference rules.
<P>
It is well known that in general dealing with full first order predicate
calculus is totally unfeasable computationally. So, what we would like to
have is a subset of predicate calculus that is still expressible enough,
and also computationally feasible: our choice went to <I>logic programming</I>.
Logic programming (see e.g. [1]) is a well known programming paradigm that selects a subset of
full first-order predicate calculus (so called Horn clauses); it is a very
powerful and expressive paradigm, and has the further advantage that it has been
widely studied in the database community (a subset of logic programming, <EM>datalog</EM>, has even the
advantage of having computations always terminating, a feature of obvious interest
for web queries).
<P>
The third level is then the actual syntax interface between the user and
this "metalog RDF" encoding, with the constraint that the expressibility
of the language must fit within the one provided by logic programming.
<P>
The metalog syntax has been explicitly designed with the purpose of being
totally natural-language based, trying to avoid any possible technicalities,
and therefore making the language extrememly readable and self-descriptive.
<P>
The way metalog reaches this scope is by a careful use of upper/lower case,
quotes, and by allowing a rather liberal positioning of the keywords (an
advanced parser then disambiguates the keywords from each metalog program
line).
<P>
Upper/lower case is used to distingush between normal keywords and variables:
variables are expressed using names all in upper case (for example,
<TT>FOO</TT> is a variable). Words that are in lower case either are keywords
(reserved words), or if not, they are ignored. For example, <TT>then</TT>
is a keyword, while <TT>foo</TT> is not, and so it is just ignored (it is
only syntactic sugaring). Other words can be either keywords, or they are
just ignored. In the current version of metalog, words cannot intermingle
upper and lower case: this helps to reduce errors and to improve readability,
since it strengthens the layout difference between variables and the other
words.
<P>
Finally, any name which is between double quotes (for example, "John") is
a datum (a fixed constant).
<H3>
Keywords
</H3>
<P>
The following set of keywords are reserved in metalog. Interpretaion of the
keywords is dne in metalog on a positional basis: the position of the keyword
with respect to other keywords and/or other data determines the interpretation
of the sentence. The reserved keywords are:
<UL>
<LI>
<TT>then</TT> is a keyword for the logical implication (=>)<BR>
For example,<BR>
<TT>if SHE has a "degree" in "math" then SHE "is" "smart"</TT><BR>
is translated into the logical formula<BR>
<I>degree(SHE,"math") => is(SHE,"smart")</I><BR>
(Note that, here and in the following examples, we provide directly the
translation into the logical formula, to save space; a more detailed translation
would have to also show the intermediate RDF model, which is in any case
trivial to derive).
<LI>
<TT>imply</TT> is a keyword for the logical implication (=>)
<LI>
<TT>implies</TT> is a keyword for the logical implication (=>)
<LI>
<TT>and</TT> can be either the metalog <I>and</I>, or it can be used to indicate
the presence of an RDF Bag: this is disambiguated by the context<BR>
For example, <TT><BR>
if SHE has a "degree" in "math" and SHE has a "degree" in "computer science"
as well then SHE "is" "really smart"</TT>.<BR>
is translated into the logical formula<BR>
(<I>degree(SHE,"math") and degree(SHE,"computer science"))=> is(SHE,"really
smart")</I><BR>
On the other hand, as said, and can be used to denote an RDF Bag (a
set):<BR>
<TT>the "technical report 231" has as "authors" "Mary" and "John"</TT>.<BR>
is translated into the logical formulas (the translation here is more involved
since the RDF Bag construct is used):<BR>
<TT>authors("technical report
231",foo).<BR>
rdf:type(0,rdf:Bag).<BR>
rdf:_1(0,"Mary").<BR>
rdf:_2(0,"John").</TT>
<LI>
<TT>or</TT> can be either the metalog <I>or</I>, or it can be used to indicate
the presence of an RDF Alt (an alternatives list) : this is disambiguated
by the context
<LI>
<TT>order</TT> the presence of this keyword turns an RDF Bag into an RDF
Seq (an ordered list).<BR>
For example,<BR>
<TT>the "technical report 231" has as "authors" "Mary" and "John"</TT> <TT>in
this order</TT>.<BR>
is translated into the logical formulas (the translation here is more involved
since the RDF Bag construct is used):<BR>
<TT>authors("technical report
231",foo).<BR>
rdf:type(foo,rdf:Seq).<BR>
rdf:_1(foo,"Mary").<BR>
rdf:_2(foo,"John").</TT>
<LI>
<TT>not</TT> can be combined with any other metalog constructs, and its
interpretation is logical negation<BR>
For example<BR>
<TT>if SHE has a "degree" in "math" then SHE "is" not "stupid"</TT>.<BR>
is translated into the logical formula<BR>
<I>degree(SHE,"math") => not(is(SHE,"stupid"))</I>
</UL>
<P>
The dot is the separator between metalog program lines. For formatting purposes,
carriage returns, line feeds can tabs can be used: they are simply ignored.
Similarly, commas and semicolon can be used as well.
<P>
A trailing question mark is used to denote a query.
<P>
<I>Note</I>: metalog programs also have a facility to express namespaces
via the keyword <TT>namespace</TT>. We will not go in further details since
we won't be explicitly using namespaces sugaring here, but en passant we
just mention that essentially the <TT>namespace</TT> keyword has the same
functionality as the <TT>xmlns</TT> attribute for XML namespaces. Also, there
are a number of other keywords that deal. for example, with numbers operations
(e.g., <TT>greater</TT>, <TT>less</TT>, etc.), but for the sake of brevity
we don't go in their (rather obvious) description.
<H3>
A more detailed example
</H3>
<P>
Suppose we want to encode the rule that if a person has written a document
in some language (for example, English), then he can speak in that language.
The corresponding metalog program would be:
<P>
<TT>if the "language" of a DOCUMENT is Y <BR>
and the "author" of the DOCUMENT is X <BR>
then X can "speak" Y.</TT>
<P>
This can be translated into the following piece of RDF syntax:
<P>
<TT><Procedure><BR>
<Head><BR>
<and><BR>
<Predicate name="speak"><BR>
<rdf:Seq><BR>
<rdf:li><Variable>X</Variable></rdf:li><BR>
<rdf:li><Variable>Y</Variable></rdf:li><BR>
</rdf:Seq><BR>
</Predicate><BR>
</and><BR>
</Head><BR>
<Body><BR>
<and><BR>
<Predicates><BR>
<rdf:Seq><BR>
<rdf:li><BR>
<Predicate
name="creator"><BR>
<rdf:Seq><BR>
<rdf:li><Variable>DOCUMENT</Variable></rdf:li><BR>
<rdf:li><Variable>X</Variable></rdf:li><BR>
</rdf:Seq><BR>
</Predicate><BR>
</rdf:li><BR>
<rdf:li><BR>
<Predicate name="language"><BR>
<rdf:Seq><BR>
<rdf:li><Variable>DOCUMENT</Variable></rdf:li><BR>
<rdf:li><Variable>Y</Variable></rdf:li><BR>
</rdf:Seq><BR>
</Predicate><BR>
</rdf:li><BR>
</rdf:Seq><BR>
</Predicates><BR>
</and><BR>
</Body><BR>
</Procedure></TT>
<P>
And, finally, this corresponds to the logical formula
<P>
<I>speak(X,Y) <= (author(DOCUMENT,X) and language(DOCUMENT,Y))</I>
<P>
So, suppose we have already grabbed from somewhere in the web some pieces
of RDF that tell us, for example, that "John" is the author of "technical
report 231", and that the language of "technical report 231" is
"English".<BR>
Then, if we want to know what language does John speak, we can just ask
<P>
<TT>what "language" does "John" "speak"?</TT>
<P>
which is translated into the corresponding query
<P>
<I>speak("John",Y).</I>
<P>
Running this query in the corresponding logic program gives the result that
<I>Y="English"</I>, that is to say, the predicate
<I>speak("John","English")</I> is true.<BR>
Hence, the corresponding metalog sentence, returned as answer, is:
<P>
<TT>"John" "speaks" "English"</TT>
<P>
As far as real data are concerned, among our examples we have run the above
example using a set of 2700 RDF data model triples that correspond with the
data available at the World Wide Web Consortium technical reports page. This
page presents the public documents the consortium has published along with
their authors, dates, and URIs. Therefore, one can get a complete knowledge
basis regarding W3C's authors, that is flexible and elegantly extendable.
<H2>
Related work
</H2>
<P>
The use of Web infrastructure to accommodate logic programs has been suggested
by (Sandevall, 1996) and (Loke & Davidson, 1996). The latter approach
suggests using familiar logic program notation to place facts and queries
on HTML pages. The embedded rules also have the ability to refer to other
HTML pages with other predicates using a namespace mechanism. In this way,
their evaluation context increases over the amount of HTML pages they retrieve
to find facts that satisfy the queries.
<H2>
Conclusions
</H2>
<P>
To the best of our knowledge, this is the first work that addresses the problem
of querying RDF models and extending it with the ability of expressing reasoning
rules. The metalog model that we have sketched is general enough to be of
wide use, and powerful enough to fulfill most of the generic user's needs.
Moreover, it is elegantly integrated within the "big picture" of W3C's standards,
with a particular eye geared toward extendability and future improvements.
It tries to lower the "access level" to metadata and reasoning management
by using a top-level syntax using natural language, enabling not only easy
and fast writing of complex relationships, but also an extremely high
readability. Finally, it can be used even without the logical extensions,
just to provide a user-friendly interface to RDF. Future work that we plan
to do within W3C is the deployment of a publicly accessible prototype of
the system, so to foster on a large scale use of structured metadata on the
web.
<H3>
Acknowledgements
</H3>
<P>
The authors would like to thank Bert Bos for his help in running the test
sets.
<H3>
References
</H3>
<OL>
<LI>
Das, S.K. (1992). <I>Deductive Databases and Logic Programming</I>. Addison
Wesley.
<LI>
Dan Brickley, R.V. Guha, A. Layman, "Resource Description Framework
(RDF) Schema Specification". W3C Draft.
<BR>
<A HREF="http://www.w3.org/TR/WD-rdf-schema/>http://www.w3.org/TR/WD-rdf-schema/</A>
<LI>
Alin Deutsch (University of Pennsylvania), Mary Fernandez (AT&T Labs),
Daniela Florescu (INRIA), Alon Levy (University of Washington), Dan Suciu
(AT&T Labs) "XML-QL". W3C Note.
<BR>
<A HREF="http://www.w3.org/TR/1998/NOTE-xml-ql-19980819">http://www.w3.org/TR/1998/NOTE-xml-ql-19980819</A>
<LI>
Lassila, O., Swick, R. (1998). <I>Resource Description Framework (RDF) Model
and Syntax Specification</I>. W3C Working Draft.<BR>
<A href="http://www.w3.org/TR">http://www.w3.org/TR/</A>
<LI>
Loke, S.W., Davison, A. (1996). Logic Programming with the World Wide Web.
<I>Proc. of the 7th ACM Conf. on Hypertext.</I><BR>
<A href="http://www.cs.unc.edu/~barman/HT96/P14/lpwww.html">http://www.cs.unc.edu/~barman/HT96/P14/lpwww.html</A>
<LI>
Niemelä, Simons, P. (1997). Smodels -- an implementation of the stable
model and well-founded semantics for normal logic programs <I>Proc. of the
4th Int. Conf. on Logic Programming and Non-Monotonic Reasoning</I>. Dagstuhl,
Germany.<BR>
<A href="http://saturn.hut.fi/pub/papers/lpnmr97-sd.ps.gz">http://saturn.hut.fi/pub/papers/lpnmr97-sd.ps.gz</A>
<LI>
Ramakrishnan, R., Srivastava, D., Sudarshan, D. (1992). CORAL: Control, Relations
and Logic. <I>Proc. of the Int. Conf. on VLDB.</I>.
<LI>
Sandewall, E. (1996). Towards a World-Wide Data Base. <I>Proc. of the 5th
Int. WWW Conf.</I>.
</OL>
<H3>
Appendix A - Query schema in RDF
</H3>
<P>
In the following, we provide (part of) the Metalog schema, to provide the
technically oriented reader with more inside on how the schema effectively
uses RDF schema facilities.
<PRE>
<RDF xmlns="http://www.w3.org/TR/WD-rdf-syntax#"
xmlns:rdf="http://www.w3.org/TR/WD-rdf-syntax#"
xmlns:rdfs="http://www.w3.org/TR/WD-rdf-schema#">
<rdfs:Class ID="Procedure" />
<Predicate ID="Head">
<rdfs:comment xml:lang="en">Head of the procedure</rdfs:comment>
<rdfs:domain rdf:resource="#Procedure"/>
<rdfs:range rdf:resource="#Connector"/>
<rdfs:range rdf:resource="#Predicate"/>
</Predicate>
<Predicate ID="Body">
<rdfs:comment xml:lang="en">Body of the procedure</rdfs:comment>
<rdfs:domain rdf:resource="#Procedure"/>
<rdfs:range rdf:resource="#Connector"/>
<rdfs:range rdf:resource="#Predicate"/>
</Predicate>
<Predicate ID="Predicates">
<rdfs:comment xml:lang="en">Predicates combined with a connector</rdfs:comment>
<rdfs:domain rdf:resource="#Connector"/>
<rdfs:range rdf:resource="#Predicate"/>
<rdfs:range rdf:resource="#Connector"/>
<!-- this last range definition enables recursion -->
</Predicate>
<rdfs:Class ID="Connector" />
<rdfs:Class ID="And">
<rdfs:subClassOf rdf:resource="#Connector" />
</rdfs:Class>
<rdfs:Class ID="Or">
<rdfs:subClassOf rdf:resource="#Connector" />
</rdfs:Class>
<rdfs:Class ID="Not">
<rdfs:subClassOf rdf:resource="#Connector" />
</rdfs:Class>
<rdfs:Class ID="Predicate" />
<Predicate ID="Variable">
<rdfs:comment xml:lang="en">Variable within a predicate</rdfs:comment>
<rdfs:domain rdf:resource="#Predicate"/>
<rdfs:range rdf:resource="http://www.w3.org/FictionalSchemas/useful_types#String"/>
</Predicate>
<Predicate ID="Constant">
<rdfs:comment xml:lang="en">Constant within a predicate</rdfs:comment>
<rdfs:domain rdf:resource="#Predicate"/>
<rdfs:range rdf:resource="http://www.w3.org/FictionalSchemas/useful_types#String"/>
</Predicate>
</RDF>
</PRE>
<P>
<HR>
<P>
</BODY></HTML>