Rules.html
13 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 13), see www.w3.org" />
<title>
-- Axioms of Web architecture
</title>
<link rel="Stylesheet" href="di.css" type="text/css" />
<meta http-equiv="Content-Type" content="text/html" />
</head>
<body bgcolor="#DDFFDD" text="#000000">
<address>
Tim Berners-Lee<br />
Date: 1998, last change: $Date: 2009/08/27 21:38:09 $<br />
Status: personal view only. Editing status: first draft.
</address>
<p>
<a href="./">Up to Design Issues</a>
</p>
<h3>
Axioms of Web Architecture: n
</h3>
<hr />
<h1>
Rules and Facts: Inference engines vs Web
</h1>
<p>
At at attempt to explain explain part of the relationship
between the Semantic Web and inference engines, either
existing or legacy, and discuss the relationship between
inference rules and logical facts.
</p>
<p>
The Semantic Web is a universal space for anything which can
be expressed in classical logic. In the world of knowledge
Representation (KR) there are many different systems, and the
following is an attempt to generalize.
</p>
<p>
Each system typically has a distinction between data and
rules. The data is a pool of information in one language
(sometimes very simple without negation like basic RDF) . The
rules control the inference steps which the inference engine
makes. The rules are written in a restricted language so as
to preserve some property computability property. Algernon
restricts its rules to forward chaining but assures Socratic
completeness.
</p>
<p>
When integrating rules with the semantic web, one must
realize that a rule contains two separate pieces of
information. Take a rule in a certain inference system
</p>
<p>
g(a,c) |= d(a,b) & d(b,c)
</p>
<p>
which is defined to mean "whenever you find a new
relationship where any a is the daughter of some b, then if
for that b there is any c for which b is the daughter of c,
then conclude that a is the granddaughter of c". Here,
"conclude" means add to the database. This is a procedural
instruction.
</p>
<p>
It involves an out-of band decision (may by a person) as to
whether all granddaughter relationships should be added to
the database the moment they can be, or whether the
relationship would only be used at a time when a query is
made. This rule can be exchanged between two inference
engines of the same type, but it does not as a rule make
sense to anyone else.
</p>
<p>
In fact, of course, this rule would be nonsense if it were
not for the fact in classical logic that
</p>
<p>
Va,b,c g(a,c) <= d(a,b) & d(b,c)
</p>
<p>
This fact, unlike the rule, can be directly expressed in the
semantic web language. When the rule is used in deducing
something, it is this fact which is a step input to the
proof. Every semantic web proof validator will be able to
handle it.
</p>
<p>
Exposing rules as classic logic facts strips the
(pragmatically useful) hint information which controls the
actual sequence of operation of a local inference engine.
When the facts corresponding to all the rules of all the
inference engines are put onto the web, then the great thing
is that all the knowledge is represented in the same space.
The drawback is that there is no one inference engine which
can answer arbitrary queries. But that is not a design goal
of the semantic web. The goal is to unify everything which
can be expressed in classical logic (including more
mathematics when we get to it) without futher constraint. We
must be able to describe hte world, and our hopes and needs
and terms and conditions. A system which tries to constrain
the expressive power cannot be universal.
</p>
<h2>
Non-monotonic "logics"
</h2>
<p>
Now there are some systems which in fact use classical logic
directly, and others, "non-monotonic logics" in which adding
a fact can change something which was previously "believed
true" to being "believed false". (Describing them as logics
may be regarded by some as questionable). For example, given
that "birds can fly", the system will believe that Pingu can
fly because Pingu is a penguin and a penguin is a bird,
unltill it is told that penguins can't fly. Then it will
assume that all birds can fly excpt for penguins. Such
systems use concepts of "defaults" -- things to be assumed
unless one is told otherwise. They are fundamentally
closed-world systems, in that the concept of "belief" is
alway implicitly make with respect to a given closed set of
facts.
</p>
<p>
One can export such information into the semantic web in two
ways. One can export the rule system specifically, ending up
with a statement of the form "there is as assertion of birds
being able to fly which is is unchallenged in the xxxx corpus
by any assertion contradicting that which applied to birds or
any otehr superclass of penguins". This effectivly is a
reification of the non-monotonic system, an analysis not of
penguins but of the inferenc system and what its state is.
This may be so unweildly that it is only useful by systems
which use th same inference system. The second way to export
the data is to just record the classical logic statement as
the output of the inference engine. "The xxxx system has
output that Pingu can fly.". In certian cases, a system might
risk incorporating such statements into a classic inference
system. This is the logical equivalent of declaring, "Well, I
don't think such a book exists becase it wasn't in
Blackwell's catalog". We do things all the time, but a secure
system is unlikely to be set up to incorporate such
information. (A more secure system would for example, given
the publisher and year, find a definitive list from the
publisher of books published in that year, which would allow
it to proove that such a book did not exist.)
</p>
<p>
The choice of classical logic for hte Semantic web is not an
arbitrary choice among equals. Classical logic is the only
way that inference can scale across the web. There are some
logics which simply do not have a consistent set of axioms -
fuzzy logic, for example, tends to believe something to a
greater extent as a funcion of how often evidence for it has
been presented. Closed world systems don't scale because the
refernce to the scope of a defualt is typically implicit, and
different from one fact to another. When a fact is presented
as a fact, the "Oh yeah?" function of demanding justification
can be satsfifed by a roof in a universal language of proof.
non-classical heuristic systems may have been used to
discover the proof, but onec the proof has been found it can
by checked as valid by any semantic web system.
</p>
<p>
In the diagram, I have put heuristic systems above the
semnatic web bus, and classical systems below. In Weaving the
Web later chapters I try to describe the importanc of the web
in supporting both types of system.
</p>
<hr />
<p>
[thanks to Lynn Stein/LCS for raising and largely answering
the question of non-monotonic logics]
</p>
<h2>
Inconsistent data
</h2>
<p>
What, they say, will happen when this huge mass of classical
logic meets its first inconsistncy? Surely, once you have one
staement that A and another somewhere on the web that not A,
then doesn't the whole system fall apart? Surely, then you
can deduce anything?
</p>
<p>
This fear of course is quite valid - or would be if all
assertions in the whole world were regarded as bing on equal
footing. Some imagine that an RDF parser will simply search
all XML documents on the web for any facts, and add them to a
massive set of belived assertions. This is not how realisic
systems will actually work.
</p>
<p>
On the web, a fact may be asserted in an expression. That
expression may be part fo a formula. The formula may ivolve
negation, and may invove quotation. The whole formula is
found by parsing some document . There is no a priori reason
to believe any document on the web. The reason to believe a
document will be found in some information (metadata) about
the document. That metadata may be an endosement of the
document - another RDF statement, which in turn was found
another document, and so on.
</p>
<p>
A real system may work backwards or forwards (or both). I
would call working forwards a system which is given a
configuartion page to work from which in turn points to other
pages which in turn are used as valid data. I would call
working backwards a system which, when looking for an answer
to a query, looks at a gloal index to find any document at
all which mentions a given term. It then searches thes
documents turned up for answers to the query. Only when it
has found an answer does t check back to see whether the data
can be deriveded directly or indirectly from sources it has
been set up to trust.
</p>
<p>
Digital sgnature (see trust) of cours adds a notion of
secuirty to the whole process. The first step is that a
document is not endorsed without giving the checksum it had
when believed. The second step is to secify more powerful
rules of the form
</p>
<blockquote>
<p>
"whatever any document says so long it is signed with key
57832498437".
</p>
</blockquote>
<p>
In prcatice, particular authroities are trusted only for
specific purposed. The semantic web must support this. You
must be able to restrict the information believed along the
lines of,
</p>
<blockquote>
<p>
"whatever any document says of the form xxxx is a meber of
W3C so long as it is signed wiht key 32457934759432".
</p>
</blockquote>
<p>
for example
</p>
<blockquote>
<p>
"whatever any document says of the form "a is an employee
of IBM" so long as it is signed by with key 3213123098129".
</p>
</blockquote>
<p>
There is a choice here, and I am not sure right now which
appeals to me most. One is to say precicely,
</p>
<blockquote>
<p>
"whatever any document <em><strong>says</strong></em> of
the form xxxx is a member of W3C so long as it is signed
with key 32457934759432".
</p>
</blockquote>
<p>
The other is to say,
</p>
<blockquote>
<p>
"whatever is of form xxxx and <em><strong>can be
inferred</strong></em> from information signed with key
32457934759432"
</p>
</blockquote>
<p>
In the first case, we are making an arbitrary requirement for
a statement to be phrased in a particular way. This seems
unnecessarily bureaucratic, and more difficult to treat
constently. Normally we like to be able to replace any set of
forumlae with another set which can be deduced from it.
However, in this case we have to preserve the actual form in
case we need to match it against a pattern. This is very
messy.
</p>
<p>
In the second case, we fall prey to the inconsistency trap.
Once any pair of conflicting statements can be deduced from
information signed with a given key, then anything can be
deduced from information signed with the key: the key is
completely broken. Of course, only that key is broken, so a
trust system can remove any reason it has to trust that key.
However, the attacked system may not realize what has
happened before it has been convinced that the sun rises in
the west.
</p>
<p>
Is there a way to limit the domain of trust in a key while
allowing inmformation to be processed in a consistent way
throughout the system? Yes - maybe - there are many. Each KR
system which uses a limited logic does do in order (partly)
to solve this problem. We just qulaify "can be inferred" be
the type of inference rules which may be used. This means the
generic proof engine eitehr has to work though a reified
version of the rules or it has to know the sets - incorporate
each proof engine. Maybe we only need one.
</p>
<hr />
<p>
<a href="Overview.html">Up to Design Issues</a>
</p>
<p>
<a href="../People/Berners-Lee">Tim BL</a>
</p>
</body>
</html>