PaperTrail.html
13.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
<?xml version="1.0" encoding="utf-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 13), see www.w3.org" />
<title>
Paper Trail: Web architecture ideas
</title>
<link href="di.css" rel="stylesheet" type="text/css" />
</head>
<body bgcolor="#DDFFDD" text="#000000" xml:lang="en" lang="en">
<address>
Tim Berners-Lee
<p>
Date: February 1999. Last modified: $Date: 2004/04/20
19:21:17 $
</p>
<p>
Status:
</p>
</address>
<address>
<p>
An example of how a social machine can be made without a
center. Editing status: Draft. Comments welcome
</p>
</address>
<p>
<a href="Overview.html">Up to Design Issues</a>
</p>
<h3>
Ideas about future Web architecture
</h3>
<hr />
<h1>
Paper Trail
</h1>
<p>
Here we look at the relationship between documents (living or
dead but basically bits of state) and messages (events with
associated data, including typically but not essentially
sender and recipient).
</p>
<p>
Here is a proposal for a project: "Paper trail" state machine
for workflow. The concept here is that the state of any
transaction is in the real world (and in this formalization
in the Web) just a function all the messages which form part
of a protocol.
</p>
<blockquote>
<h3>
Epilogue (2001/05)
</h3>
<p>
The <a href="/2001/01/WSWS">Web Services workshop</a>, in
discussing transactios over the Net, surfaced the need for
process flow descriptions
</p>
<h3>
Update (2004/03)
</h3>
<p>
The <a href="/2000/10/swap/">Semantic Web Application
Platform (SWAP)</a> now has enough functionality to
implement these ideas. see <a href=
"/2000/10/swap/ppt-bank/">ppt-bank</a>, especially <a href=
"/2000/10/swap/ppt-bank/checking.n3">checking.n3</a>
</p>
</blockquote>
<h2>
Introduction
</h2>
<p>
Social processes look like state machines. However, they
don't exist as a state variable stored in one place, but as a
trail of documents. You know the true state of the machine
only if you have access to the latest documents. (This is not
the problem addressed here, this is real life being
modelled.) <em>Paper-trail</em> is a system which allows one
to follow a strict process by creating new documents in a
constrained fashion. Every paper-trail document has a pointer
to a "paper-trail schema" which defines its document type (eg
"constitutional amendment") a pointer to its justification
documents (maybe) a notarization of when it was checked
against the schema by the paper-trail program. The schema
defines:
</p>
<ul>
<li>Prerequisites for a document being valid, in terms of
other documents
</li>
<li>Hints to other document types you can make from this one
(state transitions)
</li>
</ul>
<h3>
Example
</h3>
<blockquote>
<p>
To make a new W3C working draft, the schema requires
pointers to old working draft new document, and editor's
authorization. The editor must be defined as editor on home
page of working group where working group page is pointed
to be by old draft. If all those exist, then the new
document is created from all that and notarized (time
stamped) by the software. The human readable part of the
document is created as a (simple macro) function of the
input documents. A document also has a buttons to take you
to a form to turn it into another type of document
according to hints in the schema.
</p>
</blockquote>
<h3>
Example
</h3>
<blockquote>
<p>
A button on a Working Draft takes you to a form for
promoting it to a "proposed recommendation". This requires
different things (all the above plus endorsement of new
draft by director or any two members of the management
group.)
</p>
</blockquote>
<h2>
Technology
</h2>
<p>
If you are considering this as a student project, consider
these directions:
</p>
<ul>
<li>Use RDF within the document to express its state.
</li>
<li>Develop declarative language for defining the
prerequisites - ideally in RDF too.
</li>
<li>Develop GUI for creating a new document by supplying the
prerequisites
</li>
<li>Allow hooks for digital signature but don't have to
implement it
</li>
</ul>
<h2 id="Generalizi">
Generalizing for formal protocols
</h2>
<p>
The concept of a paper trail is common in conventional
administration, but the model can also be applied to
well-defined computer protocols.
</p>
<h2 id="Model">
Model
</h2>
<p>
The model is that a protocol P defines a status s<sub>n</sub>
as a function of a message m and a previous state
s<sub>n-1</sub>, and the time t.
</p>
<p>
s<sub>n</sub>= P(m<sub>n</sub>, s<sub>n-1</sub>, t)
</p>
<p>
or for that matter as a function of all the messages to date
</p>
<p>
s<sub>n</sub>= P'({m<sub>i</sub>}<sub>i=1..n</sub>)
</p>
<p>
The state could be a logical formula, an RDF graph, or an XML
document, or just a number, in decreasing order of interest.
The system can be a any one of a number of types of machine,
including the well-known finite state machine and push-down
automata.
</p>
<p>
In an XML world, think of the state and the messages all
being expressed in XML, and the protocol maybe being an XSLT
script.
</p>
<p>
The state must record everything necessary for calculating
future states for any new message. It could also record the
results of the protocol. For example, the state of TCP (where
IP packets are the {m} ) must hold the state of the packets
unacknowledged in the sliding window, but when the connection
has been successfully closed it could hold either just
"terminal state", or also the ordered set of bytes
transferred in the connection.
</p>
<p>
The protocol function can be seen as an information
destroying function. By specifying what needs to be
remembered, it defines what can be thrown away. This is of
course very important. Of course, one might in some cases
still want to spool the messages for security, but the actual
information needed to describe the state of affairs is
limited..
</p>
<p>
Typically, to be valid, messages will link back to previous
messages either directly or though common threading
identifiers of some sort. A message without such a reference
will in most cases not have any effect on the state.
</p>
<p>
There will in general be error states, which the protocol
does not allow, which any message which is invalid in some
way will lead to. Functionally there need only be one error
state but in practice one might want t preserve the state
before the error and details of the error. Some protocols
model most errors themselves by sending.
</p>
<p>
There must obviously be a set M<sub>0</sub> of valid ways to
start a protocol in the first case from the generic initial
state s<sub>0</sub>. For example, in TCP one sends a SYN
message; on the telephone one picks up the receiver. For any
m in M<sub>0</sub>, P(m, s<sub>0</sub>) will be a valid
(non-error) state.
</p>
<p>
There will in some systems be a set of F final states, in
which no further messages can have any effect on the state.
For any s in F, P(m,s) = s for all m.
</p>
<p>
For example, in the US, when 7 years have passed since a
transaction occurred, then all records may be discarded as no
one even the tax man has the right to query them. The state
is reduced to a minimum. Most systems can be modelled in a
simple of complex way, the simple way ignoring a lot of the
auditing processes for example. A simple model of a loan
between two people has a state which is the balance amount
and one final state when that is zero. Other systems are
designed to remain in non-final state: a lifetime warranty is
a protocol which remains in non-final state (until you die!),
waiting for any message that you are dissatisfied with the
product.
</p>
<p>
Real system are part of bigger systems, and so the real
protocol will function as part of a larger protocol. For
example, a working group at W3C goes though many internal
state changes, and (on a simple model) the last is when their
work is accepted by the Consortium as a whole as a
Recommendation. This is a message leaving the system, which
forms part of the larger protocol. Modeling this is clearly
interesting. (To demonstrate this nesting by an example of it
breaking, think of the case of a working group not arriving
at consensus and passing on not only a final document but
also a minority report, basically a peek into the internal
workings of the group which did not in fact arrive in its
final state. ) This would include modelling tasks which can
split, and be recursively delegated, and so on.
</p>
<h2>
Cool things
</h2>
<p>
This system can allow well-defined social processes to work
eg on a net newsgroup, or by email. ie, it works in a
write-only medium.
</p>
<p>
It models real life in commerce well, where the state really
is an abstract thing and one's perception of it depends on
the set of messages one has had access to.
</p>
<p>
Hopefully we can use this model to define systems which are
even more powerfully distributed than any we use at the
moment.
</p>
<h2 id="Linking">
Linking Remote operations and Data Formats
</h2>
<p>
I must have discussed the relationships between remote
operations and data formats before. Maybe I have made a table
with schema languages compared against interface definition
languages, and so on.
</p>
<p>
Now we have a clear way of expressing the relationship
between the two. A Protocol definition document defines a
document as a function of messages, which can be represented
as documents - so we can look at remote operations in terms
of documents. Typically RPC messages are very constrained:
this model allows much more complicated multi-party protocols
to be defined.
</p>
<h2>
Challenges if you finish early
</h2>
<p>
If making a paper trail machine was fun, here are some more
ideas.
</p>
<ul>
<li>Add time-aware social processes such as promises and
timeouts.
</li>
<li>Do you need to be able to prove non-existence of
documents?
</li>
<li>Locally to an author or globally?
</li>
<li>States can split. (draft can go to W3C or IETF process or
both).
</li>
<li>How can you limit this, when socially undesirable?)
</li>
<li>Develop proofs that processes will achieve given ends.
</li>
<li>Model processes near you:
<ul>
<li>auction
</li>
<li>peer review journal
</li>
<li>presidential impeachment ;-)
</li>
<li>internet newsgroup creation
</li>
<li>formation of a company
</li>
<li>MIT purchasing (possible PhD thesis ;-)
</li>
</ul>
</li>
<li>Develop theories in which players are
<ul>
<li>collaborative
</li>
<li>competitive
</li>
<li>allowed to create new schemas to achieve their ends
</li>
</ul>
</li>
<li>Model existing systems near you:
<ul>
<li>TCP
</li>
<li>HTTP...
</li>
</ul>
</li>
<li>Develop a protocol machine, which, acting on behalf of
one agent, will determine when that agent has a possible move
to make, and when in fact the protocol is acting for that
agent. Develop a GUI which helps a human user chose from the
set of possible options at that state of the protocol.
</li>
</ul>
<h2 id="Products">
Products
</h2>
<p>
The thing which would come out of this idea would I imagine
be a standard language for writing protocols. Of course, it
would mainly be something else, such as an rdf-logic
language, or prolog or whatever, but there would have to be
hooks to define it to be a definition of a protocol.
</p>
<p>
This takes the self-describing web concept into a new area:
that messages are self-describing in that they contain a
pointer to the language in which they are written, and that
includes (or points to) the protocol to which they claim to
adhere.
</p>
<p>
@@ Add pointers to work done with Notation3
</p>
<hr />
<p>
<a href="Overview.html">Up to Design Issues</a>;
</p>
<p>
Thanks for some fun discussions with Dan Connolly about these
ideas.
</p>
</body>
</html>