spec-mgmt.html
14.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
<HTML>
<HEAD>
<TITLE>Document Management for Web Specs</TITLE>
</HEAD>
<BODY>
<P>
<!-- context info -->
<A HREF="../../"><IMG alt="WWW" src="http://www.w3.org/hypertext/WWW/Icons/WWW/WWWlogo48.gif"></A>
<A href="../"> <IMG src="../../Icons/WWW/html_48x48.gif" ALT="MarkUp"></A>
| <A HREF="./">SGML</A>
<H1>
Document Management for Web Specs
</H1>
<P>
Aaargh! Maintaining specs is a Royal Pain! We need to automate this!
<P>
See also:
<UL>
<LI>
<A HREF="http://lists.w3.org/Archives/Public/spec-prod/">spec-prod@w3.org
Mail Archives</A>
<LI>
<A NAME="xmlspec" HREF="../../XML/#xml-spec">W3C XML Specification DTD
(“XMLspec”)</A>
<LI>
<A HREF="http://dri.cornell.edu/pub/davis/html-parser.html">Jim Davis's HTML
parser</A>, with RFC generator
<LI>
<A HREF="../../People/Connolly/drafts/html-design">HTML design notebook</A>,
with list of implementations
<LI>
<A HREF="../../implementations">HTML parser implementations </A>(needs beefing
up)
</UL>
<H2>
Requirements
</H2>
<UL>
<LI>
Single Source Format (for each spec, if not a common format for all specs)
<LI>
PDF Output
<LI>
EPSF Figures (reduced to SVG/PNG for online)
<LI>
Tables
<LI>
HTML output
<LI>
Plain Text output according to IETF formatting guidelines
<LI>
Automatic TOC generation, section numbering
</UL>
<H2>
Goals
</H2>
<UL>
<LI>
Open standard source format
<LI>
HTTP-based document management (PUT, version control, ...)
<LI>
Direct Manipulation Rich-Text Editing, ala FrameMaker, MS Word
<LI>
Direct Manipulation HyperLink Editing, ala Nexus (navipress is close)
<LI>
Other Automated navigation structures: index, cross references, glossary,
references
<LI>
FrameMaker interoperability
<LI>
MS Word Interoperability
<LI>
Emacs/vi interoperability (for low-bandwidth situations)
<LI>
LaTeX interoperability (nice typesetting)
<LI>
Version control with change logs
<LI>
meaningfull diffs
</UL>
<H2>
Wishes
</H2>
<UL>
<LI>
Annotation support (for writer's comments, group comments, public comments)
<LI>
"structured sed" -- API for document manipulation (e.g. for TOC generation,
glossary, etc.)
<LI>
<A NAME=automated-bibliography>BibTeX/refer-like database</A>. Here's how
it works:
<UL>
<LI>
The schema for the database is modeled after bibtex/refer: class (thesis,
techreport, etc.), title, author, date, abstract ...
<LI>
Each record also can be marked "surrogate," meaning the authoratative source
is somewhere else: the IETF abstracts file, a W3C tech report, etc.
<LI>
To refer an entry in a document, we use some stylized markup. For example,
in the head: <link rel=bibliography href="/Bibliography?">. Then, at
the point of the reference, some sort of transclusion link markup: <a
rel="embed"
href="/Bibliography?id=draft-ietf-http-v10-4;fields=title,author,date,status,abstract">
... </a> (we might need to use RANGE or AS/AE to avoid nested A elements)
<LI>
The server checks documents at PUT time. When it sees such a reference, it
consults the bibliography database (which might involve updating the bib
DB from the ietf drafts index) and fills in the appropriate fields.
<LI>
Viola! We never need to manually cite documents again. Not only are the citations
reusable (in specs, overviews, etc.) but the database itself can be a valuable
browsing/searching resource in and of itself.
</UL>
<LI>
Sharable elements/entities (boilerplate, cross-references)
<LI>
Author-chunks distinct from reader-chunks
<LI>
Equasion support
<LI>
PowerPoint interoperability (goes beyond the scope of specs into presentations)
</UL>
<H2>
Possible Solutions
</H2>
<DL>
<DT>
<A href="../../Tools/Multiformat">Multiformat tools</A>
<DD>
This was used for the PNG spec
<DT>
FrameMaker, WebMaker, ??? print-to-text tool
<DD>
This is what Roy Fielding (and a lot of other folks) use.
<UL>
<LI>
+ Direct-Manipulation editing
<LI>
+ WYSYWIG postscript output
<LI>
+ Automatic TOC, cross references, index, section numbering
<LI>
+ Automatic HTML output with TOC, chunking, navigation (prev/next/up)
<LI>
- generating plain-text is a bear
<LI>
- generating HTML requires a baroque toolset that we don't have a license
to.
<LI>
- need a Frame license
<LI>
- no way to edit over a telnet connection
<LI>
- no way to edit over an HTTP connection (must have local access to files)
</UL>
<DT>
HTML+, dsr's tools
<DD>
Dave Raggett edits the HTML with a text editor (mostly BBEdit on a Mac).
He's got some little tools written in C to produce plain text.
<UL>
<LI>
+ Automatic TOC, headers/footers in text
<LI>
+ renders HTML math in plain text output
<LI>
+ HTML is easy to edit over a telnet connection
<LI>
+ can work with Navipress to edit HTML via HTTP
<LI>
- no postscript output tools
<LI>
- text generation tool often requires manual post-processing
<LI>
- author chunks must be the same as reader chunks
<LI>
- manual maintenance of prev/next links in HTML
<LI>
- manual maintenance of cross-references, index, etc.
<LI>
- little structure in the source format (e.g. no explicit "Abstract" structure)
</UL>
<DT>
Snafu DTD, gf tools, Texi2HTML, COST, Joe English
<DD>
This is what I ended up using for HTML 2.0
<UL>
<LI>
+ structured source format
<LI>
+ open standard source format
<LI>
+ Automated postscript output in IETF format with TOC, headers, footers,
cross-references, References, section numbers, glossary
<LI>
+ Automated plaintext output with same features
<LI>
+ Automated HTML output with glossary, TOC, navigation links
<LI>
+ LaTeX output
<LI>
+ TeXinfo output
<LI>
+ RTF output
<LI>
+ meaningfull diffs (cuz I edited the source with a text editor)
<LI>
+ low-bandwidth access (emacs over telnet works fine)
<LI>
- Tools require outside support (Joe English)
<LI>
- Tools are baroque
<LI>
- no WYSYWIG tools (perhaps SoftQuad Author/Editor?)
</UL>
<DT>
LinuxDoc
<DD>
<UL>
<LI>
- print-to-text tool isn't IETF happy
<LI>
- no direct-manipulation editing tools
</UL>
<DT>
LaTeX, latex2html, IETF print-to-text tools
<DD>
<DT>
MS Word, rtf2html, ??? print-to-text tool
<DD>
</DL>
<H2>
Ideal Solution
</H2>
<DL>
<DT>
Source format: HTML dialect
<DD>
use a strict HTML dialect with: tables, class=abstract, possibly math.
<DT>
Document Manipulation API: java interface
<DD>
formerly:
<BLOCKQUOTE>
There are lots of web libraries for python. We could eventually specify the
interfaces in ILU and use them from lots of languages (C, C++, java, scheme,
CommonLisp, Modula-3), but we'd prototype and develop using python.
<P>
I've already written little tools to do things like relativize links and
such. Rather than doing TOC generation, section nubmbering, etc. during
translation, we'd do it in-place in the source, but automatically
</BLOCKQUOTE>
<P>
changed my mind, since java has at least the potential to address the
installation bugs. Plus, it looks like we can write for the java VM in scheme
(see kawa)
<DT>
Chunking support: python scripts
<DD>
This would handle chunking many HTML documents into one for printing, and
many-to-many chunking for author/reader convenience.
<DT>
PostScript Output: python implementation of Mosaic print tool
<DD>
This code is already written. Guido translated the postscript printing code
from Mosaic into python. We could adapt things like headers/footers for our
needs. This eliminates the need for a TeX installation.
<DT>
Postscript Output: libwww TeX module?
<DD>
use HTTeXGen module in libwww to generate TeX. It doesn't currently support
all the features we need, but it could work. It would rely on a many-to-one
html-to-html filter
<DT>
Postscript Output: html2lout?
<DD>
lout is kinda like TeX, but it was written since the dawn of postscript,
so there's less redundancy between lout and PS than between TeX and PS. The
syntax of lout is also cleaner. Lout has table, equasion, etc. packages.
A clean html2lout filter should be much more reliable and hands-free than
anything based on TeX.
<DT>
Plain-Text output: custom python app?
<DD>
there is already python code to do simple html to text formatting, but handling
multiple documents, tables etc. needs to be added, as well as IETF style
<DT>
Plain-Text output: libwww module?
<DD>
same feature enhancements would be needed.
</DL>
<H2>
Wish list
</H2>
<DL>
<DT>
Direct manipulation grammar editor
<DD>
for SGML DTDs, RFC822 grammars in HTTP specs, etc.
</DL>
<H2>
References
</H2>
<DL>
<DT>
<A href="http://www.inf.tu-dresden.de/~jw6/doc/sdc/index.html">SDC</A>
<DD>
structured document conversion.
<A href="http://www.comp.vuw.ac.nz/Technical/SGML/">in use at vuw.ac.nz</A>.
Gotta check it out...
<DT>
<A href="http://www.chiark.greenend.org.uk/~ijackson/debiandoc-sgml-markup/">Debiandoc-SGML
markup manual</A>
<DD>
4 February 1997 Ian Jackson ijackson@gnu.ai.mit.edu.
<P>
source in <A href="ftp://ftp.debian.org/debian/unstable/source/text">debian
archive under text</A>
<DT>
<A href="http://nathan.gmd.de/persons/thomas.gordon.html">Dr. Thomas F.
Gordon</A>
<DD>
GMD FIT - German National Research Center for Information Technology<BR>
Research Division Artificial Intelligence<BR>
53754 Sankt Augustin, Germany<BR>
email: thomas.gordon@gmd.de; phone: (+49 2241) 14-2665
<DT>
<A href="http://liinwww.ira.uka.de/bibliography/index.html">The Collection
of Computer Science Bibliographies</A>
<DD>
Copyright © 1995-1996 Alf-Christian Achilles
<P>
Great for the reference section!
<DT>
<A href="http://www.jclark.com/sp/spam.htm">spam</A>
<DD>
<A href="http://www.sil.org/sgml/archEngine.html">example of using spam to
munge HTML</A>
<DT>
<A href="http://fatman.mathematik.tu-muenchen.de/~schwarz/sgml-tools/">SGML
tools</A>
<DD>
as used in <A href="http://sunsite.unc.edu/LDP/">The Linux Documentation
Project</A>
<DT>
<A HREF="http://search.yahoo.com/bin/search?p=Postscript">Postscript</A>
<DD>
<DT>
<A HREF="http://search.yahoo.com/bin/search?p=python">Python</A>
<DD>
<DT>
<A HREF="http://search.yahoo.com/bin/search?p=python">SGML</A>
<DD>
<DT>
<A HREF="http://search.yahoo.com/bin/search?p=LinuxDocSGML">LinuxDocSGML</A>
<DD>
<DT>
<A href="../Relavent#lout">lout</A>
<DD>
<A href="http://www.ptc.spbu.ru/mail-archives/lout/0095.html">Re: Lout to
HTML Jin S. Choi (jsc@atype.com) Wed, 13 Nov 1996 19:34:47 -0500 </A>. Nifty
thread about LOUT, SGML, DSSSL, HTML, etc. I agree!
<DT>
Joe English
<DD>
<DT>
<A HREF="http://www.ccil.org/~esr/home.html">Eric Raymond</A>
<DD>
Linux, computational linguistics, www-html
<199512221800.NAA09004@locke.ccil.org>
<DT>
<A href="ftp://ftp.ietf.org/ietf/1id-guidelines.txt">IETF draft guidelines</A>
</DL>
<P>
<P>
@@ I know from first-hand experience that producing multi-purpose technical
specifications (e.g. IETF plain text, online hypertext, and postscript) is
tricky and tedious. I try to keep track of tools that might provide solutions
to this problem.
<DL>
<DT>
<A href="http://www.jclark.com/sp.html">SP</A>
<DD>
a new C++ based SGML parser by James Clark, the author of SGMLS
<DT>
<A href="ftp://ftp.ifi.uio.no/pub/SGML/Demo/dtd-fragments-0.2.tar.gz">DTD
Fragments</A>
<DD>
<BLOCKQUOTE>
Another SGMLS/Perl formatter, DTD Fragments. It's not DTD specific and does
output to HTML, ASCII and TROFF, it does require a DTD to generic element
mapping in Perl for any specific DTD and comes with DocBook and Linuxdoc
mappings. The next version will have RTF output, Snafu DTD mapping and better
support for applying different styles to the output.
<ADDRESS>
<A href="mailto:ken@bitsko.slc.ut.us">Ken MacLeod</A>
</ADDRESS>
</BLOCKQUOTE>
<DT>
<A HREF="http://www.uottawa.ca/~dmeggins/">SGMLSpm</A>
<DD>
Another perl5/ngmls toolet. Includes some support for DocBook->LaTeX,
HTML conversion, though that part of the code looks like a one-time shot,
not a complete implementation.
<DD>
<DT>
<A HREF="http://www.oac.uci.edu/indiv/ehood/dtd2html.doc.html" >DTD2HTML</A>
<DD>
An SGML DTD documentation/navigation tool by
<A href="http://www.oac.uci.edu/indiv/ehood/">Earl Hood</A><BR>
This tool translates an SGML DTD into HTML, providing hypertext navigation
of the document structure. Handy for learning SGML.
<DT>
<A name="psgml" HREF="http://www.lysator.liu.se/projects/about_psgml">PSGML</A>
<DD>
A GNU Emacs mode for SGML files
<DT>
<A HREF="http://www.sq.com/hm-ftp.html" >Getting HotMetaL by FTP</A>
<DT>
<A HREF="http://www.sq.com/panor-pr.html" >SoftQuad Inc. Panorama Press
Release</A>
<DT>
<A HREF="http://www.informatik.tu-muenchen.de/~schwarz/linuxdoc-sgml/">Linux
Doc/SGML</A>
<DD>
These guys have taken a very practical approach to SGML for technical
documentation. They started with SGMLs from James Clark and the QWERTZ DTD,
which mirrors LaTeX structure. Then they added down-translators for groff,
HTML, and others. Looks promising.
<P>
Hmmm... on closer examination, this is something of a hack. They hacked the
DTD, hacked the down-translators, etc. I like the idea of using a LaTeX-like
DTD, but I think I'll wait till this matures a little more. also:
<A HREF="ftp://sunsite.unc.edu/pub/Linux/utils/text/">distribution archive</A>.
<DT>
<A HREF="ftp://ftp.th-darmstadt.de/pub/text/sgml/misc/" >GF: General SGML
Formatter</A>
<DD>
another SGMLs based SGML to HTML converter supporting a few sophisticated
DTDs
<DT>
<A HREF="http://web.nexor.co.uk/mak/doc/html/sgml-lib/html-sgml.html" >Setting
up PSGML and sgmls for HTML</A>
<DT>
<A HREF="ftp://ftp.jclark.com/pub/sp" >Remote file ftp.jclark.com/pub/sp</A>
<DT>
<A HREF="http://www.art.com/cost/" >CoST</A>
<DD>
Copenhagen SGMLs Tool -- SGMLs meets Tcl<BR>
maintained by Joe English
</DL>
<P>
<HR>
<ADDRESS>
<A HREF="../People/Connolly/">Dan Connolly</A><BR>
created 1995/12/05<BR>
last update by $Author: connolly $ on $Date: 1999/11/23 20:35:13 $
</ADDRESS>
</BODY></HTML>