ww9401.html 8.84 KB
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<title>Formalizing Web Technology -- WebWorld Orlando</title>
</head>

<h1>Formalizing Web Technology</h1>



<address>
<a href="http://www.hal.com/~connolly">Daniel W. Connolly</a><br>

HaL Software Systems<br>

	Austin, TX
</address>



<p><a href="../index.html#webworld9401">Web World</a><br>

January 30, 1994 - Orlando, FL<br>

$Id: ww9401.html,v 1.1 1997/07/06 06:21:13 connolly Exp $

<p>(<a href="speak.html">speaker's notes</a>)

<hr>

<h2>Formalizing Web Technology</h2>


<dl>
<dt>	<a href="#proven">Proven Value of the Core Technology</a>

<dd>Distributed Hypermedia is an idea whose time has come.


<dt>	<a href="#change">Forces for Change</a>

<dd> The web represents a significant market force, and resources are
being pooled from many directions to satisfy the needs and desires of
that market.

<dt>	<a href="#capturing">Capturing the State of the Art</a>

<dd>Where are we right now?

<dt>	<a href="#stabilizing">Stabilizing Forces</a>

<dd>Deployment of new features does not come without cost.

<dt>	<a href="#breaking">Breaking Down HTML</a>

<dd> The HTML 2.0 spec is a good steak in the ground, but it should be broken
into smaller, more modular documents.

<dt>	<a href="#w3c">W3C - The Center of Evolution</a>

<dd>The core technology of the web should be in the hands of a "community trust," where 
anyone can contribute, and everyone gains.

<dt>	<a href="#looking">Looking ahead</a>

<dd>How will new features affect the technology base? What research and developments 
are on the horizon?

</dl>

<hr>

<h2><a name="proven">Proven Value of the Core Technology</a></h2>

<p> There are few novel technologies in the World-Wide Web. It is simply an effective application of 
ideas that have been tested and proven:

<ul>

<li>	Sharing information makes people more effective

<p>	The Internet is an excellent basis for a distributed information system

HTTP is a very simple information retrieval protocol. As such, it has provided an 
extensible basis for a number of valuable applications.
<p>

<li>	HyperText and HyperMedia are an effective way to represent human
knowledge

<p>HTML is a simple structured document representation, capable of representing 
many common forms of communications. URLs comprise a simple hierachical 
document address space, which can accomodate many of the existing information 
systems on the Internet.
<p>

<li>	A direct manipulation interface (i.e. "point and click") is easy to use

<p>NCSA Mosaic was an instant, overnight success.

</ul>

<p><strong>The result: The web is now a vital, global 
information system.</strong>

<hr>

<h2><a name="change">Forces For Change</a></h2>



<ul>
<li>	The consumers on the web represent a substantial commercial market, but 
the web does not currently support secure, reliable transactions.

<li>	Finding information on the web is difficult, and opportunities for automated 
searching have been demonstrated.

<li>	The same information could be delivered for less cost with caching and 
replication.

<li>	Until HTML (or some other ubiqutous data format) provides the expressive 
capability of contemporary word-processing and desktop publishing 
packages, information providers will feel constrained.
</ul>


<p><strong>How Do We Increase the Quality of Service and 
Security, and provide for Resource Discovery?</strong>

<hr>

<h2><a name="stabilizing">Stabilizing Forces</a></h2>

<p><strong>Maintaining Confidence in the Technology</strong>

<ul>

<li>	People resist change

<p>The technology will be perceived as stable as long as individual sites and users can 
choose between staying with their old applications and upgrading to participate in 
the new features. If they are forced to change their operation in response to changes 
that they did not ask for, they will be upset.
<p>

<li>	Mistakes are costly

<p>Once technology is deployed, it never really goes away. Mistakes represent a 
documentation, development, and support burden for a long time. It is critical to 
experiment and gain experience before wide deployment.

<p>
<li>	Consumers demand quality software products

<p>Internet tools have moved from research projects, to user-supported software, and 
now to the consumer market. Commercial development takes time: time to learn the 
technology and develop products, including testing, support, and documentation.

<p>
<li>	Mission critical applications must not be compromised

<p>Otherwise, the vast resources available for development of mission critical 
applications will simply be applied somewhere other than the web.

</ul>

<hr>

<h2><a name="capturing">Capturing the State of the Art</a></h2>

<p><strong>To make a change with confidence, we must be able to assess the scope of the change. 
Minimal design, i.e. modularization and information hiding, is necessary to be able to 
be able to identify the scope of effect of changes. </strong>

<ul>

<li>	TimBL's original writings

<li>	NCSA documentation: NCSA httpd, CGI, CCI

<li>	HTML+

<li>	HTML interoperability group

<li>	HTML, HTTP, URI working groups

<li>	IIIR, integrated directory services, quality information services, HTTP 
security

</ul>

<hr>

<h2><a name="breaking">Breaking Down HTML</a></h2>

<dl>

<dt>		HTML Syntax

<dd>		how to decide whether a sequence of characters is 		a valid HTML document, and if so, 
how to create 		a parse tree.

<dt>		Interpretation of HTML Idioms

<dd>an informal description of the meaning and suggested 		rendering of an HTML parse 
tree.

<dt>		The text/html Internet Media Type

<dd>registration of HTML as a MIME type. Charset issues. 		Newline Issues. Appendices 
specifically addressing 		SMTP transport and HTTP transport issues. Security 		issues.

<dt>		World-Wide Web User Agents and Applications

<dd>		Specific techniques: basic HREF links, ISINDEX, FORMS, ISMAP, 		.mailcap, 
$WWW_HOME, mailto:, proxies, security issues. 		Suggestions for documentation, 
default configuration, etc.

<dt>	World-Wide Web Hypermedia Architecture

<dd>formal discussion of the WWW hypertext model: documents, 		anchors, links, 
searching. 		Formal discussion of common abstractions from ftp, http, 		gopher, WAIS, 
etc. Definition of correct caching/proxy 		behavior.

</dl>

<hr>

<h2><a name="w3c">W3C - The Center of Evolution</a></h2>

<ul>

<li>	Pooling our resources

<p>The development of the web has been a research/volunteer effort. The current 
demand is more than that community can support. In fact, it's more than almost any 
one company or organization could shoulder. A consortium allows all the interested 
and motivated parties to contribute without taking on the entire burden.

<p>

<li>	An Open Market

<p>Various companies will carve out their niche in the vast marketplace of W3 products 
and services, but none of these companies has the last word. The core technology will 
remain royalty-free, which allows it to spread quickly.
<p>

<li>	Open discussion balanced with rapid progress

<p>The Internet Engineering Task Force working groups provide a forum for open 
communication and consensus building, and the W3 consortium provides resources to 
research and develop the technologies. Consortium members will have early access to 
the technology in order to be able to support it when it is publicly released.

</ul>

<hr>

<h2><a name="looking">Looking Ahead</a></h2>

<dl>
<dt>		HTML Syntax

<dd><ul>

<li>	Elements for new features: super/subscript, tables, etc.

<li>	ISO special character entities, and how they show up in 		the parse tree

<li>	Conformance 		testing.

<li>	Entity declarations, marked sections.

<li>	Math markup?

</ul>

<dt>		Interpretation of HTML Idioms

<dd><ul>

<li>	Tables, Figures. Super/subscript.

<li>	DSSSL-Lite. 

<li>	Toolbars (next/previous/up).

<li>	Vendor- and 		application-specific extensions.
</ul>

<dt>		The text/html Internet Media Type
<dd><ul>

<li>	Character sets

<li>	versions, levels, format negociation 		issues

<li>	Vendor- and applicatoin-specific extensions.
</ul>

<dt>		World-Wide Web User Agents
<dd>
<ul>
<li>	File upload

<li>	Embeded presentation

<li>	Mandatory display 		of copyrights. Display of security information

<li>	Desktop message bus (CCI/OLE/Tooltalk/AppleEvents)

<li>	Distributed editing, annotation, 		and other forms of collaboration.

<li>	resource discovery technology (e.g. harvest, 		verity) will have user interface 
implications.
</ul>

<dt>		World-Wide Web Hypermedia Architecture
<dd><ul>

<li>	link relationships

<li>	embedding, compound 		doucment architecture

<li>	the web as a knowledge base

<li>	isomorphisms with HyTime

<li>	Publishing model (URNs/URCs, 		copyright, payment, replication, authentication, 	
	access control).

<li>	Common attributes (aka "meta-information") and taxonomies for distributed 
searching.
</ul>

<dt>		HTTP

<dd><ul>

<li>	Security

<li>	Variations on Proxy: no-cache

<li>	Session management, and application-level packets.

<li>	Transactions

<li>	Desktop message-bus, UDP version of the protocol.
</ul>

</dl>